CN116069174A - Input association method, electronic equipment and storage medium - Google Patents

Input association method, electronic equipment and storage medium Download PDF

Info

Publication number
CN116069174A
CN116069174A CN202310144621.6A CN202310144621A CN116069174A CN 116069174 A CN116069174 A CN 116069174A CN 202310144621 A CN202310144621 A CN 202310144621A CN 116069174 A CN116069174 A CN 116069174A
Authority
CN
China
Prior art keywords
output result
target
word
sentence
taking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310144621.6A
Other languages
Chinese (zh)
Inventor
薄满辉
籍焱
王凯
张丽颖
刘丰
尚亚南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Travelsky Mobile Technology Co Ltd
Original Assignee
China Travelsky Mobile Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Travelsky Mobile Technology Co Ltd filed Critical China Travelsky Mobile Technology Co Ltd
Priority to CN202310144621.6A priority Critical patent/CN116069174A/en
Publication of CN116069174A publication Critical patent/CN116069174A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Abstract

The invention provides an input association method, which comprises the following steps: acquiring an input target character string; acquiring a plurality of target entity word libraries; traversing a target entity word stock, and for the current target entity word stock, if any entity word in the current target entity word stock is contained in the target character string, acquiring a fixed sentence corresponding to the entity word from the current target entity word stock as a current output result; acquiring sentences beginning with the target character strings from a first set corpus, and taking the acquired target sentences as a second output result if the corresponding target sentences are acquired; performing word segmentation processing on the target character string to obtain a word segmentation set; acquiring sentences comprising each word in the word segmentation set from a second set corpus to obtain a corresponding sentence set; if the sentence set has an intersection, taking the sentence obtained by the intersection as a third output result; and outputting a result. The invention also provides electronic equipment and a storage medium. The invention can output the association words as rich as possible.

Description

Input association method, electronic equipment and storage medium
Technical Field
The present invention relates to the field of intelligent search, and in particular, to an input association method, an electronic device, and a storage medium.
Background
With the tremendous growth of internet technology, people are increasingly dependent on obtaining the required information from the internet. When a user searches for content by using a search box, each word is generally input in the search box, the search box searches for an associated word matched with the input word in a pre-built input associated word bank, and an input associated word list presented below the search box is displayed, so that the user can directly click on the recommended input associated word, and further, the content to be checked can be directly searched without continuously inputting the word. However, in the existing association input method, a user is required to input a character with relatively complete meaning to give a corresponding association word, or because the corpus is limited, there may be a situation that the association word cannot be provided because of no match. When the characters input by the user are fuzzy or the number of characters is too short, for example, only one character is input, the corresponding association word cannot be given, so that the applicability is poor and the user experience is poor.
Disclosure of Invention
Aiming at the technical problems, the invention adopts the following technical scheme:
the embodiment of the invention provides an input association method, which comprises the following steps:
s100, acquiring an input target character string;
s200, acquiring n target entity word libraries; each target entity word library comprises a plurality of entity words and corresponding fixed sentences, the entity word types corresponding to any two target entity word libraries are different, and the entity words in the same target entity word library correspond to the same entity word type;
s300, traversing n target entity word libraries, and if the ith target entity word library contains the ith target character stringAny entity word in the i target entity word banks is used for acquiring fixed sentences corresponding to the entity word from the i target entity word banks as k i Outputting results, wherein the value of i is 1 to n; will (k) 1 +k 2 +…+k i +…+k n ) The output results are used as first output results; s400 is executed;
s400, acquiring a first target sentence beginning with the target character string from a first set corpus, taking the acquired first target sentence as a second output result if the corresponding first target sentence is acquired, and executing S500; otherwise, executing S500;
s500, performing word segmentation processing on the target character string to obtain a word segmentation set P= (P) 1 ,P 2 ,…,P j ,…,P m ),P j J is the j-th word in P, the value of j is 1 to m, and m is the number of words in P; if m > 1, S600 is performed; otherwise, executing S800;
s600, obtaining P from the second set corpus j Obtain P j Statement set W j =(w j1 ,w j2 ,…,w jr ,…,w jh(j) ),w jr Is W j The value of r is 1 to h (j), h (j) is W j A second target sentence number in (a);
s700, if W 1 ∩W 2 ∩…∩W j ∩…∩W m Not equal to Null, will W 1 ∩W 2 ∩…∩W j ∩…∩W m The obtained sentence is used as a third output result, and S810 is executed; s800, taking at least part of the first output result and the second output result as a final output result and outputting the final output result;
and S810, taking at least part of the first output result, the second output result and the third output result as final output results, and outputting. The invention has at least the following beneficial effects:
according to the input association method provided by the embodiment of the invention, the input character strings are firstly subjected to fixed sentence matching, then the first set corpus is used for matching, the character strings are subjected to word segmentation processing and then the second set corpus is used for matching under the condition that proper association words are not matched, if the proper association words are not matched yet, the character strings are subjected to synonym replacement and/or keyword extraction, and the matching is performed from the second set corpus based on the synonym replacement and/or keyword extraction results, so that the provided association words are rich and accurate as much as possible, and the user experience is good.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of an input association method according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
An embodiment of the present invention provides an input association method, as shown in fig. 1, which may include the following steps:
s100, acquiring an input target character string.
In the embodiment of the present invention, the target character string may be a character string composed of all characters input by the user in the input box of the setting information providing website, for example, the character input by the user is "southern aviation delay", and then the target character string is "southern aviation delay". For another example, if the character input by the user is "i want to go to beijing", the target character string is "i want to go to beijing".
S200, acquiring n target entity word libraries; each target entity word library comprises a plurality of entity words and corresponding fixed sentences, the entity word types corresponding to any two target entity word libraries are different, and the entity words in the same target entity word library correspond to the same entity word type.
In the embodiment of the present invention, n target entity word libraries may be stored in a server in advance, where the server is a server that provides website communication connection with setting information. In one example, each target entity word library may include an entity word list storing a number of entity words and a fixed sentence table associated with the entity word list. In another example, each target entity word library may include an entity word list storing a number of entity words and a number of fixed sentence tables associated with the number of entity words. Preferably, to reduce storage resources, all fixed statements may be stored in the same table.
The categories and numbers of target entity word banks may be set based on actual needs, and in one exemplary embodiment, the target entity word banks may be word banks associated with aviation, for example, entity word banks that may include entity word categories such as avionics, airports, and security checks. Those skilled in the art will recognize that any method of constructing a target entity word stock is within the scope of the present invention.
S300, traversing n target entity word banks, and for the ith target entity word bank, if any entity word in the ith target entity word bank is contained in the target character string, acquiring a fixed sentence corresponding to the entity word from the ith target entity word bank as k i Outputting results, wherein the value of i is 1 to n; will (k) 1 +k 2 +…+k i +…+k n ) The output results are used as first output results; s400 is performed.
Specifically, for each target entity word bank, each entity word in the target entity word bank can be compared with a target character string, and if any entity word in the target entity word bank is contained in the target character string, a corresponding fixed sentence is obtained from the target entity word bank as a first output result of this time. The person skilled in the art may have a case that all entity words in the target entity word library are not included in the target character string, i.e. the fixed sentence in the first output result may be Null.
S400, acquiring a first target sentence beginning with the target character string from a first set corpus, taking the acquired first target sentence as a second output result if the corresponding first target sentence is acquired, and executing S500; otherwise, S500 is directly performed.
In the embodiment of the present invention, the first set corpus may be a prefix tree corpus, and may be an existing prefix tree corpus.
In the embodiment of the invention, the acquired target sentence is a sentence with the intention being the same as or close to the intention of the target character string.
S500, performing word segmentation processing on the target character string to obtain a word segmentation set P= (P) 1 ,P 2 ,…,P j ,…,P m ),P j J is the j-th word in P, the value of j is 1 to m, and m is the number of words in P; if m > 1, S600 is performed; otherwise, i.e., m.ltoreq.1, S800 is performed.
S600, obtaining P from the second set corpus j Obtain P j Statement set W j =(w j1 ,w j2 ,…,w jr ,…,w jh(j) ),w jr Is W j The value of r is 1 to h (j), h (j) is W j A second target sentence number in (a).
In the embodiment of the present invention, the corpus in the second set corpus may be the same as the corpus stored in the first set corpus, except that the manner of storing the corpus is different, and the second set corpus is the existing corpus.
S700, if W 1 ∩W 2 ∩…∩W j ∩…∩W m Not equal to Null, i.e. m statement sets W 1 、W 2 、…、W j 、…、W m If there is an intersection, i.e. comprising the same sentence, then W will be 1 ∩W 2 ∩…∩W j ∩…∩W m The resulting sentence is taken as a third output result, and S810 is performed.
S800, taking at least part of the first output result and the second output result as a final output result and outputting.
And S810, taking at least part of the first output result, the second output result and the third output result as final output results, and outputting. In the embodiment of the invention, the output result can be displayed on a display screen of a user.
If the output result includes only the second output result, N sentences, which is the set number of output sentences, may be selected from the target sentences acquired in S400 as the output result, for example, randomly, and may be set based on actual needs. Those skilled in the art know that if the target sentence acquired from S400 is smaller than N, all the acquired target sentences may be taken as the output result.
If the output result includes a first output result and a second output result, the first output result includes A1 fixed sentences, the second output result includes A2 sentences, a1+a2=n, N is a set number of output results, and A1 and A2 can be set based on actual needs. Wherein the A1 fixed statement may be a slave pair (k 1 +k 2 +…+k i +…+k n ) The output results after the duplicate removal processing are selected, for example, randomly selected. The A2 sentences may be selected from the target sentences obtained in S400, for example, randomly selected. Those skilled in the art will recognize that if the total number of sentences in the first output result and the second output result is less than N, the first output result and the second output result may be taken as output results.
If the output result includes a first output result, a second output result, and a third output result, the first output result may include B1 fixed sentences, the second output result includes B2 sentences, and the third output result includes B3 sentences, b1+b2+b3=n. B1, B2 and B3 can be set based on actual needs. Those skilled in the art will recognize that if the total number of sentences in the first output result, the second output result, and the third output result is less than N, the first output result, the second output result, and the second output result may all be taken as output results.
According to the input association method provided by the embodiment, association words can be matched by using the target entity word stock, the first set corpus and the second set corpus to match association words, so that the association words which are as rich as possible can be matched.
In another embodiment of the present invention, S700 further includes: if W is 1 ∩W 2 ∩…∩W j ∩…∩W m =null, i.e. m statement sets W 1 、W 2 、…、W j 、…、W m If there is no intersection, i.e. the same sentence is not included, the following steps are performed:
s710, obtain P j Is based on P j And corresponding substitution word, form P j Is a combination of PB and PB j =(P j ,P j1 ,P j2 ,…,P jx ,…, P jf(j) ),P jx Is P j X has a value of 1 to f (j), f (j) is P j Is a substitute word number of (c).
In the embodiment of the invention, P j The substitution word of (1) is with P j Words with similar meaning, for example, late words are deferred, and south words are south words, east words, etc. P (P) j The surrogate word of (c) may be obtained based on a preset surrogate word list.
S720, PB-based 1 ,PB 2 ,…,PB j ,…,PB m Obtaining H combined word segmentation set groups PC= (PC) 1 ,PC 2 ,…,PC s ,…,PC H ) S-th combined word segmentation set PC s =(PC s1 ,PC s2 ,…,PC sj ,…,PC sm ),PC sj Is PC (personal computer) s J-th word in (a), PC sj ∈PB j And PC (personal computer) s Not equal to P, i.e. any combined vocabulary set includes PB 1 ,PB 2 ,…,PB j ,…,PB m One word in each word combination in the database, any two combination word segmentation sets are different, and P is not included in the PC; s730 is performed; s has a value of 1 to H.
In an embodiment of the present invention, PB-based 1 ,PB 2 ,…,PB j ,…,PB m The H combined word set groups may be obtained based on the existing permutation and combination manner, i.e. h=f (1) f (2) … f (j) … f (m) -1.
S730, obtaining the PC from the second set corpus sj Obtaining PC sj Is a set of sentences WC sj =(wc 1 sj ,wc 2 sj ,…,wc u sj ,…,wc f(sj) sj ),wc u sj For WC sj The u-th third target sentence in (1) is given by the values of 1 to f (sj), and f (sj) is WC sj A third target sentence number in (a).
S740, obtaining a target sentence result set T= (T) 1 ,T 2 ,…,T s ,…,T H ) S-th target sentence result T s =(WC s1 ∩WC s2 ∩…∩WC sj ∩…∩WC sm ) The method comprises the steps of carrying out a first treatment on the surface of the If at least one target sentence result is not Null in the T, that is, if at least one target sentence result including a sentence is present, the target sentence result which is not Null is used as a fourth output result, and S900 is executed.
In a preferred embodiment of the present invention, taking the target sentence result that is not Null as the fourth output result may include:
if the combined word segmentation set corresponding to the target sentence result which is not Null in the T comprises the word segmentation in the P, acquiring a third target sentence from the combined word segmentation set comprising the word segmentation in the P as a fourth output result, namely preferentially acquiring the sentence from the combined word segmentation set comprising the word segmentation in the P as the fourth output result. More preferably, the sentence is acquired as the fourth output result from the combined word segmentation set including the most words in P.
S900, taking at least part of the first output result, the second output result and the fourth output result as final output results and outputting.
In this embodiment, if the output result includes a first output result, a second output result, and a fourth output result, the first output result may include C1 fixed sentences, the second output result includes C2 sentences, and the fourth output result includes C3 sentences, c1+c2+c3=n. C1, C2 and C3 may be set based on actual needs. Those skilled in the art will recognize that if the total number of sentences in the first output result, the second output result, and the fourth output result is less than N, the first output result, the second output result, and the fourth output result may all be taken as output results.
According to the input association method provided by the embodiment, association word matching is firstly carried out by using the target entity word stock and the first set corpus, then word segmentation processing is carried out on the target character string, association word matching is carried out by using the second set corpus, when the association word is not matched according to the word segmentation, replacement word replacement processing is carried out on the word in the target character string, and matching is carried out by using the second set corpus based on the replaced word, so that the association word which is as rich as possible can be further matched compared with the previous embodiment.
In another embodiment of the present invention, S740 further includes: if T is Null, namely, any target statement result is Null and no statement is included, executing the following steps:
s741, acquiring keywords in P.
In the embodiment of the invention, the keywords are words obtained from word segmentation in P according to a preset rule. In an exemplary embodiment, keywords in P may be obtained based on existing word importance, e.g., keywords in P may be obtained based on entropy of information. Those skilled in the art know that obtaining keywords by entropy of information may be prior art.
S742, obtaining a third target sentence corresponding to the keyword from the second set corpus, and executing S1000 by taking the obtained fourth target sentence as a fifth output result.
S1000, taking at least part of the first output result, the second output result and the fifth output result as final output results and outputting.
In another embodiment of the present invention, S740 further includes: if T is Null, S743 is performed.
S743, acquiring keywords in P based on the set keyword table; s744 is performed.
The set keyword table may be an existing keyword table, and is stored in the server in advance.
In the embodiment of the invention, if one word in the set keyword table is included in P, the word is taken as the keyword of P. If two or more words in the set keyword table are included in P, in one example, one word may be randomly selected as the keyword of P, and in another example, the word having the highest information entropy may be selected as the keyword of P.
S744, a fifth target sentence corresponding to the keyword in P is obtained from the second set corpus, and S1001 is executed using the obtained fifth target sentence as a fifth output result.
S1001, taking at least part of the first output result, the second output result and the fifth output result as final output results, and outputting.
In an embodiment of the present invention, in S743, if P does not include any keyword in the set keyword table, S1001 may be directly executed, except that the fifth output result at this time is Null.
In another embodiment of the present invention, in S743, if any one of the set keyword tables is not included in P, S745 may be performed:
s745, the keyword in P is acquired based on the word importance degree, and S744 is executed.
In this embodiment, if the output result includes a first output result, a second output result, and a fifth output result, the first output result may include D1 fixed sentences, the second output result includes D2 sentences, and the fifth output result includes D3 sentences, d1+d2+d3=n. D1, D2 and D3 may be set based on actual needs. Those skilled in the art will recognize that if the total number of sentences in the first output result, the second output result, and the fourth output result is less than N, all of the first output result, the second output result, and the fifth output result may be taken as output results.
According to the input association method provided by the embodiment, association word matching is firstly carried out by using the target entity word stock and the first set corpus, then word segmentation processing is carried out on the target character string, association word matching is carried out by using the second set corpus, when the association word is not matched according to word segmentation, replacement word replacement processing is carried out on words in the target character string, matching is carried out by using the second set corpus on the basis of the replaced words, and if the association word is not matched, matching is carried out on the basis of key words in the target character string, and compared with the previous embodiment, the association word which is as rich as possible can be further matched.
In another embodiment of the present invention, S700 further includes: if W is 1 ∩W 2 ∩…∩W j ∩…∩W m =null, the following steps are performed:
s711, obtaining keywords in P.
In the embodiment of the invention, the keywords in P can be acquired based on the existing word importance, for example, the keywords in P can be acquired based on the information entropy. Those skilled in the art know that obtaining keywords by entropy of information may be prior art.
S712, obtaining a sixth target sentence corresponding to the keyword in the P from the second set corpus, and taking the obtained sixth target sentence as a fourth output result, and executing S820.
S820, taking at least part of the first output result, the second output result and the fourth output result as final output results and outputting.
In another embodiment of the present invention, S700 further includes: if W is 1 ∩W 2 ∩…∩W j ∩…∩W m =null, then S713 is performed.
S713, acquiring keywords in P based on the set keyword table; s714 is performed.
The set keyword table may be an existing keyword table, and is stored in the server in advance.
In the embodiment of the invention, if one word in the set keyword table is included in P, the word is taken as the keyword of P. If two or more words in the set keyword table are included in P, in one example, one word may be randomly selected as the keyword of P, and in another example, the word having the highest information entropy may be selected as the keyword of P.
S714, acquiring sentences corresponding to the keywords from the second set corpus, taking the acquired sentences as a fourth output result, and executing S820;
s820, taking at least part of the first output result, the second output result and the fourth output result as output results and outputting.
In an embodiment of the present invention, in S713, if P does not include any keyword in the set keyword table, S820 may be directly performed, except that the fifth output result at this time is Null.
In another embodiment of the present invention, in S713, if any one of the set keyword tables is not included in P, S715 may be executed:
s715, the keyword in P is acquired based on the word importance level, and S714 is executed.
In this embodiment, if the output result includes a first output result, a second output result, and a fourth output result, the first output result may include C1 fixed sentences, the second output result includes C2 sentences, and the fourth output result includes C3 sentences, c1+c2+c3=n. C1, C2 and C3 may be set based on actual needs. Those skilled in the art will recognize that if the total number of sentences in the first output result, the second output result, and the fourth output result is less than N, the first output result, the second output result, and the fourth output result may all be taken as output results.
In this embodiment, if W 1 ∩W 2 ∩…∩W j ∩…∩W m By matching with keywords, the matching can be further performed with as many associated words as possible, as in the case of the matching with the alternative words described above.
In another embodiment of the present invention, S100 is replaced with:
s110, acquiring the length L of the target character string, and executing S200 if L is more than L0; otherwise, executing S300; l0 is a set length and may be set based on actual needs, in one exemplary embodiment L0 is 2 characters or 3 characters, preferably 3 characters.
In this embodiment, fixed sentence matching is performed only when the target string length is greater than L0, so that matching time can be saved and matching efficiency can be improved compared with the foregoing embodiment.
Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.
Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.
Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the present disclosure is defined by the appended claims.

Claims (10)

1. An input association method, comprising the steps of:
s100, acquiring an input target character string;
s200, acquiring n target entity word libraries; each target entity word library comprises a plurality of entity words and corresponding fixed sentences, the entity word types corresponding to any two target entity word libraries are different, and the entity words in the same target entity word library correspond to the same entity word type;
s300, traversing n target entity word banks, and for the ith target entity word bank, if any entity word in the ith target entity word bank is contained in the target character string, acquiring a fixed sentence corresponding to the entity word from the ith target entity word bank as k i Outputting results, wherein the value of i is 1 to n; will (k) 1 +k 2 +…+k i +…+k n ) The output results are used as first output results; s400 is executed;
s400, acquiring a first target sentence beginning with the target character string from a first set corpus, taking the acquired first target sentence as a second output result if the corresponding first target sentence is acquired, and executing S500; otherwise, executing S500;
s500, performing word segmentation processing on the target character string to obtain a word segmentation set P= (P) 1 ,P 2 ,…,P j ,…,P m ),P j J is the j-th word in P, the value of j is 1 to m, and m is the number of words in P; if m > 1, S600 is performed; otherwise, executing S800;
s600, obtaining P from the second set corpus j Obtain P j Statement set W j =(w j1 ,w j2 ,…,w jr ,…,w jh(j) ),w jr Is W j The value of r is 1 to h (j), h (j) is W j A second target sentence number in (a);
s700, if W 1 ∩W 2 ∩…∩W j ∩…∩W m Not equal to Null, will W 1 ∩W 2 ∩…∩W j ∩…∩W m The obtained sentence is used as a third output result, and S810 is executed;
s800, taking at least part of the first output result and the second output result as a final output result and outputting the final output result;
and S810, taking at least part of the first output result, the second output result and the third output result as final output results, and outputting.
2. The method of claim 1, wherein S700 further comprises: if W is 1 ∩W 2 ∩…∩W j ∩…∩W m =null, the following steps are performed:
s710, obtain P j Is based on P j And corresponding substitution word, form P j Is a combination of PB and PB j =(P j ,P j1 ,P j2 ,…,P jx ,…, P jf(j) ),P jx Is P j X has a value of 1 to f (j), f (j) is P j Is the number of substitute words of (a);
s720, PB-based 1 ,PB 2 ,…,PB j ,…,PB m Obtaining H combined word segmentation set groups PC= (PC) 1 ,PC 2 ,…,PC s ,…,PC H ) S-th combined word segmentation set PC s =(PC s1 ,PC s2 ,…,PC sj ,…,PC sm ),PC sj Is PC (personal computer) s J-th word in (a), PC sj ∈PB j And PC (personal computer) s Not equal to P; s730 is performed; s has a value of 1 to H, h=f (1) f (2) … f (j) … f (m) -1;
s730, obtaining the PC from the second set corpus sj Obtaining PC sj Is a set of sentences WC sj =(wc 1 sj ,wc 2 sj ,…,wc u sj ,…,wc f(sj) sj ),wc u sj For WC sj The u-th third target sentence in (1) is given by the values of 1 to f (sj), and f (sj) is WC sj A third target sentence number in (a);
s740, obtaining a target sentence result set T= (T) 1 ,T 2 ,…,T s ,…,T H ) S-th target sentence result T s =(WC s1 ∩WC s2 ∩…∩WC sj ∩…∩WC sm ) The method comprises the steps of carrying out a first treatment on the surface of the If at least one target sentence exists in T, the result is not Null, taking the target sentence result which is not Null as a fourth output result, and executing S900;
s900, taking at least part of the first output result, the second output result and the fourth output result as final output results and outputting.
3. The method of claim 2, wherein taking a target sentence result that is not Null as a fourth output result comprises:
if the combined word segmentation set corresponding to the target sentence result which is not Null in the T comprises the word segmentation in the P, acquiring a third target sentence from the combined word segmentation set comprising the word segmentation in the P as a fourth output result.
4. The method of claim 2, wherein S740 further comprises: if T is Null, executing S741;
s741, obtaining keywords in P; the keywords are words obtained from word segmentation in P according to a preset rule;
s742, acquiring a fourth target sentence corresponding to the keyword from the second set corpus, and executing S1000 by taking the acquired fourth target sentence as a fifth output result;
s1000, taking at least part of the first output result, the second output result and the fifth output result as final output results and outputting.
5. The method of claim 2, wherein S740 further comprises: if T is Null, executing S743;
s743, acquiring keywords in P based on the set keyword table; execution S744; s744, obtaining a fifth target sentence corresponding to the keyword in the P from the second set corpus, and taking the obtained fifth target sentence as a fifth output result, and executing S1001;
s1001, taking at least part of the first output result, the second output result and the fifth output result as final output results, and outputting.
6. The method of claim 1, wherein S700 further comprises: if W is 1 ∩W 2 ∩…∩W j ∩…∩W m =null, the following steps are performed:
s711, acquiring keywords in P; the keywords are words obtained from word segmentation in P according to a preset rule;
s712, obtaining a sixth target sentence corresponding to the keyword in the P from the second set corpus, and taking the obtained sixth target sentence as a fourth output result, and executing S820;
s820, taking at least part of the first output result, the second output result and the fourth output result as final output results and outputting.
7. The method of claim 1, wherein in S800, the first output result includes A1 fixed sentences, the second output result includes A2 first target sentences, a1+a2=n, N is a set number of output sentences;
in S810, the first output result includes B1 fixed sentences, the second output result includes B2 first target sentences, and the third output result includes B3 second target sentences, b1+b2+b3=n.
8. The method of claim 1, wherein the first set corpus is a prefix tree corpus.
9. A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the method of any one of claims 1-8.
10. An electronic device comprising a processor and the non-transitory computer readable storage medium of claim 9.
CN202310144621.6A 2023-02-21 2023-02-21 Input association method, electronic equipment and storage medium Pending CN116069174A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310144621.6A CN116069174A (en) 2023-02-21 2023-02-21 Input association method, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310144621.6A CN116069174A (en) 2023-02-21 2023-02-21 Input association method, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116069174A true CN116069174A (en) 2023-05-05

Family

ID=86174849

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310144621.6A Pending CN116069174A (en) 2023-02-21 2023-02-21 Input association method, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116069174A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117057347A (en) * 2023-10-13 2023-11-14 北京睿企信息科技有限公司 Word segmentation method, electronic equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117057347A (en) * 2023-10-13 2023-11-14 北京睿企信息科技有限公司 Word segmentation method, electronic equipment and storage medium
CN117057347B (en) * 2023-10-13 2024-01-19 北京睿企信息科技有限公司 Word segmentation method, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
US20190065506A1 (en) Search method and apparatus based on artificial intelligence
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
CN106960030B (en) Information pushing method and device based on artificial intelligence
CN109937417A (en) The system and method for context searchig for electronical record
CN111552799B (en) Information processing method, information processing device, electronic equipment and storage medium
CN110162768B (en) Method and device for acquiring entity relationship, computer readable medium and electronic equipment
CN112214593A (en) Question and answer processing method and device, electronic equipment and storage medium
CN112256860A (en) Semantic retrieval method, system, equipment and storage medium for customer service conversation content
CN110968684A (en) Information processing method, device, equipment and storage medium
JP2022050379A (en) Semantic retrieval method, apparatus, electronic device, storage medium, and computer program product
CN111325018B (en) Domain dictionary construction method based on web retrieval and new word discovery
WO2018169597A1 (en) Systems and methods for verbatim -text mining
CN111611807A (en) Keyword extraction method and device based on neural network and electronic equipment
CN116775847A (en) Question answering method and system based on knowledge graph and large language model
CN115795061B (en) Knowledge graph construction method and system based on word vector and dependency syntax
JP6973255B2 (en) Word vector changing device, method, and program
CN114329225A (en) Search method, device, equipment and storage medium based on search statement
CN116069174A (en) Input association method, electronic equipment and storage medium
US20220222442A1 (en) Parameter learning apparatus, parameter learning method, and computer readable recording medium
CN110019714A (en) More intent query method, apparatus, equipment and storage medium based on historical results
KR20190110174A (en) A core sentence extraction method based on a deep learning algorithm
Kaur et al. Query based approach for referrer field analysis of log data using web mining techniques for ontology improvement
CN116680387A (en) Dialogue reply method, device, equipment and storage medium based on retrieval enhancement
CN112948561B (en) Method and device for automatically expanding question-answer knowledge base
CN113434789B (en) Search sorting method based on multi-dimensional text features and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination