CN110019735B - Statement matching method, storage medium and terminal equipment - Google Patents

Statement matching method, storage medium and terminal equipment Download PDF

Info

Publication number
CN110019735B
CN110019735B CN201711473293.5A CN201711473293A CN110019735B CN 110019735 B CN110019735 B CN 110019735B CN 201711473293 A CN201711473293 A CN 201711473293A CN 110019735 B CN110019735 B CN 110019735B
Authority
CN
China
Prior art keywords
matching
word
library
sentence
hit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201711473293.5A
Other languages
Chinese (zh)
Other versions
CN110019735A (en
Inventor
董延平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TCL Technology Group Co Ltd
Original Assignee
TCL Technology Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TCL Technology Group Co Ltd filed Critical TCL Technology Group Co Ltd
Priority to CN201711473293.5A priority Critical patent/CN110019735B/en
Publication of CN110019735A publication Critical patent/CN110019735A/en
Application granted granted Critical
Publication of CN110019735B publication Critical patent/CN110019735B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a sentence matching method, a storage medium and a terminal device, wherein the method divides an input text into a plurality of words according to grammar and divides a matching text into a plurality of sentences; sequentially matching words contained in the input text with sentences contained in the matched text according to the input sequence, and recording matching information corresponding to each word; stopping matching and generating a matching library when the number of times of interference words contained in the matching information is greater than a preset threshold value; and after the matching libraries are obtained, repeating the matching process for the matched words until the word libraries and/or the matching libraries are empty, and finally obtaining matching results according to all the matching libraries obtained by matching, so that the phenomena of sentence jumping, missing and misplacement can be avoided through the comparison of the interference words and the matching libraries, and the matching accuracy is improved.

Description

Statement matching method, storage medium and terminal equipment
Technical Field
The present invention relates to the field of natural language processing technologies, and in particular, to a sentence matching method, a storage medium, and a terminal device.
Background
Sentence matching technology has become one of the core technologies for natural language processing and has played an important role in a number of business systems, such as voice assistants, hearing tests, chat robots, and the like. The existing method generally decomposes an input text into a plurality of words, matches each word with a matching text respectively, counts the matching conditions of all the words and the matching text, and finally obtains a matching result according to the matching conditions. However, the existing word matching mode cannot grasp uncertainty of an input sentence well, such as skip sentence, miss-input, misinput and the like, so that sentence matching accuracy is low.
There is thus a need for improvements and improvements in the art.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a sentence matching method, a storage medium and terminal equipment.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a sentence matching method, comprising:
acquiring an input text and a matched text, carrying out grammar analysis on the input text and the matched text, and determining a word library corresponding to the input text and a sentence library corresponding to the matched text;
sequentially matching words in a word library with the sentence library according to an input sequence, and recording matching information of each word, wherein the matching information comprises interference times;
stopping matching and recording the matched words with the interference words removed and hit sentences in the sentence library to generate a matching library when the interference times reach the preset times;
updating the word library and the sentence library according to the matching library, and repeating the step of matching the words in the updated word library with the updated sentence library until the word library and/or the matching library is empty;
and obtaining all the matching libraries generated by matching, and generating a matching result according to the obtained all the matching libraries.
The sentence matching method comprises the steps of sequentially matching words in a word library with the sentence library according to an input sequence, and recording matching information of each word, wherein the matching information comprises the specific interference times:
matching the first word with the sentence library according to the input sequence of the words contained in the word library;
if the first word hits the statement library, the statement hit by the first word is marked as a first hit statement and the disturbance number is marked as 0;
matching the second word with the sentence library, and judging whether the second word is an interfering word or not;
and when the second word is an interference word, accumulating the interference times by one, and matching a third word with the sentence library until the interference times reach the preset times.
The sentence matching method, wherein the matching the second word with the sentence library, and determining whether the second word is an interfering word specifically includes:
matching the second word with a sentence library, and obtaining a hit mark of the second word;
determining a second hit statement corresponding to the second word according to the hit mark, and comparing a second position of the second hit statement with a first position of the first hit statement;
and when the second position is smaller than the first position, judging that the second word is an interference word.
The sentence matching method, wherein the matching the second word with the sentence library, and determining whether the second word is an interfering word further includes:
when the second position is larger than the first position, detecting whether the first hit statement is matched;
and if the second word is not matched, judging that the second word is an interference word.
The sentence matching method, wherein when the number of times of interference reaches a preset number of times, stopping matching and recording the matched words and the hit sentences in the sentence library without the interference words to generate a matching library specifically comprises the following steps:
when the number of times of interference reaches the preset number of times, stopping matching of the word library and the sentence library, and judging whether the hit sentence is prior to the sentence where the interference word is located;
when the sentence is prior to the sentence in which the disturbing word is located, extracting the matched word, the hit sentence and all sentences before the hit sentence;
and removing the disturbing words contained in the matched words to update the matched words, and generating a matching library according to the updated matched words, the hit sentences and all sentences before the hit sentences.
The sentence matching method, wherein stopping matching and recording the matched words and the hit sentences in the sentence library except the interfering words when the number of times of interference reaches the preset number of times, further comprises:
when the sentence in which the disturbing word is located is not prior to the sentence, comparing the hit rate of the hit sentence with a preset threshold value;
if the threshold value is smaller than or equal to a preset threshold value; removing the interference words contained in the matched words to update the matched words, and generating a matching library according to the updated matched words;
if the matching word is larger than the preset threshold, removing the interference word contained in the matched word to update the matched word, and generating a matching library according to the updated matched word, the hit statement and all statements before the hit statement.
The sentence matching method, wherein the steps of updating the word library and the sentence library according to the matching library and matching the words in the updated word library with the updated sentence library are repeated until the word library and/or the matching library is empty, specifically include:
extracting matched words and hit sentences contained in the matching library, deleting the matched words in the word library and deleting the hit sentences in the sentence library so as to update the word library and the sentence library;
and repeatedly matching the words in the updated word library with the updated sentence library until all the words are matched.
The sentence matching method, wherein the obtaining all the matching libraries generated by matching, and generating the matching result according to the obtained all the matching libraries, specifically comprises:
and respectively determining the matching results of the matching libraries, and generating the matching results of the input text and the matching text according to the matching results of the matching libraries.
A computer readable storage medium storing one or more programs executable by one or more processors to implement the steps in the sentence matching method as described in any of the above.
A terminal device, comprising: a processor, a memory, and a communication bus, the memory having stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps in the sentence matching method as described in any of the above.
The beneficial effects are that: compared with the prior art, the invention provides a sentence matching method, a storage medium and a terminal device, wherein the method divides an input text into a plurality of words according to grammar and divides a matching text into a plurality of sentences; sequentially matching words contained in the input text with sentences contained in the matched text according to the input sequence, and recording matching information corresponding to each word; stopping matching and generating a matching library when the number of times of interference words contained in the matching information is greater than a preset threshold value; and after the matching libraries are obtained, repeating the matching process for the matched words until the word libraries and/or the matching libraries are empty, and finally obtaining matching results according to all the matching libraries obtained by matching, so that the phenomena of sentence jumping, missing and misplacement can be avoided through the comparison of the interference words and the matching libraries, and the matching accuracy is improved.
Drawings
FIG. 1 is a flowchart of a sentence matching method according to a preferred embodiment of the present invention.
Fig. 2 is a flowchart of step S20 in a preferred embodiment of the sentence matching method according to the present invention.
Fig. 3 is a schematic structural diagram of a preferred embodiment of a terminal device according to the present invention.
Detailed Description
The invention provides a sentence matching method, a storage medium and a terminal device, which are used for making the purposes, technical schemes and effects of the invention clearer and more definite, and the invention is further described in detail below by referring to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
As used herein, the singular forms "a", "an", "the" and "the" are intended to include the plural forms as well, unless expressly stated otherwise, as understood by those skilled in the art. It will be further understood that the terms "comprises" and/or "comprising," when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. It will be understood that when an element is referred to as being "connected" or "coupled" to another element, it can be directly connected or coupled to the other element or intervening elements may also be present. Further, "connected" or "coupled" as used herein may include wirelessly connected or wirelessly coupled. The term "and/or" as used herein includes all or any element and all combination of one or more of the associated listed items.
It will be understood by those skilled in the art that all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs unless defined otherwise. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention will be further described by the description of embodiments with reference to the accompanying drawings.
Referring to fig. 1, fig. 1 is a flowchart of a preferred embodiment of a sentence matching method according to the present invention. The method comprises the following steps:
s10, acquiring an input text and a matched text, carrying out grammar analysis on the input text and the matched text, and determining a word library corresponding to the input text and a sentence library corresponding to the matched text.
Specifically, the input text is generated for the received user input text, and words in the input text are ordered in the order of receipt. I.e. the words in the input text are ordered from front to back according to the user input, i.e. the first entered word is arranged behind the later entered word. For example, the word received from the user is I love you, then I is the first bit, love is the second bit, you is the third bit according to the word order.
The matching text is a benchmark text for verifying the input text, and the input text is generated by a user according to the matching text. For example, the terminal device plays the matching text in the form of voice, and the user listens to the voice broadcast and inputs the listened content to the terminal device to form the input text. In addition, the matching text can be acquired from a preset matching text database according to the input text. For example, the matching text database is composed of a large number of texts, and the texts in the matching text database are sentences capable of expressing complete meanings. And the input text can carry a text identifier, when the input text is received, the corresponding text is searched in a matched text database according to the text identifier, and the searched text is used as the matched text corresponding to the input text. Of course, the matching text database is composed of a large number of texts, and the texts in the matching text database are sentences capable of expressing complete meanings.
Meanwhile, in the present embodiment, the parsing of the input text refers to dividing the input text in terms of words to divide the input text into a plurality of words. For example, if the input text is I love you, dividing the input text into I, love, you, and sorting the words obtained by dividing according to the corresponding input sequence, i.e. I is a first word, love is a second word, you is a third word. In addition, when an input text is received, the language type of the input text can be judged, the corresponding dividing method is determined according to the voice type, and the input text is divided according to the determined dividing method, so that the accuracy of division is improved. For example, when the language type is english, the input text is divided by using spaces as dividing bits; when the language type is Chinese, the bytes are taken as dividing standards, and every two bytes are divided into one word.
The grammar of the matched text is divided by sentence unit to divide the matched text into a plurality of sentences, which can judge the sentences to which each input word belongs and avoid the problems of skip or wrong input sequence. In this embodiment, the dividing the matching text and sentence into units may specifically be: and respectively acquiring punctuation marks carried by the matched text, and dividing the matched text into a plurality of sentences according to the acquired punctuation marks. For example, the matching text is I love you. You love me. We are area couple. Then the extracted sentence comprises three periods, so that the matching text is divided into 3 sentences according to the three periods, namely I love you; you love me; and We are couple, and each sentence is ordered according to the position of each sentence in the matching text, namely, I love you is the first sentence; you love me is the second sentence and We are couple is the third sentence.
The word library consists of words obtained by dividing an input text, and the sentence library consists of sentences obtained by dividing a matched text. In addition, each word in the word library carries an input number, and the input number is obtained by sequencing each word according to an input sequence. Each sentence in the sentence library carries a matching number, and the matching number is determined according to the position of each sentence in the matching text. Meanwhile, each word contained in each sentence in the sentence library carries a matching identifier, and the matching identifier is generated according to the position of the word in the corresponding sentence and the matching number of the sentence. For example, the matching text is I love you. You me. We are circle, where the I love you has a matching number of 1 and the I matches in I love you are identified as 1.1.
And S20, sequentially matching words in the word library with the sentence library according to the input sequence, and recording matching information of each word, wherein the matching information comprises the interference times.
Specifically, the input order is the input order of each word in the input text. That is, each word in the word library is matched with the sentence library in turn according to the sequence of the input numbers carried by the word library, and matching information obtained by matching each sub record is obtained. The matching information may include the interfering word, the number of times of interference, the location of interference, the hit mark, the first hit statement, and the matched word. In addition, the words in the word library are sequentially matched with the sentence library according to the sequence of the input numbers carried by the words, namely, the first word is matched with the sentence library, first matching information of the first word is recorded, whether the interference times carried by the first matching information reach preset times is judged, if the interference times are not larger than the preset times, the second matching is carried out with the sentence library, the matching information of the second word is generated according to the first matching information and hit information of the second word, whether the interference times carried by the second matching information reach the preset times is judged, if the interference times do not reach the preset times, the third word is matched with the sentence library until the interference times reach the preset times, and the matching library is generated according to the matching information corresponding to the words which are stopped from matching.
For example, as shown in fig. 2, the matching the words in the word library with the sentence library according to the input sequence, and recording the matching information of each word specifically includes:
s21, matching the first word with the sentence library according to the input sequence of words contained in the word library;
s22, if the first word hits the statement library, marking the statement hit by the first word as a first hit statement and marking the disturbance number as 0;
s23, matching the second word with a sentence library, and judging whether the second word is an interfering word or not;
and S24, accumulating the interference times to one when the second word is an interference word, and matching a third word with the sentence library until the interference times reach a preset time.
Specifically, the first word hit statement library refers to a statement library storing statements containing the first word, and the matching number of the statement containing the first time is the most first hit statement. Meanwhile, a plurality of sentences including the first word may exist in the sentence library, and then the sentence with the forefront matching number is taken as the first hit sentence. For example, the first sentence and the third sentence in the first word hit sentence library, then the first sentence is the first hit sentence. In addition, when the first word does not hit the sentence library, the matching is stopped. That is, the first hit sentence is a sentence hit by the word that matches the first input word of all the words at this time.
Meanwhile, in this embodiment, the matching the second word with the sentence library, and determining whether the second word is an interfering word specifically includes:
matching the second word with a sentence library, and obtaining a hit mark of the second word;
determining a second hit statement corresponding to the second word according to the hit mark, and comparing a second position of the second hit statement with a first position of the first hit statement;
when the second position is smaller than the first position, judging that the second word is an interference word;
when the second position is larger than the first position, detecting whether the first hit statement is matched;
and if the second word is not matched, judging that the second word is an interference word.
Specifically, the second hit statement refers to a statement hit by the second word, and of course, when the second word is not hit, the second word is directly judged to be an interfering word. And after the second word hits, acquiring a hit mark of the second word hit, and judging whether the second word is an interference word according to the hit mark and the matching information of the first word. When the matching number of the second hit statement corresponding to the second word is smaller than that of the first hit statement, the second word is an interference word; when the matching number of the second hit statement is equal to the matching number of the first hit statement, judging whether the matching identifier corresponding to the hit mark of the second word is in front of the matching identifier corresponding to the first word, and if so, judging that the second word is an interference word; and when the matching number of the second hit statement is larger than that of the first hit statement, judging whether the first hit statement is matched, and if not, judging that the second word is an interference word. In addition, after the second word is determined to be the interference number, the interference number of the configuration information corresponding to the second word is increased by one, namely the interference number corresponding to the second word is 1, and the method is performed by analogy until the number of times of carrying the interference word reaches the preset number of times.
S30, stopping matching and recording the matched words and the hit sentences in the sentence library without the interference words when the interference times reach the preset times so as to generate a matching library.
Specifically, stopping matching refers to stopping matching of the words arranged behind when the number of times of interference carried by the matching information corresponding to the words in the word library reaches a preset number of times. The matching library comprises matched words, matched sentences and all sentences arranged in front of the matched sentences, wherein the matched words do not comprise interfering words. That is, the interfering words are removed from all the words that have been matched to obtain matched words. Correspondingly, stopping matching and recording the matched words and the hit sentences in the sentence library except the interfering words when the number of times of interference reaches the preset number of times, so as to generate the matching library specifically comprises the following steps:
when the number of times of interference reaches the preset number of times, stopping matching of the word library and the sentence library, and judging whether the hit sentence is prior to the sentence where the interference word is located;
when the sentence is prior to the sentence in which the disturbing word is located, extracting the matched word, the hit sentence and all sentences before the hit sentence;
and removing the disturbing words contained in the matched words to update the matched words, and generating a matching library according to the updated matched words, the hit sentences and all sentences before the hit sentences.
Specifically, the matching number of the hit sentence of the sentence finger, which is earlier than the interfering word, is smaller than the matching number of the sentence in which the interfering word is located. Furthermore, the preset number of times is 2, two interfering words are stored. Correspondingly, the hit statement is earlier than the statement in which the disturbing word is located. In addition, a case may be stored in which the sentence in which the disturbing word is located is not preceded. Correspondingly, stopping matching and recording the matched words and the hit sentences in the sentence library except the interfering words when the number of times of interference reaches the preset number of times, so as to generate the matching library further comprises:
when the sentence in which the disturbing word is located is not prior to the sentence, comparing the hit rate of the hit sentence with a preset threshold value;
if the threshold value is smaller than or equal to a preset threshold value; removing the interference words contained in the matched words to update the matched words, and generating a matching library according to the updated matched words;
if the matching word is larger than the preset threshold, removing the interference word contained in the matched word to update the matched word, and generating a matching library according to the updated matched word, the hit statement and all statements before the hit statement.
Specifically, when the sentence in which the disturbing word is located is not preceded, the hit rate of the hit sentence needs to be calculated, wherein the hit rate refers to the ratio of the number of words hit by the hit sentence to the number of all words of the hit sentence. That is, the number of all words contained in the hit sentence is acquired corresponding to the hit sentence, and the number of the hit words is extracted; and determining the hit rate of the pairs according to the number and the quantity. And in addition, screening according to the hit rate to remove the hit rate to reach a preset threshold value. Specifically, the hit sentences in each matching library are extracted, and the hit rate of the hit sentences is compared with a preset threshold value; selecting hit sentences with hit rate larger than a preset threshold value, if the interference words contained in the matched words are removed to update the matched words, and generating a matching library according to the updated matched words, the hit sentences and all sentences before the hit sentences; if the interference word is smaller than or equal to a preset threshold value, judging that the interference word is misinput by taking the matching word as a misinput, and deleting the matching word from a word library.
And S40, updating the word library and the sentence library according to the matching library, and repeating the step of matching the words in the updated word library with the updated sentence library until the word library and/or the matching library is/are empty.
Specifically, the updating of the word library refers to deleting words in the word library relative to words in the matching library, and the updating of the sentence library refers to deleting hit sentences and all sentences before the hit sentences. And repeatedly matching the words in the updated sentence library with the updated sentence library means that the steps S20-S40 are repeatedly executed until all the words in the word library are matched. The step of updating the word library and the sentence library according to the matching library and repeating the step of matching the word in the updated word library with the updated sentence library until the word library and/or the matching library is empty specifically includes: extracting matched words and hit sentences contained in the matching library, deleting the matched words in the word library and deleting the hit sentences in the sentence library, updating the word library and the sentence library, and repeatedly matching the words in the updated word library with the updated sentence library until the word library and/or the matching library is/are empty.
S50, acquiring all the matching libraries generated by matching, and generating a matching result according to the acquired all the matching libraries.
Specifically, the step of analyzing information in each matching library according to the matching library refers to respectively analyzing the information in each matching library, obtaining a matching result according to the analysis, and summarizing the matching results of all the matching libraries to obtain a matching result of the input text and the matching text. Correspondingly, the steps of obtaining all the matching libraries generated by matching and generating the matching result according to the obtained all the matching libraries are specifically as follows: and respectively determining the matching results of the matching libraries, and generating the matching results of the input text and the matching text according to the matching results of the matching libraries.
In addition, in order to further explain the sentence matching method provided by the present invention, two specific embodiments are described below.
Example 1
In this embodiment, the matching text is "I love You me. We are area couple". The input text is "I You are couple". The sentence matching method comprises the following steps:
h1, carrying out grammar analysis on the matched text and the input text to obtain a word library and a sentence library; the word library is 1, I,2, you,3, are, 4 and couple, and the sentence library is 1, I love You; 2. you love me; 3.we are couple.
And H2, matching the first word I with the sentence library to obtain matching information corresponding to the first word, wherein the matching information is as follows: the first hit statement is 1; hit mark: i (1.1); matched: i, a step of I; number of interference: 0; interference scrambling word: empty; interference location: empty;
and H3, matching the second word You with the sentence library to obtain matching information corresponding to the second word, wherein the matching information is as follows: the first hit statement is 1; hit mark: i (1.1), you (1.3), you (2.1); matched: i, you; number of interference: 0; interference scrambling word: empty; interference location: empty;
and H4, matching the third word are with the sentence library to obtain matching information corresponding to the third word, wherein the matching information is as follows: the first hit statement is 1; hit mark: i (1.1), you (1.3), you (2.1); matched: i You; number of interference: 1 (jump to sentence 3); interfering words: an are; interference location: 3, a step of;
and H5, matching the fourth word coupler with the sentence library to obtain matching information corresponding to the fourth word, wherein the matching information is as follows: the first hit statement is 1; hit mark: i (1.1), you (1.3), you (2.1); matched: i You; number of interference: 2 (jump to sentence 3); interfering words: an are coupler; interference location: 3, a step of;
h6, stopping the matching when the interference times reach the preset times;
h7, comparing the hit rate of the hit sentences, and judging the first sentence as the hit sentence according to the hit rate;
h8, judging whether the matched position is prior to the interference position;
h9, the matched position is prior to the interference position, and a matching result corresponding to the first sentence is determined, wherein the matching result is correct input and write: i, a step of I; size write error: you; miss input: love;
h10, deleting the matched words from the word library, deleting the hit sentences and all sentences before the hit sentences from the sentence library to obtain an updated word library and a sentence library, wherein the updated word library comprises 1, are,2 and couple2; the updated sentence library comprises 2, you love me and 3.We are area couple;
h11, matching the second word are with the sentence library to obtain matching information corresponding to the first word, wherein the matching information is as follows: the first hit statement is 3; hit mark: are (3.2); matched: an are;
h12, matching the second word coupler with the sentence library to obtain matching information corresponding to the second word, wherein the matching information is as follows: the first hit statement is 3; hit mark: are (3.2), couple (3.3); matched: an are, coupling; number of interference: 0; interfering words: empty; interference location: empty;
h14, inputting null, comparing the hit rate of the hit sentences, and judging the third sentence as the hit sentence according to the hit rate;
h15, the matched position is prior to the interference position, and a matching result corresponding to the third sentence is determined, wherein the matching result is correct input and write: an are coupler; miss input: we;
and H16, determining a final matching result of the input text and the matching text according to the determined matching results, wherein the final result is correct input and writing: i are coupler; size write error: you; miss input: love; you love me; we.
Example two
In this embodiment, the matching text is "I love You me. We area couple". The input text is "We I love You". Fig.. The sentence matching method comprises the following steps:
m1, carrying out grammar analysis on the matched text and the input text to obtain a word library and a sentence library; the word library is 1, we,2, I,3, love, 4 and you, and the sentence library is 1 and I love you; 2. you love me; 3.we are couple.
M2, matching the first word We with a sentence library to obtain matching information corresponding to the first word, wherein the matching information is as follows: the first hit statement is 3; hit mark: we (3.1); matched: we; number of interference: 0; interference scrambling word: empty; interference location: empty;
m3, matching the second word I with the sentence library to obtain matching information corresponding to the second word, wherein the matching information is as follows: the first hit statement is 3; hit mark: we (3.1); matched: we; number of interference: 1, a step of; interference scrambling word: i, a step of I; interference location: 1, a step of;
m4, matching the third word love with the sentence library to obtain matching information corresponding to the third word, wherein the matching information is as follows: the first hit statement is 3; hit mark: we (3.1); matched: we; number of interference: 2; interfering words: i LOVE; interference location: 1, (2.2);
m6, stopping the matching when the interference times reach the preset times;
m7, comparing the hit rate of the hit sentences, and judging that the third sentence is the hit sentence according to the hit rate;
m8, judging whether the matched position is prior to the interference position;
m9, determining that the matched position is not prior to the interference position and the matching rate is one third, determining that the matching result corresponding to the third sentence is a jump sentence, and outputting the matching result as error input: we;
m10, deleting error input from a word library and obtaining an updated word library from a sentence library, wherein the updated word library comprises 1, I,2, love,3 and you;2; the updated sentence library comprises 1 and I love you; 2. you love me; and a we are couple;
m11, matching the first word I with the sentence library to obtain matching information corresponding to the first word, wherein the matching information is as follows: the first hit statement is 1; hit mark: i (1.1); matched: i, a step of I;
m12, matching the second word love with the sentence library to obtain matching information corresponding to the second word, wherein the matching information is as follows: the first hit statement is 1; hit mark: i (1.1), love (1.2), love (2.2); matched: i, love; number of interference: 0; interfering words: empty; interference location: empty;
m13, matching the third word you with the sentence library to obtain matching information corresponding to the third word, wherein the matching information is: the first hit statement is 1; hit mark: i (1.1), love (1.2), you (1.3), love (2.2); matched: i love you; number of interference: 0; interfering words: empty; interference location: empty;
m14, inputting null, comparing hit rate of hit sentences, and judging the first sentence as hit sentence according to the hit rate;
m15, determining a matching result corresponding to the third sentence before the interference position, wherein the matching result is correct input and write: i love you;
m16, determining a final matching result of the input text and the matching text according to the determined matching results, wherein the final result is correct input and writing: i love you; miss input: you love me. We are tie; error input We.
The present invention also provides a terminal device, as shown in fig. 3, comprising at least one processor (processor) 20; a display screen 21; and a memory (memory) 22, which may also include a communication interface (Communications Interface) 23 and a bus 24. Wherein the processor 20, the display 21, the memory 22 and the communication interface 23 may communicate with each other via a bus 24. The display screen 21 is configured to display a user guidance interface preset in the initial setting mode. The communication interface 23 may transmit information. The processor 20 may invoke logic instructions in the memory 22 to perform the methods of the embodiments described above.
Further, the logic instructions in the memory 22 described above may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product.
The memory 22, as a computer readable storage medium, may be configured to store a software program, a computer executable program, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure. The processor 30 performs the functional applications and data processing, i.e. implements the methods of the embodiments described above, by running software programs, instructions or modules stored in the memory 22.
The memory 22 may include a storage program area that may store an operating system, at least one application program required for functions, and a storage data area; the storage data area may store data created according to the use of the terminal device, etc. In addition, the memory 22 may include high-speed random access memory, and may also include nonvolatile memory. For example, a plurality of media capable of storing program codes such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or a transitory storage medium may be used.
In addition, the specific processes that the storage medium and the plurality of instruction processors in the mobile terminal load and execute are described in detail in the above method, and are not stated here.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A sentence matching method, comprising:
acquiring an input text and a matched text, carrying out grammar analysis on the input text and the matched text, and determining a word library corresponding to the input text and a sentence library corresponding to the matched text;
sequentially matching words in a word library with the sentence library according to an input sequence, and recording matching information of each word, wherein the matching information comprises interference times; when the second word of the input text is an interference word, accumulating the interference times by one;
when the number of times of interference reaches a preset number of times, stopping matching and recording matched words with interference words removed and hit sentences in a sentence library to generate a matching library, wherein the matching library comprises matched words, matched sentences and all sentences arranged in front of the matched sentences, and the matched words do not comprise the interference words;
updating the word library and the sentence library according to the matching library, and repeating the step of matching the words in the updated word library with the updated sentence library until the word library and/or the matching library is empty;
and obtaining all the matching libraries generated by matching, and generating a matching result according to the obtained all the matching libraries.
2. The sentence matching method according to claim 1, wherein the matching the words in the word bank with the sentence bank sequentially according to the input order, and recording matching information of each word, wherein the matching information includes the number of interference specifically including:
matching the first word with the sentence library according to the input sequence of the words contained in the word library;
if the first word hits the statement library, the statement hit by the first word is marked as a first hit statement and the disturbance number is marked as 0;
matching the second word with the sentence library, and judging whether the second word is an interfering word or not;
and when the second word is an interference word, accumulating the interference times by one, and matching a third word with the sentence library until the interference times reach the preset times.
3. The sentence matching method according to claim 2, wherein the matching the second word with the sentence library, and determining whether the second word is an interfering word specifically includes:
matching the second word with a sentence library, and obtaining a hit mark of the second word;
determining a second hit statement corresponding to the second word according to the hit mark, and comparing a second position of the second hit statement with a first position of the first hit statement;
and when the second position is smaller than the first position, judging that the second word is an interference word.
4. The sentence matching method according to claim 3, wherein said matching the second word with the sentence library, determining whether the second word is an interfering word further includes:
when the second position is larger than the first position, detecting whether the first hit statement is matched;
and if the second word is not matched, judging that the second word is an interference word.
5. The sentence matching method according to claim 1, wherein stopping matching and recording the matched words excluding the disturbing word and hit sentences in the sentence library to generate a matching library when the number of disturbing times reaches a preset number of times comprises:
when the number of times of interference reaches the preset number of times, stopping matching of the word library and the sentence library, and judging whether the hit sentence is prior to the sentence where the interference word is located;
when the sentence is prior to the sentence in which the disturbing word is located, extracting the matched word, the hit sentence and all sentences before the hit sentence;
and removing the disturbing words contained in the matched words to update the matched words, and generating a matching library according to the updated matched words, the hit sentences and all sentences before the hit sentences.
6. The sentence matching method according to claim 5, wherein stopping matching and recording the matched words excluding the disturbing word and the hit sentences in the sentence library to generate the matching library when the number of disturbing times reaches a preset number of times further comprises:
when the sentence in which the disturbing word is located is not prior to the sentence, comparing the hit rate of the hit sentence with a preset threshold value;
if the threshold value is smaller than or equal to a preset threshold value; removing the interference words contained in the matched words to update the matched words, and generating a matching library according to the updated matched words;
if the matching word is larger than the preset threshold, removing the interference word contained in the matched word to update the matched word, and generating a matching library according to the updated matched word, the hit statement and all statements before the hit statement.
7. The sentence matching method according to claim 1, wherein the steps of updating the word library and the sentence library according to the matching library, and repeating the step of matching the word in the updated word library with the updated sentence library until the word library and/or the matching library is empty include:
extracting matched words and hit sentences contained in the matching library, deleting the matched words in the word library and deleting the hit sentences in the sentence library so as to update the word library and the sentence library;
and repeatedly matching the words in the updated word library with the updated sentence library until all the words are matched.
8. The sentence matching method according to claim 1, wherein the step of obtaining all the matching libraries generated by the matching, and generating the matching result according to the obtained all the matching libraries is specifically:
and respectively determining the matching results of the matching libraries, and generating the matching results of the input text and the matching text according to the matching results of the matching libraries.
9. A computer-readable storage medium storing one or more programs executable by one or more processors to implement the steps in the sentence matching method of any of claims 1-8.
10. A terminal device, comprising: a processor, a memory, and a communication bus; the memory has stored thereon a computer readable program executable by the processor;
the communication bus realizes connection communication between the processor and the memory;
the processor, when executing the computer readable program, implements the steps of the sentence matching method as claimed in any one of claims 1-8.
CN201711473293.5A 2017-12-29 2017-12-29 Statement matching method, storage medium and terminal equipment Active CN110019735B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201711473293.5A CN110019735B (en) 2017-12-29 2017-12-29 Statement matching method, storage medium and terminal equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201711473293.5A CN110019735B (en) 2017-12-29 2017-12-29 Statement matching method, storage medium and terminal equipment

Publications (2)

Publication Number Publication Date
CN110019735A CN110019735A (en) 2019-07-16
CN110019735B true CN110019735B (en) 2023-06-23

Family

ID=67187168

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201711473293.5A Active CN110019735B (en) 2017-12-29 2017-12-29 Statement matching method, storage medium and terminal equipment

Country Status (1)

Country Link
CN (1) CN110019735B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6766069B1 (en) * 1999-12-21 2004-07-20 Xerox Corporation Text selection from images of documents using auto-completion
CN106325537A (en) * 2015-06-23 2017-01-11 腾讯科技(深圳)有限公司 Information inputting method and device
CN107220292A (en) * 2017-04-25 2017-09-29 上海庆科信息技术有限公司 Intelligent dialogue device, reaction type intelligent sound control system and method

Family Cites Families (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP0645757B1 (en) * 1993-09-23 2000-04-05 Xerox Corporation Semantic co-occurrence filtering for speech recognition and signal transcription applications
US9606634B2 (en) * 2005-05-18 2017-03-28 Nokia Technologies Oy Device incorporating improved text input mechanism
JP4816409B2 (en) * 2006-01-10 2011-11-16 日産自動車株式会社 Recognition dictionary system and updating method thereof
US8738353B2 (en) * 2007-09-05 2014-05-27 Modibo Soumare Relational database method and systems for alphabet based language representation
CN102298582B (en) * 2010-06-23 2016-09-21 商业对象软件有限公司 Data search and matching process and system
US20120215533A1 (en) * 2011-01-26 2012-08-23 Veveo, Inc. Method of and System for Error Correction in Multiple Input Modality Search Engines
CN102789504A (en) * 2012-07-19 2012-11-21 姜赢 Chinese grammar correcting method and system on basis of XLM (Extensible Markup Language) rule
US9760559B2 (en) * 2014-05-30 2017-09-12 Apple Inc. Predictive text input
CN104035918A (en) * 2014-06-12 2014-09-10 华东师范大学 Chinese organization name abbreviation recognition system adopting context feature matching
CN106528749B (en) * 2016-11-04 2019-04-30 福建天泉教育科技有限公司 Sentence recommended method and its system based on writing training

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6766069B1 (en) * 1999-12-21 2004-07-20 Xerox Corporation Text selection from images of documents using auto-completion
CN106325537A (en) * 2015-06-23 2017-01-11 腾讯科技(深圳)有限公司 Information inputting method and device
CN107220292A (en) * 2017-04-25 2017-09-29 上海庆科信息技术有限公司 Intelligent dialogue device, reaction type intelligent sound control system and method

Also Published As

Publication number Publication date
CN110019735A (en) 2019-07-16

Similar Documents

Publication Publication Date Title
CN108491373B (en) Entity identification method and system
CN107622054B (en) Text data error correction method and device
US11348571B2 (en) Methods, computing devices, and storage media for generating training corpus
CN109086199B (en) Method, terminal and storage medium for automatically generating test script
KR101648235B1 (en) Pattern-recognition processor with matching-data reporting module
CN108549662B (en) Complementary digestion method and device for semantic analysis results in multi-round conversation
CN107221328B (en) Method and device for positioning modification source, computer equipment and readable medium
CN108090043B (en) Error correction report processing method and device based on artificial intelligence and readable medium
CN111739514B (en) Voice recognition method, device, equipment and medium
CN101082909A (en) Method and system for dividing Chinese sentences for recognizing deriving word
EP3748507A1 (en) Automated software testing
CN102681981A (en) Natural language lexical analysis method, device and analyzer training method
CN106325596B (en) A kind of written handwriting automatic error correction method and system
US20040194036A1 (en) Automated evaluation of overly repetitive word use in an essay
CN108595412B (en) Error correction processing method and device, computer equipment and readable medium
CN109684457A (en) A kind of method and system that personal share advertisement data is extracted
CN103970913A (en) UTF-8 (8-bit Unicode transformation format) and ANSI (American national standards institute) code identification method and device
CN110826301B (en) Punctuation mark adding method, punctuation mark adding system, mobile terminal and storage medium
JPS6359660A (en) Information processor
CN110019735B (en) Statement matching method, storage medium and terminal equipment
CN110717323B (en) Document seal dividing method and device, terminal and computer readable storage medium
CN111492364B (en) Data labeling method and device and storage medium
JP6810352B2 (en) Fault analysis program, fault analysis device and fault analysis method
KR20160061135A (en) Voice recognition apparatus and method of controlling the voice recognition apparatus
CN114065762A (en) Text information processing method, device, medium and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 516006 TCL science and technology building, No. 17, Huifeng Third Road, Zhongkai high tech Zone, Huizhou City, Guangdong Province

Applicant after: TCL Technology Group Co.,Ltd.

Address before: 516006 Guangdong province Huizhou Zhongkai hi tech Development Zone No. nineteen District

Applicant before: TCL Corp.

GR01 Patent grant
GR01 Patent grant