WO2022116436A1

WO2022116436A1 - Text semantic matching method and apparatus for long and short sentences, computer device and storage medium

Info

Publication number: WO2022116436A1
Application number: PCT/CN2021/083780
Authority: WO
Inventors: 谢静文; 阮晓雯; 徐亮
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-12-01
Filing date: 2021-03-30
Publication date: 2022-06-09
Also published as: CN112446218A

Abstract

A text semantic matching method and apparatus for long and short sentences, a computer device and a storage medium. The method comprises: obtaining a sentence to be matched and a target sample sentence, and comparing the length of characters to be matched corresponding to the sentence with the length of target sample characters corresponding to the target sample sentence (S10); when the length of the characters is less than the length of the target sample characters, recording the length of the characters as the window length of a sliding window (S20); sliding the sliding window on the target sample sentence, matching a target sample field of the target sample sentence covered by the sliding window with the sentence to obtain a first semantic distance result (S30); determining a first semantic score between the sentence and the target sample sentence according to the first semantic distance result corresponding to the sentence and the target sample sentence (S40); when the first semantic score exceeds a preset score threshold, recording the target sample sentence as a semantic matching sentence corresponding to the sentence (S50). The method improves the accuracy of semantic matching between long and short sentences.

Description

Method, device, computer equipment and storage medium for semantic matching of long and short sentences

This application claims the priority of the Chinese patent application filed on December 1, 2020 with the application number 202011382663.6 and the title of the invention is "Long and Short Sentence Text Semantic Matching Method, Device, Computer Equipment and Storage Medium", the entire content of which is approved by Reference is incorporated in this application.

technical field

The present application relates to the technical field of semantic parsing, and in particular, to a method, apparatus, computer device and storage medium for semantic matching of long and short sentences.

Background technique

With the development of science and technology, the field of natural language processing technology has also gradually developed, and natural language processing technology has been widely used in various scenarios such as similar sentence matching and similar expression recall.

The inventor realized that at present, for similar sentence matching, similar expression recall and other scenarios, text matching is often performed through an end-to-end deep learning model or unsupervised semantic matching to directly output the semantic similarity between two sentences, and then Similarity comparison is performed; for short sentence matching, end-to-end models or character matching methods are used; however, for matching between text and short sentences in the prior art, it is often necessary to disassemble the text into the same short sentences as the text. After the field of characters, the similarity between short sentences and fields is matched, and for the matching between text and short sentences, the end-to-end model often cannot accurately cover all semantic information, but through the character similarity container Judgment is easy to cause misjudgment, resulting in low semantic matching similarity.

Application content

Embodiments of the present application provide a method, an apparatus, a computer device, and a storage medium for semantic matching of long and short sentences, so as to solve the problem of low accuracy of semantic matching between long and short sentences.

A method for semantic matching of long and short sentences, including:

Obtain the sentence to be matched and the target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence;

When the length of the character to be matched is less than the length of the character of the target sample, the length of the character to be matched is recorded as the window length of the sliding window;

Sliding the sliding window on the target sample sentence, and matching the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain a first semantic distance result;

Determine the first semantic score between the sentence to be matched and the target sample sentence according to the first semantic distance result corresponding to the sentence to be matched and the target sample sentence;

When the first semantic score exceeds a preset score threshold, the target sample sentence is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.

A long-short sentence text semantic matching device, comprising:

a sentence obtaining module, configured to obtain a sentence to be matched and a target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence;

A window length recording module, configured to record the length of the character to be matched as the window length of the sliding window when the length of the character to be matched is less than the length of the character of the target sample;

The first sentence matching module is used to slide the sliding window on the target sample sentence, and match the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain the first sentence. lexical distance results;

The first semantic score determination module is used to determine the first semantic score between the sentence to be matched and the target sample sentence according to the first semantic distance result corresponding to the sentence to be matched and the target sample sentence;

A matching sentence determination module, configured to record the target sample sentence as a semantic matching sentence corresponding to the to-be-matched sentence when the first semantic score exceeds a preset score threshold.

A computer device comprising a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, wherein the processor implements the following steps when executing the computer-readable instructions:

One or more readable storage media storing computer-readable instructions, wherein the computer-readable instructions, when executed by one or more processors, cause the one or more processors to perform the following steps:

The above-mentioned long and short sentence text semantic matching method, device, computer equipment and storage medium, the method obtains the sentence to be matched and the target sample sentence, the length of the character to be matched corresponding to the sentence to be matched and the target sample character corresponding to the target sample sentence Length comparison is carried out; when the length of the character to be matched is less than the length of the character of the target sample, the length of the character to be matched is recorded as the window length of the sliding window; sliding the sliding window on the target sample sentence, the The target sample field of the target sample sentence covered by the sliding window is matched with the sentence to be matched to obtain a first semantic distance result; according to the first semantic distance corresponding to the sentence to be matched and the target sample sentence As a result, the first semantic score between the sentence to be matched and the target sample sentence is determined; when the first semantic score exceeds a preset score threshold, the target sample sentence is recorded as the sentence corresponding to the sentence to be matched. Semantically match sentences.

The present application defines a sliding window indicator to match the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain a first word sense distance result, and then according to the first word sense distance As a result, the first semantic score between the sentence to be matched and the target sample sentence is determined to determine whether there is part of the semantic information in the target sample sentence that matches the sentence to be matched, so that the target sample sentence that will not be recalled (target sample sentence) When the semantic similarity between the sentence and the sentence to be matched is less than the preset similarity threshold, it will be directly determined that the target sample sentence does not match the sentence to be matched), and there is a possibility of being recalled. The target scene provides more sample data, and also improves the semantic matching accuracy between short sentences and long sentences.

The details of one or more embodiments of the application are set forth in the accompanying drawings and the description below, and other features and advantages of the application will be apparent from the description, drawings, and claims.

Description of drawings

In order to illustrate the technical solutions of the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments of the present application. Obviously, the drawings in the following description are only some embodiments of the present application. , for those of ordinary skill in the art, other drawings can also be obtained from these drawings without creative labor.

1 is a schematic diagram of an application environment of a method for semantic matching of long and short sentences in an embodiment of the present application;

2 is a flowchart of a method for semantic matching of long and short sentences in an embodiment of the present application;

3 is a flowchart of step S30 in the method for semantic matching of long and short sentences in an embodiment of the present application;

4 is a flowchart of step S40 in the method for semantic matching of long and short sentences in an embodiment of the present application;

5 is another flowchart of a method for semantic matching of long and short sentences in an embodiment of the present application;

6 is a schematic block diagram of a device for semantic matching of long and short sentences in an embodiment of the present application;

7 is a schematic block diagram of a first sentence matching module in a device for semantic matching of long and short sentences in an embodiment of the present application;

8 is a schematic block diagram of a first semantic score determination module in a long-short sentence text semantic matching device according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a computer device in an embodiment of the present application.

Detailed ways

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

The method for semantic matching of long-short sentence text provided by the embodiment of the present application can be applied in the application environment shown in FIG. 1 . Specifically, the long-short sentence text semantic matching method is applied in a long and short sentence text semantic matching system, and the long and short sentence text semantic matching system includes a client and a server as shown in FIG. The problem of low accuracy of semantic matching between them. Among them, the client, also known as the client, refers to the program corresponding to the server and providing local services for the client. Clients can be installed on, but not limited to, various personal computers, laptops, smartphones, tablets, and portable wearable devices. The server can be implemented as an independent server or a server cluster composed of multiple servers.

In one embodiment, as shown in FIG. 2 , a method for semantic matching of long and short sentences is provided, and the method is applied to the server in FIG. 1 as an example for description, including the following steps:

S10: Obtain the sentence to be matched and the target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence.

The sentences to be matched may be sentences in different application scenarios. Exemplarily, in the field of multi-round intelligent interactive robots, the sentences to be matched may be sentences for robot reply. The target sample sentences can also be sentences in different application scenarios. Preferably, the target sample sentences and the sentences to be matched are sentences in the same application scenario. The length of the characters to be matched refers to the number of characters in the sentence to be matched; the length of the target sample character refers to the number of characters in the target sample sentence. Further, there is a large difference in character length between the sentence to be matched and the target sample sentence. For example, the character length of the sentence to be matched is 4-6 characters, while the character length of the target sample is 12-16 characters.

In a specific embodiment, before step S10, it further includes:

S01: Obtain a sentence to be matched and a target sample text; the target sample text includes multiple sentences.

The target sample text is the text waiting to be detected whether there is a sentence semantically matching the sentence to be matched, and the target sample text contains multiple sentences. It is understandable that the segmentation processing based on the period form is performed on the target sample text, that is, a sentence in the target sample text that ends with a period is segmented (because usually a complete sentence contains an independent semantic information). Generally, the sentence to be matched is a single sentence, and if there are multiple periods in the sentence to be matched, it can also be split.

S02: Input the to-be-matched sentence and each of the sentences into a preset similarity recognition model, and determine the semantic similarity between the to-be-matched sentence and each of the sentences.

The preset similarity recognition model may be a model pre-trained by methods such as machine learning, and the preset similarity recognition model is used to determine the semantic similarity between two sentences.

Specifically, after acquiring the sentence to be matched and the target sample text, input each sentence in the sentence to be matched and the target sample text into the preset similarity recognition model, and perform edit distance calculation on the sentence to be matched and each sentence, or The German coefficient is calculated to determine the semantic similarity between the sentence to be matched and each sentence.

S03: Determine the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences.

S04: when the highest semantic similarity is less than a preset similarity threshold and the difference between it and the preset similarity threshold is less than the preset similarity difference, the sentence corresponding to the highest semantic similarity is Recorded as the target sample sentence.

Wherein, the preset similarity threshold may be set according to the requirements of the actual application scenario. Exemplarily, the preset similarity threshold may be set to 0.9, 0.95, or the like. The preset similarity difference value can be any value from 0.1 to 0.5.

Understandably, in inputting the sentence to be matched and each of the sentences into the preset similarity recognition model, the semantic similarity between the to-be-matched sentence and each of the sentences is determined, due to the needs discussed here. Select the sentence with the highest semantic similarity to the sentence to be matched and the most matching sentence, so determine the highest semantic similarity in the semantic similarity corresponding to the sentence to be matched and each sentence, and compare the highest semantic similarity with the sentence. The preset similarity thresholds are compared, and when the highest semantic similarity is less than the preset similarity threshold, the difference between the highest semantic similarity and the preset similarity threshold is determined, and the difference is compared with the preset similarity. The difference is compared, and when the difference is smaller than the preset similarity difference, the sentence corresponding to the highest semantic similarity is recorded as the target sample sentence.

In the prior art, if the preset similarity recognition model determines that the semantic similarity between two sentences is less than the preset similarity threshold, it will determine that the two sentences are not similar. For sentences corresponding to less than the preset similarity threshold, it is determined whether the difference between the semantic similarity and the preset similarity threshold is less than the preset similarity difference, and further semantic similarity judgment is performed through steps S10-S40.

Further, when the highest semantic similarity is smaller than the preset similarity threshold and the difference between it and the preset similarity threshold is greater than or equal to the preset similarity difference, the further steps S10-S40 are not performed. Semantic similarity judgment.

In another specific embodiment, after determining the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences, the method further includes:

When the highest semantic similarity is greater than or equal to a preset similarity threshold, the sentence corresponding to the highest semantic similarity is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.

Understandably, when the highest semantic similarity is greater than or equal to the preset similarity threshold, it means that the sentence corresponding to the highest semantic similarity is a sentence that is semantically matched with the sentence to be matched, then the sentence corresponding to the highest semantic similarity is directly matched with the highest semantic similarity. The corresponding sentence is recorded as a semantic matching sentence corresponding to the sentence to be matched.

In a specific embodiment, before step S10, that is, before obtaining the length of the characters to be matched corresponding to the sentences to be matched, and the length of the target sample characters corresponding to each of the target sample sentences, the steps include:

(1) Obtaining a preset text recognition model; wherein, the preset text recognition model may be a word2vec or bert model trained based on a large number of training samples, and the preset text recognition model is used to perform word vector conversion on sentences.

(2) Inputting the sentence to be matched into the preset text recognition model to obtain the word vector to be matched corresponding to the sentence to be matched; at the same time, inputting the target sample sentence into the preset text recognition model In the model, the target sample word vector corresponding to the target sample sentence is obtained.

Specifically, after obtaining the sentence to be matched and the target sample sentence, a preset text recognition model is obtained, and the sentence to be matched is input into the preset text recognition model, and word embedding processing is performed on the sentence to be matched, that is, word segmentation is performed on the sentence to be matched. After processing, it is converted into a word vector, and the word vector to be matched corresponding to the sentence to be matched is obtained; in the same way, the target sample sentence is input into the preset text recognition model, and word embedding processing is performed on the target sample sentence to obtain the target sample sentence. The corresponding target sample word vector.

(3) Determine the character length of the to-be-matched sentence according to each of the to-be-matched word vectors; and simultaneously determine the target sample character length of the target sample sentence according to each of the target sample word vectors.

Specifically, inputting the to-be-matched sentence into the preset text recognition model to obtain a to-be-matched word vector corresponding to the to-be-matched sentence; at the same time, inputting the target sample sentence into the preset text In the recognition model, after obtaining the target sample word vector corresponding to the target sample sentence, the length of the to-be-matched characters of the to-be-matched sentence is determined according to the specific number of each to-be-matched word vector; at the same time, the target sample is determined according to the specific number of each target sample word vector. The target sample character length of the sentence.

S20: When the length of the character to be matched is less than the character length of the target sample, record the length of the character to be matched as the window length of the sliding window.

Specifically, after obtaining the sentence to be matched and the target sample sentence, and comparing the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence, if the length of the character to be matched is smaller than the target sample character length, the length of the character to be matched is recorded as the window length of the sliding window.

S30: Slide the sliding window on the target sample sentence, and match the target sample field of the target sample sentence covered by the sliding window with the to-be-matched sentence to obtain a first semantic distance result.

Among them, the target sample field refers to the character segment covered by the sliding window in the target sample sentence. The word sense distance result represents whether there is key sense information between the sentence to be matched and the target sample sentence.

Specifically, as shown in FIG. 3, in step S30, it includes:

S301: Align the start characters of the target sample sentence with the start characters of the to-be-matched sentence, and record the first target sample field covered by the sliding window as the first intercepted sentence.

Wherein, the starting character refers to the character at the starting position (ie, the first position) in the sentence.

Specifically, after recording the length of the character to be matched as the window length of the sliding window, align the starting character of the target sample sentence with the starting character of the sentence to be matched, in order to start from the starting character of the target sample sentence Perform sliding window coverage to avoid missing character information. Then, the sliding window is covered on the target sample sentence, and the first target sample field in the covered target sample sentence is recorded as the first intercepted sentence. Understandably, the length of the first intercepted sentence is equal to the length of the window and also equal to the length of the characters to be matched.

S302: Perform semantic matching on the first intercepted sentence and the to-be-matched sentence to obtain a semantic result of the to-be-matched sentence and the first intercepted sentence.

Among them, a word sense result can be regarded as a word sense distance value, that is, it represents the word sense distance between each intercepted sentence and the sentence to be matched.

Specifically, after aligning the starting character of the target sample sentence with the starting character of the sentence to be matched, and recording the first target sample field covered by the sliding window as the first intercepted sentence, the first The intercepted sentence and the to-be-matched sentence are semantically matched to obtain a semantic result of the to-be-matched sentence and the first intercepted sentence, where the semantic result represents whether the first intercepted sentence and the to-be-matched sentence are semantically similar. It should be noted that the semantic matching here is based on sentence structure information for semantic matching judgment, and the sentence structure information represents whether the sentence character composition between the first intercepted sentence and the sentence to be matched is similar, or whether the structure is similar (such as sentence The structure is subject, predicate, object, etc.), which can be used as a supplement to semantic information.

S303: Slide the sliding window to the right by one character length on the target sample sentence, and record the second target sample field covered by the sliding window as the second intercepted sentence.

Specifically, after semantic matching is performed between the first intercepted sentence and the to-be-matched sentence, and the semantic result of the to-be-matched sentence and the first intercepted sentence is obtained, the relationship between the first intercepted sentence and the to-be-matched sentence is represented. After the semantic matching has been completed, slide the sliding window to the right by one character length on the target sample sentence, and record the second target sample field covered by the sliding window as the second intercepted sentence; understandably, the character length of the second intercepted sentence Equal to the character length of the sentence to be matched.

S304: Perform semantic matching on the second intercepted sentence and the to-be-matched sentence to obtain a semantic result of the to-be-matched sentence and the second intercepted sentence.

Specifically, after sliding the sliding window to the right by one character length on the target sample sentence, and recording the second target sample field covered by the sliding window as the second intercepted sentence, the second intercepted sentence is The sentence and the to-be-matched sentence are semantically matched to obtain a semantic result of the to-be-matched sentence and the second intercepted sentence, where the semantic result represents whether the second intercepted sentence and the to-be-matched sentence are semantically similar.

S305: When it is detected that the end character of the sliding window has been aligned with the end character of the target sample sentence, record all semantic results as first semantic distance results.

Among them, the end character refers to the last character in the sentence.

Specifically, after the above steps S301 to S304, if it is detected that the end character of the sliding window has been aligned with the end character of the target sample sentence, all semantic results are recorded as the first semantic distance result; if the end point of the current sliding window is If the character is not aligned with the end character of the target sample sentence, it means that there are still unrecognized characters in the target sample sentence, then continue to move the sliding window until it is detected that the end character of the sliding window has been aligned with the target sample sentence. end character alignment.

S40: Determine a first semantic score between the sentence to be matched and the target sample sentence according to the result of the first semantic distance corresponding to the sentence to be matched and the target sample sentence.

The first semantic score indicates the semantic similarity between the target sample sentence and the sentence to be matched. The higher the first semantic score indicates that the target sample sentence contains more key semantic information matching the sentence to be matched.

Specifically, as shown in FIG. 4 , in step S40, that is, according to the first word sense distance result corresponding to the sentence to be matched and the target sample sentence, determine the distance between the sentence to be matched and the target sample sentence. The first semantic score, including:

S401: Perform derivation processing on a first word sense distance result corresponding to the sentence to be matched and the target sample sentence, to obtain a word meaning curve corresponding to the first word meaning distance result.

The first word sense distance result can be regarded as a continuous density sequence formed by integrating multiple word sense results, and a word sense curve corresponding to the first word sense distance result can be obtained by derivation of the first word sense distance result.

S402: Determine whether there is a word-meaning peak in the word-meaning curve through a peak-seeking identification algorithm.

The peak-seeking identification algorithm is used to find out whether a word-meaning peak appears in the word-meaning curve, and the word-meaning peak is used to represent the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word-meaning curve. In this embodiment, the peak-seeking identification algorithm can perform a global search in the word meaning curve. During the global search process, if the word meaning curve has a point where the curve first rises and then falls, it is a word meaning peak.

Specifically, after the derivation process is performed on the first word sense distance result corresponding to the sentence to be matched and the target sample sentence, and the word sense curve corresponding to the target sample sentence is obtained, the peak search algorithm is used to identify the word meaning curve in the word sense curve. Find out whether there is a word meaning peak, if there is a word meaning peak in the word meaning curve, the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve can be determined according to the word meaning peak.

S403: When there is a word meaning peak in the word meaning curve, determine a first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve according to the word meaning peak.

Specifically, after determining whether there is a word meaning peak in the word meaning curve by a peak-seeking identification algorithm, when there is a word meaning peak in the word meaning curve, according to the peak size of the word meaning peak, or the area occupied by the word meaning peak, Determine the first semantic score between the sentence to be matched and the target sample sentence corresponding to the semantic curve. Exemplarily, the larger the peak value of the word sense peak, the higher the first semantic score; or the larger the area occupied by the word sense peak, the higher the first semantic score.

S404: When the word meaning curve does not have a word meaning peak, determine that the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve is 0.

Specifically, after determining whether there is a word sense peak in the word sense curve through a peak-seeking identification algorithm, when there is no word sense peak in the word sense curve, it indicates that there is a mismatch between the target sample sentence corresponding to the word sense curve and the sentence to be matched , it is determined that the first semantic score between the sentence to be matched and the target sample sentence corresponding to the semantic curve is 0.

S50: When the first semantic score exceeds a preset score threshold, record the target sample sentence as a semantic matching sentence corresponding to the sentence to be matched.

The preset score threshold may be determined according to different application scenarios, and for example, the preset score threshold may be a value such as 90 or 95.

Specifically, after determining the first semantic score between the to-be-matched sentence and the target sample sentence according to the first lexical distance result corresponding to the to-be-matched sentence and the target sample sentence, compare the first semantic score with the predicted Set the score threshold for comparison, when the first semantic score exceeds the preset score threshold, record the target sample sentence as the semantic matching sentence corresponding to the sentence to be matched; when the first semantic score does not exceed the preset score The target sample sentence does not semantically match the sentence to be matched.

In this embodiment, a sliding window indicator is defined to match the target sample field of the target sample sentence covered by the sliding window with the to-be-matched sentence to obtain a first word sense distance result, and then according to the The first semantic distance result determines the first semantic score between the sentence to be matched and the target sample sentence, so as to determine whether there is part of the semantic information in the target sample sentence that matches the sentence to be matched, so that the target sample that would not be recalled originally Sentence (when the semantic similarity between the target sample sentence and the sentence to be matched is less than the preset similarity threshold, it will be directly determined that the target sample sentence does not match the sentence to be matched), and there is a possibility of being recalled. Provide more sample data for some target scenes lacking samples, and also improve the accuracy of semantic matching between short sentences and long sentences, so that the semantic matching similarity between long and short sentences is higher.

In one embodiment, as shown in FIG. 5 , after step S10, it further includes:

S60: When the length of the character to be matched is greater than or equal to the character length of the target sample, record the character length of the target sample as the window length of the sliding window.

Specifically, after obtaining the sentence to be matched and the target sample sentence, and comparing the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence, if the length of the character to be matched is greater than or equal to target sample character length, record the target sample character length as the window length of the sliding window.

S70: Slide the sliding window on the sentence to be matched, and match the to-be-matched field of the sentence to be matched covered by the sliding window with the target sample sentence to obtain a second semantic distance result.

Specifically, when the length of the character to be matched is greater than or equal to the length of the target sample character, after recording the length of the target sample character as the window length of the sliding window, the starting character of the target sample sentence is compared with the The starting characters of the sentence to be matched are aligned, and the first field to be matched covered by the sliding window (that is, the field in the sentence to be matched that is made up of characters of the same length as the window length) is recorded as the third interception sentence. , perform semantic matching between the third intercepted sentence and the target sample sentence to obtain the semantic result of the target sample sentence and the third intercepted sentence; slide the sliding window to the right on the to-be-matched sentence One character length, the second to-be-matched field covered by the sliding window is recorded as the fourth intercepted sentence, and the fourth intercepted sentence is semantically matched with the target sample sentence to obtain the target sample sentence and the described target sample sentence. The fourth intercepts the semantic results of the sentence; when it is detected that the end character of the sliding window has been aligned with the end character of the sentence to be matched, all semantic results are recorded as the second semantic distance results.

S80: According to the result of the second semantic distance corresponding to the sentence to be matched and the target sample sentence, determine that a second semantic score is obtained between the sentence to be matched and the target sample sentence.

The second semantic score indicates the semantic similarity between the target sample sentence and the sentence to be matched. A higher second semantic score indicates that the target sample sentence contains more key semantic information matching the sentence to be matched.

Specifically, sliding the sliding window on the to-be-matched sentence, matching the to-be-matched field of the to-be-matched sentence covered by the sliding window with the target sample sentence, and after obtaining the second lexical distance result, Perform derivation processing on the second word sense distance result to obtain a word meaning curve corresponding to the second word meaning distance result; determine whether there is a word meaning peak in the word meaning curve through a peak-seeking identification algorithm; whether there is a word meaning peak in the word meaning curve When the word sense peak is detected, the second semantic score between the sentence to be matched and the target sample sentence corresponding to the word sense curve is determined according to the word meaning peak. When the word meaning curve does not have a word meaning peak, it is determined that the second semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve is 0.

S90: When the second semantic score exceeds the preset score threshold, record the target sample sentence as a semantic matching sentence corresponding to the to-be-matched sentence.

Specifically, after determining the second semantic score between the to-be-matched sentence and the target sample sentence according to the second semantic distance result corresponding to the to-be-matched sentence and the target sample sentence, the second semantic score is compared with the predicted The score threshold is set for comparison, and when the second semantic score exceeds the preset score threshold, the target sample sentence is recorded as the semantic matching sentence corresponding to the sentence to be matched; when the second semantic score does not exceed the preset score threshold, it is characterized. The target sample sentence does not semantically match the sentence to be matched.

It should be understood that the size of the sequence numbers of the steps in the above embodiments does not mean the sequence of execution, and the execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

In one embodiment, a long-short sentence text semantic matching device is provided, and the long and short sentence text semantic matching device is in one-to-one correspondence with the long-short sentence text semantic matching method in the above embodiment. As shown in FIG. 6 , the apparatus for semantic matching of long and short sentences includes a sentence acquisition module 10 , a first window length recording module 20 , a first sentence matching module 30 , a first semantic score determination module 40 and a first matched sentence determination module 50 . The detailed description of each functional module is as follows:

The sentence obtaining module 10 is configured to obtain a sentence to be matched and a target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence;

The first window length recording module 20 is configured to record the length of the character to be matched as the window length of the sliding window when the length of the character to be matched is less than the length of the character of the target sample;

The first sentence matching module 30 is used to slide the sliding window on the target sample sentence, and match the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain the first sentence. a semantic distance result;

The first semantic score determination module 40 is configured to determine the first semantic score between the sentence to be matched and the target sample sentence according to the first semantic distance result corresponding to the sentence to be matched and the target sample sentence;

The first matching sentence determination module 50 is configured to record the target sample sentence as a semantic matching sentence corresponding to the to-be-matched sentence when the first semantic score exceeds a preset score threshold.

Preferably, the long-short sentence text semantic matching device further comprises:

a sample text obtaining module, used to obtain a sentence to be matched and a target sample text; the target sample text contains a plurality of sentences;

a semantic similarity determination module, configured to input the to-be-matched sentence and each of the sentences into a preset similarity recognition model to determine the semantic similarity between the to-be-matched sentence and each of the sentences;

a highest similarity determination module, used for determining the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences;

The target sample sentence determination module is used to determine the semantic similarity with the highest semantic similarity when the highest semantic similarity is smaller than the preset similarity threshold and the difference between it and the preset similarity threshold is smaller than the preset similarity difference The sentence corresponding to the degree is recorded as the target sample sentence.

The semantic matching sentence recording module is configured to record the sentence corresponding to the highest semantic similarity as the semantic matching sentence corresponding to the to-be-matched sentence when the highest semantic similarity is greater than or equal to a preset similarity threshold.

The text recognition model acquisition module is used to acquire the preset text recognition model;

A word vector determination module, configured to input the sentence to be matched into the preset text recognition model to obtain a word vector to be matched corresponding to the sentence to be matched; at the same time, input the target sample sentence into the In the preset text recognition model, the target sample word vector corresponding to the target sample sentence is obtained;

A character length determination module, configured to determine the length of the characters to be matched in the sentence to be matched according to each of the word vectors to be matched; at the same time, determine the length of the target sample characters of the target sample sentence according to each of the target sample word vectors.

Preferably, as shown in Figure 7, the first sentence matching module 30 includes the following units:

Character alignment unit 301, for aligning the starting character of the target sample sentence with the starting character of the sentence to be matched, and recording the first target sample field covered by the sliding window as the first intercepted sentence;

a first semantic matching unit 302, configured to perform semantic matching on the first intercepted sentence and the to-be-matched sentence, to obtain a semantic result of the to-be-matched sentence and the first intercepted sentence;

Window sliding unit 303, for sliding the sliding window to the right by one character length on the target sample sentence, and recording the second target sample field covered by the sliding window as the second interception sentence;

The second semantic matching unit 304 is configured to perform semantic matching between the second intercepted sentence and the to-be-matched sentence to obtain a semantic result of the to-be-matched sentence and the second intercepted sentence;

The lexical distance result recording unit 305 is configured to record all semantic results as the first lexical distance result when it is detected that the end character of the sliding window has been aligned with the end character of the target sample sentence.

Preferably, as shown in FIG. 8 , the first semantic score determination module 40 includes:

A word meaning curve determination unit 401, configured to perform derivation processing on the first word meaning distance result corresponding to the sentence to be matched and the target sample sentence, to obtain a word meaning curve corresponding to the first word meaning distance result;

A word meaning peak determining unit 402, configured to determine whether there is a word meaning peak in the word meaning curve through a peak-seeking identification algorithm;

A first semantic score determination unit 403, configured to determine, according to the word meaning peak, a first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve when there is a word meaning peak in the word meaning curve;

The second semantic score determining unit 404 is configured to determine that the first semantic score between the sentence to be matched and the target sample sentence corresponding to the semantic curve is 0 when the word meaning curve does not have a word meaning peak.

A second window length recording module, configured to record the length of the target sample character as the window length of the sliding window when the length of the character to be matched is greater than or equal to the length of the target sample character;

The second sentence matching module is used to slide the sliding window on the sentence to be matched, and match the to-be-matched field of the sentence to be matched covered by the sliding window with the target sample sentence to obtain the second sentence. lexical distance results;

A second semantic score determination module, configured to determine a second semantic score between the to-be-matched sentence and the target sample sentence according to the second semantic distance result corresponding to the to-be-matched sentence and the target sample sentence;

The second matching sentence determination module is configured to record the target sample sentence as a semantic matching sentence corresponding to the to-be-matched sentence when the second semantic score exceeds the preset score threshold.

For the specific limitation of the apparatus for semantic matching of long and short sentences, please refer to the above limitation on the method for semantic matching of long and short sentences, which will not be repeated here. Each module in the above-mentioned apparatus for semantic matching of long and short sentences can be implemented in whole or in part by software, hardware and combinations thereof. The above modules can be embedded in or independent of the processor in the computer device in the form of hardware, or stored in the memory in the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

In one embodiment, a computer device is provided, and the computer device may be a server, and its internal structure diagram may be as shown in FIG. 9 . The computer device includes a processor, memory, a network interface, and a database connected by a system bus. Among them, the processor of the computer device is used to provide computing and control capabilities. The memory of the computer device includes a readable storage medium, an internal memory. The readable storage medium stores an operating system, computer readable instructions and a database. The internal memory provides an environment for the execution of the operating system and computer-readable instructions in the readable storage medium. The database of the computer device is used to store the data used in the text semantic matching method of long and short sentences in the above embodiment. The network interface of the computer device is used to communicate with an external terminal through a network connection. The computer readable instructions, when executed by a processor, implement a method for semantic matching of long and short sentences. The readable storage medium provided by this embodiment includes a non-volatile readable storage medium and a volatile readable storage medium.

In one embodiment, there is provided a computer apparatus comprising a memory, a processor, and computer readable instructions stored in the memory and executable on the processor, wherein the processor executes the computer The following steps are implemented when readable instructions:

In one embodiment, one or more readable storage media are provided having computer-readable instructions stored thereon, wherein the computer-readable instructions, when executed by one or more processors, cause the one or more processing The device performs the following steps:

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a non-volatile computer. In a readable storage medium or a volatile computer-readable storage medium, the computer-readable instructions, when executed, may include the processes of the foregoing method embodiments. Wherein, any reference to memory, storage, database or other medium used in the various embodiments provided in this application may include non-volatile and/or volatile memory. Nonvolatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory may include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Road (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Those skilled in the art can clearly understand that, for the convenience and simplicity of description, only the division of the above-mentioned functional units and modules is used as an example for illustration. In practical applications, the above-mentioned functions can be allocated to different functional units, Module completion, that is, dividing the internal structure of the device into different functional units or modules to complete all or part of the functions described above.

The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, but not to limit them; although the present application has been described in detail with reference to the above-mentioned embodiments, those of ordinary skill in the art should understand that: it can still be used for the above-mentioned implementations. The technical solutions described in the examples are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions in the embodiments of the application, and should be included in the within the scope of protection of this application.

Claims

A method for semantic matching of long and short sentences, including:

Obtain the sentence to be matched and the target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence;

When the length of the character to be matched is less than the length of the character of the target sample, the length of the character to be matched is recorded as the window length of the sliding window;

Sliding the sliding window on the target sample sentence, and matching the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain a first semantic distance result;

Determine the first semantic score between the sentence to be matched and the target sample sentence according to the first semantic distance result corresponding to the sentence to be matched and the target sample sentence;

When the first semantic score exceeds a preset score threshold, the target sample sentence is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.
The method for semantic matching of long and short sentence texts according to claim 1, wherein the acquiring the sentence to be matched and the target sample sentence, the length of the character to be matched corresponding to the sentence to be matched and the length of the target sample character corresponding to the target sample sentence Before making a length comparison, also include:

Obtain the sentence to be matched and the target sample text; the target sample text contains multiple sentences;

Inputting the to-be-matched sentence and each of the sentences into a preset similarity recognition model to determine the semantic similarity between the to-be-matched sentence and each of the sentences;

determining the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences;

When the highest semantic similarity is smaller than the preset similarity threshold and the difference between it and the preset similarity threshold is smaller than the preset similarity difference, the sentence corresponding to the highest semantic similarity is recorded as target sample sentence.
The method for semantic matching of long and short sentences as claimed in claim 2, wherein after determining the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences, the method further comprises:

When the highest semantic similarity is greater than or equal to a preset similarity threshold, the sentence corresponding to the highest semantic similarity is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.
The method for semantic matching of long-short sentence text according to claim 1, wherein before acquiring the length of the characters to be matched corresponding to the sentences to be matched and the length of the target sample characters corresponding to each of the target sample sentences, the method comprises:

Get the preset text recognition model;

Inputting the to-be-matched sentence into the preset text recognition model to obtain a to-be-matched word vector corresponding to the to-be-matched sentence; at the same time, inputting the target sample sentence into the preset text recognition model, obtaining the target sample word vector corresponding to the target sample sentence;

The to-be-matched character length of the to-be-matched sentence is determined according to each of the to-be-matched word vectors; at the same time, the target sample character length of the target sample sentence is determined according to each of the target sample word vectors.
The method for text semantic matching of long and short sentences according to claim 1, wherein, sliding the sliding window on the target sample sentence, the target sample field of the target sample sentence covered by the sliding window is compared with the target sample field of the target sample sentence. Match the sentences to be matched, and get the first word sense distance result, including:

Align the starting character of the target sample sentence with the starting character of the sentence to be matched, and record the first target sample field covered by the sliding window as the first intercepted sentence;

The first intercepted sentence and the to-be-matched sentence are semantically matched to obtain the semantic result of the to-be-matched sentence and the first intercepted sentence;

On the target sample sentence, slide the sliding window to the right by one character length, and record the second target sample field covered by the sliding window as the second intercepted sentence;

The second intercepted sentence and the to-be-matched sentence are semantically matched to obtain the semantic result of the to-be-matched sentence and the second intercepted sentence;

When it is detected that the end character of the sliding window has been aligned with the end character of the target sample sentence, all semantic results are recorded as the first lexical distance result.
The method for text semantic matching of long and short sentences according to claim 1, wherein the first distance between the to-be-matched sentence and the target sample sentence is determined according to the result of the lexical distance corresponding to the to-be-matched sentence and the target sample sentence. Semantic scores, including:

Perform derivation processing on the first word sense distance result corresponding to the sentence to be matched and the target sample sentence, to obtain a word meaning curve corresponding to the first word sense distance result;

Determine whether there is a word-meaning peak in the word-meaning curve by using a peak-seeking identification algorithm;

When there is a word meaning peak in the word meaning curve, determine the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve according to the word meaning peak;

When the word meaning curve does not have a word meaning peak, it is determined that the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve is 0.
The method for semantic matching of long-short sentence text according to claim 1, wherein after the length comparison of the length of the characters to be matched corresponding to the sentence to be matched and the length of the target sample character corresponding to the target sample sentence, the method further comprises:

When the length of the character to be matched is greater than or equal to the length of the target sample character, recording the length of the target sample character as the window length of the sliding window;

sliding the sliding window on the to-be-matched sentence, and matching the to-be-matched field of the to-be-matched sentence covered by the sliding window with the target sample sentence to obtain a second lexical distance result;

According to the second semantic distance result corresponding to the to-be-matched sentence and the target sample sentence, determine that a second semantic score is obtained between the to-be-matched sentence and the target sample sentence;

When the second semantic score exceeds the preset score threshold, the target sample sentence is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.
A long-short sentence text semantic matching device, comprising:

a sentence obtaining module, configured to obtain a sentence to be matched and a target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence;

a first window length recording module, configured to record the length of the character to be matched as the window length of the sliding window when the length of the character to be matched is less than the length of the character of the target sample;

The first sentence matching module is used to slide the sliding window on the target sample sentence, and match the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain the first sentence. lexical distance results;

a first semantic score determination module, configured to determine a first semantic score between the to-be-matched sentence and the target sample sentence according to the first semantic distance result corresponding to the to-be-matched sentence and the target sample sentence;

The first matching sentence determination module is configured to record the target sample sentence as a semantic matching sentence corresponding to the to-be-matched sentence when the first semantic score exceeds a preset score threshold.
A computer device comprising a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, wherein the processor implements the following steps when executing the computer-readable instructions:

Obtain the sentence to be matched and the target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence;

When the length of the character to be matched is less than the length of the character of the target sample, the length of the character to be matched is recorded as the window length of the sliding window;

Sliding the sliding window on the target sample sentence, and matching the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain a first semantic distance result;

Determine the first semantic score between the sentence to be matched and the target sample sentence according to the first semantic distance result corresponding to the sentence to be matched and the target sample sentence;

When the first semantic score exceeds a preset score threshold, the target sample sentence is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.
The computer device according to claim 9, wherein the acquiring the sentence to be matched and the target sample sentence, and comparing the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence Before, the processor also implements the following steps when executing the computer-readable instructions:

Obtain the sentence to be matched and the target sample text; the target sample text contains multiple sentences;

Inputting the to-be-matched sentence and each of the sentences into a preset similarity recognition model to determine the semantic similarity between the to-be-matched sentence and each of the sentences;

determining the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences;

When the highest semantic similarity is smaller than the preset similarity threshold and the difference between it and the preset similarity threshold is smaller than the preset similarity difference, the sentence corresponding to the highest semantic similarity is recorded as target sample sentence.
The computer device according to claim 10, wherein after determining the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences, the processor executes the computer executable The following steps are also implemented when reading the command:

When the highest semantic similarity is greater than or equal to a preset similarity threshold, the sentence corresponding to the highest semantic similarity is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.
The computer device according to claim 9, wherein before acquiring the length of the characters to be matched corresponding to the sentences to be matched and the length of the target sample characters corresponding to each of the target sample sentences, the processor executes the When the computer-readable instructions are described, the following steps are also implemented:

Get the preset text recognition model;

Inputting the to-be-matched sentence into the preset text recognition model to obtain a to-be-matched word vector corresponding to the to-be-matched sentence; at the same time, inputting the target sample sentence into the preset text recognition model, obtaining the target sample word vector corresponding to the target sample sentence;

The to-be-matched character length of the to-be-matched sentence is determined according to each of the to-be-matched word vectors; at the same time, the target sample character length of the target sample sentence is determined according to each of the target sample word vectors.
The computer device according to claim 9, wherein, by sliding the sliding window on the target sample sentence, a target sample field of the target sample sentence covered by the sliding window is compared with the sentence to be matched Match to get the first word sense distance result, including:

Align the starting character of the target sample sentence with the starting character of the sentence to be matched, and record the first target sample field covered by the sliding window as the first intercepted sentence;

The first intercepted sentence and the to-be-matched sentence are semantically matched to obtain the semantic result of the to-be-matched sentence and the first intercepted sentence;

On the target sample sentence, slide the sliding window to the right by one character length, and record the second target sample field covered by the sliding window as the second intercepted sentence;

The second intercepted sentence and the to-be-matched sentence are semantically matched to obtain the semantic result of the to-be-matched sentence and the second intercepted sentence;

When it is detected that the end character of the sliding window has been aligned with the end character of the target sample sentence, all semantic results are recorded as the first lexical distance result.
The computer device according to claim 9, wherein the first semantic score between the to-be-matched sentence and the target sample sentence is determined according to the lexical distance result corresponding to the to-be-matched sentence and the target sample sentence, include:

Perform derivation processing on the first word sense distance result corresponding to the sentence to be matched and the target sample sentence, to obtain a word meaning curve corresponding to the first word sense distance result;

Determine whether there is a word-meaning peak in the word-meaning curve by using a peak-seeking identification algorithm;

When there is a word meaning peak in the word meaning curve, determine the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve according to the word meaning peak;

When the word meaning curve does not have a word meaning peak, it is determined that the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve is 0.
One or more readable storage media storing computer-readable instructions, wherein the computer-readable instructions, when executed by one or more processors, cause the one or more processors to perform the following steps:

Obtain the sentence to be matched and the target sample sentence, and compare the length of the character to be matched corresponding to the sentence to be matched with the length of the target sample character corresponding to the target sample sentence;

When the length of the character to be matched is less than the length of the character of the target sample, the length of the character to be matched is recorded as the window length of the sliding window;

Sliding the sliding window on the target sample sentence, and matching the target sample field of the target sample sentence covered by the sliding window with the sentence to be matched to obtain a first semantic distance result;

Determine the first semantic score between the sentence to be matched and the target sample sentence according to the first semantic distance result corresponding to the sentence to be matched and the target sample sentence;

When the first semantic score exceeds a preset score threshold, the target sample sentence is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.
The readable storage medium according to claim 15, wherein, in acquiring the sentence to be matched and the target sample sentence, the length of the characters to be matched corresponding to the sentence to be matched is compared with the length of the target sample character corresponding to the target sample sentence. Before the length comparison, the computer-readable instructions, when executed by one or more processors, cause the one or more processors to further perform the following steps:

Obtain the sentence to be matched and the target sample text; the target sample text contains multiple sentences;

Inputting the to-be-matched sentence and each of the sentences into a preset similarity recognition model to determine the semantic similarity between the to-be-matched sentence and each of the sentences;

determining the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences;

When the highest semantic similarity is smaller than the preset similarity threshold and the difference between it and the preset similarity threshold is smaller than the preset similarity difference, the sentence corresponding to the highest semantic similarity is recorded as target sample sentence.
The readable storage medium of claim 16, wherein after the determination of the highest semantic similarity among the semantic similarities corresponding to the sentences to be matched and the sentences, the computer-readable instructions are executed by When the one or more processors are executed, the one or more processors are caused to further perform the following steps:

When the highest semantic similarity is greater than or equal to a preset similarity threshold, the sentence corresponding to the highest semantic similarity is recorded as a semantic matching sentence corresponding to the to-be-matched sentence.
The readable storage medium according to claim 15, wherein before acquiring the length of the characters to be matched corresponding to the sentences to be matched and the length of the target sample characters corresponding to each of the target sample sentences, the computer can When the read instruction is executed by one or more processors, the one or more processors further perform the following steps:

Get the preset text recognition model;

Inputting the to-be-matched sentence into the preset text recognition model to obtain a to-be-matched word vector corresponding to the to-be-matched sentence; at the same time, inputting the target sample sentence into the preset text recognition model, obtaining the target sample word vector corresponding to the target sample sentence;

The to-be-matched character length of the to-be-matched sentence is determined according to each of the to-be-matched word vectors; at the same time, the target sample character length of the target sample sentence is determined according to each of the target sample word vectors.
The readable storage medium according to claim 15, wherein, by sliding the sliding window on the target sample sentence, the target sample field of the target sample sentence covered by the sliding window is compared with the to-be-to-be sample field. Match the sentence to match, and get the first word sense distance result, including:

Align the starting character of the target sample sentence with the starting character of the sentence to be matched, and record the first target sample field covered by the sliding window as the first intercepted sentence;

The first intercepted sentence is semantically matched with the described sentence to be matched, and the semantic result of the sentence to be matched and the first intercepted sentence is obtained;

On the target sample sentence, slide the sliding window to the right by one character length, and record the second target sample field covered by the sliding window as the second intercepted sentence;

The second intercepted sentence and the to-be-matched sentence are semantically matched to obtain the semantic result of the to-be-matched sentence and the second intercepted sentence;

When it is detected that the end character of the sliding window has been aligned with the end character of the target sample sentence, all semantic results are recorded as the first lexical distance result.
The readable storage medium according to claim 15, wherein the first semantics between the to-be-matched sentence and the target sample sentence is determined according to the result of the semantic distance corresponding to the to-be-matched sentence and the target sample sentence Score, including:

Perform derivation processing on the first word sense distance result corresponding to the sentence to be matched and the target sample sentence, to obtain a word meaning curve corresponding to the first word sense distance result;

Determine whether there is a word-meaning peak in the word-meaning curve by using a peak-seeking identification algorithm;

When there is a word meaning peak in the word meaning curve, determine the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve according to the word meaning peak;

When the word meaning curve does not have a word meaning peak, it is determined that the first semantic score between the sentence to be matched and the target sample sentence corresponding to the word meaning curve is 0.