CN113656547A - Text matching method, device, equipment and storage medium - Google Patents

Text matching method, device, equipment and storage medium Download PDF

Info

Publication number
CN113656547A
CN113656547A CN202110942420.1A CN202110942420A CN113656547A CN 113656547 A CN113656547 A CN 113656547A CN 202110942420 A CN202110942420 A CN 202110942420A CN 113656547 A CN113656547 A CN 113656547A
Authority
CN
China
Prior art keywords
statement
text
sentence
information
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110942420.1A
Other languages
Chinese (zh)
Other versions
CN113656547B (en
Inventor
沈越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110942420.1A priority Critical patent/CN113656547B/en
Publication of CN113656547A publication Critical patent/CN113656547A/en
Application granted granted Critical
Publication of CN113656547B publication Critical patent/CN113656547B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to artificial intelligence and provides a text matching method, a text matching device, text matching equipment and a storage medium. The method includes the steps of obtaining a search statement according to a text matching request when the text matching request is received, obtaining the length requirement of a statement dimension reduction model, carrying out coding processing on the search statement according to the length requirement to obtain statement codes, analyzing the statement codes based on the statement dimension reduction model to obtain statement information, carrying out normalization processing on the statement information to obtain statement features, obtaining a to-be-selected text and to-be-selected information according to the text matching request, carrying out filtering processing on the to-be-selected information to obtain to-be-selected features, calculating the text similarity between the search statement and the to-be-selected text according to the statement features and the to-be-selected features, and determining the to-be-selected text with the maximum text similarity as a target text. The invention can improve the text matching efficiency and the matching accuracy. In addition, the invention also relates to a block chain technology, and the target text can be stored in the block chain.

Description

Text matching method, device, equipment and storage medium
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a text matching method, a text matching device, text matching equipment and a storage medium.
Background
Text matching refers to matching out a text similar to the semanteme of a search sentence from a knowledge base, and the reading efficiency of a user can be improved by the text matching. In the current text matching implementation mode, the search sentence and each text to be selected are learned together based on the BERT model to select the most matched text, however, in this mode, the matching efficiency is low due to more repeated processing steps and more trained BERT model parameters.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a text matching method, device, apparatus and storage medium, which can improve matching efficiency and matching accuracy.
In one aspect, the present invention provides a text matching method, where the text matching method includes:
when a text matching request is received, acquiring a search statement according to the text matching request;
obtaining a pre-trained statement dimension reduction model and obtaining the length requirement of the statement dimension reduction model;
coding the search statement according to the length requirement to obtain statement codes;
analyzing the statement codes based on the statement dimension reduction model to obtain statement information;
normalizing the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain the characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
According to the preferred embodiment of the present invention, the obtaining a search statement according to the text matching request includes:
analyzing the message of the text matching request to obtain the data information carried by the message;
extracting a statement path and a statement identifier from the data information, and calculating the query total amount of the statement path and the statement identifier;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
and operating the query statement to obtain the search statement.
According to a preferred embodiment of the present invention, the encoding the search statement according to the length requirement to obtain a statement code includes:
splitting the search statement to obtain a plurality of search characters and a split serial number of each search character;
acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence mark;
splicing a preset identifier, the type identifier of the statement type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is larger than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or
If the coding length is smaller than the length requirement, taking the length difference value between the coding length and the length requirement as a filling digit, and filling the intermediate code to obtain the statement code; or
And if the coding length is equal to the length requirement, determining the intermediate code as the statement code.
According to the preferred embodiment of the present invention, before obtaining the pre-trained sentence dimensionality reduction model, the method further includes:
acquiring a learner and acquiring initial requirements of the learner;
obtaining a training sample, wherein the training sample comprises a sample sentence and a similar text;
extracting semantic codes of the similar texts;
coding the sample statement according to the initial requirement to obtain a sample code;
performing dimensionality reduction processing on the sample code based on the learner to obtain a predictive code;
and adjusting the initial requirement and the network parameters of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced any more, so as to obtain the statement dimension reduction model.
According to a preferred embodiment of the present invention, the sentence dimensionality reduction model includes a convolution layer, a pooling layer and a full-link layer, and the analyzing the sentence code based on the sentence dimensionality reduction model to obtain the sentence information includes:
performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the bias value to obtain the statement information.
According to a preferred embodiment of the present invention, the filtering the information to be selected to obtain the feature to be selected includes:
acquiring a preset list, wherein the preset list comprises initial representations of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the characteristics to be selected.
According to a preferred embodiment of the present invention, the calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristic and the feature to be selected includes:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the largest value from the character similarities as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
On the other hand, the present invention further provides a text matching apparatus, including:
the acquisition unit is used for acquiring a search statement according to a text matching request when the text matching request is received;
the obtaining unit is further configured to obtain a pre-trained statement dimension reduction model, and obtain a length requirement of the statement dimension reduction model;
the coding unit is used for coding the search statement according to the length requirement to obtain statement codes;
the analysis unit is used for analyzing the statement codes based on the statement dimension reduction model to obtain statement information;
the processing unit is used for carrying out normalization processing on the statement information to obtain statement characteristics;
the acquiring unit is further used for acquiring a plurality of texts to be selected and the information to be selected corresponding to each text to be selected according to the text matching request;
the filtering unit is used for filtering the information to be selected to obtain the characteristics to be selected;
the calculation unit is used for calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and the determining unit is used for determining the text to be selected with the maximum text similarity as the target text.
In another aspect, the present invention further provides an electronic device, including:
a memory storing computer readable instructions; and
a processor executing computer readable instructions stored in the memory to implement the text matching method.
In another aspect, the present invention further provides a computer-readable storage medium, in which computer-readable instructions are stored, and the computer-readable instructions are executed by a processor in an electronic device to implement the text matching method.
According to the technical scheme, the sentence information is normalized, so that the sentence characteristics and the features to be selected are in the same operation magnitude when the text similarity is calculated, the calculation accuracy of the text similarity is improved, the modular length of the sentence characteristics and the features to be selected is not required to be analyzed when the text similarity is calculated subsequently, and the calculation efficiency of the text similarity is improved. In addition, the global feature vectors of the search sentences and the texts to be selected are not directly generated, the text similarity is calculated by using the feature coding sequences of the search sentences and the texts to be selected on the low level, the relation between the search sentences and the texts to be selected can be analyzed from the fine-grained perspective, and the matching accuracy of the target texts is improved. In addition, the invention directly obtains the information to be selected corresponding to the text to be selected according to the text matching request without further analysis of the text to be selected, thereby improving the matching efficiency of the target text.
Drawings
FIG. 1 is a flow chart of a preferred embodiment of the text matching method of the present invention.
Fig. 2 is a functional block diagram of a preferred embodiment of the text matching apparatus of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device implementing a text matching method according to a preferred embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a flow chart of a text matching method according to a preferred embodiment of the present invention. The order of the steps in the flow chart may be changed and some steps may be omitted according to different needs.
The text matching method can acquire and process related data based on an artificial intelligence technology. Among them, Artificial Intelligence (AI) is a theory, method, technique and application system that simulates, extends and expands human Intelligence using a digital computer or a machine controlled by a digital computer, senses the environment, acquires knowledge and uses the knowledge to obtain the best result.
The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a robot technology, a biological recognition technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
The text matching method is applied to one or more electronic devices, which are devices capable of automatically performing numerical calculation and/or information processing according to computer readable instructions set or stored in advance, and the hardware thereof includes, but is not limited to, a microprocessor, an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (FPGA), a Digital Signal Processor (DSP), an embedded device, and the like.
The electronic device may be any electronic product capable of performing human-computer interaction with a user, for example, a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive Internet Protocol Television (IPTV), a smart wearable device, and the like.
The electronic device may include a network device and/or a user device. Wherein the network device includes, but is not limited to, a single network electronic device, an electronic device group consisting of a plurality of network electronic devices, or a Cloud Computing (Cloud Computing) based Cloud consisting of a large number of hosts or network electronic devices.
The network in which the electronic device is located includes, but is not limited to: the internet, a wide area Network, a metropolitan area Network, a local area Network, a Virtual Private Network (VPN), and the like.
And S10, when a text matching request is received, acquiring a search sentence according to the text matching request.
In at least one embodiment of the present invention, the text matching request carries data information such as a statement path and a statement identifier. The text matching request may be triggered by any user.
The search sentence refers to a sentence needing text semantic matching. For example, the search statement may be: text for weather comments.
In at least one embodiment of the present invention, the obtaining, by the electronic device, the search sentence according to the text matching request includes:
analyzing the message of the text matching request to obtain the data information carried by the message;
extracting a statement path and a statement identifier from the data information, and calculating the query total amount of the statement path and the statement identifier;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
and operating the query statement to obtain the search statement.
The sentence path refers to a path for storing the search sentence, and a plurality of sentences which need to be subjected to text matching are stored in the sentence path.
The sentence mark is a mark capable of uniquely identifying the search sentence.
The total number of objects of the query template is the same as the number of the total number of queries.
The appropriate query template can be obtained through the total amount, so that the query template is not required to be corrected when the query statement is generated, the generation efficiency of the query statement is improved, the search statement is further obtained through the query statement, and the statements corresponding to the statement identification do not need to be traversed in the statement path one by one, and the statement path does not need to be positioned, so that the obtaining efficiency of the search statement can be improved.
And S11, acquiring a pre-trained statement dimension reduction model and acquiring the length requirement of the statement dimension reduction model.
In at least one embodiment of the present invention, the statement dimension reduction model refers to a model for performing dimension reduction processing on the representation information.
The length requirement refers to the length of the characterization information input into the statement dimension reduction model. For example, the length requirement may be 128 bits.
In at least one embodiment of the present invention, before obtaining the pre-trained sentence dimensionality reduction model, the method further includes:
acquiring a learner and acquiring initial requirements of the learner;
obtaining a training sample, wherein the training sample comprises a sample sentence and a similar text;
extracting semantic codes of the similar texts;
coding the sample statement according to the initial requirement to obtain a sample code;
performing dimensionality reduction processing on the sample code based on the learner to obtain a predictive code;
and adjusting the initial requirement and the network parameters of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced any more, so as to obtain the statement dimension reduction model.
The initial requirement refers to the maximum length of the characterization information input into the learner, and the initial requirement is preset.
The coding distance refers to a difference value between the predictive coding and the semantic coding.
By adjusting the parameters of the initial requirement, the sentence codes input into the sentence dimension reduction model for analysis can be ensured to contain comprehensive characterization information, so that the information in the search sentences is prevented from being lost, and the dimension reduction accuracy of the sentence dimension reduction model on the sentence codes can be improved by adjusting the parameters of the network.
And S12, coding the search statement according to the length requirement to obtain statement codes.
In at least one embodiment of the present invention, the statement code refers to a vector representation of the search statement, the length of the vector representation being the length requirement.
In at least one embodiment of the present invention, the electronic device performs coding processing on the search statement according to the length requirement, and obtaining a statement code includes:
splitting the search statement to obtain a plurality of search characters and a split serial number of each search character;
acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence mark;
splicing a preset identifier, the type identifier of the statement type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is larger than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or
If the coding length is smaller than the length requirement, taking the length difference value between the coding length and the length requirement as a filling digit, and filling the intermediate code to obtain the statement code; or
And if the coding length is equal to the length requirement, determining the intermediate code as the statement code.
Wherein the plurality of search characters include characters such as punctuation marks in the search sentence.
The character mapping table stores a plurality of characters and vector representation of each character.
The sentence type refers to a sentence type of the search sentence, and accordingly, the type identifier refers to an identifier capable of indicating the sentence type. For example, the sentence type is a question sentence, and the type identifier may be Q.
The initial code is identified through the type identification, so that the statement code can be analyzed subsequently, the length of the statement code can be ensured through the relation between the code length and the length requirement, and the representation capability of the statement code on the search statement is improved.
Specifically, the splicing, by the electronic device, the character vector according to the split serial number to obtain an initial code includes:
and splicing the character vectors according to the split serial numbers from small to large to obtain the initial codes.
Specifically, the splicing, by the electronic device, the preset identifier, the type identifier of the sentence type, and the initial code to obtain the intermediate code includes:
splicing the type identification at the tail end of the preset identification to obtain splicing information;
and splicing the initial codes at the tail end of the splicing information to obtain the intermediate codes.
And S13, analyzing the statement code based on the statement dimension reduction model to obtain statement information.
In at least one embodiment of the present invention, the statement information refers to information obtained by performing dimension reduction processing on the statement code.
In at least one embodiment of the present invention, the statement dimension reduction model includes a convolution layer, a pooling layer, and a full link layer, and the analyzing, by the electronic device, the statement code based on the statement dimension reduction model to obtain the statement information includes:
performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the bias value to obtain the statement information.
Wherein the plurality of convolution kernels, the pooling function, the weight matrix, and the bias value are generated from training the learner.
The characterization capability of the convolution features on the search sentences can be further improved through the convolution layer, and the interference information in the convolution features can be removed through the pooling layer, so that the accuracy of the subsequent text similarity is improved.
And S14, carrying out normalization processing on the statement information to obtain statement characteristics.
In at least one embodiment of the present invention, the sentence characteristic refers to characteristic information in which the sentence information is between [0, 1 ].
In at least one embodiment of the present invention, the electronic device performs normalization processing on the sentence information, so as to ensure that the sentence features and the candidate features are in the same operation level, thereby improving the accuracy of the text similarity.
And S15, obtaining a plurality of texts to be selected and the information to be selected corresponding to each text to be selected according to the text matching request.
In at least one embodiment of the present invention, the candidate text refers to a text that needs to be matched with the search sentence.
The information to be selected is based on the representation information obtained by coding, dimension reduction and normalization processing of the text to be selected.
In at least one embodiment of the present invention, the acquiring, by the electronic device, a plurality of texts to be selected and candidate information corresponding to each text to be selected according to the text matching request includes:
extracting a text path from the data information;
determining all texts in the text path as the plurality of texts to be selected, and acquiring a text identifier of each text to be selected from the text path;
and acquiring the information to be selected from the vector list corresponding to the text path based on each text identifier.
And S16, filtering the information to be selected to obtain the characteristics to be selected.
In at least one embodiment of the present invention, the candidate feature refers to candidate information from which the predetermined stop word and the predetermined symbol are removed.
In at least one embodiment of the present invention, the filtering, by the electronic device, the information to be selected to obtain the feature to be selected includes:
acquiring a preset list, wherein the preset list comprises initial representations of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the characteristics to be selected.
By the above embodiment, after the text to be selected is encoded, the information to be selected is filtered, instead of filtering before the text to be selected is encoded, so that the preset stop words and preset characters with coding significance can be prevented from being eliminated, the representation accuracy of the information to be selected is improved, meanwhile, the simplification of the features to be selected can be improved by filtering the information to be selected, and the calculation efficiency of the text similarity is improved.
And S17, calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected.
In at least one embodiment of the present invention, the text similarity refers to a similarity between the search sentence and each text to be selected.
In at least one embodiment of the present invention, the calculating, by the electronic device, the text similarity between the search sentence and each text to be selected according to the sentence characteristic and the feature to be selected includes:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the largest value from the character similarities as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
And calculating the text similarity through the relationship between a first character feature in the sentence features and a second character feature in the features to be selected, wherein the first character feature and the second character feature belong to low-level feature coding sequences, so that the similarity between the search sentence and the text to be selected can be determined from fine granularity, and the accuracy of the text similarity is improved.
And S18, determining the text to be selected with the maximum text similarity as the target text.
In at least one embodiment of the present invention, the target text refers to a candidate text most similar to the search sentence.
It is emphasized that, to further ensure the privacy and security of the target text, the target text may also be stored in a node of a blockchain.
In at least one embodiment of the invention, the method further comprises:
acquiring a request number of the text matching request;
packaging the request number and the target text to obtain a feedback result;
and sending the feedback result to a trigger terminal of the text matching request.
Through the embodiment, the feedback result can be sent to the trigger terminal in time, and timeliness is improved.
According to the technical scheme, the sentence information is normalized, so that the sentence characteristics and the features to be selected are in the same operation magnitude when the text similarity is calculated, the calculation accuracy of the text similarity is improved, the modular length of the sentence characteristics and the features to be selected is not required to be analyzed when the text similarity is calculated subsequently, and the calculation efficiency of the text similarity is improved. In addition, the global feature vectors of the search sentences and the texts to be selected are not directly generated, the text similarity is calculated by using the feature coding sequences of the search sentences and the texts to be selected on the low level, the relation between the search sentences and the texts to be selected can be analyzed from the fine-grained perspective, and the matching accuracy of the target texts is improved. In addition, the invention directly obtains the information to be selected corresponding to the text to be selected according to the text matching request without further analysis of the text to be selected, thereby improving the matching efficiency of the target text.
Fig. 2 is a functional block diagram of a preferred embodiment of the text matching apparatus according to the invention. The text matching device 11 includes an obtaining unit 110, an encoding unit 111, an analyzing unit 112, a processing unit 113, a filtering unit 114, a calculating unit 115, a determining unit 116, an extracting unit 117, a dimension reducing unit 118, an adjusting unit 119, a packaging unit 120, and a sending unit 121. The module/unit referred to herein is a series of computer readable instruction segments that can be accessed by the processor 13 and perform a fixed function and that are stored in the memory 12. In the present embodiment, the functions of the modules/units will be described in detail in the following embodiments.
When receiving a text matching request, the obtaining unit 110 obtains a search sentence according to the text matching request.
In at least one embodiment of the present invention, the text matching request carries data information such as a statement path and a statement identifier. The text matching request may be triggered by any user.
The search sentence refers to a sentence needing text semantic matching. For example, the search statement may be: text for weather comments.
In at least one embodiment of the present invention, the obtaining unit 110 obtains the search sentence according to the text matching request includes:
analyzing the message of the text matching request to obtain the data information carried by the message;
extracting a statement path and a statement identifier from the data information, and calculating the query total amount of the statement path and the statement identifier;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
and operating the query statement to obtain the search statement.
The sentence path refers to a path for storing the search sentence, and a plurality of sentences which need to be subjected to text matching are stored in the sentence path.
The sentence mark is a mark capable of uniquely identifying the search sentence.
The total number of objects of the query template is the same as the number of the total number of queries.
The appropriate query template can be obtained through the total amount, so that the query template is not required to be corrected when the query statement is generated, the generation efficiency of the query statement is improved, the search statement is further obtained through the query statement, and the statements corresponding to the statement identification do not need to be traversed in the statement path one by one, and the statement path does not need to be positioned, so that the obtaining efficiency of the search statement can be improved.
The obtaining unit 110 obtains a pre-trained sentence dimensionality reduction model and obtains a length requirement of the sentence dimensionality reduction model.
In at least one embodiment of the present invention, the statement dimension reduction model refers to a model for performing dimension reduction processing on the representation information.
The length requirement refers to the length of the characterization information input into the statement dimension reduction model. For example, the length requirement may be 128 bits.
In at least one embodiment of the present invention, before obtaining a pre-trained sentence dimensionality reduction model, the obtaining unit 110 obtains a learner, and obtains an initial requirement of the learner;
the obtaining unit 110 obtains a training sample, where the training sample includes a sample sentence and a similar text;
the extraction unit 117 extracts semantic codes of the similar texts;
the encoding unit 111 encodes the sample statement according to the initial requirement to obtain a sample code;
the dimension reduction unit 118 performs dimension reduction processing on the sample code based on the learner to obtain a prediction code;
the adjusting unit 119 adjusts the initial requirement and the network parameter of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced any more, so as to obtain the statement dimension reduction model.
The initial requirement refers to the maximum length of the characterization information input into the learner, and the initial requirement is preset.
The coding distance refers to a difference value between the predictive coding and the semantic coding.
By adjusting the parameters of the initial requirement, the sentence codes input into the sentence dimension reduction model for analysis can be ensured to contain comprehensive characterization information, so that the information in the search sentences is prevented from being lost, and the dimension reduction accuracy of the sentence dimension reduction model on the sentence codes can be improved by adjusting the parameters of the network.
The encoding unit 111 performs encoding processing on the search statement according to the length requirement to obtain a statement code.
In at least one embodiment of the present invention, the statement code refers to a vector representation of the search statement, the length of the vector representation being the length requirement.
In at least one embodiment of the present invention, the encoding unit 111 performs encoding processing on the search statement according to the length requirement, and obtaining a statement code includes:
splitting the search statement to obtain a plurality of search characters and a split serial number of each search character;
acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence mark;
splicing a preset identifier, the type identifier of the statement type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is larger than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or
If the coding length is smaller than the length requirement, taking the length difference value between the coding length and the length requirement as a filling digit, and filling the intermediate code to obtain the statement code; or
And if the coding length is equal to the length requirement, determining the intermediate code as the statement code.
Wherein the plurality of search characters include characters such as punctuation marks in the search sentence.
The character mapping table stores a plurality of characters and vector representation of each character.
The sentence type refers to a sentence type of the search sentence, and accordingly, the type identifier refers to an identifier capable of indicating the sentence type. For example, the sentence type is a question sentence, and the type identifier may be Q.
The initial code is identified through the type identification, so that the statement code can be analyzed subsequently, the length of the statement code can be ensured through the relation between the code length and the length requirement, and the representation capability of the statement code on the search statement is improved.
Specifically, the splicing, by the encoding unit 111, the character vector according to the split sequence number to obtain an initial code includes:
and splicing the character vectors according to the split serial numbers from small to large to obtain the initial codes.
Specifically, the splicing, by the encoding unit 111, the preset identifier, the type identifier of the sentence type, and the initial code to obtain the intermediate code includes:
splicing the type identification at the tail end of the preset identification to obtain splicing information;
and splicing the initial codes at the tail end of the splicing information to obtain the intermediate codes.
The analysis unit 112 analyzes the sentence code based on the sentence dimension reduction model to obtain the sentence information.
In at least one embodiment of the present invention, the statement information refers to information obtained by performing dimension reduction processing on the statement code.
In at least one embodiment of the present invention, the sentence dimensionality reduction model includes a convolution layer, a pooling layer and a full-link layer, and the analyzing unit 112 analyzes the sentence code based on the sentence dimensionality reduction model to obtain the sentence information includes:
performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the bias value to obtain the statement information.
Wherein the plurality of convolution kernels, the pooling function, the weight matrix, and the bias value are generated from training the learner.
The characterization capability of the convolution features on the search sentences can be further improved through the convolution layer, and the interference information in the convolution features can be removed through the pooling layer, so that the accuracy of the subsequent text similarity is improved.
The processing unit 113 performs normalization processing on the sentence information to obtain a sentence characteristic.
In at least one embodiment of the present invention, the sentence characteristic refers to characteristic information in which the sentence information is between [0, 1 ].
In at least one embodiment of the present invention, the processing unit 113 performs normalization processing on the sentence information, so as to ensure that the sentence features and the candidate features are in the same operation level, thereby improving the accuracy of the text similarity.
The obtaining unit 110 obtains a plurality of texts to be selected and candidate information corresponding to each text to be selected according to the text matching request.
In at least one embodiment of the present invention, the candidate text refers to a text that needs to be matched with the search sentence.
The information to be selected is based on the representation information obtained by coding, dimension reduction and normalization processing of the text to be selected.
In at least one embodiment of the present invention, the obtaining unit 110 obtains a plurality of texts to be selected and candidate information corresponding to each text to be selected according to the text matching request, where the obtaining unit includes:
extracting a text path from the data information;
determining all texts in the text path as the plurality of texts to be selected, and acquiring a text identifier of each text to be selected from the text path;
and acquiring the information to be selected from the vector list corresponding to the text path based on each text identifier.
The filtering unit 114 performs filtering processing on the information to be selected to obtain the features to be selected.
In at least one embodiment of the present invention, the candidate feature refers to candidate information from which the predetermined stop word and the predetermined symbol are removed.
In at least one embodiment of the present invention, the filtering unit 114 performs filtering processing on the candidate information, and obtaining the candidate features includes:
acquiring a preset list, wherein the preset list comprises initial representations of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the characteristics to be selected.
By the above embodiment, after the text to be selected is encoded, the information to be selected is filtered, instead of filtering before the text to be selected is encoded, so that the preset stop words and preset characters with coding significance can be prevented from being eliminated, the representation accuracy of the information to be selected is improved, meanwhile, the simplification of the features to be selected can be improved by filtering the information to be selected, and the calculation efficiency of the text similarity is improved.
The calculation unit 115 calculates the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the features to be selected.
In at least one embodiment of the present invention, the text similarity refers to a similarity between the search sentence and each text to be selected.
In at least one embodiment of the present invention, the calculating unit 115 calculates the text similarity between the search sentence and each text to be selected according to the sentence characteristic and the feature to be selected includes:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the largest value from the character similarities as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
And calculating the text similarity through the relationship between a first character feature in the sentence features and a second character feature in the features to be selected, wherein the first character feature and the second character feature belong to low-level feature coding sequences, so that the similarity between the search sentence and the text to be selected can be determined from fine granularity, and the accuracy of the text similarity is improved.
The determining unit 116 determines the text to be selected with the largest text similarity as the target text.
In at least one embodiment of the present invention, the target text refers to a candidate text most similar to the search sentence.
It is emphasized that, to further ensure the privacy and security of the target text, the target text may also be stored in a node of a blockchain.
In at least one embodiment of the present invention, the obtaining unit 110 obtains a request number of the text matching request;
the packaging unit 120 packages the request number and the target text to obtain a feedback result;
the sending unit 121 sends the feedback result to the trigger terminal of the text matching request.
Through the embodiment, the feedback result can be sent to the trigger terminal in time, and timeliness is improved.
According to the technical scheme, the sentence information is normalized, so that the sentence characteristics and the features to be selected are in the same operation magnitude when the text similarity is calculated, the calculation accuracy of the text similarity is improved, the modular length of the sentence characteristics and the features to be selected is not required to be analyzed when the text similarity is calculated subsequently, and the calculation efficiency of the text similarity is improved. In addition, the global feature vectors of the search sentences and the texts to be selected are not directly generated, the text similarity is calculated by using the feature coding sequences of the search sentences and the texts to be selected on the low level, the relation between the search sentences and the texts to be selected can be analyzed from the fine-grained perspective, and the matching accuracy of the target texts is improved. In addition, the invention directly obtains the information to be selected corresponding to the text to be selected according to the text matching request without further analysis of the text to be selected, thereby improving the matching efficiency of the target text.
Fig. 3 is a schematic structural diagram of an electronic device implementing a text matching method according to a preferred embodiment of the present invention.
In one embodiment of the present invention, the electronic device 1 includes, but is not limited to, a memory 12, a processor 13, and computer readable instructions, such as a text matching program, stored in the memory 12 and executable on the processor 13.
It will be appreciated by a person skilled in the art that the schematic diagram is only an example of the electronic device 1 and does not constitute a limitation of the electronic device 1, and that it may comprise more or less components than shown, or some components may be combined, or different components, e.g. the electronic device 1 may further comprise an input output device, a network access device, a bus, etc.
The Processor 13 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. The processor 13 is an operation core and a control center of the electronic device 1, and is connected to each part of the whole electronic device 1 by various interfaces and lines, and executes an operating system of the electronic device 1 and various installed application programs, program codes, and the like.
Illustratively, the computer readable instructions may be partitioned into one or more modules/units that are stored in the memory 12 and executed by the processor 13 to implement the present invention. The one or more modules/units may be a series of computer readable instruction segments capable of performing specific functions, which are used for describing the execution process of the computer readable instructions in the electronic device 1. For example, the computer readable instructions may be divided into an acquisition unit 110, an encoding unit 111, an analysis unit 112, a processing unit 113, a filtering unit 114, a calculation unit 115, a determination unit 116, an extraction unit 117, a dimension reduction unit 118, an adjustment unit 119, an encapsulation unit 120, and a transmission unit 121.
The memory 12 may be used for storing the computer readable instructions and/or modules, and the processor 13 implements various functions of the electronic device 1 by executing or executing the computer readable instructions and/or modules stored in the memory 12 and invoking data stored in the memory 12. The memory 12 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the electronic device, and the like. The memory 12 may include non-volatile and volatile memories, such as: a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other storage device.
The memory 12 may be an external memory and/or an internal memory of the electronic device 1. Further, the memory 12 may be a memory having a physical form, such as a memory stick, a TF Card (Trans-flash Card), or the like.
The integrated modules/units of the electronic device 1 may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, all or part of the flow of the method according to the above embodiments may be implemented by hardware that is configured to be instructed by computer readable instructions, which may be stored in a computer readable storage medium, and when the computer readable instructions are executed by a processor, the steps of the method embodiments may be implemented.
Wherein the computer readable instructions comprise computer readable instruction code which may be in source code form, object code form, an executable file or some intermediate form, and the like. The computer-readable medium may include: any entity or device capable of carrying said computer readable instruction code, recording medium, U-disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM).
The block chain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product service layer, an application service layer, and the like.
In conjunction with fig. 1, the memory 12 of the electronic device 1 stores computer-readable instructions to implement a text matching method, and the processor 13 executes the computer-readable instructions to implement:
when a text matching request is received, acquiring a search statement according to the text matching request;
obtaining a pre-trained statement dimension reduction model and obtaining the length requirement of the statement dimension reduction model;
coding the search statement according to the length requirement to obtain statement codes;
analyzing the statement codes based on the statement dimension reduction model to obtain statement information;
normalizing the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain the characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
Specifically, the processor 13 may refer to the description of the relevant steps in the embodiment corresponding to fig. 1 for a specific implementation method of the computer readable instructions, which is not described herein again.
In the embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is only one logical functional division, and other divisions may be realized in practice.
The computer readable storage medium has computer readable instructions stored thereon, wherein the computer readable instructions when executed by the processor 13 are configured to implement the steps of:
when a text matching request is received, acquiring a search statement according to the text matching request;
obtaining a pre-trained statement dimension reduction model and obtaining the length requirement of the statement dimension reduction model;
coding the search statement according to the length requirement to obtain statement codes;
analyzing the statement codes based on the statement dimension reduction model to obtain statement information;
normalizing the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain the characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
The modules described as separate parts may or may not be physically separate, and parts displayed as modules may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment.
In addition, functional modules in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional module.
The present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any reference signs in the claims shall not be construed as limiting the claim concerned.
Furthermore, it is obvious that the word "comprising" does not exclude other elements or steps, and the singular does not exclude the plural. The plurality of units or devices may also be implemented by one unit or device through software or hardware. The terms first, second, etc. are used to denote names, but not any particular order.
Finally, it should be noted that the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting, and although the present invention is described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications or equivalent substitutions may be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions of the present invention.

Claims (10)

1. A text matching method, characterized in that the text matching method comprises:
when a text matching request is received, acquiring a search statement according to the text matching request;
obtaining a pre-trained statement dimension reduction model and obtaining the length requirement of the statement dimension reduction model;
coding the search statement according to the length requirement to obtain statement codes;
analyzing the statement codes based on the statement dimension reduction model to obtain statement information;
normalizing the statement information to obtain statement characteristics;
acquiring a plurality of texts to be selected and information to be selected corresponding to each text to be selected according to the text matching request;
filtering the information to be selected to obtain the characteristics to be selected;
calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and determining the text to be selected with the maximum text similarity as a target text.
2. The text matching method of claim 1, wherein the obtaining a search statement according to the text matching request comprises:
analyzing the message of the text matching request to obtain the data information carried by the message;
extracting a statement path and a statement identifier from the data information, and calculating the query total amount of the statement path and the statement identifier;
acquiring a query template according to the query total amount;
writing the statement path and the statement mark into the query template to obtain a query statement;
and operating the query statement to obtain the search statement.
3. The text matching method of claim 2, wherein the encoding the search sentence according to the length requirement to obtain a sentence code comprises:
splitting the search statement to obtain a plurality of search characters and a split serial number of each search character;
acquiring a character vector of each search character based on the character mapping table;
splicing the character vectors according to the splitting serial numbers to obtain initial codes;
determining the sentence type of the search sentence according to the sentence mark;
splicing a preset identifier, the type identifier of the statement type and the initial code to obtain an intermediate code, and calculating the code length of the intermediate code;
if the coding length is larger than the length requirement, processing the intermediate code according to the length requirement to obtain the statement code; or
If the coding length is smaller than the length requirement, taking the length difference value between the coding length and the length requirement as a filling digit, and filling the intermediate code to obtain the statement code; or
And if the coding length is equal to the length requirement, determining the intermediate code as the statement code.
4. The text matching method of claim 1, wherein prior to obtaining a pre-trained sentence dimensionality reduction model, the method further comprises:
acquiring a learner and acquiring initial requirements of the learner;
obtaining a training sample, wherein the training sample comprises a sample sentence and a similar text;
extracting semantic codes of the similar texts;
coding the sample statement according to the initial requirement to obtain a sample code;
performing dimensionality reduction processing on the sample code based on the learner to obtain a predictive code;
and adjusting the initial requirement and the network parameters of the learner according to the coding distance between the predictive coding and the semantic coding until the coding distance is not reduced any more, so as to obtain the statement dimension reduction model.
5. The text matching method of claim 1, wherein the sentence dimensionality reduction model comprises a convolution layer, a pooling layer and a full-link layer, and the analyzing the sentence coding based on the sentence dimensionality reduction model to obtain the sentence information comprises:
performing feature extraction on the statement codes based on a plurality of convolution cores in the convolution layer to obtain convolution features;
screening the convolution characteristics based on a pooling function in the pooling layer to obtain a pooling result;
acquiring a weight matrix and a bias value in the full connection layer;
and calculating the product of the pooling result and the weight matrix, and calculating the sum of the product and the bias value to obtain the statement information.
6. The text matching method according to claim 1, wherein the filtering the information to be selected to obtain the features to be selected comprises:
acquiring a preset list, wherein the preset list comprises initial representations of preset stop words and preset characters;
traversing the information to be selected based on the initial characterization;
and deleting the information which is the same as the initial characterization from the information to be selected to obtain the characteristics to be selected.
7. The text matching method of claim 1, wherein the calculating the text similarity of the search sentence and each text to be selected according to the sentence features and the features to be selected comprises:
for each text to be selected, extracting a first character feature from the sentence feature, and extracting a second character feature from the feature to be selected;
calculating the product of each first character feature and each second character feature to obtain character similarity;
selecting the similarity with the largest value from the character similarities as the target similarity of each first character feature;
and calculating the sum of the target similarity corresponding to each first character feature in the sentence features to obtain the text similarity.
8. A text matching apparatus, characterized in that the text matching apparatus comprises:
the acquisition unit is used for acquiring a search statement according to a text matching request when the text matching request is received;
the obtaining unit is further configured to obtain a pre-trained statement dimension reduction model, and obtain a length requirement of the statement dimension reduction model;
the coding unit is used for coding the search statement according to the length requirement to obtain statement codes;
the analysis unit is used for analyzing the statement codes based on the statement dimension reduction model to obtain statement information;
the processing unit is used for carrying out normalization processing on the statement information to obtain statement characteristics;
the acquiring unit is further used for acquiring a plurality of texts to be selected and the information to be selected corresponding to each text to be selected according to the text matching request;
the filtering unit is used for filtering the information to be selected to obtain the characteristics to be selected;
the calculation unit is used for calculating the text similarity between the search sentence and each text to be selected according to the sentence characteristics and the characteristics to be selected;
and the determining unit is used for determining the text to be selected with the maximum text similarity as the target text.
9. An electronic device, characterized in that the electronic device comprises:
a memory storing computer readable instructions; and
a processor executing computer readable instructions stored in the memory to implement the text matching method of any of claims 1-7.
10. A computer-readable storage medium characterized by: the computer-readable storage medium has stored therein computer-readable instructions that are executed by a processor in an electronic device to implement the text matching method of any one of claims 1 to 7.
CN202110942420.1A 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium Active CN113656547B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110942420.1A CN113656547B (en) 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110942420.1A CN113656547B (en) 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113656547A true CN113656547A (en) 2021-11-16
CN113656547B CN113656547B (en) 2023-06-30

Family

ID=78479901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110942420.1A Active CN113656547B (en) 2021-08-17 2021-08-17 Text matching method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113656547B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887192A (en) * 2021-12-06 2022-01-04 阿里巴巴达摩院(杭州)科技有限公司 Text matching method and device and storage medium
CN116108163A (en) * 2023-04-04 2023-05-12 之江实验室 Text matching method, device, equipment and storage medium
CN116383491A (en) * 2023-03-21 2023-07-04 北京百度网讯科技有限公司 Information recommendation method, apparatus, device, storage medium, and program product

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083623A1 (en) * 2015-09-21 2017-03-23 Qualcomm Incorporated Semantic multisensory embeddings for video search by text
CN109766424A (en) * 2018-12-29 2019-05-17 安徽省泰岳祥升软件有限公司 It is a kind of to read the filter method and device for understanding model training data
CN111209395A (en) * 2019-12-27 2020-05-29 铜陵中科汇联科技有限公司 Short text similarity calculation system and training method thereof
CN111427995A (en) * 2020-02-26 2020-07-17 平安科技(深圳)有限公司 Semantic matching method and device based on internal countermeasure mechanism and storage medium
CN111563387A (en) * 2019-02-12 2020-08-21 阿里巴巴集团控股有限公司 Sentence similarity determining method and device and sentence translation method and device
CN112966073A (en) * 2021-04-07 2021-06-15 华南理工大学 Short text matching method based on semantics and shallow features
CN113239700A (en) * 2021-04-27 2021-08-10 哈尔滨理工大学 Text semantic matching device, system, method and storage medium for improving BERT

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170083623A1 (en) * 2015-09-21 2017-03-23 Qualcomm Incorporated Semantic multisensory embeddings for video search by text
CN109766424A (en) * 2018-12-29 2019-05-17 安徽省泰岳祥升软件有限公司 It is a kind of to read the filter method and device for understanding model training data
CN111563387A (en) * 2019-02-12 2020-08-21 阿里巴巴集团控股有限公司 Sentence similarity determining method and device and sentence translation method and device
CN111209395A (en) * 2019-12-27 2020-05-29 铜陵中科汇联科技有限公司 Short text similarity calculation system and training method thereof
CN111427995A (en) * 2020-02-26 2020-07-17 平安科技(深圳)有限公司 Semantic matching method and device based on internal countermeasure mechanism and storage medium
CN112966073A (en) * 2021-04-07 2021-06-15 华南理工大学 Short text matching method based on semantics and shallow features
CN113239700A (en) * 2021-04-27 2021-08-10 哈尔滨理工大学 Text semantic matching device, system, method and storage medium for improving BERT

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113887192A (en) * 2021-12-06 2022-01-04 阿里巴巴达摩院(杭州)科技有限公司 Text matching method and device and storage medium
CN116383491A (en) * 2023-03-21 2023-07-04 北京百度网讯科技有限公司 Information recommendation method, apparatus, device, storage medium, and program product
CN116108163A (en) * 2023-04-04 2023-05-12 之江实验室 Text matching method, device, equipment and storage medium
CN116108163B (en) * 2023-04-04 2023-06-27 之江实验室 Text matching method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113656547B (en) 2023-06-30

Similar Documents

Publication Publication Date Title
CN111814466A (en) Information extraction method based on machine reading understanding and related equipment thereof
CN113656547B (en) Text matching method, device, equipment and storage medium
CN113094478B (en) Expression reply method, device, equipment and storage medium
CN113283675B (en) Index data analysis method, device, equipment and storage medium
CN113408268A (en) Slot filling method, device, equipment and storage medium
CN113536770B (en) Text analysis method, device and equipment based on artificial intelligence and storage medium
CN115222443A (en) Client group division method, device, equipment and storage medium
CN114037545A (en) Client recommendation method, device, equipment and storage medium
CN113705468A (en) Digital image identification method based on artificial intelligence and related equipment
CN113268597A (en) Text classification method, device, equipment and storage medium
CN111783425A (en) Intention identification method based on syntactic analysis model and related device
US11481389B2 (en) Generating an executable code based on a document
CN113420545B (en) Abstract generation method, device, equipment and storage medium
CN113627186B (en) Entity relation detection method based on artificial intelligence and related equipment
CN113342977B (en) Invoice image classification method, device, equipment and storage medium
CN113486169B (en) Synonymous statement generation method, device, equipment and storage medium based on BERT model
CN114942749A (en) Development method, device and equipment of approval system and storage medium
CN112989044B (en) Text classification method, device, equipment and storage medium
CN112989820B (en) Legal document positioning method, device, equipment and storage medium
CN113326365A (en) Reply statement generation method, device, equipment and storage medium
CN114842982A (en) Knowledge expression method, device and system for medical information system
CN114757205A (en) Text evaluation method, device, equipment and storage medium
CN113486680A (en) Text translation method, device, equipment and storage medium
CN112632264A (en) Intelligent question and answer method and device, electronic equipment and storage medium
CN113468334B (en) Ciphertext emotion classification method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant