CN114281931A - Text matching method, device, equipment, medium and computer program product - Google Patents

Text matching method, device, equipment, medium and computer program product Download PDF

Info

Publication number
CN114281931A
CN114281931A CN202111057181.8A CN202111057181A CN114281931A CN 114281931 A CN114281931 A CN 114281931A CN 202111057181 A CN202111057181 A CN 202111057181A CN 114281931 A CN114281931 A CN 114281931A
Authority
CN
China
Prior art keywords
sample data
matching
distance
model
target
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111057181.8A
Other languages
Chinese (zh)
Inventor
张子恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202111057181.8A priority Critical patent/CN114281931A/en
Publication of CN114281931A publication Critical patent/CN114281931A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a text matching method, a text matching device, text matching equipment, a text matching medium and a computer program product, and relates to the technical field of computers. The method comprises the following steps: acquiring training sample data, wherein the training sample data is marked with a sample label, and the training sample data comprises first sample data and second sample data; performing text matching on the training sample data through a text matching model to obtain a predicted matching result; determining a distance loss value based on a difference between the first sample data and the second sample data; determining a match loss value based on a difference between the sample label and the predicted match result; and training the text matching model based on the matching loss value and the distance loss value to obtain a target matching model, wherein the target matching model is used for matching the target text content to obtain a matching result. The distance loss value is introduced in the training process of the model, so that the accuracy of the model obtained through training is improved.

Description

Text matching method, device, equipment, medium and computer program product
Technical Field
The present application relates to the field of computer technologies, and in particular, to a text matching method, apparatus, device, medium, and computer program product.
Background
Text matching is an important basic problem in natural Language Processing, and can be applied to a large number of Natural Language Processing (NLP) tasks, for example, in medical term standardization tasks, knowledge map alignment tasks, medical question and answer matching tasks and other tasks in medical scenes, and all the tasks relate to text matching. The corresponding text matching model is established according to the task, and the parameters of the text matching model are trained by using the training data, so that the target model capable of completing the corresponding task is obtained.
During model training, due to the presence of a lot of noise or complex random expressions in the training data, e.g. for the medical term standardization task, a lot of inputs should not be normalized and should be rejected without giving a result, i.e. there is a "matching overhang" problem. In the related art, when dealing with the problem of "matching overhang", the rejection is generally performed by constructing a classification model in advance, that is, a classification model is preceded by a text matching model, and the classification model can be a support vector machine obtained through multiple training.
However, when the problem of "matching dangling" is solved in the manner of the aforementioned pre-classification model, it is difficult to construct training data thereof; and the performance of the preposed classification model can greatly influence the comprehensive performance of the whole task, once the classification model has a classification error condition, the subsequent text matching model can run a completely wrong result, so that the task performance is reduced, and the accuracy of the final model is low.
Disclosure of Invention
The embodiment of the application provides a text matching method, a text matching device, text matching equipment, a text matching medium and a computer program product, and the accuracy of a text matching model can be improved. The technical scheme is as follows:
in one aspect, a text matching method is provided, and the method includes:
acquiring training sample data, wherein the training sample data is marked with a sample label, the training sample data comprises first sample data and second sample data, the first sample data corresponds to a null matching relationship, and the second sample data corresponds to a reference matching relationship;
performing text matching on the training sample data through a text matching model to obtain a predicted matching result;
determining a distance loss value based on a difference between the first sample data and the second sample data;
determining a match loss value based on a difference between the sample label and the predicted match result;
training the text matching model based on the matching loss value and the distance loss value to obtain a target matching model, wherein the target matching model is used for matching target text content to obtain a matching result.
In another aspect, there is provided a text matching apparatus, the apparatus including:
the acquisition module is used for acquiring training sample data, wherein the training sample data is marked with a sample label, the training sample data comprises first sample data and second sample data, the first sample data corresponds to a null matching relationship, and the second sample data corresponds to a reference matching relationship;
the prediction module is used for performing text matching on the training sample data through a text matching model to obtain a prediction matching result;
a determination module to determine a distance loss value based on a difference between the first sample data and the second sample data;
the determination module is further configured to determine a match loss value based on a difference between the sample label and the predicted match result;
and the training module is used for training the text matching model based on the matching loss value and the distance loss value to obtain a target matching model, and the target matching model is used for matching target text contents to obtain a matching result.
In another aspect, a computer device is provided, where the terminal includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the text matching method according to any one of the embodiments of the present application.
In another aspect, a computer-readable storage medium is provided, in which at least one program code is stored, and the program code is loaded and executed by a processor to implement the text matching method described in any of the embodiments of the present application.
In another aspect, a computer program product or computer program is provided, the computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions to cause the computer device to execute the text matching method in any of the above embodiments.
The technical scheme provided by the application at least comprises the following beneficial effects:
in order to solve the problem of matching overhang of data with a null matching relationship in a text matching task, when a text matching model is trained through a loss function, a distance loss value capable of indicating the difference between first sample data and second sample data and a matching loss value capable of indicating the difference between a sample label and a predicted matching result are obtained, wherein the first sample data is sample data with the null matching relationship, and meanwhile, model parameters of the text matching model are trained according to the matching loss value and the distance loss value to obtain a target matching model capable of completing the text matching task. The accuracy of the obtained target matching model can be improved by increasing the training standard of the matching loss value in the training process of the model, meanwhile, the method can be applied to various text matching tasks, the influence of the 'matching overhang' problem on the text matching model is reduced uniformly, and the performance of the text matching model is improved.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic illustration of an implementation environment provided by an exemplary embodiment of the present application;
FIG. 2 is a flow diagram of a text matching method provided by an exemplary embodiment of the present application;
FIG. 3 is a flowchart of a distance loss value acquisition method provided by an exemplary embodiment of the present application;
FIG. 4 is a flow chart of a text matching method provided by another exemplary embodiment of the present application;
FIG. 5 is an architectural diagram of a twin network provided by an exemplary embodiment of the present application;
FIG. 6 is a block diagram of a text matching apparatus according to an exemplary embodiment of the present application;
FIG. 7 is a block diagram of a text matching apparatus according to another exemplary embodiment of the present application;
fig. 8 is a schematic structural diagram of a server according to an exemplary embodiment of the present application.
Detailed Description
To make the objects, technical solutions and advantages of the present application more clear, embodiments of the present application will be described in further detail below with reference to the accompanying drawings.
First, terms referred to in the embodiments of the present application are briefly described:
artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.
The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and the like.
Natural language processing is an important direction in the fields of computer science and artificial intelligence. It studies various theories and methods that enable efficient communication between humans and computers using natural language. Natural language processing is a science integrating linguistics, computer science and mathematics. Therefore, the research in this field will involve natural language, i.e. the language that people use everyday, so it is closely related to the research of linguistics. Natural language processing techniques typically include text processing, semantic understanding, machine translation, robotic question and answer, knowledge mapping, and the like.
Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The special research on how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and teaching learning.
Text matching: the method is an important basic problem in natural language processing, can be applied to a large number of NLP tasks, and the NLP tasks related to matching operation of texts according to preset criteria, such as information retrieval, question answering systems, question replying, dialogue systems, machine translation and the like, can be abstracted into text matching. Specifically, in the medical field, hundreds of different writing methods are often used for the same diagnosis, and the problem to be solved by medical term standardization is to find corresponding medical term standard expressions for various clinical expressions, i.e., to match relevant sentences in the clinic with standardized terms.
Match draping (Dangling Mapping): in the matching problem, a large amount of noise or complex random expression exists in data, which can greatly affect many text matching problems, for example, in the field of medical AI, for the medical term standardization task, a large amount of input should not be normalized and should be rejected and not give a result, for the medical knowledge graph alignment task, two graphs do not exist in a large number of entity nodes at the same time, so that a large number of entities should have predicted equivalent aligned entities, and the problems are collectively called as a 'matching overhang' problem.
The text matching method provided by the embodiment of the application is a general matching overhang recognition method based on multi-task learning, can migrate among a plurality of tasks or machine learning models, and uniformly solves the problem of matching overhang so as to improve the performance of each machine learning model.
Referring to fig. 1, a schematic diagram of an implementation environment provided by an exemplary embodiment of the present application is shown. The implementation environment includes: a terminal 110, a server 120 and a communication network 130.
The terminal 110 includes various types of terminal devices such as a mobile phone, a tablet computer, a desktop computer, and a laptop computer. The terminal 110 is configured to provide sample data to the server 120, where the sample data may be training sample data labeled with a sample label or initial sample data that is not labeled. In some embodiments, the terminal 110 is also used to indicate to the server 120 the text matching model or text matching task to be trained.
The server 120 is configured to provide a training function of a text matching model, where the server 120 performs training of the text matching model according to the text matching model or the text matching task to be trained, which is indicated by the terminal 110, by using the acquired training sample data, after the training of the target matching model is completed, the server 120 may receive target text content sent by the terminal 110, input the target text content into the target matching model, obtain a matching result, and return the matching result to the terminal 110. Alternatively, the server may send the trained target matching model directly to the terminal 110.
It should be noted that the server 120 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.
The Cloud Technology (Cloud Technology) is a hosting Technology for unifying series resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have its own identification mark and needs to be transmitted to a background system for logic processing, data in different levels are processed separately, and various industrial data need strong system background support and can only be realized through cloud computing.
In some embodiments, the server 120 described above may also be implemented as a node in a blockchain system. The Blockchain (Blockchain) is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism, an encryption algorithm and the like. The block chain, which is essentially a decentralized database, is a string of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, which is used for verifying the validity (anti-counterfeiting) of the information and generating a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.
The block chain underlying platform can comprise processing modules such as user management, basic service, intelligent contract and operation monitoring. The user management module is responsible for identity information management of all blockchain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and blockchain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, the user management module supervises and audits the transaction condition of certain real identities and provides rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node equipment and used for verifying the validity of the service request, recording the service request to storage after consensus on the valid request is completed, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the service information to a shared account (network communication) completely and consistently after encryption, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of upgrading and canceling the contracts; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process and visual output of real-time states in product operation.
Illustratively, the terminal 110 and the server 120 are connected via a communication network 130.
Referring to fig. 2, a text matching method according to an embodiment of the present application is shown, and in the embodiment of the present application, the method is applied to a server shown in fig. 1, and the method includes:
step 201, training sample data is obtained.
The training sample data is used for training the text matching model, and the training sample data is marked with a sample label, wherein the training sample data comprises first sample data and second sample data, the first sample data corresponds to a null matching relationship, and the second sample data corresponds to a reference matching relationship. The empty matching relationship means that the text content serving as sample data does not have any matching relationship with other text contents in the preset matching task, and the reference matching relationship means that the text content serving as sample data has a matching relationship with at least one text content in the other text contents in the preset matching task. Illustratively, when the preset matching task is an information retrieval task, for example, a medical knowledge base retrieval task, the data corresponding to the null matching relationship is index data unrelated to the medical knowledge base, and the data corresponding to the reference matching relationship includes index data in the medical knowledge base and similar data associated with the index data.
In some embodiments, the first sample data is training sample data labeled with a dangling label, and the second sample data is training sample data labeled with a reference label, where the dangling label indicates that the first sample data corresponds to a null matching relationship, and the reference label indicates a reference matching relationship corresponding to the second sample data.
In some embodiments, the training sample data is obtained by labeling the initial sample data with a label. Optionally, the process of annotating the initial sample data may be manually completed, or may be completed through an annotation module in the server. Schematically, acquiring initial sample data; responding to the fact that the matching relation of the initial sample data in the target task is the empty matching relation, and labeling a suspension label for the initial sample data to obtain first sample data; or, in response to that the matching relation of the initial sample data in the target task is a reference matching relation, marking a reference label for the initial sample data to obtain second sample data; the target task is used for indicating a text matching task which needs to be completed by the target matching model; and obtaining training sample data based on the first sample data and the second sample data.
Optionally, the target task includes at least one of a term standardization task, a knowledge graph alignment task, a question and answer matching task, a knowledge base retrieval task, a synonym mining task, and a knowledge graph entity chain indicating task.
The term normalization task is used to indicate matching of sentences and normalization terms in a preset field, for example, in the medical field, there are hundreds of different writing styles about the same diagnosis, and different sentence contents corresponding to the same diagnosis are normalized to correspond to the normalization terms corresponding to the diagnosis, i.e., the medical term normalization task.
The knowledge graph alignment task is used for indicating entities (entities) pointing to the same object to construct a matching relationship, wherein the knowledge graph is formed by a plurality of interconnected entities and attributes of the interconnected entities, the entities refer to objectively existing and mutually distinguishable real objects, and the entities are basic UANs of the knowledge graph and are important language units for bearing information in a text. In other words, the knowledge-graph is composed of a single piece of knowledge, wherein each piece of knowledge is represented as an SPO triple (Subject-predict-Object). The knowledge graph alignment, also called entity alignment, aims to judge whether entities of two or more different knowledge graphs point to the same object in the real world, and if a plurality of entities represent the same object, an alignment relationship is constructed between the entities. For example, "caries" and "tooth decay" both point to the same disease, and thus the two entities are brought into an aligned relationship.
The question-answer matching task is used for indicating that the input question is matched with the candidate answer. The question-answer matching task may be a search task for a preset field, that is, a text matching model that completes the question-answer matching task searches for candidate answers in an information search field corresponding to the preset field according to input question content, and if a target answer matching the input question content can be searched, the corresponding target answer is output.
And the knowledge base retrieval task is used for indicating that the input content is matched with the knowledge content in the preset knowledge base. And the input content and the output knowledge content have semantic matching relationship. Taking a knowledge base in the medical field as an example, when the input content is "tooth decay", the model is output by matching the knowledge content related to the "tooth decay" in the medical knowledge base according to the "tooth decay", for example, the clinical name "caries" corresponding to the "tooth decay" is output, and/or the knowledge content that the "tooth decay" belongs to bacterial diseases is output.
The synonym mining task is used for instructing to acquire an output vocabulary with similar meaning characteristics with the input vocabulary. Illustratively, the synonym mining task may also be for a predetermined domain, taking the medical domain as an example, the model may perform synonym matching according to the input vocabulary and the vocabulary domain corresponding to the medical domain, and output an output vocabulary having a similar meaning to the input vocabulary, for example, when the input vocabulary is "decayed tooth", the output vocabulary may be "decayed tooth", and the like.
The knowledge graph entity chain refers to a task used for indicating that entity content in a knowledge graph is matched with content in a preset form, wherein the preset form comprises at least one of a text form, an image form, a video form and a media data form. The Entity chain (Entity Linking) refers to Linking Entity elements in a text to Entity content in a preset library, and the form of the Entity content may be a text form, an image form, a video form, and a media data form. For example, if the input entity element is "decayed tooth", the output entity content may be the entity content such as decayed tooth picture, decayed tooth treatment video, etc. stored in the medical library.
Optionally, the server obtains training sample data from the terminal, for example, when the initial sample data completes the labeling process in the terminal, the terminal uploads the training sample data to the server; alternatively, the server acquires training sample data from the database, which is not limited herein.
Step 202, performing text matching on the training sample data through a text matching model to obtain a prediction matching result.
In some embodiments, the text matching model is a model to be trained, and the text matching model is determined according to a target task, that is, a target task is obtained, and the target task is used for indicating a text matching requirement realized by the text matching model; model information corresponding to the text matching model is obtained based on the target task, and the model information comprises at least one of model structure, initial parameters, model loss functions and the like. Illustratively, the terminal provides model information such as a model structure, initial parameters, a model loss function and the like of the text matching model for the server, the server constructs the text matching model according to the model information, and the text matching model is trained through training sample data; or the terminal provides task content corresponding to the target task to the server, the task content indicates a text matching task which needs to be completed by a target matching model obtained by the server training, the server obtains a text matching model capable of realizing the target task from a database, model parameters of the text matching model are initialized randomly, and the text matching model is trained according to obtained training sample data.
Illustratively, training sample data is input into the obtained text matching model, and a corresponding prediction matching result under the current model parameters is output and obtained, wherein the prediction matching result is used for training the current text matching model.
In some embodiments, the text matching model performs feature extraction on training sample data to obtain a sample feature vector. Illustratively, the text matching model may include one of a Convolutional Neural Network (CNN), a super-resolution test sequence (VGG) network, a Support Vector Machine (SVM) classifier, an Encoder (Encoder), and a Decoder (Decoder), and the specifically included network structure is determined by a specific text matching task, which is not limited herein.
Step 203, a distance loss value is determined based on a difference between the first sample data and the second sample data.
In some embodiments, the difference between the first sample data and the second sample data is embodied by a vector distance in a vector space. In order to solve the overhang problem, after a target matching model is obtained through training, the distance between the vector representation of the sample data marked with the overhang label and the vector representations of all other sample data is required to be larger than or equal to a preset vector distance. Schematically, acquiring a preset vector distance corresponding to the first sample data; acquiring difference data between the first sample data and the second sample data; a distance loss value is determined based on the difference data and a preset vector distance.
Optionally, the preset vector distance may be preset by the system, or may be determined according to training sample data. If the preset vector distance is preset by the system, the preset vector distance has an incidence relation with the target task or the text matching model.
In some embodiments, the distance information of the first sample data and the second sample data in the sample set in the vector space is determined as the difference data, i.e. the distance loss value is determined based on the difference between the distance information and a preset vector distance. The sample set includes all other sample data except the first sample data in the training sample data, or the sample set is a sample set obtained by randomly sampling second sample data based on the first sample data. Illustratively, the distance information includes, but is not limited to, at least one of euclidean distance, cosine similarity distance, mahalanobis distance, manhattan distance, and the like.
In this embodiment of the application, the server determines a distance loss value according to a difference between the distance information and a preset vector distance, and if the distance loss value is converged, it indicates that a vector representation of the first sample data after feature extraction by the current text matching model is far enough away from vector representations corresponding to other second sample data (i.e., positive samples) in a vector space, and the trained target matching model can correctly identify input text content, so that matching operation cannot be performed on text content that should not be matched, and the problem of matching overhang is solved.
A match loss value is determined based on the difference between the sample label and the predicted match result, step 204.
The matching loss value is determined by a loss function corresponding to the text matching model. Illustratively, a corresponding model loss function is obtained according to the text matching model, the prediction matching result is input into the model loss function, the loss condition between the prediction matching result and the sample label is output, and the text matching model is trained based on the loss condition.
In some embodiments, the model loss function may be provided by the terminal, for example, when uploading model information of a text matching model, the terminal uploads the model loss function together with the model information; alternatively, the model loss function may be obtained from a database, for example, the server reads the corresponding model loss function from the database according to the model identification of the text matching model.
In the embodiment of the present application, a fixed step order is not limited between step 203 and step 204, and step 203 may be executed first, step 204 may be executed first, or step 203 and step 204 may be executed simultaneously.
And step 205, training the text matching model based on the matching loss value and the distance loss value to obtain a target matching model.
The target matching model is used for matching the target text content to obtain a matching result.
In some embodiments, the matching loss value and the distance loss value are subjected to weighted summation according to a preset weight relation to obtain a target loss value; performing iterative training on model parameters of the text matching model based on the target loss value; and responding to the convergence of the target loss value to obtain a target matching model.
Illustratively, the preset weight relationship may be preset by the system, or may be obtained by adjusting parameters through the network. In one example, the preset weight relationship includes β corresponding to the matching loss value and γ corresponding to the distance loss value, β and γ are hyper-parameters obtained by adjusting parameters through the network, and the target loss value passes through the target loss functionNumber LnewIs obtained as shown in formula I, wherein LoldAs a model loss function, LxA distance loss function is used to find the distance loss value.
The formula I is as follows: l isnew=β*Lold+γ*Lx
After the target loss value is obtained, the current model parameter of the text matching model can be adjusted according to the target loss value, and the adjustment process of the model parameter through the matching loss value and the distance loss value is repeated until the text matching model is converged, so that the target matching model is obtained.
To sum up, in order to solve the problem of matching overhang of data having a null matching relationship in a text matching task, when a text matching model is trained through a loss function, a distance loss value capable of indicating a difference between first sample data and second sample data and a matching loss value capable of indicating a difference between a sample label and a predicted matching result are obtained, where the first sample data is sample data having a null matching relationship, and a model parameter of the text matching model is trained according to the matching loss value and the distance loss value to obtain a target matching model capable of completing the text matching task. The accuracy of the obtained target matching model can be improved by increasing the training standard of the matching loss value in the training process of the model, meanwhile, the method can be applied to various text matching tasks, the influence of the 'matching overhang' problem on the text matching model is reduced uniformly, and the performance of the text matching model is improved.
Referring to fig. 3, a distance loss value obtaining method according to an embodiment of the present application is shown, in the embodiment of the present application, a determination of a distance loss value in a text matching method is schematically illustrated, where the method includes:
step 301, randomly sampling the second sample data based on the first sample data to obtain a sample set.
Illustratively, the sample set includes a target amount of second sample data.
In some embodiments, the number of the sample sets obtained by random sampling may be one or more, and is not limited herein.
Step 302, determining a vector distance between the second sample data and the first sample data in the sample set.
Illustratively, feature extraction is performed on the first sample data and the second sample data respectively through a text matching model to obtain a first vector corresponding to the first sample data and a second vector corresponding to the second sample data, and a vector distance between the first vector and the second vector is obtained.
Step 303, determining the mean value of the vector distances between all the second sample data and the first sample data in the sample set as a preset vector distance.
Illustratively, the preset vector distance is determined by taking the mean of the vector distances between the target number of second vectors and the first vector in the sample set. In one example, the predetermined vector distance λ is obtained by formula twoxWherein, in the step (A),
Figure BDA0003255126510000121
representing a set of samples to which the first sample data x corresponds, n representing the number of second sample data in the set of samples, | |)2Representing the euclidean (L2) distance of two feature vectors in vector space, W is the non-linear mapping of the first vector x to the second vector y in the computed vector space.
The formula II is as follows:
Figure BDA0003255126510000122
step 304, determining a first Euclidean distance of the first sample data in the vector space.
Illustratively, to determine the distance loss function, it is necessary to determine distance information of the first sample data and the second sample data in the sample set in the vector space. In an embodiment of the present application, the distance information includes a first euclidean distance and a second euclidean distance. And calculating the L2 distance of the first vector in the vector space to obtain the first Euclidean distance. The distance information is described by taking only the euclidean distance as an example, and the distance information may be of a type such as a cosine similarity distance, a mahalanobis distance, and a manhattan distance, and is not limited herein.
Step 305, determining a non-linear mapping of the first sample data in the vector space to the second sample data in the sample set.
A non-linear mapping of the first vector to the second vector is calculated in vector space to find a second euclidean distance.
Step 306, determining a second Euclidean distance of the nonlinear mapping in the vector space.
In this embodiment, the L2 distance in the vector space of the nonlinear mapping of the first vector to the second vector is calculated to obtain the second euclidean distance, and the determined first euclidean distance and the second euclidean distance are determined as the distance information.
Step 307, a distance loss value is determined based on the difference between the distance information and the preset vector distance.
Illustratively, the absolute value of the difference between the preset vector distance and the second euclidean distance is determined, and the first euclidean distance and the absolute value of the target number in the sample set are accumulated to obtain a distance loss value. In one example, the distance loss value corresponds to a distance loss function LxAs shown in the formula three, wherein,
Figure BDA0003255126510000123
representing a set of samples to which the first sample data x corresponds, n representing the number of second sample data in the set of samples, | |)2Representing the Euclidean (L2) distance of two feature vectors in a vector space, W is a nonlinear mapping for computing a first vector x to a second vector y in the vector space, abs () is an absolute value, λxAnd alpha represents the hyper-parameter of the first sample data position, and the hyper-parameter is obtained through network parameter adjustment.
The formula III is as follows:
Figure BDA0003255126510000131
in summary, the distance loss value obtaining method provided in the embodiments of the present application determines the distance loss value by the distance loss function, which defaults to regarding "overhang" samples (first sample data) as so-called background samples, i.e. they should be farther away from other samples (i.e. positive samples) as better, here, first, a positive sample set N with a number N is randomly sampled for one "overhang" sample, where each sample y has a distance x of at least lambdaxDistance of, and lambdaxIs the average distance of all samples in the current set of random samples N to the "overhanging" sample x. In other words, the distance loss function LxIt is expected that for any randomly sampled sample set N, the distance of the "hanging" sample x to each sample in the set should be greater than or equal to its average distance. Since multiple different sample sets are randomly sampled for each "dangling" sample x, the vector representations of the "dangling" samples that can be reached after training are at least equal to or greater than lambda away from all sample vector representationsxAnd lambda isx>0。
Referring to fig. 4, a text matching method provided by an exemplary embodiment of the present application is illustrated, and in the embodiment of the present application, the method is applied to a medical term normalization task, which indicates that a mapping to a certain medical concept in a medical standard system is required for a certain non-standard query (query) input. The method comprises the following steps:
step 401, obtaining training sample data and a term standardization model.
The training sample data is used to train the term normalized model. In the embodiment of the present application, the model architecture of the standardized model uses a twin network architecture, and the network architecture may also be other network architectures, which are not limited herein.
As shown in fig. 5, which shows an architectural diagram of one of the twin networks described above for the task of medical term standardization, the term standardized model 500 is a twin network employing a 3-layer fully connected and shared weight, wherein, the input data 501 comprises wordA and wordB, the input data 501 is respectively modeled by a natural language model to obtain semantic word vectors 502, the semantic word vector 502 comprises embedA and embedB, the input data 501 are respectively designed manually to obtain an artificial feature vector 503, the artificial feature vector 503 comprises featA and featB, and after the semantic word vector 502 passes through three layers of dense blocks (dense)510, together with the artificial feature vector 503, to a fully connected layer (concat)520, the outputs of the two sub-networks (network a and network b) are connected to a hidden layer before the output layer, the similarity between the two input data 501 is evaluated by a contrast loss 530.
In some embodiments, the artificially designed artificial feature vector comprises: (1) a Chinese word editing distance; (2) chinese phonetic editing distance; (3) the number of the Chinese words is the same as that of the radicals; (4) the longest public subsequence of chinese words.
In the embodiment of the present application, the data format corresponding to the medical term normalization task is [ non-standard data, standard term ], and therefore, the data adopted by the model training of the twin network is labeled data (positive sample, i.e., the second sample data) and labeled data randomly replacing a standard word (negative sample, i.e., the first sample data).
And step 402, performing text matching on the training sample data through the term standardized model to obtain a prediction matching result.
Inputting training sample data into the term standardization model, and outputting a prediction matching result for evaluating the similarity between the input training sample data pairs, wherein the prediction matching result is obtained by prediction based on the model parameters corresponding to the current term standardization model.
And step 403, obtaining a model loss function corresponding to the term standardized model.
In the embodiment of the present application, a loss function L of the twin network during training is shown in formula four, where N is the number of all sample pairs, y is a training label, d is a euclidean distance between two input samples that are finally embedded, and margin is a predefined threshold (usually set to 0.5).
The formula four is as follows:
Figure BDA0003255126510000141
and step 404, determining a matching loss value corresponding to the predicted matching result based on the model loss function.
And inputting the predicted matching result into the model loss function, and obtaining a matching loss value corresponding to the predicted matching result, wherein for the model loss function, the form of the predicted matching result in the embodiment of the application is the final embedding of the input sample.
At step 405, a distance loss value is determined based on a difference between the first sample data and the second sample data.
In the embodiment of the present application, in combination with the distance loss value obtaining method given in steps 301 to 307, a distance loss function for determining the difference between positive and negative samples is determined, and the distance loss function is shown in formula five, where I is an identity matrix,
Figure BDA0003255126510000144
representing a set of samples to which the first sample data x corresponds, n representing the number of second sample data in the set of samples, | |)2Representing the Euclidean (L2) distance of two feature vectors in vector space, abs () is the absolute value, λxAnd alpha represents a hyper-parameter of the first sample data position, and is obtained by network parameter adjustment.
The formula five is as follows:
Figure BDA0003255126510000142
formula six:
Figure BDA0003255126510000143
and 406, training the term standardization model based on the matching loss value and the distance loss value to obtain a target standardization model.
The target standardization model is used for matching input data pairs to obtain a standardization result.
In the embodiment of the present application, the content of training the model by combining the matching loss value and the distance loss value is the same as that in step 205, and is not described herein again.
In one example, for the above medical term standardization task to be evaluated, a large amount of medical record data of hospital outpatient service is manually labeled by medical experts, 253 effective data are obtained, and 400 pieces of data are data that should not be normalized (i.e., there is a "match overhang" problem), illustratively, with the evaluation index employed as a perfect match, i.e. only if and only if all the standard words that are correct are output by the engine, please refer to table one, which shows the evaluation results, it can be seen from the table that the text matching method provided in the present application can achieve better normalization effect of the optimization engine on the diagnostic data of the real scene than the original engine, especially on the test data with the problem of "matching overhang", the promotion is more obvious, and the text matching method provided by the application is effective in solving the problem of 'matching overhang'.
Watch 1
Figure BDA0003255126510000151
To sum up, in order to solve the problem of matching overhang existing in the medical term standardization task, that is, a certain input should not be standardized, when the term standardization model is trained through a loss function, a matching loss value is determined based on a difference between first sample data and second sample data, a matching loss value of the term standardization model is determined based on a difference between a sample label of training sample data and a predicted matching result, and meanwhile, a model parameter of the term standardization model is trained according to the matching loss value and a distance loss value to obtain the term standardization model capable of completing the medical term standardization task. By increasing the training reference of the matching loss value in the training process of the model, the precision of the obtained term standardization model can be improved, and the performance of the term standardization model is improved.
In one example, the text matching method provided by the embodiment of the application can also be applied to a knowledge graph alignment task. Taking MultiKE as an example, MultiKE is a knowledge graph entity alignment method based on deep learning, and mainly utilizes a multi-view learning technology to learn a comprehensive vector embedding, namely a name view, a relationship view and an attribute view, based on multiple views. Schematically, the three views are schematically illustrated:
(1) name view: word embedding of words is first obtained by a pre-trained word embedding model, and character embedding is obtained by a pre-trained Skip-Gram model, as shown in formula seven, where LP converts characters or single shots into word embedding.
The formula seven:
Figure BDA0003255126510000161
then obtaining a final embedded representation of name input through a formula eight, wherein n is a positive integer, l is input, and an encoder is obtained by training a self-encoder [; and represents a splicing operation.
The formula eight: Φ (l) ═ encoder ([ LP (o) ]1);LP(o2);…;LP(on)])
And (4) passing the entity name through a module corresponding to a formula eight to obtain the name-based embedded expression of the entity, wherein the embedded expression is expressed by a formula nine.
The formula is nine: h is(1)=Φ(name(h))
(2) The relation view is as follows: in order to preserve the relation structure of the knowledge graph, a TransE model is adopted, and the relation is used as a transfer vector between a head entity and a tail entity, namely, the transfer vector is expressed by a formula ten, wherein | · | | represents L1 or L2 norm. An objective function can be constructed using the formula ten (e.g., using sigmoid as the activation function and using logistic loss). Through training, the resulting entities and relationship embeddings make the above equations as small as possible.
Formula ten: f. ofrel(h(2),r,t(2))=-‖h(2)+r-t(2)
(3) And (4) attribute view: for attributes, a convolutional neural network is used to extract entity features from the attributes and attribute values, as shown in equation eleven, where < a; and v > represents an attribute-attribute value pair of the entity, wherein W is a transformation matrix, Omega is a convolution kernel, and sigma is an activation function.
Formula eleven: CNN (< a; v >) ═ σ (vec (σ (< a; v >) W)
The goal of CNN is to make the output as close as possible to the embedded representation of the entity, i.e., as shown in equation twelve, where the objective of minimizing the above equation can be achieved by constructing a loss function.
Equation twelve: f. ofattr(h(3),a,v)=-‖h(3)-CNN(<a;v>)‖
After the steps corresponding to the three views are carried out, a plurality of embedded representations (h) of an entity can be obtained(1),h(2),h(3)). In addition to the above mentioned goals, these word-embedded representations also satisfy the goals on the training set (derived from the output of PARIS). For a relationship triplet (h, r, t) in the knowledge-graph G, (h, r, t ') also appears with a very high probability in G ' if t is known to be aligned with t '. Accordingly, taking the relational view as an example, an alignment loss function shown in the following formula thirteen can be constructed with the goal of maximizing the probability that this triplet (h, r, t') holds true.
Formula thirteen:
Figure BDA0003255126510000162
Figure BDA0003255126510000171
similarly, a similar alignment loss function can be constructed and optimized on the property view.
The above content is a loss function in the training of a standard MultiKE model, and in the embodiment of the present application, to deal with the problem of "matching dangling", the text matching method provided in the embodiment of the present application is introduced into the training framework of the whole model, that is, the distance loss function provided by formula three is added to the loss function of the training model.
In a complete training process, embedding of the name view is first obtained through pre-trained word/character embedding learning and training the self-encoder. And minimizing the objective function of the relation view in each training period, then minimizing the objective function of the attribute view, and finally minimizing the distance loss function corresponding to the newly introduced 'matching overhang' problem. And then minimizing the alignment loss function of the relationship view and the attribute view according to the training data. After training is complete, the embedded representations of multiple views of an entity are integrated to obtain a more comprehensive embedded representation of the entity. Then, according to the nearest neighbor search algorithm, for the unaligned entity, the nearest neighbor in the vector space is judged as the aligned paired entity.
Referring to fig. 6, a block diagram of a text matching apparatus according to an exemplary embodiment of the present application is shown, where the apparatus includes the following modules:
an obtaining module 610, configured to obtain training sample data, where the training sample data is labeled with a sample label, where the training sample data includes first sample data and second sample data, the first sample data corresponds to a null matching relationship, and the second sample data corresponds to a reference matching relationship;
the prediction module 620 is configured to perform text matching on the training sample data through a text matching model to obtain a prediction matching result;
a determining module 630 for determining a distance loss value based on a difference between the first sample data and the second sample data;
the determining module 630 is further configured to determine a match loss value based on a difference between the sample label and the predicted match result;
the training module 640 is configured to train the text matching model based on the matching loss value and the distance loss value to obtain a target matching model, where the target matching model is configured to match target text contents to obtain a matching result.
In an alternative embodiment, as shown in fig. 7, the determining module 630 further includes:
an obtaining unit 631 configured to obtain a preset vector distance corresponding to the first sample data;
the obtaining unit 631 is further configured to obtain difference data between the first sample data and the second sample data;
a determining unit 632, configured to determine the distance loss value based on the difference data and the preset vector distance.
In an optional embodiment, the determining module 630 further includes:
a sampling unit 633, configured to perform random sampling on the second sample data based on the first sample data, so as to obtain a sample set;
the determining unit 632, further configured to determine a vector distance between a second sample data in the sample set and the first sample data;
the determining unit 632 is further configured to determine a mean value of vector distances between all second sample data in the sample set and the first sample data as the preset vector distance.
In an optional embodiment, the determining unit 632 is further configured to determine, as the difference data, distance information of the first sample data and the second sample data in the sample set in a vector space;
the determining unit 632 is further configured to determine the distance loss value based on a difference between the distance information and the preset vector distance.
In an optional embodiment, the determining unit 632 is further configured to determine a first euclidean distance of the first sample data in the vector space;
the determining unit 632, further configured to determine a non-linear mapping of the first sample data in vector space to a second sample data in the sample set;
the determining unit 632 is further configured to determine a second euclidean distance of the non-linear mapping in the vector space;
the determining unit 632 is further configured to determine the difference data according to the first euclidean distance and the second euclidean distance.
In an optional embodiment, the set of samples includes a target amount of second sample data;
the determining unit 632 is further configured to determine an absolute value of a difference between the preset vector distance and the second euclidean distance;
the determining unit 632 is further configured to accumulate the first euclidean distance and the absolute value of the target number in the sample set to obtain the distance loss value.
In an optional embodiment, the obtaining module 610 is further configured to obtain initial sample data;
the device further comprises:
a labeling module 650, configured to perform label tagging on a dangling label for the initial sample data to obtain the first sample data in response to that a matching relationship of the initial sample data in a target task is the null matching relationship; or, in response to that the matching relationship of the initial sample data in the target task is the reference matching relationship, performing reference label labeling on the initial sample data to obtain the second sample data; the target task is used for indicating a text matching task which needs to be completed by the target matching model;
the obtaining module 610 is further configured to obtain the training sample data based on the first sample data and the second sample data.
In an optional embodiment, the target task comprises at least one of a medical term standardization task, a knowledge graph alignment task, a medical question and answer matching task, a medical knowledge base retrieval task, a medical synonym mining task, and a knowledge graph entity chain referring task;
wherein the term standardization task is used for indicating that sentences in a preset field are matched with standardized terms; the knowledge graph alignment task is used for indicating entities pointing to the same object to construct a matching relationship; the question-answer matching task is used for indicating that an input question is matched with a candidate answer; the knowledge base retrieval task is used for indicating that input content is matched with knowledge content in a preset knowledge base; the synonym mining task is used for indicating to obtain an output vocabulary with similar meaning characteristics with the input vocabulary; the knowledge graph entity chain refers to a task used for indicating that entity content in a knowledge graph is matched with content in a preset form, wherein the preset form comprises at least one of a text form, an image form, a video form and a media data form.
In an optional embodiment, the training module 640 further includes:
the calculating unit 641 is configured to perform weighted summation on the matching loss value and the distance loss value according to a preset weight relationship to obtain a target loss value;
a training unit 642, configured to perform iterative training on model parameters of the text matching model based on the target loss value;
a determining unit 643, configured to obtain the target matching model in response to convergence of the target loss value.
In an optional embodiment, the obtaining module 610 is further configured to obtain a target task, where the target task is used to indicate a text matching requirement implemented by the text matching model;
the determining module 630 is further configured to obtain model information corresponding to the text matching model based on the target task, where the model information includes at least one of a model structure, an initial parameter, a model loss function, and the like.
To sum up, in order to solve the problem of dangling matching of data having a null matching relationship in a text matching task, when a text matching model is trained through a loss function, a distance loss value capable of indicating a difference between first sample data and second sample data and a matching loss value capable of indicating a difference between a sample label and a predicted matching result are obtained, where the first sample data is sample data having the null matching relationship, and a model parameter of the text matching model is trained according to the matching loss value and the distance loss value to obtain a target matching model capable of completing the text matching task. The accuracy of the obtained target matching model can be improved by increasing the training standard of the matching loss value in the training process of the model, meanwhile, the method can be applied to various text matching tasks, the influence of the 'matching overhang' problem on the text matching model is reduced uniformly, and the performance of the text matching model is improved.
It should be noted that: the text matching apparatus provided in the foregoing embodiment is only illustrated by dividing the functional modules, and in practical applications, the functions may be distributed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the text matching device and the text matching method provided by the above embodiments belong to the same concept, and specific implementation processes thereof are described in the method embodiments in detail and are not described herein again.
Fig. 8 shows a schematic structural diagram of a server according to an exemplary embodiment of the present application. Specifically, the structure includes the following.
The server 800 includes a Central Processing Unit (CPU) 801, a system Memory 804 including a Random Access Memory (RAM) 802 and a Read Only Memory (ROM) 803, and a system bus 805 connecting the system Memory 804 and the CPU 801. The server 800 also includes a mass storage device 806 for storing an operating system 813, application programs 814, and other program modules 815.
The mass storage device 806 is connected to the central processing unit 801 through a mass storage controller (not shown) connected to the system bus 805. The mass storage device 806 and its associated computer-readable media provide non-volatile storage for the server 800. That is, the mass storage device 806 may include a computer-readable medium (not shown) such as a hard disk or Compact disk Read Only Memory (CD-ROM) drive.
Without loss of generality, computer readable media may comprise computer storage media and communication media. Computer storage media includes volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data. Computer storage media includes RAM, ROM, Erasable Programmable Read-Only Memory (EPROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), flash Memory or other solid state Memory technology, CD-ROM, Digital Versatile Disks (DVD), or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage, or other magnetic storage devices. Of course, those skilled in the art will appreciate that computer storage media is not limited to the foregoing. The system memory 804 and mass storage device 806 as described above may be collectively referred to as memory.
According to various embodiments of the present application, server 800 may also operate as a remote computer connected to a network through a network, such as the Internet. That is, the server 800 may be connected to the network 812 through the network interface unit 811 coupled to the system bus 805, or may be connected to other types of networks or remote computer systems (not shown) using the network interface unit 811.
The memory further includes one or more programs, and the one or more programs are stored in the memory and configured to be executed by the CPU.
Embodiments of the present application further provide a computer device, which includes a processor and a memory, where at least one instruction, at least one program, a set of codes, or a set of instructions is stored in the memory, and the at least one instruction, the at least one program, the set of codes, or the set of instructions is loaded and executed by the processor to implement the biometric identification method provided by the above-mentioned method embodiments. Alternatively, the computer device may be a terminal or a server.
Embodiments of the present application further provide a computer-readable storage medium having at least one instruction, at least one program, code set, or instruction set stored thereon, loaded and executed by a processor, to implement the biometric identification method provided by the above method embodiments.
Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the biometric method described in any of the above embodiments.
Optionally, the computer-readable storage medium may include: a Read Only Memory (ROM), a Random Access Memory (RAM), a Solid State Drive (SSD), or an optical disc. The Random Access Memory may include a resistive Random Access Memory (ReRAM) and a Dynamic Random Access Memory (DRAM). The above-mentioned serial numbers of the embodiments of the present application are merely for description and do not represent the merits of the embodiments.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (15)

1. A method of text matching, the method comprising:
acquiring training sample data, wherein the training sample data is marked with a sample label, the training sample data comprises first sample data and second sample data, the first sample data corresponds to a null matching relationship, and the second sample data corresponds to a reference matching relationship;
performing text matching on the training sample data through a text matching model to obtain a predicted matching result;
determining a distance loss value based on a difference between the first sample data and the second sample data;
determining a match loss value based on a difference between the sample label and the predicted match result;
training the text matching model based on the matching loss value and the distance loss value to obtain a target matching model, wherein the target matching model is used for matching target text content to obtain a matching result.
2. The method of claim 1, wherein said determining a distance loss value based on a difference between said first sample data and said second sample data comprises:
acquiring a preset vector distance corresponding to the first sample data;
obtaining difference data between the first sample data and the second sample data;
determining the distance loss value based on the difference data and the preset vector distance.
3. The method of claim 2, wherein the obtaining the preset vector distance corresponding to the first sample data comprises:
randomly sampling the second sample data based on the first sample data to obtain a sample set;
determining a vector distance between second sample data in the set of samples and the first sample data;
determining the preset vector distance as the mean of the vector distances between all second sample data in the sample set and the first sample data.
4. The method of claim 3, wherein said obtaining difference data between said first sample data and said second sample data comprises:
determining distance information in vector space between the first sample data and the second sample data in the sample set as the difference data.
5. The method of claim 4, wherein said determining distance information in vector space between said first sample data and said second sample data in said set of samples as said difference data comprises:
determining a first Euclidean distance of the first sample data within the vector space;
determining a non-linear mapping of the first sample data within a vector space to second sample data in the sample set;
determining a second Euclidean distance of the non-linear mapping within the vector space;
and determining the difference data according to the first Euclidean distance and the second Euclidean distance.
6. The method of claim 5, wherein the set of samples includes a target amount of second sample data;
said determining said distance loss value based on said difference data and said preset vector distance comprises:
determining an absolute value of a difference between the preset vector distance and the second Euclidean distance;
and accumulating the first Euclidean distance and the absolute value of the target number in the sample set to obtain the distance loss value.
7. The method according to any one of claims 1 to 6, wherein said obtaining training sample data comprises:
acquiring initial sample data;
responding to the fact that the matching relation of the initial sample data in the target task is the empty matching relation, and labeling a dangling label for the initial sample data to obtain first sample data; or, in response to that the matching relationship of the initial sample data in the target task is the reference matching relationship, performing reference label labeling on the initial sample data to obtain the second sample data; the target task is used for indicating a text matching task which needs to be completed by the target matching model;
and obtaining the training sample data based on the first sample data and the second sample data.
8. The method of claim 7,
the target task comprises at least one of a term standardization task, a knowledge graph alignment task, a question and answer matching task, a knowledge base retrieval task, a synonym mining task and a knowledge graph entity chain referring task;
wherein the term standardization task is used for indicating that sentences in a preset field are matched with standardized terms; the knowledge graph alignment task is used for indicating entities pointing to the same object to construct a matching relationship; the question-answer matching task is used for indicating that an input question is matched with a candidate answer; the knowledge base retrieval task is used for indicating that input content is matched with knowledge content in a preset knowledge base; the synonym mining task is used for indicating to obtain an output vocabulary with similar meaning characteristics with the input vocabulary; the knowledge graph entity chain refers to a task used for indicating that entity content in a knowledge graph is matched with content in a preset form, wherein the preset form comprises at least one of a text form, an image form, a video form and a media data form.
9. The method of any one of claims 1 to 6, wherein training the text matching model based on the matching cost value and the distance cost value to obtain a target matching model comprises:
carrying out weighted summation on the matching loss value and the distance loss value according to a preset weight relation to obtain a target loss value;
performing iterative training on model parameters of the text matching model based on the target loss value;
and responding to the convergence of the target loss value to obtain the target matching model.
10. The method according to any one of claims 1 to 6, wherein before said obtaining training sample data, further comprising:
acquiring a target task, wherein the target task is used for indicating a text matching requirement realized by the text matching model;
and obtaining model information corresponding to the text matching model based on the target task, wherein the model information comprises at least one of model structure, initial parameters, model loss functions and the like.
11. A text matching apparatus, characterized in that the apparatus comprises:
the acquisition module is used for acquiring training sample data, wherein the training sample data is marked with a sample label, the training sample data comprises first sample data and second sample data, the first sample data corresponds to a null matching relationship, and the second sample data corresponds to a reference matching relationship;
the prediction module is used for performing text matching on the training sample data through a text matching model to obtain a prediction matching result;
a determination module to determine a distance loss value based on a difference between the first sample data and the second sample data;
the determination module is further configured to determine a match loss value based on a difference between the sample label and the predicted match result;
and the training module is used for training the text matching model based on the matching loss value and the distance loss value to obtain a target matching model, and the target matching model is used for matching target text contents to obtain a matching result.
12. The apparatus of claim 11, wherein the determining module further comprises:
an obtaining unit, configured to obtain a preset vector distance corresponding to the first sample data;
the obtaining unit is further configured to obtain difference data between the first sample data and the second sample data;
a determining unit for determining the distance loss value based on the difference data and the preset vector distance.
13. A computer device comprising a processor and a memory, the memory having stored therein at least one instruction, at least one program, a set of codes, or a set of instructions, the at least one instruction, the at least one program, the set of codes, or the set of instructions being loaded and executed by the processor to implement a text matching method according to any one of claims 1 to 10.
14. A computer-readable storage medium, having at least one program code stored therein, the program code being loaded and executed by a processor to implement the text matching method according to any one of claims 1 to 10.
15. A computer program product comprising a computer program/instructions, characterized in that the computer program/instructions are stored in a computer readable storage medium. A processor of a computer device reads the computer program/instructions from a computer-readable storage medium, and the processor executes the computer program/instructions to cause the computer device to execute to implement the text matching method according to any one of claims 1 to 10.
CN202111057181.8A 2021-09-09 2021-09-09 Text matching method, device, equipment, medium and computer program product Pending CN114281931A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111057181.8A CN114281931A (en) 2021-09-09 2021-09-09 Text matching method, device, equipment, medium and computer program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111057181.8A CN114281931A (en) 2021-09-09 2021-09-09 Text matching method, device, equipment, medium and computer program product

Publications (1)

Publication Number Publication Date
CN114281931A true CN114281931A (en) 2022-04-05

Family

ID=80868522

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111057181.8A Pending CN114281931A (en) 2021-09-09 2021-09-09 Text matching method, device, equipment, medium and computer program product

Country Status (1)

Country Link
CN (1) CN114281931A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116167336A (en) * 2023-04-22 2023-05-26 拓普思传感器(太仓)有限公司 Sensor data processing method based on cloud computing, cloud server and medium
CN116860312A (en) * 2023-09-05 2023-10-10 成都智慧锦城大数据有限公司 Program abnormal text information maintenance method, device and storage medium
CN116910223A (en) * 2023-08-09 2023-10-20 北京安联通科技有限公司 Intelligent question-answering data processing system based on pre-training model

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116167336A (en) * 2023-04-22 2023-05-26 拓普思传感器(太仓)有限公司 Sensor data processing method based on cloud computing, cloud server and medium
CN116167336B (en) * 2023-04-22 2023-07-07 拓普思传感器(太仓)有限公司 Sensor data processing method based on cloud computing, cloud server and medium
CN116910223A (en) * 2023-08-09 2023-10-20 北京安联通科技有限公司 Intelligent question-answering data processing system based on pre-training model
CN116910223B (en) * 2023-08-09 2024-06-11 北京安联通科技有限公司 Intelligent question-answering data processing system based on pre-training model
CN116860312A (en) * 2023-09-05 2023-10-10 成都智慧锦城大数据有限公司 Program abnormal text information maintenance method, device and storage medium
CN116860312B (en) * 2023-09-05 2023-11-07 成都智慧锦城大数据有限公司 Program abnormal text information maintenance method, device and storage medium

Similar Documents

Publication Publication Date Title
CN112949786B (en) Data classification identification method, device, equipment and readable storage medium
Kant et al. Spatially aware multimodal transformers for textvqa
Guo et al. Deep multimodal representation learning: A survey
CN110597991B (en) Text classification method and device, computer equipment and storage medium
CN114281931A (en) Text matching method, device, equipment, medium and computer program product
CN112131883B (en) Language model training method, device, computer equipment and storage medium
CN109857846B (en) Method and device for matching user question and knowledge point
CN111738001B (en) Training method of synonym recognition model, synonym determination method and equipment
CN113704460B (en) Text classification method and device, electronic equipment and storage medium
CN113793696B (en) Novel medicine side effect occurrence frequency prediction method, system, terminal and readable storage medium based on similarity
CN112580352A (en) Keyword extraction method, device and equipment and computer storage medium
Bhutani et al. Open information extraction from question-answer pairs
CN113705191A (en) Method, device and equipment for generating sample statement and storage medium
CN116975350A (en) Image-text retrieval method, device, equipment and storage medium
CN114282055A (en) Video feature extraction method, device and equipment and computer storage medium
Wang et al. Representation learning from limited educational data with crowdsourced labels
CN116385937A (en) Method and system for solving video question and answer based on multi-granularity cross-mode interaction framework
Hendricks et al. Generating visual explanations with natural language
CN113609866A (en) Text marking method, device, equipment and storage medium
CN112380867A (en) Text processing method, text processing device, knowledge base construction method, knowledge base construction device and storage medium
CN116561272A (en) Open domain visual language question-answering method and device, electronic equipment and storage medium
CN116628161A (en) Answer generation method, device, equipment and storage medium
CN114757183B (en) Cross-domain emotion classification method based on comparison alignment network
CN114386436B (en) Text data analysis method, model training method, device and computer equipment
CN115273856A (en) Voice recognition method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination