CN116050405A - Text processing, question-answer text processing and text processing model training method - Google Patents

Text processing, question-answer text processing and text processing model training method Download PDF

Info

Publication number
CN116050405A
CN116050405A CN202211674486.8A CN202211674486A CN116050405A CN 116050405 A CN116050405 A CN 116050405A CN 202211674486 A CN202211674486 A CN 202211674486A CN 116050405 A CN116050405 A CN 116050405A
Authority
CN
China
Prior art keywords
text
sample
training sample
training
prompt
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211674486.8A
Other languages
Chinese (zh)
Inventor
戴翼
郎皓
李永彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Alibaba China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba China Co Ltd filed Critical Alibaba China Co Ltd
Priority to CN202211674486.8A priority Critical patent/CN116050405A/en
Publication of CN116050405A publication Critical patent/CN116050405A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the specification provides a text processing method, a question-answer text processing method and a text processing model training method, wherein the text processing method comprises the following steps: extracting characteristics of the text to be processed to obtain text characteristics of the text to be processed; matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different; inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. The text processing model can be better guided to process the text to be processed by matching a plurality of target prompt messages corresponding to the text to be processed, and the granularity of knowledge acquired by the text processing model is different due to the different granularity of the target prompt messages, so that the accuracy of the text processing result is further improved.

Description

Text processing, question-answer text processing and text processing model training method
Technical Field
The embodiment of the specification relates to the technical field of computers, in particular to a text processing method. One or more embodiments of the present specification relate to a question-answer text processing method, a text processing model training method, a text processing apparatus, a question-answer text processing apparatus, a text processing model training apparatus, a computing device, a computer readable storage medium, and a computer program.
Background
With the development of computer technology, more and more work and learning tasks can be automatically processed, and the computer technology is gradually applied to various daily education and learning activities, for example, text is processed by using the computer technology, so that human resources are greatly saved.
At present, in the life cycle of the deployment of the text processing system, new tasks are required to be continuously learned, and the development of the deployment environment is adapted, so that the life learning requirement of the text processing system is generated. For this requirement, the text processing system often has problems such as forgetting old tasks (catastrophic forgetting), unknowing test tasks, and incapability of achieving fine-grained knowledge sharing, which results in poor text processing accuracy. Thus, there is a need for a highly accurate text processing scheme.
Disclosure of Invention
In view of this, the present embodiments provide a text processing method. One or more embodiments of the present specification relate to a method for processing a question-answer text, a method for training a text processing model, a text processing apparatus, a method for training a text processing model, a computing device, a computer-readable storage medium, and a computer program, so as to solve the technical drawbacks of the prior art.
According to a first aspect of embodiments of the present specification, there is provided a text processing method, including:
extracting characteristics of the text to be processed to obtain text characteristics of the text to be processed;
matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different;
inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts.
According to a second aspect of embodiments of the present specification, there is provided a question-answer text processing method, including:
Extracting features of the to-be-processed question-answering text to obtain question-answering text features of the to-be-processed question-answering text;
matching a plurality of target question-answer prompt messages corresponding to the to-be-processed question-answer text by utilizing the question-answer text characteristics, wherein the granularity of the plurality of target question-answer prompt messages is different;
inputting the question-answering text characteristics and a plurality of target question-answering prompt messages into a text processing model, and processing the text processing model to obtain a question-answering text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts.
According to a third aspect of embodiments of the present disclosure, there is provided a text processing model training method applied to cloud-side equipment, the method including:
acquiring a second sample set, wherein the second sample set comprises a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results;
extracting features of the plurality of training sample texts to obtain training text features of the plurality of training sample texts;
matching a plurality of training sample prompt messages corresponding to each training sample text by utilizing training text characteristics, wherein the training sample prompt messages comprise training sample global prompt messages, training sample form prompt messages, training sample task prompt messages and training sample element prompt messages;
Training the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text to obtain model parameters of a text processing model obtained through training;
and sending the model parameters of the text processing model obtained through training to the terminal equipment.
According to a fourth aspect of embodiments of the present specification, there is provided a text processing apparatus comprising:
the first extraction module is configured to perform feature extraction on the text to be processed to obtain text features of the text to be processed;
the first matching module is configured to match a plurality of target prompt messages corresponding to the text to be processed by utilizing text characteristics, wherein the granularity of the target prompt messages is different;
the first obtaining module is configured to input text features and a plurality of target prompt messages into a text processing model, and obtain text processing results through processing of the text processing model, wherein the text processing model is obtained through training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts.
According to a fifth aspect of embodiments of the present specification, there is provided a question-answering text processing apparatus, including:
The second extraction module is configured to perform feature extraction on the to-be-processed question-answer text to obtain question-answer text features of the to-be-processed question-answer text;
the second matching module is configured to match a plurality of target question-answer prompt messages corresponding to the to-be-processed question-answer text by utilizing the question-answer text characteristics, wherein the granularity of the plurality of target question-answer prompt messages is different;
the second obtaining module is configured to input the question-answer text characteristics and the multiple target question-answer prompt information into the text processing model, and obtain a question-answer text processing result through processing of the text processing model, wherein the text processing model is obtained through training based on multiple training sample texts and training sample prompt information corresponding to each training sample text.
According to a sixth aspect of embodiments of the present specification, there is provided a text processing model training apparatus applied to cloud-side equipment, the apparatus including:
the acquisition module is configured to acquire a second sample set, wherein the second sample set comprises a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results;
the third extraction module is configured to perform feature extraction on the plurality of training sample texts to obtain training text features of the plurality of training sample texts;
The third matching module is configured to match a plurality of training sample prompt messages corresponding to each training sample text by utilizing training text characteristics, wherein the training sample prompt messages comprise training sample global prompt messages, training sample form prompt messages, training sample task prompt messages and training sample meta prompt messages;
the training module is configured to train the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text, and obtain model parameters of a text processing model obtained through training;
and the sending module is configured to send the model parameters of the text processing model obtained through training to the end-side equipment.
According to a seventh aspect of embodiments of the present specification, there is provided a computing device comprising:
a memory and a processor;
the memory is configured to store computer executable instructions that, when executed by the processor, implement the steps of the methods provided in the first, second or third aspects above.
According to an eighth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the method provided in the first or second or third aspects above.
According to a ninth aspect of embodiments of the present specification, there is provided a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the method provided in the first or second or third aspect described above.
According to the text processing method provided by the embodiment of the specification, feature extraction is carried out on the text to be processed, and text features of the text to be processed are obtained; matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different; inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. Firstly, a plurality of target prompt messages corresponding to a text to be processed are obtained through matching, a text processing model can be better guided to process the text to be processed, and because the granularity of the target prompt messages is different, the prompt messages with different granularities can learn and store knowledge with different granularities and angles, the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided, and knowledge sharing cannot be realized in fine granularity which is not related by a traditional method. And secondly, the text processing model is obtained based on a plurality of training sample texts and training sample prompt information corresponding to each training sample text, and in the training process, each training sample text corresponds to sample prompt information, so that the sample prompt information can be trained and tested together with the training sample text, the problem that a test task is unknown and the problem of catastrophic forgetting are solved, and the accuracy of a text processing result is further improved.
Drawings
FIG. 1 is a block diagram of a text processing system provided in one embodiment of the present disclosure;
FIG. 2 is a block diagram of a data processing system of another pattern recognition model provided in one embodiment of the present disclosure;
FIG. 3 is a flow chart of a text processing method provided by one embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating diversity and regionality of meta-hint information in a text processing method according to an embodiment of the present disclosure;
FIG. 5 is a flow chart of a method of question-answering text processing provided by one embodiment of the present disclosure;
FIG. 6 is a flow chart of another method of question-answering text processing provided by one embodiment of the present disclosure;
FIG. 7 is a block diagram of a text processing model training system provided in one embodiment of the present disclosure;
FIG. 8 is a flow chart of a text processing model training method provided in one embodiment of the present disclosure;
FIG. 9 is an interface diagram of a text processing interface provided by one embodiment of the present disclosure;
FIG. 10 is a process flow diagram of a text processing method provided in one embodiment of the present disclosure;
FIG. 11 is a schematic diagram of a text processing device according to an embodiment of the present disclosure;
Fig. 12 is a schematic structural view of a question-answering text processing device according to one embodiment of the present disclosure;
FIG. 13 is a schematic diagram of a text processing model training apparatus according to one embodiment of the present disclosure;
FIG. 14 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, one or more embodiments and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
First, terms related to one or more embodiments of the present specification will be explained.
Text-to-text: text-to-text (text-to-test) is a common form of solving various types of natural language processing problems.
Planning and sampling: planned sampling (scheduled sampling) is a sampling method that solves two scene difference problems.
With the development of computer technology, more and more work and learning tasks can be automatically processed, and the computer technology is gradually applied to various daily education and learning activities, for example, text is processed by using the computer technology, so that human resources are greatly saved. At present, in the life cycle of the deployment of the text processing system, new tasks are required to be continuously learned, and the development of the deployment environment is adapted, so that the life learning requirement of the text processing system is generated. For this requirement, text processing systems often suffer from the problem of forgetting old tasks, known as "catastrophic forgetting", resulting in poor text processing accuracy.
In order to solve the above problems, the embodiments of the present disclosure provide a multi-level text processing prompt system, taking a question-answering system as an example, different levels of prompts can learn and store question-answering knowledge with different granularity and angles, so as to help the question-answering system obtain specific knowledge. The method can store question and answer knowledge on coarse granularity of tasks, forms and the like, and can also help knowledge sharing on fine granularity which cannot be divided manually and is not related to the traditional method.
Specifically, extracting characteristics of a text to be processed to obtain text characteristics of the text to be processed; matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different; inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. The text processing model can be better guided to process the text to be processed by matching a plurality of target prompt messages corresponding to the text to be processed, and the prompt messages with different granularities can learn and store knowledge with different granularities and angles due to different granularities of the target prompt messages, so that the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of text processing results is further improved.
In the present specification, a text processing method is provided, and the present specification relates to a question-answer text processing method, a text processing model training method, a text processing apparatus, a question-answer text processing apparatus, a text processing model training apparatus, a computing device, and a computer-readable storage medium, one by one, in the following embodiments.
Referring to fig. 1, fig. 1 shows a frame diagram of a text processing system according to an embodiment of the present disclosure, where the text processing system includes a server 100 and a client 200;
client 200: sending a text to be processed to the server 100;
the server 100: extracting characteristics of the text to be processed to obtain text characteristics of the text to be processed; matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different; inputting the text characteristics and a plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts; sending the text processing result to the client 200;
Client 200: and receiving a text processing result sent by the server 100.
By applying the scheme of the embodiment of the specification, the text to be processed is subjected to feature extraction, and the text features of the text to be processed are obtained; matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different; inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. The text processing model can be better guided to process the text to be processed by matching a plurality of target prompt messages corresponding to the text to be processed, and the prompt messages with different granularities can learn and store knowledge with different granularities and angles due to different granularities of the target prompt messages, so that the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of text processing results is further improved.
Referring to fig. 2, fig. 2 illustrates a block diagram of another text processing system provided in one embodiment of the present description, which may include a server 100 and a plurality of clients 200. Communication connection can be established between the plurality of clients 200 through the server 100, and in a text processing scenario, the server 100 is used to provide text processing services between the plurality of clients 200, and the plurality of clients 200 can respectively serve as a transmitting end or a receiving end, so that real-time communication can be realized through the server 100.
The user may interact with the server 100 through the client 200 to receive data transmitted from other clients 200, to transmit data to other clients 200, etc. In the text processing scenario, it may be that the user issues a data stream to the server 100 through the client 200, and the server 100 performs text processing according to the data stream and pushes a text processing result to other clients that establish communication.
Wherein, the client 200 and the server 100 establish a connection through a network. The network provides a medium for communication links between clients and servers. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The data transmitted by the client 200 may need to be encoded, transcoded, compressed, etc. before being distributed to the server 100.
The client 200 may be a browser, APP (Application), or a web Application such as H5 (HyperText Markup Language, hypertext markup language (htv) 5 th edition) Application, or a light Application (also called applet, a lightweight Application) or cloud Application, etc., and the client 200 may be based on a software development kit (SDK, software Development Kit) of a corresponding service provided by a server, such as a real-time communication (RTC, real Time Communication) based SDK development acquisition, etc. The client 200 may be deployed in an electronic device, need to run depending on the device or some APP in the device, etc. The electronic device may for example have a display screen and support information browsing etc. as may be a personal mobile terminal such as a mobile phone, tablet computer, personal computer etc. Various other types of applications are also commonly deployed in electronic devices, such as human-machine conversation type applications, model training type applications, text processing type applications, web browser applications, shopping type applications, search type applications, instant messaging tools, mailbox clients, social platform software, and the like.
The server 100 may include a server that provides various services, such as a server that provides communication services for multiple clients, a server for background training that provides support for a model used on a client, a server that processes data sent by a client, and so on. It should be noted that, the server 100 may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. The server may also be a server of a distributed system or a server that incorporates a blockchain. The server may also be a cloud server for cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (CDN, content Delivery Network), basic cloud computing services such as big data and artificial intelligence platforms, or an intelligent cloud computing server or an intelligent cloud host with artificial intelligence technology.
It should be noted that, the text processing method provided in the embodiment of the present disclosure is generally executed by the server, but in other embodiments of the present disclosure, the client may also have a similar function to the server, so as to execute the text processing method provided in the embodiment of the present disclosure. In other embodiments, the text processing method provided in the embodiments of the present disclosure may be performed by the client and the server together.
Referring to fig. 3, fig. 3 shows a flowchart of a text processing method according to an embodiment of the present disclosure, which specifically includes the following steps:
step 302: and extracting the characteristics of the text to be processed to obtain the text characteristics of the text to be processed.
In one or more embodiments of the present disclosure, in order to obtain an accurate text processing result corresponding to a text to be processed, feature extraction may be performed on the text to be processed, and according to text features of the text to be processed, a plurality of target prompt messages corresponding to the text to be processed are matched, so that a text processing model is guided to perform text processing by using the plurality of target prompt messages.
Specifically, the text to be processed is the text to be processed by the text processing model, and the text to be processed may be texts under multiple scenes, such as emotion analysis text, subject matter extraction text, question-answer text, and the like, which are specifically selected according to actual situations, and the embodiment of the present disclosure is not limited in any way. The text feature of the text to be processed refers to a representation of the text to be processed for subsequent matching of a plurality of target prompt messages corresponding to the text to be processed.
In practical applications, there are various ways of extracting features from a text to be processed, and a feature extraction method may be used to extract text features of the text to be processed, or a text encoder may be used to extract text features of the text to be processed. Any manner in which text features of the text to be processed may be obtained may be used in the present specification, and this embodiment is not limited thereto.
Step 304: and matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different.
In one or more embodiments of the present disclosure, after feature extraction is performed on a text to be processed, text features of the text to be processed are obtained, and then the text features may be further utilized to match a plurality of target prompt messages corresponding to the text to be processed, where granularity of the plurality of target prompt messages is different.
Specifically, by utilizing the text characteristics of the text to be processed, the text to be processed can be matched with target prompt messages of different levels, and the granularity of the target prompt messages of different levels is different. Granularity refers to the degree of thickness of data in the same dimension. If the four layers of target prompt messages are obtained for matching the to-be-processed question-answer text, the four layers of target prompt messages are respectively target global prompt message "text processing", target form prompt message "multi-selection", target task prompt message "multi-selection task A", target meta-prompt message "English", and the granularity of the prompt messages is gradually fine.
When the text features are used for matching the target prompt messages corresponding to the text to be processed, the target prompt keys corresponding to the text to be processed can be matched first, and the target prompt features corresponding to the target prompt keys one by one are further used as the target prompt messages corresponding to the text to be processed. In practical application, the target prompt feature can be understood as a soft prompt, and is a continuous vector, so that the target prompt feature can be flexibly inserted into a text to acquire related knowledge of the text.
In practical application, the plurality of target prompt messages corresponding to the text to be processed may include target global prompt messages and target form prompt messages, the target global prompt messages may be preset, and the target form prompt messages corresponding to the text to be processed may be determined according to the text form of the text to be processed. The plurality of target prompt messages corresponding to the text to be processed can also comprise target task prompt messages and target element prompt messages, and the target task prompt messages and the target element prompt messages corresponding to the text to be processed can be determined according to the feature similarity between the text features and the candidate prompt messages of the text to be processed.
Step 306: inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts.
In one or more embodiments of the present disclosure, feature extraction is performed on a text to be processed to obtain text features of the text to be processed, and after matching a plurality of target prompt messages corresponding to the text to be processed by using the text features, the text features and the plurality of target prompt messages may be further input into a text processing model, and a text processing result is obtained through processing of the text processing model.
In particular, the Text processing model is a machine learning model, such as a Transfer Text-to-Text converter (T5), which can be understood as a trained program that can find patterns in new data and make predictions. These models are represented as a mathematical function that accepts requests in the form of input data, predicts the input data, and then provides outputs in response. The text processing result is a processing result corresponding to the text to be processed, for example, the text to be processed is the text to be analyzed, namely, the text to be processed is a good bar of weather today, the text to be analyzed is blown to the face to be good and gentle, and the text emotion analysis result corresponding to the text to be analyzed is happy and comfortable.
By applying the scheme of the embodiment of the specification, the text to be processed is subjected to feature extraction, and the text features of the text to be processed are obtained; matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different; inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. The text processing model can be better guided to process the text to be processed by matching a plurality of target prompt messages corresponding to the text to be processed, and the prompt messages with different granularities can learn and store knowledge with different granularities and angles due to different granularities of the target prompt messages, so that the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of text processing results is further improved.
In practical applications, the method for extracting the characteristics of the text to be processed and obtaining the text characteristics of the text to be processed is various, and the method is specifically selected according to practical situations, and the embodiment of the present disclosure is not limited in any way.
In one possible implementation manner of the present specification, a feature extraction method may be used to extract text features of a text to be processed. Feature extraction methods include, but are not limited to, one-Hot encoding (One-Hot), word Frequency-inverse document Frequency (TF-IDF, term Frequency-Inverse Document Frequency). The term frequency represents the frequency of the keywords in the document, and the inverse document frequency reflects the popularity of the keywords.
In another possible implementation manner of the present disclosure, the text encoder may be used to extract text features of the text to be processed, that is, the feature extraction is performed on the text to be processed, so as to obtain the text features of the text to be processed, and the method may include the following steps:
inputting the text to be processed into a text encoder, and obtaining the text characteristics of the text to be processed through encoding processing of the text encoder.
In particular, the text encoder includes, but is not limited to, BERT, long Short-Term Memory (LSTM), etc., and is specifically selected according to the actual situation, which is not limited in any way by the embodiments of the present specification. To avoid catastrophic forgetfulness, the parameters of the text encoder may be fixed.
By applying the scheme of the embodiment of the specification, the text to be processed is input into the text encoder, and the text characteristics of the text to be processed are obtained through the encoding processing of the text encoder, so that the efficiency and the accuracy for obtaining the text characteristics of the text to be processed are improved.
In an optional embodiment of the present disclosure, the plurality of target prompt messages corresponding to the text to be processed includes a target global prompt message and a target form prompt message; matching a plurality of target prompt messages corresponding to the text to be processed by using the text characteristics can comprise the following steps:
acquiring preset target global prompt information;
analyzing the text to be processed, and determining the text form of the text to be processed;
and determining target form prompt information corresponding to the text to be processed according to the text form.
Specifically, only one preset target global prompt message is used for sharing text knowledge, and the target global prompt message can be used in all texts without matching. The target form prompt information corresponds to the text form of the text to be processed one by one.
In practical applications, the text to be processed is analyzed, and various ways of determining the text form of the text to be processed are available, and the text form to be processed is specifically selected according to practical situations, which is not limited in any way in the embodiments of the present specification.
In an optional implementation manner of the present disclosure, a plurality of preset text form templates may be obtained, a text to be processed is matched with the plurality of text form templates, a target text form template corresponding to the text to be processed is determined, and a text form corresponding to the target text form template is further used as a text form of the text to be processed.
In another optional implementation manner of the present disclosure, a text analysis model may be trained in advance, a text to be processed is input into the text analysis model, and a text form of the text to be processed is obtained through analysis processing of the text analysis model, where the text analysis model is trained based on a plurality of sample texts and text forms corresponding to the respective sample texts.
It should be noted that, the text of different forms corresponds to different form prompt keys, so after determining the text form corresponding to the text to be processed, the form prompt key corresponding to the text form can be determined, and then the form prompt feature corresponding to the form prompt key is used as the target form prompt information corresponding to the text to be processed.
By applying the scheme of the embodiment of the specification, the preset target global prompt information is obtained; analyzing the text to be processed, and determining the text form of the text to be processed; according to the text form, determining target form prompt information corresponding to the text to be processed, so as to accurately determine target global prompt information and target form prompt information corresponding to the text to be processed, and further enable text processing to be more accurate.
In another optional embodiment of the present disclosure, the plurality of target prompt messages corresponding to the text to be processed include target task prompt messages and target meta prompt messages, and the matching the plurality of target prompt messages corresponding to the text to be processed by using text features may include the following steps:
acquiring a plurality of candidate prompt messages, wherein the candidate prompt messages comprise candidate task prompt messages and candidate meta prompt messages;
and determining target task prompt information and target meta prompt information corresponding to the text to be processed according to the feature similarity between the text features and the candidate prompt information.
Specifically, the candidate prompt information is a candidate prompt key learned in advance, and the candidate prompt key comprises a candidate task prompt key and a candidate element prompt key.
After the plurality of candidate prompt messages are acquired, the cosine similarity between the text features of the text to be processed and the candidate task prompt messages and the cosine similarity between the text features of the text to be processed and the candidate meta prompt messages can be calculated, and further according to the calculated cosine similarity, the candidate task prompt message and the candidate meta prompt message with the highest similarity are used as target task prompt messages and target meta prompt messages corresponding to the text to be processed.
By applying the scheme of the embodiment of the specification, a plurality of candidate prompt messages are obtained, wherein the candidate prompt messages comprise candidate task prompt messages and candidate meta prompt messages; and determining target task prompt information and target meta prompt information corresponding to the text to be processed according to the feature similarity between the text features and the candidate prompt information. Therefore, the target task prompt information and the target element prompt information corresponding to the text to be processed are accurately determined, and further, the text processing is more accurate.
In an alternative embodiment of the present disclosure, for candidate task prompt information, the sample task prompt information of the sample text may be pulled to a text feature (query) of the sample text by an exponential triangle ternary loss (exponential angular triplet loss) and away from queries of other task sample texts, so as to optimize the sample task prompt information, that is, the obtaining a plurality of candidate prompt information may include the following steps:
acquiring a first sample set, wherein the first sample set comprises a plurality of sample texts, and task attributes carried by the plurality of sample texts are different;
extracting a first sample text from a plurality of sample texts, wherein the first sample text is any one of the plurality of sample texts;
Screening a negative sample text from a plurality of sample texts according to a first task attribute carried by the first sample, wherein the task attribute of the negative sample text is different from the first task attribute;
calculating task prompt loss according to the first sample text, the negative sample text and first sample task prompt information corresponding to the first sample text;
and optimizing the first sample task prompt information according to the task prompt loss, and returning to execute the step of extracting the first sample text from the plurality of sample texts until a preset stop condition is reached, so as to obtain candidate task prompt information.
Specifically, task attributes carried by the sample text include, but are not limited to, question-and-answer tasks, emotion analysis tasks, subject matter extraction tasks, and the like. Sample task prompt information corresponding to each sample text is generated by random initialization, and is optimized according to task prompt loss.
It should be noted that, the manner of obtaining the first sample set may be multiple, that is, a large amount of sample texts may be manually input to form the first sample set, or that a large amount of sample texts may be read from other data obtaining devices or databases to form the first sample set, and the manner of obtaining the first sample set is specifically selected according to the actual situation, which is not limited in any way in the embodiment of the present disclosure.
According to the first task attribute carried by the first text, a plurality of ways of screening the negative sample text from the plurality of sample texts can be to obtain task attributes of all the sample texts, search task attributes different from the first task attribute in all the task attributes, and further take the sample text corresponding to any task attribute different from the first task attribute as the negative sample text. Or randomly selecting any sample text, judging whether the task attribute of the sample text is the same as the first task attribute, if so, returning to randomly selecting any sample text, and if not, taking the sample text as a negative sample text.
In practical applications, the preset stop condition includes traversing all sample texts as positive sample texts. According to the first sample text, the negative sample text and the first sample task prompt information corresponding to the first sample text, when the task prompt loss is calculated, the first sample task can be regarded as the positive sample text. For one sample P, it is desirable that the sample task prompt corresponding to its task attribute is close to the feature representation produced by the sample text P code and far from the representation produced by the sample text code under other task attributes. Therefore, a sample text N with different task attributes from the sample text P can be selected as a negative example, P is taken as a positive example, the sample task prompt information corresponding to the task attributes to which the P belongs is input into an exponential triangle ternary loss function, and the task prompt loss is calculated.
According to the scheme of the embodiment of the application instruction book, screening a negative sample text from a plurality of sample texts according to a first task attribute carried by a first sample, wherein the task attribute of the negative sample text is different from the first task attribute; calculating task prompt loss according to the first sample text, the negative sample text and first sample task prompt information corresponding to the first sample text; and optimizing the first sample task prompt information according to the task prompt loss, and returning to execute the step of extracting the first sample text from the plurality of sample texts until all the sample texts are traversed to serve as positive sample texts, so as to obtain all the candidate task prompt information. By optimizing the sample task prompt information, the finally obtained candidate task prompt information is more accurate.
In an optional embodiment of the present disclosure, for candidate meta-alert information, in order to make a plurality of sample meta-alert information matched with a sample text more approximate to a sample text feature, and ensure a certain diversity (diversity) and locality (locality), at a local (batch) level, a first meta-alert loss may be calculated according to sample text features and sample meta-alert information of a plurality of sample texts, so as to optimize the sample meta-alert information, and obtain candidate meta-alert information, where a local level refers to a batch of sample texts input by a model synchronously, that is, the obtaining a plurality of candidate alert information may include the following steps:
Extracting features of the plurality of sample texts to obtain sample text features of the plurality of sample texts;
calculating a first meta-prompt loss according to sample text characteristics and sample element prompt information of a plurality of sample texts;
and optimizing the sample element prompt information according to the first element prompt loss until a preset stop condition is reached, so as to obtain candidate element prompt information.
It should be noted that, the sample element prompt information of each sample text is generated through random initialization, and is optimized according to the first element prompt loss. The specific implementation manner of extracting the features of the sample texts to obtain the features of the sample texts is the same as the above-mentioned method of extracting the features of the text to be processed to obtain the features of the text to be processed, and the embodiments of the present disclosure will not be repeated.
In practical applications, the first meta-hint loss may be calculated according to the following formula (1):
Figure BDA0004017580520000111
wherein L is m Is the first meta-hint penalty, x is the input sample text, S (x) is the subscript of the M' meta-hint keys to which x is matched,
Figure BDA0004017580520000112
and h (x) is a sample text feature obtained by encoding x by a text encoder for the meta prompt feature corresponding to the ith meta prompt key. The first meta-hint loss function has two items, and the first item is to pull up the distance between M ' meta-hint features (M sample texts are totally matched with the M ' meta-hint features) and the sample text features h (x), namely, the locality, but at the same time, the distance between the M ' meta-hint features and the h (x) is kept by taking the parameter eta as an edge (margin), so as to ensure the diversity. The second term is to avoid overlapping M' meta-hint features, keep a distance, promote diversity, and use parameter Y as margin to avoid over-spacing meta-hint keys. I is the cosine distance of the two vectors.
By applying the scheme of the embodiment of the specification, extracting the characteristics of the plurality of sample texts to obtain the sample text characteristics of the plurality of sample texts; calculating a first meta-prompt loss according to sample text characteristics and sample element prompt information of a plurality of sample texts; and optimizing the sample element prompt information according to the first element prompt loss until a preset stop condition is reached, so as to obtain candidate element prompt information. The method and the device avoid the situation that the meta prompt keys of different texts are too similar, so that the meta prompt features are similar and redundant overlapping is generated, ensure the regionality of the meta prompt information, promote global diversity and enable the target prompt information of the text to be processed to be more accurate.
In another optional embodiment of the present disclosure, for candidate meta-alert information, in order to make meta-alert keys diverse on a global (all sample text) level, and distribute around each sample text cluster, so that meta-alert features obtain diverse knowledge, a second meta-alert loss is calculated according to a plurality of sample text features, sample meta-alert information of a plurality of sample texts, and at least one cluster center feature, and the sample meta-alert information is optimized, that is, the sample meta-alert information is optimized according to the first meta-alert loss until a preset stop condition is reached, and before obtaining the candidate meta-alert information, the method may further include the following steps:
Clustering the plurality of sample text features to determine at least one cluster center feature;
calculating a second binary prompt loss according to the characteristics of the plurality of sample texts, the sample element prompt information of the plurality of sample texts and at least one clustering center characteristic;
according to the first meta-hint loss, optimizing the sample meta-hint information until a preset stop condition is reached, obtaining candidate meta-hint information, which may include the steps of:
and optimizing the sample element prompt information according to the first element prompt loss and the second element prompt loss until a preset stop condition is reached, so as to obtain candidate element prompt information.
It should be noted that, the sample text features of all sample texts may be clustered directly, and in order to improve the clustering efficiency, a small portion of the sample texts learned before may be selected to be saved as memory samples, and E samples may be saved for each task. Assuming i tasks have been learned, there are a total of iE memory samples. All the sample texts stored before are subjected to feature generation through a sample text encoder, the sample text features are clustered into 5i classes through a K-means clustering algorithm (K-means), and each cluster center feature is determined.
In practical application, the sample text feature h (x) in the above formula (1) may be replaced by each cluster center feature, so that each cluster center is surrounded by an approximate meta-hint key, that is, the second meta-hint loss is calculated according to the following formula (2):
Figure BDA0004017580520000121
Wherein c k Is the center of the kth cluster in the memory sample.
Referring to fig. 4, fig. 4 is a schematic diagram illustrating diversity and regionalization of meta-hint information in a text processing method according to an embodiment of the present disclosure. As shown in fig. 4, (a) is a case where only diversity is ensured, (b) is a case where only regionalization is ensured, and (c) is a case where both diversity and regionalization are ensured by the above-described method.
By applying the scheme of the embodiment of the specification, clustering is carried out on a plurality of sample text features, and at least one clustering center feature is determined; calculating a second binary prompt loss according to the characteristics of the plurality of sample texts, the sample element prompt information of the plurality of sample texts and at least one clustering center characteristic; and optimizing the sample element prompt information according to the first element prompt loss and the second element prompt loss until a preset stop condition is reached, so as to obtain candidate element prompt information. The global diversity of candidate meta-prompt information is ensured, so that the target prompt information of the text to be processed is more accurate.
In an alternative embodiment of the present disclosure, the training manner of the text processing model may include the following steps:
obtaining a second sample set, wherein the second sample set comprises a plurality of training sample texts, the training sample texts carry training sample prompt information and sample processing results, and the training sample prompt information comprises training sample global prompt information, training sample form prompt information, training sample task prompt information and training sample element prompt information;
Extracting a first training sample text from a plurality of training sample texts, wherein the first training sample text is any one of the plurality of training sample texts;
inputting the first training sample text and the first training sample prompt information into an initial processing model to obtain a first prediction result corresponding to the first training sample text;
calculating a prediction loss value according to the first prediction result and a first sample processing result carried by the first training sample text;
and adjusting model parameters of the initial processing model according to the predicted loss value, and returning to execute the step of extracting the first training sample text from the plurality of training sample texts until a preset processing stop condition is reached, so as to obtain the text processing model.
Specifically, the training sample prompt information carried by the training sample text is obtained by extracting features of a plurality of training sample texts to obtain training text features of the plurality of training sample texts and matching the training text features.
It should be noted that, the manner of obtaining the second sample set is various, and may be that a large number of training sample texts are manually input to form the second sample set, or may be that a large number of training sample texts are read from other data obtaining devices or databases to form the second sample set, and the manner of obtaining the second sample set is specifically selected according to the actual situation, which is not limited in any way in the embodiment of the present specification.
In one possible implementation manner of the present disclosure, the preset processing stop condition includes that the predicted loss value is less than or equal to a preset threshold value. Inputting the first training sample text and the first training sample prompt information into an initial processing model to obtain a first prediction result corresponding to the first training sample text, calculating to obtain a prediction loss value according to the first prediction result and the first sample processing result carried by the first training sample text after the first prediction result is obtained, and comparing the prediction loss value with a preset threshold value.
Specifically, if the predicted loss value is greater than the preset threshold value, it is indicated that the difference between the first predicted result and the first sample processing result carried by the first training sample text is greater, the prediction capability of the initial processing model on the first training sample text and the first training sample prompt information is poorer, at this time, the model parameters of the initial processing model can be adjusted, and the step of extracting the first training sample text from the plurality of training sample texts is performed in a returning manner, so that the training of the initial processing model is continued until the predicted loss value is less than or equal to the preset threshold value, it is indicated that the difference between the first predicted result and the first sample processing result carried by the first training sample text is smaller, and the preset processing stop condition is reached, so as to obtain the text processing model for completing the training.
According to the scheme of the embodiment of the specification, according to the first prediction result and the first sample processing result carried by the first training sample text, the prediction loss value is calculated and obtained, the prediction loss value is compared with the preset threshold value, the initial processing model is continuously trained under the condition that the prediction loss value is larger than the preset threshold value, the training is completed under the condition that the prediction loss value is smaller than or equal to the preset threshold value, and the finally obtained text processing model is more accurate through continuously adjusting the model parameters of the initial processing model.
In another possible implementation manner of the present disclosure, in addition to comparing the magnitude relation between the predicted loss value and the preset threshold, it may also be determined whether the current initial processing model is trained or not by combining the iteration times.
Specifically, if the predicted loss value is greater than a preset threshold, the model parameters of the initial processing model are adjusted, the step of extracting the first training sample text from the plurality of training sample texts is returned to be executed, the training of the initial processing model is continued until the preset iteration number is reached, the iteration is stopped, and the text processing model with the training completed is obtained, wherein the preset threshold and the preset iteration number are specifically selected according to the actual situation, and the embodiment of the present disclosure is not limited in any way.
In practical applications, there are many functions for calculating the predicted loss value, such as cross entropy loss function, L1 norm loss function, maximum loss function, mean square error loss function, logarithmic loss function, and the like, which are specifically selected according to practical situations, and the embodiment of the present disclosure is not limited in any way. Preferably, the cross entropy loss function is used for calculating the predicted loss value, and the cross entropy of the first predicted result and the first sample processing result carried by the first training sample text is calculated as the predicted loss value, so that the efficiency of calculating the predicted loss value is improved, and the training efficiency of the model is improved.
In an optional embodiment of the present disclosure, since a task sample that is not trained, i.e., an open world problem, may be faced in a training environment, in order to solve this problem, it may be determined, based on a task prompt of the open environment, in combination with a preset task recognition condition, whether a sample text belongs to a task that is not trained (an open task), and whether an open environment task prompt is applicable, that is, the first training sample prompt information includes first training sample task prompt information; the method comprises the steps of inputting a first training sample text and first training sample prompt information into an initial processing model, and before obtaining a first prediction result corresponding to the first training sample text, further comprising the following steps:
Extracting features of the first training sample text to obtain first training text features of the first training sample text;
calculating the distance between the first training text feature and the first training sample prompt information;
and under the condition that the distance is greater than the preset task identification condition, replacing the task prompt information of the first training sample with the appointed task prompt information.
It should be noted that, the specific implementation manner of "extracting the features of the first training sample text to obtain the first training text features of the first training sample text" is the same as the implementation manner of "extracting the features of the text to be processed to obtain the text features of the text to be processed" described above, and the embodiments of this specification will not be repeated.
In practical applications, in the training phase, each task is continuously optimized corresponding to a dynamic decision boundary, which is called an adaptable decision boundary and can be understood as a preset task identification condition. For the input training sample text x, see formula (3), if the cosine distance between the text feature code h (x) and all task prompt keys is greater than the adaptable decision boundary corresponding to the task, it is indicated that it belongs to the task that the text processing model has not contacted in the training phase, therefore, the designated task can be used The prompt information replaces training sample task prompt information corresponding to the training sample text, wherein the designated task prompt information is trained by using all training sample texts in a training stage. Specifically, the boundary optimization loss can be calculated by the following equation (4), fixing k t (T i ) Optimizing the adaptable decision boundary:
Figure BDA0004017580520000141
L b =|||h(x),k t (T i )||-δ i | (4)
wherein Ti is the task corresponding to the training sample text x, k t (T i ) And a task prompt key corresponding to the task Ti. N is the known task category number.
By applying the scheme of the embodiment of the specification, the feature extraction is carried out on the first training sample text, and the first training text feature of the first training sample text is obtained; calculating the distance between the first training text feature and the first training sample prompt information; and under the condition that the distance is greater than the preset task identification condition, replacing the task prompt information of the first training sample with the appointed task prompt information. The text processing model can also process unprocessed tasks, and universality of the text processing model is improved.
In an optional embodiment of the present disclosure, when a text processing model is trained, training sample task prompt information carried by a training sample text may be optimized secondarily, so as to improve accuracy of the text processing model, that is, the training sample text also carries task attributes; after extracting the first training sample text from the plurality of training sample texts, the method may further include the steps of:
Determining a first prediction attribute of the first training sample text according to the first training sample text and first training sample task prompt information carried by the first training sample text;
calculating an attribute loss value according to the first predicted attribute and the first task attribute carried by the first training sample text;
and optimizing the task prompt information of the first training sample according to the attribute loss value until a preset optimization stop condition is reached, and obtaining the optimized task prompt information of the first training sample.
It should be noted that feature extraction can be performed on the training sample text, training text features corresponding to the training sample text are determined, and a plurality of training sample prompt messages corresponding to the training sample text are further obtained according to training text feature matching, wherein the training sample prompt messages comprise training sample task prompt messages.
Further, after the training sample task prompt information is obtained, the predicted attribute of the training sample text can be predicted according to the training sample prompt information, and the attribute loss between the predicted attribute and the real task attribute of the training sample text can be calculated, so that the training sample task prompt information can be optimized according to the attribute loss, the preset optimization stop condition comprises a preset optimization stop threshold value and a preset optimization iteration number, and the training process of the text processing model can be referred to specifically, so that the description of the embodiment of the specification is omitted.
In practical application, training sample task prompt information corresponding to training sample text can be selected through a plan sampling (scheduled sampling), namely, real task information of a current sample is used by adopting decreasing probability, otherwise, task information of the sample is inferred, namely, training sample task prompt information obtained according to training text feature matching is adopted.
By applying the scheme of the embodiment of the specification, according to the first training sample text and the first training sample task prompt information carried by the first training sample text, the first prediction attribute of the first training sample text is determined; calculating an attribute loss value according to the first predicted attribute and the first task attribute carried by the first training sample text; and optimizing the task prompt information of the first training sample according to the attribute loss value until a preset optimization stop condition is reached, and obtaining the optimized task prompt information of the first training sample. And performing secondary optimization on training sample task prompt information carried by the training sample text, and improving the accuracy of the text processing model.
In the embodiment of the specification, aiming at the problems that training is required to be continuously carried out on a new task in the life cycle of a text processing system and the current situation that the old task is forgotten easily caused by training the new task, a text processing scheme for life learning based on layering prompt is provided, prompt information containing various knowledge characterization granularities is used, and different knowledge in life learning of the text processing system is obtained. Based on the problem of lack of input task annotation in a test scene, a multi-level prompt matching mechanism is set, so that the method is better suitable for life learning of a text processing system in an actual environment. In addition, a candidate prompt key which corresponds to the prompt features one by one and can be learned is provided, so that each sample text can be matched with the prompt key of each level, the prompt features are automatically acquired, training and testing are carried out together with the sample text, and the problem that in an actual environment, a test task is unknown and the task attribute to which the sample text belongs cannot be acquired, so that a general life learning method cannot be applied is solved.
Referring to fig. 5, fig. 5 shows a flowchart of a question-answer text processing method according to an embodiment of the present disclosure, which specifically includes the following steps:
step 502: and extracting features of the to-be-processed question-answer text to obtain question-answer text features of the to-be-processed question-answer text.
Step 504: and matching a plurality of target question-answer prompt messages corresponding to the question-answer text to be processed by utilizing the question-answer text characteristics, wherein the granularity of the plurality of target question-answer prompt messages is different.
Step 506: inputting the question-answering text characteristics and a plurality of target question-answering prompt messages into a text processing model, and processing the text processing model to obtain a question-answering text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts.
Specifically, the question-answering text to be processed includes a plurality of types of question-answering texts, such as a plurality of selection question-answering texts, dialogue scenario question-answering texts, and the like, and is specifically selected according to actual conditions, which are not limited in any way in the embodiments of the present specification.
It should be noted that, the specific implementation manners of step 502, step 504, and step 506 are the same as the implementation manner of the text processing method shown in fig. 3, and will not be described in detail in the embodiment of the present disclosure.
By applying the scheme of the embodiment of the specification, extracting the characteristics of the to-be-processed question-answering text to obtain the question-answering text characteristics of the to-be-processed question-answering text; matching a plurality of target question-answer prompt messages corresponding to the to-be-processed question-answer text by utilizing the question-answer text characteristics, wherein the granularity of the plurality of target question-answer prompt messages is different; inputting the question-answering text characteristics and a plurality of target question-answering prompt messages into a text processing model, and processing the text processing model to obtain a question-answering text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. The method has the advantages that the multiple target question-answer prompt messages corresponding to the question-answer text to be processed are obtained through matching, the text processing model can be better guided to process the question-answer text to be processed, and because the multiple target question-answer prompt messages are different in granularity, the question-answer prompt messages with different granularities can learn and store knowledge with different granularities and angles, the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of the question-answer text processing result is further improved.
Referring to fig. 6, fig. 6 shows a flowchart of another question-answer text processing method provided in an embodiment of the present disclosure, which specifically includes:
and receiving three tasks to be processed, which are respectively task 1, task 2 and task 3, sent by the user. Task 1 is an extraction type question and answer task, and comprises a question and answer text 1 to be processed; task 2 is an abstract (abstract) question-answering task, and comprises a question-answering text 2 to be processed; task 3 is a Multiple-Choice question task including question and answer text 3 to be processed. And respectively carrying out feature extraction on the to-be-processed question-answer text 1, the to-be-processed question-answer text 2 and the to-be-processed question-answer text 3 to obtain question-answer text features 1, question-answer text features 2 and question-answer text features 3. Taking a to-be-processed question and answer text 1 as an example, matching a plurality of target question and answer Prompt messages corresponding to the to-be-processed question and answer text 1 by utilizing the question and answer text characteristics 1, wherein the plurality of target question and answer Prompt messages comprise target global Prompt messages (General Prompt), target form Prompt messages (Format Prompt), target Task Prompt messages (Task Prompt) and target Meta Prompt messages (Meta Prompt); and (3) splicing the question-answering text feature 1, the target global prompt information, the target form prompt information, the target task prompt information and the target element prompt information, inputting the same into a text processing model, and processing the same to obtain a question-answering text processing result 1. Similarly, a question-answer text processing result 2 corresponding to the question-answer text 2 to be processed and a question-answer text processing result 3 corresponding to the question-answer text 3 to be processed can be obtained.
By applying the scheme of the embodiment of the specification, corresponding prompt information can be matched for each input, and the prompt information and the input are output together through a text processing model for realizing text-to-text, so that efficient and accurate question-answer text processing is realized.
Referring to FIG. 7, FIG. 7 illustrates a frame diagram of a text processing model training system provided in one embodiment of the present description, wherein the text processing model training system includes a cloud-side device 702 and an end-side device 704;
the end-side device 704 is configured to construct a second sample set, and send the second sample set to the cloud-side device 702, where the second sample set includes a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results;
the cloud side device 702 is configured to perform feature extraction on a plurality of training sample texts to obtain training text features of the plurality of training sample texts; matching a plurality of training sample prompt messages corresponding to each training sample text by utilizing training text characteristics, wherein the training sample prompt messages comprise training sample global prompt messages, training sample form prompt messages, training sample task prompt messages and training sample element prompt messages; training the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text to obtain model parameters of a text processing model obtained through training;
Cloud-side device 702 is further configured to send model parameters of the trained text processing model to end-side device 704.
The end-side device 704 is further configured to construct a text processing model according to the model parameters of the text processing model sent by the cloud-side device 702.
By applying the scheme of the embodiment of the specification, the cloud side equipment acquires a second sample set, wherein the second sample set comprises a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results; extracting features of the plurality of training sample texts to obtain training text features of the plurality of training sample texts; matching a plurality of training sample prompt messages corresponding to each training sample text by utilizing training text characteristics, wherein the training sample prompt messages comprise training sample global prompt messages, training sample form prompt messages, training sample task prompt messages and training sample element prompt messages; training the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text to obtain model parameters of a text processing model obtained through training; and sending the model parameters of the text processing model obtained through training to the terminal equipment. By utilizing training text characteristics, a plurality of training sample prompt messages corresponding to each training sample text are matched, a text processing model is helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of the text processing model is further improved.
Referring to fig. 8, fig. 8 shows a flowchart of a text processing model training method provided in an embodiment of the present disclosure, where the text processing model training method is applied to cloud-side equipment, and specifically includes the following steps:
step 802: the cloud-side device acquires a second sample set.
The second sample set comprises a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results.
Step 804: and the cloud side equipment performs feature extraction on the plurality of training sample texts to obtain training text features of the plurality of training sample texts.
Step 806: and the cloud side equipment matches a plurality of training sample prompt messages corresponding to each training sample text by utilizing the training text characteristics.
The training sample prompt information comprises training sample global prompt information, training sample form prompt information, training sample task prompt information and training sample element prompt information.
Step 808: the cloud side equipment trains the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text, and model parameters of a text processing model obtained through training are obtained.
Step 810: and the cloud side equipment transmits the model parameters of the text processing model obtained through training to the end side equipment.
It should be noted that, the specific implementation manners of step 802, step 804, step 806, and step 808 are the same as the implementation manner of the text processing method shown in fig. 3, and will not be described in detail in the embodiment of the present disclosure.
By applying the scheme of the embodiment of the specification, the cloud side equipment acquires a second sample set, wherein the second sample set comprises a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results; extracting features of the plurality of training sample texts to obtain training text features of the plurality of training sample texts; matching a plurality of training sample prompt messages corresponding to each training sample text by utilizing training text characteristics, wherein the training sample prompt messages comprise training sample global prompt messages, training sample form prompt messages, training sample task prompt messages and training sample element prompt messages; training the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text to obtain model parameters of a text processing model obtained through training; and sending the model parameters of the text processing model obtained through training to the terminal equipment. By utilizing training text characteristics, a plurality of training sample prompt messages corresponding to each training sample text are matched, a text processing model is helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of the text processing model is further improved.
Referring to fig. 9, fig. 9 shows an interface schematic of a text processing interface according to an embodiment of the present disclosure. The text processing interface comprises a text uploading interface and a text processing result display interface, wherein the text uploading interface comprises a text uploading box, a 'determination' control and a 'cancel' control, and the text processing result display interface comprises a text processing result display box. The user uploads the text to be processed in a text upload box, such as "text: today is clear and suitable for going out and walking. Problems: today, how the weather is "clicking" a determination control, the terminal side device inputs the text to be processed and a plurality of target prompt messages corresponding to the text to be processed into a text processing model, the text processing model processes the text to obtain a text processing result, and the text processing result is displayed in a text processing result display frame to be "clear".
It should be noted that, the manner in which the user operates the control includes any manner such as clicking, double clicking, touch control, mouse hovering, sliding, long pressing, voice control or shaking, and the embodiment of the present disclosure does not limit the foregoing.
The text processing method provided in the present specification will be further described with reference to fig. 10 by taking an application of the text processing method in the field of question answering as an example. Fig. 10 shows a process flow chart of a text processing method according to an embodiment of the present disclosure, which specifically includes the following steps:
Step 1002: and receiving a question-answer text processing request input by a user, wherein the question-answer text processing request comprises a question-answer text to be processed.
Step 1004: and inputting the to-be-processed question-answering text into a text encoder, and obtaining the question-answering text characteristics of the to-be-processed question-answering text through encoding processing of the text encoder.
Step 1006: and acquiring preset target global prompt information.
Step 1008: and analyzing the to-be-processed question-answer text to determine the question-answer text form of the to-be-processed question-answer text.
Step 1010: and determining target form prompt information corresponding to the to-be-processed question-answering text according to the question-answering text form.
Step 1012: and determining target task prompt information corresponding to the question-answering text to be processed according to the feature similarity between the question-answering text features and the candidate task prompt information.
Step 1014: and determining target meta-prompt information corresponding to the to-be-processed question-answer text according to the feature similarity between the question-answer text features and the candidate meta-prompt information.
Step 1016: and splicing the text characteristics, the target global prompt information, the target form prompt information, the target task prompt information and the target element prompt information, inputting a text processing model, and processing by the text processing model to obtain a text processing result.
It should be noted that, the specific implementation manner of steps 1002 to 1016 is the same as the implementation manner of the text processing method shown in fig. 3, and the description of the embodiment of the present disclosure is omitted.
By applying the scheme of the embodiment of the specification, the text processing model can be better guided to process the text to be processed by matching to obtain the plurality of target prompt messages corresponding to the text to be processed, and the prompt messages with different granularities can learn and store knowledge with different granularities and angles because of different granularities of the plurality of target prompt messages, so that the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and the accuracy of text processing results is further improved.
It should be noted that, the information and data such as the text to be processed, the question-answer text to be processed, the sample text, the training sample text and the like in the above embodiment of the method are all information and data authorized by the user or fully authorized by each party, and the collection, the use and the processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and are provided with corresponding operation entries for the user to select authorization or rejection.
Corresponding to the above text processing method embodiment, the present disclosure further provides an embodiment of a text processing apparatus, and fig. 11 shows a schematic structural diagram of a text processing apparatus provided in one embodiment of the present disclosure. As shown in fig. 11, the apparatus includes:
the first extraction module 1102 is configured to perform feature extraction on the text to be processed to obtain text features of the text to be processed;
a first matching module 1104 configured to match a plurality of target prompt messages corresponding to the text to be processed by using text features, wherein the granularity of the plurality of target prompt messages is different;
the first obtaining module 1106 is configured to input the text feature and the plurality of target prompt information into a text processing model, and obtain a text processing result through processing of the text processing model, where the text processing model is obtained by training based on the plurality of training sample texts and training sample prompt information corresponding to each training sample text.
Optionally, the first extraction module 1102 is further configured to input the text to be processed into a text encoder, and obtain text features of the text to be processed through encoding processing of the text encoder.
Optionally, the first matching module 1104 is further configured to obtain a plurality of candidate prompt messages, where the candidate prompt messages include candidate task prompt messages and candidate meta prompt messages; and determining target task prompt information and target meta prompt information corresponding to the text to be processed according to the feature similarity between the text features and the candidate prompt information.
Optionally, the plurality of target prompt messages include a target global prompt message and a target form prompt message; the first matching module 1104 is further configured to obtain preset target global prompt information; analyzing the text to be processed, and determining the text form of the text to be processed; and determining target form prompt information corresponding to the text to be processed according to the text form.
Optionally, the first matching module 1104 is further configured to obtain a first sample set, where the first sample set includes a plurality of sample texts, and task attributes carried by the plurality of sample texts are different; extracting a first sample text from a plurality of sample texts, wherein the first sample text is any one of the plurality of sample texts; screening a negative sample text from a plurality of sample texts according to a first task attribute carried by the first sample, wherein the task attribute of the negative sample text is different from the first task attribute; calculating task prompt loss according to the first sample text, the negative sample text and first sample task prompt information corresponding to the first sample text; and optimizing the first sample task prompt information according to the task prompt loss, and returning to execute the step of extracting the first sample text from the plurality of sample texts until a preset stop condition is reached, so as to obtain candidate task prompt information.
Optionally, the first matching module 1104 is further configured to perform feature extraction on the plurality of sample texts to obtain sample text features of the plurality of sample texts; calculating a first meta-prompt loss according to sample text characteristics and sample element prompt information of a plurality of sample texts; and optimizing the sample element prompt information according to the first element prompt loss until a preset stop condition is reached, so as to obtain candidate element prompt information.
Optionally, the apparatus further comprises: the clustering module is configured to cluster the plurality of sample text features and determine at least one cluster center feature; calculating a second binary prompt loss according to the characteristics of the plurality of sample texts, the sample element prompt information of the plurality of sample texts and at least one clustering center characteristic; the first matching module 1104 is further configured to optimize the sample meta-hint information according to the first meta-hint loss and the second meta-hint loss until a preset stop condition is reached, thereby obtaining candidate meta-hint information.
Optionally, the apparatus further comprises: the text processing model training module is configured to acquire a second sample set, wherein the second sample set comprises a plurality of training sample texts, the training sample texts carry training sample prompt information and sample processing results, and the training sample prompt information comprises training sample global prompt information, training sample form prompt information, training sample task prompt information and training sample element prompt information; extracting a first training sample text from a plurality of training sample texts, wherein the first training sample text is any one of the plurality of training sample texts; inputting the first training sample text and the first training sample prompt information into an initial processing model to obtain a first prediction result corresponding to the first training sample text; calculating a prediction loss value according to the first prediction result and a first sample processing result carried by the first training sample text; and adjusting model parameters of the initial processing model according to the predicted loss value, and returning to execute the step of extracting the first training sample text from the plurality of training sample texts until a preset processing stop condition is reached, so as to obtain the text processing model.
Optionally, the first training sample prompt includes a first training sample task prompt; the apparatus further comprises: the computing module is configured to perform feature extraction on the first training sample text to obtain first training text features of the first training sample text; calculating the distance between the first training text feature and the first training sample prompt information; and under the condition that the distance is greater than the preset task identification condition, replacing the task prompt information of the first training sample with the appointed task prompt information.
Optionally, the training sample text also carries task attributes; the apparatus further comprises: the optimizing module is configured to determine a first prediction attribute of the first training sample text according to the first training sample text and first training sample task prompt information carried by the first training sample text; calculating an attribute loss value according to the first predicted attribute and the first task attribute carried by the first training sample text; and optimizing the task prompt information of the first training sample according to the attribute loss value until a preset optimization stop condition is reached, and obtaining the optimized task prompt information of the first training sample.
By applying the scheme of the embodiment of the specification, the text to be processed is subjected to feature extraction, and the text features of the text to be processed are obtained; matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different; inputting the text characteristics and the plurality of target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. The text processing model can be better guided to process the text to be processed by matching a plurality of target prompt messages corresponding to the text to be processed, and the prompt messages with different granularities can learn and store knowledge with different granularities and angles due to different granularities of the target prompt messages, so that the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of text processing results is further improved.
The above is an exemplary scheme of a text processing apparatus of the present embodiment. It should be noted that, the technical solution of the text processing apparatus and the technical solution of the text processing method belong to the same concept, and details of the technical solution of the text processing apparatus, which are not described in detail, can be referred to the description of the technical solution of the text processing method.
Corresponding to the above-mentioned question-answering text processing method embodiment, the present disclosure further provides a question-answering text processing apparatus embodiment, and fig. 12 shows a schematic structural diagram of a question-answering text processing apparatus provided in one embodiment of the present disclosure. As shown in fig. 12, the apparatus includes:
a second extraction module 1202 configured to perform feature extraction on the to-be-processed question-answer text, and obtain question-answer text features of the to-be-processed question-answer text;
a second matching module 1204, configured to match a plurality of target question-answer prompt information corresponding to the question-answer text to be processed by using the question-answer text feature, where the granularity of the plurality of target question-answer prompt information is different;
the second obtaining module 1206 is configured to input the question-answer text features and the multiple target question-answer prompt information into a text processing model, and obtain a question-answer text processing result through processing of the text processing model, where the text processing model is obtained through training based on multiple training sample texts and training sample prompt information corresponding to each training sample text.
Applying the scheme of the embodiment of the specification, extracting the characteristics of the to-be-processed question-answering text, and obtaining the question-answering text characteristics of the to-be-processed question-answering text; matching a plurality of target question-answer prompt messages corresponding to the to-be-processed question-answer text by utilizing the question-answer text characteristics, wherein the granularity of the plurality of target question-answer prompt messages is different; inputting the question-answering text characteristics and a plurality of target question-answering prompt messages into a text processing model, and processing the text processing model to obtain a question-answering text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts. The method has the advantages that the multiple target question-answer prompt messages corresponding to the question-answer text to be processed are obtained through matching, the text processing model can be better guided to process the question-answer text to be processed, and because the multiple target question-answer prompt messages are different in granularity, the question-answer prompt messages with different granularities can learn and store knowledge with different granularities and angles, the text processing model can be helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of the question-answer text processing result is further improved.
The above is an exemplary scheme of a question-answering text processing apparatus of the present embodiment. It should be noted that, the technical solution of the question-answering text processing device and the technical solution of the question-answering text processing method belong to the same concept, and the details of the technical solution of the question-answering text processing device which are not described in detail can be referred to the description of the technical solution of the question-answering text processing method.
Corresponding to the above embodiment of the text processing model training method, the present disclosure further provides an embodiment of a text processing model training device, and fig. 13 shows a schematic structural diagram of a text processing model training device provided in one embodiment of the present disclosure. As shown in fig. 13, the apparatus includes:
an obtaining module 1302 configured to obtain a second sample set, where the second sample set includes a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results;
a third extraction module 1304 configured to perform feature extraction on the plurality of training sample texts to obtain training text features of the plurality of training sample texts;
a third matching module 1306 configured to match a plurality of training sample prompt messages corresponding to each training sample text by using training text features, where the training sample prompt messages include training sample global prompt messages, training sample form prompt messages, training sample task prompt messages, and training sample meta prompt messages;
The training module 1308 is configured to train the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text, and obtain model parameters of a text processing model obtained by training;
a sending module 1310 configured to send model parameters of the trained text processing model to the end-side device.
By applying the scheme of the embodiment of the specification, the cloud side equipment acquires a second sample set, wherein the second sample set comprises a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results; extracting features of the plurality of training sample texts to obtain training text features of the plurality of training sample texts; matching a plurality of training sample prompt messages corresponding to each training sample text by utilizing training text characteristics, wherein the training sample prompt messages comprise training sample global prompt messages, training sample form prompt messages, training sample task prompt messages and training sample element prompt messages; training the initial processing model according to training text characteristics of a plurality of training sample texts and a plurality of training sample prompt messages corresponding to each training sample text to obtain model parameters of a text processing model obtained through training; and sending the model parameters of the text processing model obtained through training to the terminal equipment. By utilizing training text characteristics, a plurality of training sample prompt messages corresponding to each training sample text are matched, a text processing model is helped to acquire specific knowledge, knowledge can be stored in coarse granularity, knowledge sharing can be helped in fine granularity which cannot be manually divided and is not related by the traditional method, and accuracy of the text processing model is further improved.
The above is an exemplary scheme of a text processing model training apparatus of the present embodiment. It should be noted that, the technical solution of the text processing model training device and the technical solution of the text processing model training method belong to the same concept, and details of the technical solution of the text processing model training device which are not described in detail can be referred to the description of the technical solution of the text processing model training method.
FIG. 14 illustrates a block diagram of a computing device provided in one embodiment of the present description. The components of computing device 1400 include, but are not limited to, a memory 1410 and a processor 1420. Processor 1420 is coupled to memory 1410 via bus 1430, and database 1450 is used to store data.
Computing device 1400 also includes an access device 1440, which access device 1440 enables computing device 1400 to communicate via one or more networks 1460. Examples of such networks include public switched telephone networks (PSTN, public Switched Telephone Network), local area networks (LAN, local Area Network), wide area networks (WAN, wide Area Network), personal area networks (PAN, personal Area Network), or combinations of communication networks such as the internet. The access device 1440 may include one or more of any type of network interface, wired or wireless (e.g., network interface card (NIC, network Interface Card)), such as an IEEE802.11 wireless local area network (WLAN, wireless Local Area Networks) wireless interface, a worldwide interoperability for microwave access (Wi-MAX, world Interoperability for Microwave Access) interface, an ethernet interface, a universal serial bus (USB, universal Serial Bus) interface, a cellular network interface, a bluetooth interface, a near-field communication (NFC, near Field Communication) interface, and so forth.
In one embodiment of the present description, the above-described components of computing device 1400, as well as other components not shown in FIG. 14, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 14 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 1400 may be any type of stationary or mobile computing device, including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop computer or personal computer (PC, personal Computer). Computing device 1400 may also be a mobile or stationary server.
Wherein the processor 1420 is operative to execute computer-executable instructions that, when executed by the processor, perform the steps of the text processing method or question-and-answer text processing method or text processing model training method described above.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device belongs to the same concept as the technical solutions of the text processing method, the question-answer text processing method and the text processing model training method, and details of the technical solution of the computing device, which are not described in detail, can be described by referring to the technical solutions of the text processing method, the question-answer text processing method or the text processing model training method.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the above-described text processing method or question-answering text processing method or text processing model training method.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium belongs to the same concept as the technical solution of the text processing method, the question-answer text processing method and the text processing model training method, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the text processing method, the question-answer text processing method or the text processing model training method.
An embodiment of the present specification further provides a computer program, wherein the computer program, when executed in a computer, causes the computer to perform the steps of the above-described text processing method or question-answering text processing method or text processing model training method.
The above is an exemplary version of a computer program of the present embodiment. It should be noted that, the technical solution of the computer program and the technical solutions of the text processing method, the question-answer text processing method and the text processing model training method belong to the same concept, and the details of the technical solution of the computer program, which are not described in detail, can be referred to the description of the technical solutions of the text processing method, the question-answer text processing method or the text processing model training method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code that may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise form disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (14)

1. A text processing method, comprising:
extracting characteristics of a text to be processed to obtain text characteristics of the text to be processed;
matching a plurality of target prompt messages corresponding to the text to be processed by utilizing the text characteristics, wherein the granularity of the target prompt messages is different;
inputting the text characteristics and the target prompt messages into a text processing model, and processing the text processing model to obtain a text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts.
2. The method of claim 1, wherein the feature extraction of the text to be processed to obtain text features of the text to be processed comprises:
inputting the text to be processed into a text encoder, and obtaining the text characteristics of the text to be processed through encoding processing of the text encoder.
3. The method of claim 1, wherein the matching, using the text feature, the plurality of target cues corresponding to the text to be processed includes:
acquiring a plurality of candidate prompt messages, wherein the candidate prompt messages comprise candidate task prompt messages and candidate meta prompt messages;
and determining target task prompt information and target element prompt information corresponding to the text to be processed according to the feature similarity between the text features and the candidate prompt information.
4. The method of claim 1, the plurality of target cues including a target global cue and a target formal cue; the matching of the plurality of target prompt messages corresponding to the text to be processed by utilizing the text features comprises the following steps:
acquiring preset target global prompt information;
analyzing the text to be processed, and determining the text form of the text to be processed;
And determining target form prompt information corresponding to the text to be processed according to the text form.
5. A method according to claim 3, said obtaining a plurality of candidate cues comprising:
acquiring a first sample set, wherein the first sample set comprises a plurality of sample texts, and task attributes carried by the plurality of sample texts are different;
extracting a first sample text from the plurality of sample texts, wherein the first sample text is any one of the plurality of sample texts;
screening a negative sample text from the plurality of sample texts according to a first task attribute carried by the first sample text, wherein the task attribute of the negative sample text is different from the first task attribute;
calculating task prompt loss according to the first sample text, the negative sample text and first sample task prompt information corresponding to the first sample text;
and optimizing the first sample task prompt information according to the task prompt loss, and returning to execute the step of extracting the first sample text from the plurality of sample texts until a preset stop condition is reached, so as to obtain candidate task prompt information.
6. A method according to claim 3, said obtaining a plurality of candidate cues comprising:
extracting characteristics of a plurality of sample texts to obtain sample text characteristics of the plurality of sample texts;
calculating a first meta-prompt loss according to sample text characteristics and sample element prompt information of the plurality of sample texts;
and optimizing the sample element prompt information according to the first element prompt loss until a preset stop condition is reached, so as to obtain candidate element prompt information.
7. The method according to claim 6, wherein optimizing the sample meta-hint information according to the first meta-hint loss until a preset stop condition is reached, before obtaining candidate meta-hint information, further comprises:
clustering the plurality of sample text features to determine at least one cluster center feature;
calculating a second binary prompt loss according to the plurality of sample text features, the sample element prompt information of the plurality of sample texts and the at least one cluster center feature;
optimizing the sample meta-hint information according to the first meta-hint loss until a preset stop condition is reached, obtaining candidate meta-hint information, including:
And optimizing the sample element prompt information according to the first element prompt loss and the second element prompt loss until a preset stop condition is reached, so as to obtain candidate element prompt information.
8. The method of claim 1, the training mode of the text processing model comprising:
obtaining a second sample set, wherein the second sample set comprises a plurality of training sample texts, the training sample texts carry training sample prompt information and sample processing results, and the training sample prompt information comprises training sample global prompt information, training sample form prompt information, training sample task prompt information and training sample element prompt information;
extracting a first training sample text from the plurality of training sample texts, wherein the first training sample text is any one of the plurality of training sample texts;
inputting the first training sample text and the first training sample prompt information into an initial processing model to obtain a first prediction result corresponding to the first training sample text;
calculating a prediction loss value according to the first prediction result and a first sample processing result carried by the first training sample text;
And according to the predicted loss value, adjusting model parameters of the initial processing model, and returning to execute the step of extracting the first training sample text from the plurality of training sample texts until a preset processing stop condition is reached, so as to obtain a text processing model.
9. The method of claim 8, the first training sample cues comprise first training sample task cues; the step of inputting the first training sample text and the first training sample prompt information into an initial processing model, and before obtaining a first prediction result corresponding to the first training sample text, further comprises:
extracting features of the first training sample text to obtain first training text features of the first training sample text;
calculating the distance between the first training text feature and the first training sample prompt information;
and under the condition that the distance is larger than a preset task identification condition, replacing the task prompt information of the first training sample with the appointed task prompt information.
10. The method of claim 8, the training sample text further carrying task attributes; after extracting the first training sample text from the plurality of training sample texts, the method further comprises:
Determining a first prediction attribute of the first training sample text according to the first training sample text and first training sample task prompt information carried by the first training sample text;
calculating an attribute loss value according to the first predicted attribute and the first task attribute carried by the first training sample text;
and optimizing the task prompt information of the first training sample according to the attribute loss value until a preset optimization stop condition is reached, so as to obtain the optimized task prompt information of the first training sample.
11. A question-answering text processing method comprises the following steps:
extracting features of a to-be-processed question-answering text to obtain question-answering text features of the to-be-processed question-answering text;
matching a plurality of target question-answer prompt messages corresponding to the question-answer text to be processed by utilizing the question-answer text characteristics, wherein the granularity of the target question-answer prompt messages is different;
inputting the question-answering text features and the target question-answering prompt messages into a text processing model, and processing by the text processing model to obtain a question-answering text processing result, wherein the text processing model is obtained by training based on a plurality of training sample texts and training sample prompt messages corresponding to the training sample texts.
12. A text processing model training method applied to cloud-side equipment, the method comprising:
obtaining a second sample set, wherein the second sample set comprises a plurality of training sample texts, and the training sample texts carry training sample prompt information and sample processing results;
extracting features of the plurality of training sample texts to obtain training text features of the plurality of training sample texts;
matching a plurality of training sample prompt messages corresponding to each training sample text by utilizing the training text characteristics, wherein the training sample prompt messages comprise training sample global prompt messages, training sample form prompt messages, training sample task prompt messages and training sample element prompt messages;
training the initial processing model according to training text characteristics of the training sample texts and the training sample prompt information corresponding to each training sample text to obtain model parameters of a text processing model obtained through training;
and sending the model parameters of the text processing model obtained through training to the terminal equipment.
13. A computing device, comprising:
a memory and a processor;
the memory is configured to store computer executable instructions, the processor being configured to execute the computer executable instructions, which when executed by the processor, implement the steps of the method of any one of claims 1 to 10 or claim 11 or claim 12.
14. A computer readable storage medium storing computer executable instructions which when executed by a processor implement the steps of the method of any one of claims 1 to 10 or claim 11 or claim 12.
CN202211674486.8A 2022-12-26 2022-12-26 Text processing, question-answer text processing and text processing model training method Pending CN116050405A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211674486.8A CN116050405A (en) 2022-12-26 2022-12-26 Text processing, question-answer text processing and text processing model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211674486.8A CN116050405A (en) 2022-12-26 2022-12-26 Text processing, question-answer text processing and text processing model training method

Publications (1)

Publication Number Publication Date
CN116050405A true CN116050405A (en) 2023-05-02

Family

ID=86117355

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211674486.8A Pending CN116050405A (en) 2022-12-26 2022-12-26 Text processing, question-answer text processing and text processing model training method

Country Status (1)

Country Link
CN (1) CN116050405A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501858A (en) * 2023-06-21 2023-07-28 阿里巴巴(中国)有限公司 Text processing and data query method
CN117332072A (en) * 2023-12-01 2024-01-02 阿里云计算有限公司 Dialogue processing, voice abstract extraction and target dialogue model training method
CN117540012A (en) * 2024-01-04 2024-02-09 阿里云计算有限公司 Text generation method and system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116501858A (en) * 2023-06-21 2023-07-28 阿里巴巴(中国)有限公司 Text processing and data query method
CN116501858B (en) * 2023-06-21 2023-11-14 阿里巴巴(中国)有限公司 Text processing and data query method
CN117332072A (en) * 2023-12-01 2024-01-02 阿里云计算有限公司 Dialogue processing, voice abstract extraction and target dialogue model training method
CN117332072B (en) * 2023-12-01 2024-02-13 阿里云计算有限公司 Dialogue processing, voice abstract extraction and target dialogue model training method
CN117540012A (en) * 2024-01-04 2024-02-09 阿里云计算有限公司 Text generation method and system
CN117540012B (en) * 2024-01-04 2024-04-30 阿里云计算有限公司 Text generation method and system

Similar Documents

Publication Publication Date Title
CN110377911B (en) Method and device for identifying intention under dialog framework
CN107846350B (en) Method, computer readable medium and system for context-aware network chat
CN116050405A (en) Text processing, question-answer text processing and text processing model training method
CN111666416B (en) Method and device for generating semantic matching model
US11238132B2 (en) Method and system for using existing models in connection with new model development
CN117521675A (en) Information processing method, device, equipment and storage medium based on large language model
CN111738010A (en) Method and apparatus for generating semantic matching model
CN116303558A (en) Query statement generation method, data query method and generation model training method
CN110727871A (en) Multi-mode data acquisition and comprehensive analysis platform based on convolution decomposition depth model
CN117313837A (en) Large model prompt learning method and device based on federal learning
CN116363457B (en) Task processing, image classification and data processing method of task processing model
CN117291185A (en) Task processing method, entity identification method and task processing data processing method
CN117093864A (en) Text generation model training method and device
CN116561270A (en) Question-answering method and question-answering model training method
CN115757723A (en) Text processing method and device
CN115910062A (en) Audio recognition method, device, equipment and storage medium
CN117573842B (en) Document retrieval method and automatic question-answering method
CN116467500B (en) Data relation identification, automatic question-answer and query sentence generation method
CN117633540B (en) Sample data construction method and device
CN118212460A (en) Image classification method, automatic question-answering method, image class feature fusion model training method and information processing method based on deep learning model
CN117972222B (en) Enterprise information retrieval method and device based on artificial intelligence
US20240177243A1 (en) Intelligent platform for audit response using a metaverse-driven approach for regulator reporting requirements
CN117789099B (en) Video feature extraction method and device, storage medium and electronic equipment
CN116611435A (en) Entity processing model training method, entity identification method and device
CN117971420A (en) Task processing, traffic task processing and task processing model training method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination