CN117648986B - Task processing and code processing method, computing device, medium, and program product - Google Patents

Task processing and code processing method, computing device, medium, and program product Download PDF

Info

Publication number
CN117648986B
CN117648986B CN202410114939.4A CN202410114939A CN117648986B CN 117648986 B CN117648986 B CN 117648986B CN 202410114939 A CN202410114939 A CN 202410114939A CN 117648986 B CN117648986 B CN 117648986B
Authority
CN
China
Prior art keywords
task
data
target
model
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410114939.4A
Other languages
Chinese (zh)
Other versions
CN117648986A (en
Inventor
张昕东
潘庆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Alibaba Robot Co ltd
Original Assignee
Zhejiang Alibaba Robot Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Alibaba Robot Co ltd filed Critical Zhejiang Alibaba Robot Co ltd
Priority to CN202410114939.4A priority Critical patent/CN117648986B/en
Publication of CN117648986A publication Critical patent/CN117648986A/en
Application granted granted Critical
Publication of CN117648986B publication Critical patent/CN117648986B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Embodiments of the present specification provide task processing and code processing methods, computing devices, media, and program products, wherein the task processing methods include: acquiring data to be processed; inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task; and under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task to obtain a task processing result, wherein the task processing model is obtained based on task sample sets corresponding to the target task through training. The data type of the data to be processed is identified through the target identification model, the data to be processed, the data type of which accords with the task type of the target task, is input into the task processing model, so that the consumption of the task processing model is reduced, abuse is avoided, and the processing efficiency of the task processing model is accelerated.

Description

Task processing and code processing method, computing device, medium, and program product
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and in particular, to a task processing and code processing method, a computing device, a medium, and a program product.
Background
With the development of computer technology, the dialog generation technology gradually becomes a core technology of the artificial intelligence generation field (AI GENERATED Content, AIGC). Dialog generation techniques generate a natural language reply that conforms to a context by processing a user's input request. The method can be applied to scenes such as virtual assistants, customer service robots, intelligent dialogue systems and the like, and personalized interaction experience is provided for users.
In practical application, user input is generally random, and a large number of invalid inputs affect the efficiency of model identification and human-computer interaction efficiency.
Therefore, a task processing method is needed to avoid misuse of the model and improve the processing efficiency of the model.
Disclosure of Invention
In view of this, the present embodiment provides a task processing method. One or more embodiments of the present specification relate to a code processing method, a task processing device, a code processing device, a computing device, a computer-readable storage medium, and a computer program product that solve the technical drawbacks of the prior art.
According to a first aspect of embodiments of the present specification, there is provided a task processing method, including: acquiring data to be processed; inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task; under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result, wherein the task processing model is obtained by training based on a task sample group corresponding to the target task.
According to a second aspect of embodiments of the present specification, there is provided a code processing method, including: acquiring task initial information sent by a front-end user; inputting the task initial information into a target recognition model, and determining the information type of the task initial information, wherein the target recognition model is obtained by training a classification model based on a task initial sample; under the condition that the information type accords with the code task type, inputting task initial information into a code processing model to execute a code question-answering task, and obtaining a code processing result, wherein the code processing model is obtained by training based on a code processing sample pair; and feeding back the code processing result to the front-end user.
According to a third aspect of embodiments of the present specification, there is provided a task processing device including: the first acquisition module is configured to acquire data to be processed; the first recognition module is configured to input data to be processed into a target recognition model and recognize the data type of the data to be processed, wherein the target recognition model is obtained by training the classification model based on sample data corresponding to a target task; the first processing module is configured to input data to be processed into the task processing model to execute the target task under the condition that the data type accords with the task type of the target task, and obtain a task processing result, wherein the task processing model is obtained based on task sample sets corresponding to the target task through training.
According to a fourth aspect of embodiments of the present specification, there is provided a code processing apparatus comprising: the second acquisition module is configured to acquire task initial information sent by a front-end user; the second recognition module is configured to input the task initial information into the target recognition model, and determine the information type of the task initial information, wherein the target recognition model is obtained by training the classification model based on the task initial sample; the second processing module is configured to input the task initial information into the code processing model to execute the code question-answering task to obtain a code processing result under the condition that the information type accords with the code task type, wherein the code processing model is obtained based on the code processing sample pair training; and the feedback module is configured to feed back the code processing result to the front-end user.
According to a fifth aspect of embodiments of the present specification, there is provided a computing device comprising: a memory and a processor; the memory is for storing computer programs/instructions which, when executed by the processor, implement the steps of the method provided in the first or second aspect described above.
According to a fourth aspect of embodiments of the present specification, there is provided a computer readable storage medium storing a computer program/instruction which when executed by a processor performs the steps of the method provided in the first or second aspect described above.
According to a fifth aspect of embodiments of the present description, there is provided a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the method provided in the first or second aspect described above.
According to the task processing method provided by the embodiment of the specification, data to be processed is obtained; inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task; under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result, wherein the task processing model is obtained by training based on a task sample group corresponding to the target task. The data type of the data to be processed is identified through the target identification model, the data to be processed, the data type of which accords with the task type of the target task, is input into the task processing model, the data to be processed, the data type of which does not accord with the task type of the target task, is filtered, the consumption of the task processing model is reduced, the misuse of the task processing model is avoided, and the processing efficiency of the task processing model is accelerated.
Drawings
FIG. 1 (a) is an interactive schematic diagram of a task processing system provided in one embodiment of the present description;
FIG. 1 (b) is an architecture diagram of a task processing system provided by one embodiment of the present description;
FIG. 1 (c) is an architecture diagram of another task processing system provided by one embodiment of the present description;
FIG. 2 is a flow chart of a method of task processing provided in one embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a task processing method according to one embodiment of the present disclosure;
FIG. 4 is a schematic flow chart of training a target recognition model in a task processing method according to an embodiment of the present disclosure;
FIG. 5 is a schematic diagram of intercepting alert words in a task processing method according to an embodiment of the present disclosure;
FIG. 6 is a flow chart of a code processing method provided by one embodiment of the present disclosure;
FIG. 7 is a process flow diagram of a code processing method provided by one embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a task processing device according to an embodiment of the present disclosure;
fig. 9 is a schematic diagram of a code processing apparatus according to an embodiment of the present disclosure;
FIG. 10 is a block diagram of a computing device provided in one embodiment of the present description.
Detailed Description
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present description. This description may be embodied in many other forms than described herein and similarly generalized by those skilled in the art to whom this disclosure pertains without departing from the spirit of the disclosure and, therefore, this disclosure is not limited by the specific implementations disclosed below.
The terminology used in the one or more embodiments of the specification is for the purpose of describing particular embodiments only and is not intended to be limiting of the one or more embodiments of the specification. As used in this specification, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used in one or more embodiments of the present specification refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that, although the terms first, second, etc. may be used in one or more embodiments of this specification to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, a first may also be referred to as a second, and similarly, a second may also be referred to as a first, without departing from the scope of one or more embodiments of the present description. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.
Furthermore, it should be noted that, user information (including, but not limited to, user equipment information, user personal information, etc.) and data (including, but not limited to, data for analysis, stored data, presented data, etc.) according to one or more embodiments of the present disclosure are information and data authorized by a user or sufficiently authorized by each party, and the collection, use, and processing of relevant data is required to comply with relevant laws and regulations and standards of relevant countries and regions, and is provided with corresponding operation entries for the user to select authorization or denial.
In one or more embodiments of the present description, a large model refers to a deep learning model with large scale model parameters, typically including hundreds of millions, billions, trillions, and even more than one billion model parameters. The large Model can be called as a Foundation Model, a training Model is performed by using a large-scale unlabeled corpus, a pre-training Model with more than one hundred million parameters is produced, the Model can adapt to a wide downstream task, and the Model has better generalization capability, such as a large-scale language Model (Large Language Model, LLM), a multi-modal pre-training Model (multi-modal pre-training Model) and the like.
When the large model is actually applied, the pretrained model can be applied to different tasks by fine tuning with a small amount of samples, the large model can be widely applied to the fields of natural language processing (Natural Language Processing, NLP), computer vision and the like, and particularly can be applied to the tasks of the computer vision fields such as vision question and answer (Visual Question Answering, VQA), image description (IC), image generation and the like, and the tasks of the natural language processing fields such as emotion classification based on texts, text abstract generation, machine translation and the like, and main application scenes of the large model comprise digital assistants, intelligent robots, searching, online education, office software, electronic commerce, intelligent design and the like.
First, terms related to one or more embodiments of the present specification will be explained.
AIGC: i.e., content generation of artificial intelligence, in some embodiments of the present specification refers to code generation based on artificial intelligence large models, such as code complement content generation, code question-answer generation.
Big data platform: the platform can be used for performing off-line calculation on a plurality of machines in parallel and high efficiency.
Code question-answering: the questions and answers in the code field comprise specific purposes, such as code annotation generation, test case generation, code interpretation and the like, besides general free questions and answers.
Code question-answer big model: the special field large model based on the general large model secondary processing optimization training mainly solves the problems of research and development of coding, and is huge in parameter quantity, low in inference output speed and high in cost.
And (5) intention recognition: the question and answer intentions of the user are judged through the classification model, in some embodiments of the specification, a classification model or a multi-classification model is adopted, and whether the problem input by the user is a problem conforming to the large model is finally judged through adjusting the threshold value.
Small model: the model with less parameter quantity is referred to, and the task type of processing is relatively simple, understanding and migration ability are weaker, the speed is fast, and the cost is low compared with a large model.
The Bert model: a model with larger parameter quantity can be used for text classification tasks.
Fasttext model: a deep learning model with smaller parameter quantity can be used for text classification tasks.
MultinomialNB model: i.e., multinomial Naive Bayes model, a polynomial naive bayes model, a relatively simple machine learning model, can be used for text classification.
Jieba word segmentation: namely, the crust word segmentation is a Chinese word segmentation processing tool which is used for tasks such as natural language analysis, text processing and the like.
Green net interception: the goal of green network interception is to limit the spread and access of objectionable, illegal, or harmful content to maintain the health and order of the network environment, which in the embodiments of this specification generally refers to a sensitive information interception service within the enterprise.
Along with the development of computer technology, AIGC technology is becoming the core technology in the field of artificial intelligence generation. AIGC technology processes the task request of the user through the task processing large model, and generates a task processing result of the task request. The method can be applied to scenes such as virtual assistants, customer service robots, intelligent dialogue systems and the like, and personalized interaction experience is provided for users.
In practical application, if the task processing large model has no interception mechanism, all task requests are processed by the task processing large model, which wastes performance cost of the task processing large model, for example, users ask code-independent questions, such as life questions, in code large model products, and the answer quality given by the code large model is poor because special tuning is not performed. Even users are sensitive, there is public opinion risk, and task processing cost of task processing large model is high, and irrelevant task request will cause cost increase.
If the large task processing model is provided with an interception mechanism, a large number of error filtering conditions can be generated through regular or blacklist word filtering; meanwhile, keywords are difficult to enumerate completely, and the recall rate is low; if the rules are too many, the performance is affected; there is also a need for a scene that may require semantic level understanding that is difficult to identify with mere regularization.
Therefore, a task processing method is needed to avoid misuse of the large task processing model and improve the recognition efficiency of the large task processing model.
In order to solve the above-described problems, in the present specification, a task processing method, a task processing system, a task processing device, a code processing device, a computing device, and a computer-readable storage medium are provided, and the present specification relates to a code processing method, a task processing system, a task processing device, a code processing device, a computing device, and a computer-readable storage medium, which are described in detail in the following embodiments one by one.
Referring to fig. 1 (a), fig. 1 (a) is an interaction schematic diagram of a task processing system according to an embodiment of the present disclosure, specifically, the task processing system includes a target recognition model and a task processing model, and the task processing system acquires data to be processed and inputs the data to be processed into the target recognition model; the target recognition model is used for recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training the classification model based on sample data corresponding to a target task; the task processing system is also used for inputting the data to be processed into the task processing model under the condition that the data type accords with the task type of the target task; the task processing model is used for executing the target task and obtaining a task processing result, wherein the task processing model is obtained based on training of a task sample group corresponding to the target task.
Acquiring data to be processed through the task processing system; inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task; under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result, wherein the task processing model is obtained by training based on a task sample group corresponding to the target task. The data type of the data to be processed is identified through the target identification model, the data to be processed, the data type of which accords with the task type of the target task, is input into the task processing model, the data to be processed, the data type of which does not accord with the task type of the target task, is filtered, the consumption of the task processing model is reduced, the misuse of the task processing model is avoided, and the processing efficiency of the task processing model is accelerated.
Referring to fig. 1 (b), fig. 1 (b) shows an architecture diagram of a task processing system provided in one embodiment of the present specification, where the task processing system may include a client 100 and a server 200;
The client 100 is configured to send data to be processed to the server 200; the server 200 is configured to obtain data to be processed; inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task; under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result, wherein the task processing model is obtained by training based on a task sample group corresponding to the target task;
The client 100 is further configured to receive a task processing result sent by the server 200.
By applying the scheme of the embodiment of the specification, the data type of the data to be processed is identified through the target identification model, the data to be processed, the data type of which accords with the task type of the target task, is input into the task processing model, the data to be processed, the data type of which does not accord with the task type of the target task, is filtered, the consumption of the task processing model is reduced, the misuse of the task processing model is avoided, and the processing efficiency of the task processing model is accelerated.
Referring to fig. 1 (c), fig. 1 (c) shows an architecture diagram of another task processing system provided in an embodiment of the present disclosure, where the task processing system may include a plurality of clients 100 and a server 200, where the clients 100 may include an end-side device, and the server 200 may include a cloud-side device. Communication connection can be established between the plurality of clients 100 through the server 200, in a task processing scenario, the server 200 is used to provide task processing services between the plurality of clients 100, and the plurality of clients 100 can respectively serve as a transmitting end or a receiving end, so that communication is realized through the server 200.
The user may interact with the server 200 through the client 100 to receive data transmitted from other clients 100, or transmit data to other clients 100, etc. In the task processing scenario, it may be that the user issues data to be processed to the server 200 through the client 100, and the server 200 generates a task processing result according to the data to be processed and pushes the task processing result to other clients that establish communications.
Wherein, the client 100 and the server 200 establish a connection through a network. The network provides a medium for a communication link between client 100 and server 200. The network may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others. The data transmitted by the client 100 may need to be encoded, transcoded, compressed, etc. before being distributed to the server 200.
The client 100 may be a browser, an Application (APP), or a web Application such as a hypertext markup language5 th edition (HyperText Markup Language, H5) Application, or a light Application (also referred to as applet, a lightweight Application), or cloud Application, etc., and the client 100 may be based on a software development kit (Software Development Kit, SDK) of a corresponding service provided by the server 200, such as a real-time communication (Real Time Communication, RTC) based SDK development acquisition, etc. The client 100 may be deployed in an electronic device, need to run depending on the device or some APP in the device, etc. The electronic device may for example have a display screen and support information browsing etc. as may be a personal mobile terminal such as a mobile phone, tablet computer, personal computer etc. Various other types of applications are also commonly deployed in electronic devices, such as human-machine conversation type applications, model training type applications, text processing type applications, web browser applications, shopping type applications, search type applications, instant messaging tools, mailbox clients, social platform software, and the like.
The server 200 may include a server that provides various services, such as a server that provides communication services for multiple clients, a server for background training that provides support for a model used on a client, a server that processes data sent by a client, and so on. It should be noted that, the server 200 may be implemented as a distributed server cluster formed by a plurality of servers, or may be implemented as a single server. The server may also be a server of a distributed system or a server that incorporates a blockchain. The server may also be a cloud server for cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms, or an intelligent cloud computing server or intelligent cloud host with artificial intelligence technology.
It should be noted that, the task processing method provided in the embodiments of the present disclosure is generally executed by the server, but in other embodiments of the present disclosure, the client may also have a similar function to the server, so as to execute the task processing method provided in the embodiments of the present disclosure. In other embodiments, the task processing method provided in the embodiments of the present disclosure may be performed by the client and the server together.
Referring to fig. 2, fig. 2 is a flowchart of a task processing method according to an embodiment of the present disclosure, which specifically includes the following steps.
Step 202: and obtaining data to be processed.
The data to be processed may be data input by a user, and may be different types of data according to specific application scenarios and tasks, including but not limited to: text data: including questions, comments, articles, reports, etc. entered by the user; image data: including photographs uploaded by the user, image files, images captured by a vision sensor, etc.; audio data: including user voice input, music, voice recording, etc.; video data: the method comprises the steps of uploading video files, real-time video streams, recorded event videos and the like by a user; table data: including electronic forms, records in databases, CSV (Comma-SEPARATED VALUES, comma separated value format) files, etc.; sensor data: including temperature, humidity, position, acceleration, etc.
In practical application, firstly, a user inputs a problem or a requirement through an interface of an application or other modes, and the input can be text input, voice input or other input modes; then, the task processing system receives data to be processed input by a user, for text input, text data can be directly received, and for voice input, voice data can be converted into a text format; finally, the received data is cleaned and preprocessed to remove non-critical information such as special characters, noise, etc., which may be performed by text processing techniques or voice processing techniques, and the cleaned and preprocessed data is formatted to meet the requirements of subsequent processing of the system, for example, converting the text data or voice data into a format recognizable by the target recognition model.
In one or more embodiments of the present disclosure, taking a user input as an example of "please write an example of bubble ordering", first, the user inputs a text of "please write an example of bubble ordering" through an application interface; and then, the task processing system receives the data to be processed input by the user, cleans and preprocesses the received data, removes punctuation marks or other special symbols, and converts the punctuation marks or other special symbols into an input format of a subsequent target recognition model.
Step 204: inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task.
The data type of the data to be processed refers to task types corresponding to a problem proposed by a user, such as an image generation type, a free question-answering type, a code generation type and the like, the target recognition model is a neural network model for task type recognition of a target task, the neural network model is obtained by training sample data based on different task types, it can be understood that the task types of the target task are included in the sample data, so that the trained target recognition model can recognize the data to be processed of the task type of which the data type is the target task, the target recognition model can be specifically an intention recognition model or an intention classification model, the target task is a task which can be executed by a subsequent task processing model, for example, when the task processing model is the code question-answering model, the target task is the code question-answering task such as code generation, code annotation and the like.
In practical applications, the target recognition model is trained, and various user intentions are recognized by using sample data corresponding to the labeled target tasks through supervised learning or other related technologies. The annotated sample data typically contains different user inputs and corresponding intent labels, and the target recognition model is able to recognize and infer intent expressed by the user inputs through learning of the data sets.
In one or more embodiments of the present disclosure, taking a code question-and-answer scenario as an example, in a training phase, sample data with labels is input into a classification model to be trained to identify the intent of different data to be processed, for example, a label "annotating the following code" is a "target task" and a label "how the local weather today" is a non-target task. Based on the sample data, the task processing system can use a supervised learning method to train a classification model, the classification model can learn to extract features from user input, and classify according to labeled intention labels to obtain a target recognition model, and after training is completed, the target recognition model can be used for recognizing and deducing the intention input by the user in real time.
It can be understood that the target recognition model can be obtained through training sample data of other scenes besides the code question-answer scene, such as a picture generation scene, a free question-answer scene and the like, and the application scene of the target recognition model is not particularly limited in the specification.
Further, before inputting the data to be processed into the target recognition model, the method further comprises the following steps: acquiring a sample set and a classification model, wherein the sample set comprises a plurality of sample data corresponding to a target task, and the sample data carries a data type label; inputting the sample data into a classification model to obtain the predicted data type of the sample data; training the classification model based on the loss between the data type label and the predicted data type to obtain a target recognition model.
In practical application, firstly, a task processing system needs to acquire a sample set containing a plurality of sample data corresponding to a target task, wherein the sample set can contain sample data with various data types, such as text, images, audio and the like, and meanwhile, a classification model, such as a text classification model or other classification model of Bert, fasttext, multinomialNB and the like, needs to be prepared for judging the data type of the data to be processed; second, each sample data may carry a data type tag for indicating its actual data type; then, inputting sample data into a classification model, processing the sample data by using the classification model to obtain a predicted data type of the sample data, training the classification model based on the loss between a data type label of the sample data and the predicted data type, measuring the difference between the predicted data type and the real data type by the loss, and updating model parameters by a back propagation algorithm to reduce the loss; finally, after a series of training, the classification model can obtain better data type prediction capability, the trained classification model is used as a target recognition model for recognizing the data type of the data to be processed, and the data to be processed is input into the target recognition model, so that the data type of the data to be processed can be recognized through the target recognition model.
In one or more embodiments of the present disclosure, taking a target task as a code question-answering task and sample data as a question text, first, a task processing system needs to obtain a sample set including a plurality of sample data, and prepare a classification model; secondly, each section of sample text needs to carry a data type label for representing the real data type, the labeled sample data is input into a classification model for training to identify the intention of different sample data, for example, the label of 'complementing the following code' is a target task, and the label of 'how to the local weather today' is a non-target task; then, inputting a sample text into a classification model, obtaining a predicted data type of the sample through a forward propagation process of the classification model, calculating loss between the predicted data type and a real data type label, measuring difference between the predicted data type and the real data type, and updating parameters of the classification model by using a backward propagation algorithm to reduce the loss and improve the prediction capability of the classification model on the data type of the sample data; finally, after a series of training, the classification model can accurately predict the data type of the sample data, and the trained classification model is used as a target recognition model.
Referring to fig. 3, fig. 3 is a schematic level diagram of a task processing method provided in an embodiment of the present disclosure, where the task processing method provided in the embodiment of the present disclosure includes a data layer, a training layer, and an inference layer, where the data layer includes an open source data set for providing various publicly available data sets as sample data, and the data layer further includes a non-open source data set for providing sample data in a specific field according to task requirements of an enterprise; the training layer comprises a data integration unit, a format conversion unit, a classification model training unit and a classification model uploading unit, wherein the data integration unit is used for integrating sample data from different data sources into the same format, the format conversion unit is used for converting the sample data with the same format into formats required by training a classification model, the classification model training unit is used for training the classification model by using the integrated and format-converted data, so that the classification model can learn and identify data types from the sample data, and the classification model uploading unit is used for uploading the trained target identification model to a position which can be called by an inference layer, such as a cloud server, a model warehouse and the like; the reasoning layer comprises a model calling unit, a interception and filtering unit and a model reasoning unit, wherein the model calling unit is used for calling an uploaded target recognition model from a model warehouse or a server, the reasoning layer further comprises a data type recognition unit for carrying out data type recognition on data to be processed through the target recognition model before reasoning, and the model reasoning unit is used for calling a task processing model to carry out reasoning on the data to be processed under the condition that the data type recognized by the target recognition model accords with the task type of a target task so as to meet the requirement of a user.
Based on the above, a target recognition model capable of accurately recognizing the data type of the data to be processed is finally obtained, and after the data type of the data to be processed is recognized by the target recognition model, the data can be input into a proper task processing model according to specific conditions for further task processing.
Further, obtaining a sample set includes: acquiring positive sample data and negative sample data from a target data set, wherein the target data set is a data set of a target task; a sample set is constructed based on the positive sample data and the negative sample data.
The positive sample data refers to samples meeting the target task requirements, and the negative sample data refers to samples not meeting the target task requirements. For example, where the target task is an image generation task, "generate a flower" is positive sample data, and "how today's local weather" is negative sample data.
In practical application, first, a task processing system needs to acquire positive sample data and negative sample data from a target task data set, namely a target data set; secondly, based on positive sample data and negative sample data, a sample set is constructed, the sample set contains enough positive sample data and negative sample data to ensure that a classification model can fully learn and distinguish positive samples from negative samples, the positive sample data and the negative sample data are randomized and segmented, noise is added to part of the positive sample data to be converted into negative samples when the positive sample data are more, and part of the negative samples are abandoned when the negative sample data are more, so that the number of the positive sample data and the number of the negative sample data are approximately the same to ensure the representativeness and the balance of the sample set, and the deviation to either one of the positive sample and the negative sample in the learning process is avoided. In the case that the sample set includes chinese sample data and non-chinese sample data, in order to further ensure the equality of the sample set, the task processing system may perform a translation process on the sample data, so as to ensure that the number of chinese sample data is substantially the same as the number of non-chinese sample data, and avoid bias in identifying chinese data or non-chinese data in the classification model learning process.
Based on the method, by acquiring the positive sample data and the negative sample data from the target data set, the sample set can be ensured to better represent the actual data situation, so that the difference and the characteristics between the positive sample and the negative sample can be more accurately learned in the training process of the classification model. The classification model is sufficiently trained to maintain accuracy in the face of new unseen samples. By constructing a sample set containing diversified positive and negative samples, the classification model has generalization capability and stronger classification capability on unknown data. The representativeness and the balance of the sample set can improve the accuracy and the performance of the classification model, so that the classification model can more accurately identify positive sample data and negative sample data, and the execution effect of a target task is improved.
Further, inputting the sample data into the classification model to obtain a predicted data type of the sample data, including: based on first format information corresponding to the target code language, performing format conversion on the sample data; performing word segmentation on the sample data subjected to format conversion to obtain word segmentation results; based on the second format information corresponding to the classification model, performing format conversion on the segmentation result to obtain updated sample data; and inputting the updated sample data into the classification model to obtain the predicted data type of the sample data.
The first format information is used for converting sample data from different sources into the same format, the second format information is used for converting the word segmentation result into a format corresponding to the classification model, and the classification model is taken as fasttext model as an example, and the second format information is used for converting the word segmentation result into a format corresponding to the fasttext model, so that the word segmentation result can be directly input into the fasttext model for type identification.
In practical application, firstly, according to the specification and requirement of the target code language, format conversion is carried out on initial sample data to be unified into the specific format requirement of the target code language, for example, by taking the target code language as jsonl, the original sample data is uniformly converted into jsonl format; secondly, word segmentation is carried out on the sample data after format conversion, a text is segmented according to a certain rule, a word segmentation result is obtained, and the text is converted into a form of a word or phrase so as to be input into a classification model for processing; finally, according to the input format requirement required by the classification model, further format conversion is carried out on the word segmentation result so that the word segmentation result meets the input requirement of the classification model, and the classification model is taken as fasttext model as an example, and at the moment, the second format information is used for converting the word segmentation result into a format corresponding to the fasttext model, so that the word segmentation result can be directly input into the fasttext model for type identification.
In one or more embodiments of the present disclosure, referring to fig. 4, taking a first format information as jsonl format and a second format information as fasttext format as an example, fig. 4 is a schematic flow chart of training a target recognition model in a task processing method according to one embodiment of the present disclosure, specifically including step 402, the data formats are unified: after receiving the sample data, uniformly converting the format of the sample data into jsonl format; step 404, word segmentation: jieba word segmentation is carried out on sample data in jsonl format, and texts in the sample data are segmented according to a certain rule to obtain word segmentation results; step 406, format conversion: because the classification model is fasttext models, the word segmentation result is converted into fasttext format to obtain updated sample data, so that the word segmentation result can be directly input into a fasttext model for type recognition; after step 406, the updated sample data obtained after conversion is input fasttext to the model for model training in step 408, model saving in step 410, and model testing in step 412, to obtain a trained classification model, and the trained classification model is determined as a target recognition model.
It is understood that the classification model may be fasttext models, and may also be Bert, multinomialNB class classification models, and the specific type of classification model is not limited in the embodiments of the present disclosure.
Based on the method, sample data can be ensured to meet the input format requirement of the classification model, and can be correctly input into the classification model for training.
Further, acquiring data to be processed includes: receiving a task processing request, wherein the task processing request comprises data to be processed and a task identifier of a target task; inputting the data to be processed into a target recognition model, recognizing the data type of the data to be processed, and comprising: identifying whether the target task is a question-answer task executed for the first time according to the task identification; if yes, inputting the data to be processed into a target recognition model for type recognition, and recognizing the data type of the data to be processed.
In practical applications, the user may make an inquiry or other additional requirements, where the inquiry or other additional requirements often do not include the task type of the target task, for example, "continue output" is not related to the task type of the target task in a literal sense, so in order to prevent the inquiry of the user from being filtered by mistake, in the embodiment of the present disclosure, only the first round of inquiry and answer of the user needs to be input into the target recognition model to perform type recognition, and specifically, the task processing system receives a task processing request, where the request includes the data to be processed and the task identifier of the target task; the task processing system determines whether the same target task is executed before the task processing request according to the task identification; if not, the target task is a question-answer task executed for the first time, and the data to be processed is input into a target recognition model for type recognition, so that the data type of the data to be processed is recognized.
In one or more embodiments of the present disclosure, taking a code question-answer scenario as an example, first, a user inputs "generate a test case of the following code a" to a task processing system, and the task processing system receives a task processing request, including data to be processed (text input by the user) and a task identifier of a target task (assuming that the task identifier is "test case generation: code a"); and secondly, the task processing system determines whether the task is a first-round question-answering task according to the task identification, if so, the data to be processed is input into the target recognition model for type recognition, the data type of the text input by the user is recognized, if not, the task processing system already executes the same task before the task processing system, the question-answering is judged to be the question of the user, the target task is not executed for the first time, and the data type of the text input by the user is not required to be determined.
Further, after identifying whether the target task is the question-answer task executed for the first time according to the task identification, the method further comprises: if not, inputting the data to be processed into the task processing model to execute the target task, and obtaining a task processing result.
In practical application, a task processing system receives a task processing request, wherein the request comprises data to be processed and a task identifier of a target task; the task processing system determines whether the same target task is executed before the task processing request according to the task identification; if yes, the target task is not the question-answer task executed for the first time, the question is asked by the user, and the data to be processed is directly input into the task processing model to execute the target task, so that a task processing result is obtained.
In one or more embodiments of the present disclosure, taking a code question-answer scenario as an example, first, a user inputs a "continue annotation code a" to a task processing system, and the task processing system receives a task processing request including data to be processed (text input by the user) and a task identification of a target task (assuming that the task identification is "code annotation: code a"); and secondly, the task processing system determines whether the task is a first-round question-and-answer task according to the task identification, if not, the task processing system judges that the question-and-answer is a user's inquiry if the task is executed the same task before, and at the moment, the text input by the user is directly input into a task processing model to execute a code annotation task without determining the data type of the text input by the user, so that a code annotation result is obtained.
In one or more embodiments of the present disclosure, data to be input of a preset task corresponding to a part of task processing models has a special input mode, taking a code question-answer scene as an example, a special input mode, for example, a user selects a code a and right keys, selects a "generating unit test case", at this time, the system automatically cleans up session history memory, and splices the "generating unit test case" and the "code a" to start question-answer, where the question-answer input in the special input mode is necessarily the first round of question-answer, and the user continues to input a continuous comment code a in the input box, and also can be judged to be a question, and directly input the task processing model.
If the input mode of the user is the input mode of a common input box, the task processing system can directly forward the content of the input box to the target recognition model.
Based on the method, whether the target task is the first-round question-answering task or not is judged, the target recognition model can be prevented from mistakenly filtering the questions of the user, and the use experience of the user is improved.
Further, after inputting the data to be processed into the object recognition model, it further includes: obtaining the recognition result of whether the data to be processed contains the appointed warning word or not; if the identification result is that the data to be processed contains the appointed warning word, generating a warning word prompt message; and sending the warning word prompting message to the front-end user so as to prompt the front-end user to delete the designated warning word.
In practical application, a part of questions of a user may contain warning words to cause questions and answers, and in order to avoid that data to be processed containing the warning words are input into a task processing model, firstly, after the data to be processed is input into a target recognition model, the target recognition model detects keywords of the data to be processed, and judges whether the data to be processed contains specified warning words or not; secondly, under the condition that the detection result of the target recognition model shows that the data to be processed contains the appointed alarming word, the task processing system generates an alarming word prompting message, wherein the alarming word prompting message can contain operation suggestions aiming at the appointed alarming word in the data to be processed, for example, a user is prompted to delete or modify the content containing the alarming word, and the user can perform corresponding operation according to the alarming word prompting message so as to achieve the purpose of deleting the appointed alarming word; and finally, the task processing system sends the generated warning word prompting message to a front-end user interface, and displays the warning word prompting message to a user through notification, popup or other prompting modes of the user interface.
In one or more embodiments of the present disclosure, taking a task processing model as a question-answer model and taking a target recognition model as an intention recognition small model as an example, referring to fig. 5, fig. 5 is a schematic diagram of intercepting an alert word in a task processing method provided in one embodiment of the present disclosure, specifically, step 502, receiving a question-answer request input by a user; after receiving the user-entered question-answer request, the task processing system inputs the question-answer request to the intent recognition gadget, step 504; in the case that the small model identification of intention is passed, step 506, green network interception is performed on the question-answering request; and step 508, the task processing system inputs the question-answering request into a question-answering large model under the condition that the green network interception is passed, and the question-answering large model infers the question-answering request. If the green network interception is not passed, the task processing system will generate an alert word prompt message "sorry, i am an intelligent coding assistant, unable to answer the question, i am able to help the developer write the code and provide technical advice asking what i am able to help your? And returning the warning word prompt information to the client.
Based on the method, the task processing system can recognize whether the data to be processed contains the appointed alarming word through the target recognition model, and generates an alarming word prompting message to be sent to the front-end user so as to remind the user to delete or modify the related content, thereby realizing recognition and processing of the appointed alarming word.
Further, after inputting the data to be processed into the object recognition model and recognizing the data type of the data to be processed, the method further comprises: generating a question-answer guide message under the condition that the data type does not accord with the task type of the target task, wherein the question-answer guide message comprises the task type of the target task; and sending the question and answer guide message to the front-end user to guide the front-end user to input updated to-be-processed data based on the task type.
In practical application, under the condition that the data type does not accord with the task type of the target task, firstly, according to the target task type, the system generates a question-answer guide message to prompt a user to input updated to-be-processed data which accords with the target task type, wherein the question-answer guide message generally comprises problems, guidance or prompt information related to the task type so as to guide the user to provide data matched with the target task; second, the task processing system sends the generated question and answer guide message to the front-end user interface, typically presented to the user through a notification, pop-up window or other means of the user interface, to guide the user to input updated pending data based on the task type.
In one or more embodiments of the present disclosure, taking a task type of a target task as a code question and answer as an example, according to the target task type, the system generates a question and answer guiding message "sorry, i are an intelligent coding assistant, and cannot answer the question, i can help a developer write a code and provide technical advice, please ask what i can help your, if the system of the user side is an english system or the user selects english in a language setting function of the task processing system, corresponding generation "Sorry, I am an intelligent coding assistant and cannot answer this question. I can help developers write code and provide technical advice. Is there anything I can help you with", prompts the user to input updated data to be processed related to the code; second, the task processing system sends the generated question and answer guide message to the front-end user interface, typically presented to the user through a notification, pop-up window or other means of the user interface, to guide the user to input updated pending data based on the task type.
Based on the method, the task processing system can provide more personalized and accurate user interaction by generating the question-answer guide message and guiding the user to input updated to-be-processed data, so that the user can more easily understand the functions of the task processing system and more quickly provide the data of the task type conforming to the target task, thereby improving the user experience.
Step 206: under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result, wherein the task processing model is obtained by training based on a task sample group corresponding to the target task.
The task sample group refers to a data set consisting of labeled training data, and the training data is used for training a task processing model so that the task processing model has the capability of processing target tasks. For example, in a code question and answer task, one sample task group may contain various code question training data, and a code question and answer result tag corresponding to the code question training data.
It will be appreciated that the sample data in step 204 is used to train the target recognition model such that the target recognition model has the ability to recognize the data type of the data to be processed, and that the sample data may be of various data types and include the data type corresponding to the target task type. For example, the task type of the target task is a code question-answer type, and the sample data may include question sample data generated by an image and a corresponding image generation type tag, question sample data of a scenario dialog and a corresponding scenario dialog type tag, question sample data of a code question-answer and a corresponding code question-answer type tag.
In practical application, the task processing model may be trained in advance with a sample task group corresponding to the target task, so that the task processing model may execute the target task. Under the condition that the data type accords with the task type of the target task, the task processing system inputs the data to be processed into the task processing model, executes the target task and obtains a task processing result.
In one or more embodiments of the present description, the target task is exemplified by a code question-answering task. When the data type is a code annotation type and accords with the task type of the code question-answering task, the task processing system inputs data to be processed into the task processing model, and the task processing model executes the code annotation task to obtain codes containing the code annotation.
Further, inputting the data to be processed into the task processing model to execute the target task to obtain a task processing result, including: determining a target subtask hit by the data to be processed according to the data type; determining a target subtask model corresponding to the target subtask from the task processing model; and executing the target subtask on the data to be processed by using the target subtask model to obtain a task processing result.
In practical application, the target recognition model is a multi-classification model, and can recognize multiple data types, firstly, data to be processed is input into the multi-classification model for data type recognition, and which data type the data to be processed belongs to is recognized; secondly, determining a target subtask hit by data to be processed according to the output data type of the multi-classification model by the task processing system, and determining a target subtask model corresponding to the target subtask from the task processing model according to the determined target subtask; and finally, executing the target subtask by the task processing system based on the data to be processed by utilizing the determined target subtask model, and obtaining a task processing result.
In one or more embodiments of the present disclosure, taking a target task as a code question and answer, taking a target task including subtasks such as code completion, code annotation, code test, etc. as an example, firstly, a task processing system inputs a question text input by a user into a multi-classification model to identify a type of a question, and identifies which type of subtask the question text belongs to; secondly, determining a target subtask hit by a task processing system according to the problem type and the output result of the multi-classification model, and determining a target subtask model (a code complement sub-model, a code annotation sub-model and a code test sub-model) corresponding to the target subtask from a code question-answer model according to the determined target subtask; and finally, executing the target subtask by the task processing system based on the problem text by utilizing the determined target subtask model, and obtaining a code question-answering result.
Based on the method, the task processing system can execute corresponding tasks for different types of data, and obtain effective task processing results, so that the requirements of specific application scenes are met.
Further, after inputting the data to be processed into the object recognition model and recognizing the data type of the data to be processed, the method further comprises: generating target prompt information based on the data type; and sending the target prompt information to the front-end user.
In practical application, according to the determined data type, the task processing system generates target prompt information, where the target prompt information is used to provide prompt information related to a target task for a front end user, and may be in various forms such as text, audio, and video tables, so as to guide the user to perform a next operation, the prompt information may include description of the task type, operation guidance, or related data information required to be provided by the user, and the task processing system sends the generated target prompt information to the front end user interface, so that the user can see the target prompt information and execute a subsequent operation according to the target prompt information.
Based on the method, the interaction experience of the user and the system can be improved, and the user can process tasks more conveniently.
Further, after inputting the data to be processed into the task processing model to execute the target task and obtaining the task processing result, the method further comprises the following steps: the task processing result is sent to a front-end user; receiving feedback information sent by a front-end user based on a task processing result; and acquiring an updated sample set based on the feedback information, and updating the target recognition model by using the updated sample set.
The feedback information is error reporting information triggered by the front-end user.
In practical application, firstly, after a task processing model executes a target task, a task processing system sends a task processing result to a front-end user interface, after a user checks the task processing result, if the user finds that the task processing result is wrong or not in line with expectations, the user can click an error reporting button on the interface to send feedback information to the task processing system, and the task processing system receives and records the feedback information of the user; secondly, the task processing system acquires updated sample sets according to the received user feedback information, wherein the updated sample sets can comprise sample data provided by a user in the feedback information, or the task processing system acquires the sample data in a sample database after receiving the feedback information; and finally, utilizing the obtained updated sample set, the task processing system performs updating training on the target recognition model so as to improve the data type recognition capability of the target recognition model on the data to be processed, thus ensuring that the system can more accurately recognize the type of the data and providing a more reliable basis for subsequent task processing.
In one or more embodiments of the present disclosure, taking a code question-answer scenario as an example, firstly, a user puts forward a code-related problem to a task processing system, but obtains an error processing result that "sorry, i are an intelligent coding assistant, and cannot answer the problem," i can help a developer write a code and provide a technical suggestion, ask what i can help your, and indicate that a filtering error occurs in a target recognition model in the task processing system, and after the user clicks an error reporting button on an interface to trigger sending feedback information, the task processing system acquires an update sample set according to received user feedback information, and performs update training on the target recognition model to improve the data type recognition capability of the target recognition model on data to be processed.
Based on this, it can help task systems to continuously improve and optimize, providing higher quality services.
The method comprises the steps of obtaining data to be processed; inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task; under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result, wherein the task processing model is obtained by training based on a task sample group corresponding to the target task. The data type of the data to be processed is identified through the target identification model, the data to be processed, the data type of which accords with the task type of the target task, is input into the task processing model, the data to be processed, the data type of which does not accord with the task type of the target task, is filtered, the consumption of the task processing model is reduced, the misuse of the task processing model is avoided, and the processing efficiency of the task processing model is accelerated.
Referring to fig. 6, fig. 6 is a flowchart of a code processing method according to an embodiment of the present disclosure, which specifically includes the following steps.
Step 602: and acquiring task initial information sent by a front-end user.
In practical application, firstly, a user inputs initial information of a task through an interface of the application or other modes; then, the task processing system receives the data to be processed input by the user, the task processing system directly receives the task initial information data, cleans and preprocesses the received data to remove non-key information such as special characters, noise and the like which may exist, formats the cleaned and preprocessed data to meet the requirement of subsequent processing of the system, for example, converting text into a specific data structure or converting voice data into a format which can be processed by a voice recognition model.
In one or more embodiments of the present description, an annotation is generated with user input as "code a below: for example, code A "first, a user enters" generate comments to code A: the literal of code A "; and then, the task processing system receives data to be processed input by a user, cleans and preprocesses the received data, removes punctuation marks to obtain a code A, and converts the code A into an input format of a subsequent target recognition model.
Step 604: inputting the task initial information into a target recognition model, and determining the information type of the task initial information, wherein the target recognition model is obtained by training a classification model based on a task initial sample.
In practical applications, the target recognition model is trained, and uses supervised learning or other related techniques to recognize various types of information input by users by using sample data corresponding to the initial sample of the task with the label. Through learning of these sample data, the object recognition model is able to recognize and infer the intent expressed by the user input.
In one or more embodiments of the present disclosure, taking a scenario of code question and answer as an example, in a training stage, data with labeling samples is input into a classification model to perform training to identify intent of different data to be processed, for example, a label of "complement the following code" is a "code question and answer task", and a label of "generate a flower" is a non-code question and answer task. Based on the sample data, the task processing system can use a supervised learning method to train a classification model, the classification model can learn to extract features from user input, and classify according to labeled intention labels to obtain a target recognition model, and after training is completed, the target recognition model can be used for recognizing and deducing the intention input by the user in real time.
Step 606: under the condition that the information type accords with the code task type, inputting the task initial information into a code processing model to execute the code question-answering task, and obtaining a code processing result, wherein the code processing model is obtained by training based on a code processing sample pair.
In practical application, the task processing model is trained by a sample task group corresponding to the code question-answering task in advance, so that the task processing model can execute the code question-answering task. Under the condition that the data type accords with the task type of the code question-answering task, the task processing system inputs the data to be processed into the model, executes the code question-answering task and obtains a task processing result.
In one or more embodiments of the present description, the target task is exemplified by a code question-answering task. When the data type is a code annotation type and accords with the task type of the code question-answering task, the task processing system inputs data to be processed into the task processing model, and the task processing model executes the code annotation task to obtain codes containing the code annotation.
Step 608: and feeding back the code processing result to the front-end user.
In practical application, the code processing result is transmitted to an interface or an application program where a front-end user is located through a network or other communication modes, so that the user can further read, analyze and process the code processing result.
By the method, the task initial information sent by the front-end user is obtained; inputting the task initial information into a target recognition model, and determining the information type of the task initial information, wherein the target recognition model is obtained by training a classification model based on a task initial sample; under the condition that the information type accords with the code task type, inputting task initial information into a code processing model to execute a code question-answering task, and obtaining a code processing result, wherein the code processing model is obtained by training based on a code processing sample pair; and feeding back the code processing result to the front-end user. The method comprises the steps of identifying the information type of task initial information through a target identification model, inputting the task initial information with the information type conforming to the code task type into a code processing model, filtering the task initial information with the data type not conforming to the task type of the target task, filtering non-research and development problems irrelevant to code processing in a code question-answer scene, reducing consumption of the code processing model, avoiding misuse of the code processing model, reducing consumption of the code processing model and accelerating processing efficiency of the code processing model.
The task processing method provided in the present specification will be further described with reference to fig. 7 by taking an application of the task processing method to code question answering as an example. Fig. 7 is a flowchart of a processing procedure of a code processing method according to an embodiment of the present disclosure, which specifically includes the following steps.
Step 702, load an intent recognition model.
Specifically, the intention recognition model service loads the trained intention recognition model, and the training process of the intention recognition model may refer to the training process of the target recognition model, which is not described herein.
Step 704, inputting questioning contents.
Specifically, the user inputs code-related questioning contents to the client at the front-end interface. It can be understood that the questioning content, i.e. the data to be processed in the foregoing embodiment, is not described herein again.
Step 706, the questioning content is sent.
Specifically, the client inputs the questioning content input by the user to the back end of the questioning and answering big model.
Step 708, it is determined whether the first round of free question is executed.
Specifically, the rear end of the large model judges whether the questioning content is a first round free questioning. It can be understood that the first round of free questions, i.e. the first round of questions and answers in the foregoing embodiments, are not repeated here.
Step 710, a request is sent to the intent recognition model.
Specifically, if the rear end of the large model judges that the questioning content is a first-round free questioning, a request is sent to the intention recognition model, and the request carries the questioning content input by a user.
Step 712, invoking the intent recognition model.
Specifically, the intention recognition model service calls an intention recognition model to analyze the questioning content, and the confidence corresponding to the questioning content is obtained. When the confidence is greater than or equal to the confidence threshold, the intent recognition model service determines the questioning content as a code questioning and answering type, and when the confidence is less than the confidence threshold, the intent recognition model service determines the questioning content as a non-code questioning and answering type.
Step 714, return whether to intercept.
Specifically, the intent recognition model service returns a pull interception result to the large model back end: when the questioning content is of a code questioning and answering type, the questioning content is not intercepted; and intercepting when the questioning content is of a non-code questioning and answering type.
Step 716, if the document is intercepted, returning to the intercepting document.
Specifically, when the questioning content is of a non-code question-answering type, the back end of the question-answering large model returns an interception document' sorry, i are an intelligent coding assistant, and can not answer the question, i can help a developer write codes and provide technical suggestions, and ask what i can help your.
If not, step 718, a request is sent to the question-answer large model.
Specifically, when the questioning content is of a code question-answering type, the back end of the question-answering large model sends a request to the question-answering large model reasoning service.
Step 720, green net interception.
Specifically, the question-answering large model reasoning service performs green network interception through keyword retrieval, and specific processes refer to the above, and are not repeated here.
Step 722, call the question-answer big model.
Specifically, the question and answer big model reasoning service calls a question and answer big model to reason the question content, and a question and answer result is generated.
Step 724, return question and answer results.
Specifically, the question-answer large model reasoning service returns a question-answer result to the client.
Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a task processing device, and fig. 8 is a schematic structural diagram of a task processing device provided in one embodiment of the present disclosure. As shown in fig. 8, the apparatus includes:
a first acquisition module 802 configured to acquire data to be processed;
The first recognition module 804 is configured to input the data to be processed into a target recognition model, and recognize the data type of the data to be processed, wherein the target recognition model is obtained by training the classification model based on sample data corresponding to a target task;
The first processing module 806 is configured to input the data to be processed into the task processing model to execute the target task to obtain a task processing result, where the task processing model is obtained based on the task sample set corresponding to the target task.
Optionally, the first identifying module 804 further includes a guiding module configured to generate a question-answer guiding message in case that the data type does not conform to the task type of the target task, wherein the question-answer guiding message includes the task type of the target task; and sending the question and answer guide message to the front-end user to guide the front-end user to input updated to-be-processed data based on the task type.
Optionally, the first recognition module 804 further includes a prompt module configured to obtain a recognition result of whether the data to be processed includes the specified alert word; if the identification result is that the data to be processed contains the appointed warning word, generating a warning word prompt message; and sending the warning word prompting message to the front-end user so as to prompt the front-end user to delete the designated warning word.
Optionally, the first identifying module 804 is further configured to receive a task processing request, wherein the task processing request includes data to be processed and a task identification of a target task; identifying whether the target task is a question-answer task executed for the first time according to the task identification; if yes, inputting the data to be processed into a target recognition model for type recognition, and recognizing the data type of the data to be processed.
Optionally, the first recognition module 804 is further configured to input the data to be processed into the task processing model to execute the target task if not, so as to obtain the task processing result.
Optionally, the first processing module 806 is further configured to determine a target subtask for which the data to be processed hits according to the data type; determining a target subtask model corresponding to the target subtask from the task processing model; and executing the target subtask on the data to be processed by using the target subtask model to obtain a task processing result.
Optionally, the first processing module 806 is further configured to generate the target hint information based on the data type; and sending the target prompt information to the front-end user.
Optionally, the first processing module 806 further includes an updating module configured to send the task processing result to the front-end user; receiving feedback information sent by a front-end user based on a task processing result; and acquiring an updated sample set based on the feedback information, and updating the target recognition model by using the updated sample set.
Optionally, the first identifying module 804 further includes a training module configured to obtain a sample set and a classification model, where the sample set includes a plurality of sample data corresponding to the target task, and the sample data carries a data type tag; inputting the sample data into a classification model to obtain the predicted data type of the sample data; training the classification model based on the loss between the data type label and the predicted data type to obtain a target recognition model.
Optionally, the training module is further configured to obtain positive sample data and negative sample data from a target dataset, wherein the target dataset is a dataset of a target task; a sample set is constructed based on the positive sample data and the negative sample data.
Optionally, the training module further includes a preprocessing module configured to perform format conversion on the sample data based on the first format information corresponding to the target code language; performing word segmentation on the sample data subjected to format conversion to obtain word segmentation results; based on the second format information corresponding to the classification model, performing format conversion on the segmentation result to obtain updated sample data; and inputting the updated sample data into the classification model to obtain the predicted data type of the sample data.
Acquiring data to be processed through the device; inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task; under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result, wherein the task processing model is obtained by training based on a task sample group corresponding to the target task. The data type of the data to be processed is identified through the target identification model, the data to be processed, the data type of which accords with the task type of the target task, is input into the task processing model, the data to be processed, the data type of which does not accord with the task type of the target task, is filtered, the consumption of the task processing model is reduced, the misuse of the task processing model is avoided, and the processing efficiency of the task processing model is accelerated.
The above is a schematic solution of a task processing device of the present embodiment. It should be noted that, the technical solution of the task processing device and the technical solution of the task processing method belong to the same concept, and details of the technical solution of the task processing device, which are not described in detail, can be referred to the description of the technical solution of the task processing method.
Corresponding to the above method embodiments, the present disclosure further provides an embodiment of a code processing apparatus, and fig. 9 is a schematic structural diagram of a code processing apparatus provided in one embodiment of the present disclosure. As shown in fig. 9, the apparatus includes:
a second obtaining module 902, configured to obtain task initial information sent by a front-end user;
A second recognition module 904 configured to input the task initial information into a target recognition model, and determine an information type of the task initial information, wherein the target recognition model is obtained by training a classification model based on the task initial sample;
the second processing module 906 is configured to input the task initial information into the code processing model to execute the code question-answering task to obtain a code processing result under the condition that the information type accords with the code task type, wherein the code processing model is obtained by training based on the code processing sample pair;
A feedback module 908 configured to feed back the code processing result to the front-end user.
Acquiring task initial information sent by a front-end user through the device; inputting the task initial information into a target recognition model, and determining the information type of the task initial information, wherein the target recognition model is obtained by training a classification model based on a task initial sample; under the condition that the information type accords with the code task type, inputting task initial information into a code processing model to execute a code question-answering task, and obtaining a code processing result, wherein the code processing model is obtained by training based on a code processing sample pair; and feeding back the code processing result to the front-end user. The information type of the task initial information is identified through the target identification model, the task initial information with the information type conforming to the code task type is input into the code processing model, non-research and development problems irrelevant to code processing are filtered in a code question-answer scene, consumption of the code processing model is reduced, abuse of the code processing model is avoided, and processing efficiency of the code processing model is accelerated.
The above is a schematic scheme of a code processing apparatus of the present embodiment. It should be noted that, the technical solution of the code processing apparatus and the technical solution of the code processing method belong to the same conception, and details of the technical solution of the code processing apparatus, which are not described in detail, can be referred to the description of the technical solution of the code processing method.
Fig. 10 illustrates a block diagram of a computing device 1000 provided in accordance with one embodiment of the present description. The components of the computing device 1000 include, but are not limited to, a memory 1010 and a processor 1020. Processor 1020 is coupled to memory 1010 via bus 1030 and database 1050 is used to store data.
Computing device 1000 also includes access device 1040, which access device 1040 enables computing device 1000 to communicate via one or more networks 1060. Examples of such networks include a public switched telephone network (Public Switched Telephone Network, PSTN), a local area network (Local Area Network, LAN), a wide area network (Wide Area Network, WAN), a personal area network (Personal Area Network, PAN), or a combination of communication networks such as the internet. The access device 1040 may include one or more of any type of network interface, wired or wireless (e.g., network interface card (network interface controller, NIC)), such as an IEEE802.11 wireless local area network (Wireless Local Area Network, WLAN) wireless interface, a worldwide interoperability for microwave access (Worldwide Interoperability for Microwave Access, wi-MAX) interface, an ethernet interface, a universal serial bus (Universal Serial Bus, USB) interface, a cellular network interface, a bluetooth interface, a near-field Communication (NEAR FIELD Communication).
In one embodiment of the present description, the above-described components of computing device 1000, as well as other components not shown in FIG. 10, may also be connected to each other, such as by a bus. It should be understood that the block diagram of the computing device illustrated in FIG. 10 is for exemplary purposes only and is not intended to limit the scope of the present description. Those skilled in the art may add or replace other components as desired.
Computing device 1000 may be any type of stationary or mobile computing device including a mobile computer or mobile computing device (e.g., tablet, personal digital assistant, laptop, notebook, netbook, etc.), mobile phone (e.g., smart phone), wearable computing device (e.g., smart watch, smart glasses, etc.), or other type of mobile device, or a stationary computing device such as a desktop or personal computer (Personal Computer, PC). Computing device 1000 may also be a mobile or stationary server.
The processor 1020 is configured to execute computer-executable instructions that, when executed by the processor, implement the steps of the task processing method or the code processing method described above.
The foregoing is a schematic illustration of a computing device of this embodiment. It should be noted that, the technical solution of the computing device and the technical solution of the task processing method or the code processing method belong to the same concept, and details of the technical solution of the computing device, which are not described in detail, can be referred to the description of the technical solution of the task processing method or the code processing method.
An embodiment of the present disclosure also provides a computer-readable storage medium storing computer-executable instructions that, when executed by a processor, implement the steps of the task processing method or the code processing method described above.
The above is an exemplary version of a computer-readable storage medium of the present embodiment. It should be noted that, the technical solution of the storage medium and the technical solution of the task processing method or the code processing method belong to the same concept, and details of the technical solution of the storage medium which are not described in detail can be referred to the description of the technical solution of the task processing method or the code processing method.
An embodiment of the present specification also provides a computer program product comprising computer programs/instructions which, when executed by a processor, implement the steps of the task processing method or the code processing method described above.
The foregoing is a schematic version of a computer program product of this embodiment. It should be noted that, the technical solution of the computer program product and the technical solution of the task processing method or the code processing method belong to the same concept, and details of the technical solution of the computer program product, which are not described in detail, can be referred to the description of the technical solution of the task processing method or the code processing method.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The computer instructions include computer program code which may be in source code form, object code form, executable file or some intermediate form, etc. The computer readable medium may include: any entity or device capable of carrying computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), an electrical carrier signal, a telecommunications signal, a software distribution medium, and so forth. It should be noted that the content of the computer readable medium can be increased or decreased appropriately according to the requirements of the patent practice, for example, in some areas, according to the patent practice, the computer readable medium does not include an electric carrier signal and a telecommunication signal.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of combinations of actions, but it should be understood by those skilled in the art that the embodiments are not limited by the order of actions described, as some steps may be performed in other order or simultaneously according to the embodiments of the present disclosure. Further, those skilled in the art will appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily all required for the embodiments described in the specification.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and for parts of one embodiment that are not described in detail, reference may be made to the related descriptions of other embodiments.
The preferred embodiments of the present specification disclosed above are merely used to help clarify the present specification. Alternative embodiments are not intended to be exhaustive or to limit the invention to the precise forms disclosed. Obviously, many modifications and variations are possible in light of the teaching of the embodiments. The embodiments were chosen and described in order to best explain the principles of the embodiments and the practical application, to thereby enable others skilled in the art to best understand and utilize the invention. This specification is to be limited only by the claims and the full scope and equivalents thereof.

Claims (14)

1. A task processing method, comprising:
Acquiring data to be processed;
Inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed, wherein the target recognition model is obtained by training a classification model based on sample data corresponding to a target task;
under the condition that the data type accords with the task type of the target task, inputting the data to be processed into a task processing model to execute the target task to obtain a task processing result, wherein the task processing model is obtained based on task sample sets corresponding to the target task in a training way;
The obtaining the data to be processed comprises the following steps:
receiving a task processing request, wherein the task processing request comprises the data to be processed and a task identifier of the target task;
The step of inputting the data to be processed into a target recognition model, and recognizing the data type of the data to be processed comprises the following steps:
identifying whether the target task is a question-answer task executed for the first time according to the task identification;
if yes, inputting the data to be processed into the target recognition model for type recognition, and recognizing the data type of the data to be processed.
2. The method of claim 1, further comprising, after said inputting the data to be processed into the object recognition model, recognizing a data type of the data to be processed:
generating a question-answer guide message under the condition that the data type does not accord with the task type of the target task, wherein the question-answer guide message comprises the task type of the target task;
And sending the question-answer guide message to a front-end user so as to guide the front-end user to input updated data to be processed based on the task type.
3. The method of claim 1, further comprising, after said inputting the data to be processed into a target recognition model:
Obtaining the recognition result of whether the data to be processed contains the appointed warning word or not;
if the identification result is that the data to be processed contains the appointed warning word, generating a warning word prompting message;
And sending the warning word prompting message to a front-end user so as to prompt the front-end user to delete the appointed warning word.
4. The method of claim 1, further comprising, after said identifying, based on said task identification, whether said target task is a first executed question-answer task:
If not, inputting the data to be processed into the task processing model to execute the target task, and obtaining the task processing result.
5. A method according to any one of claims 1-4, wherein said inputting the data to be processed into a task processing model to execute the target task to obtain a task processing result comprises:
determining a target subtask hit by the data to be processed according to the data type;
determining a target subtask model corresponding to the target subtask from the task processing model;
and executing the target subtask on the data to be processed by using the target subtask model to obtain the task processing result.
6. The method of claim 5, further comprising, after said inputting the data to be processed into the object recognition model, recognizing a data type of the data to be processed:
generating target prompt information based on the data type;
And sending the target prompt information to a front-end user.
7. The method according to any one of claims 1-4, further comprising, after said inputting the data to be processed into a task processing model to execute the target task and obtain a task processing result:
The task processing result is sent to a front-end user;
receiving feedback information sent by the front-end user based on the task processing result;
and acquiring an updating sample set based on the feedback information, and updating the target recognition model by using the updating sample set.
8. The method of any of claims 1-4, further comprising, prior to said inputting the data to be processed into a target recognition model, identifying a data type of the data to be processed:
acquiring a sample set and the classification model, wherein the sample set comprises a plurality of sample data corresponding to the target task, and the sample data carries a data type label;
inputting the sample data into the classification model to obtain the predicted data type of the sample data;
And training the classification model based on the loss between the data type label and the predicted data type to obtain the target recognition model.
9. The method of claim 8, the acquiring a sample set comprising:
Acquiring positive sample data and negative sample data from a target data set, wherein the target data set is a data set of the target task;
The sample set is constructed based on the positive sample data and the negative sample data.
10. The method of claim 8, the inputting the sample data into a classification model resulting in a predicted data type for the sample data, comprising:
Performing format conversion on the sample data based on first format information corresponding to the target code language;
performing word segmentation on the sample data subjected to format conversion to obtain word segmentation results;
Based on second format information corresponding to the classification model, performing format conversion on the word segmentation result to obtain updated sample data;
And inputting the updated sample data into a classification model to obtain the predicted data type of the sample data.
11. A code processing method, comprising:
acquiring task initial information sent by a front-end user;
Inputting the task initial information into a target recognition model, and determining the information type of the task initial information, wherein the target recognition model is obtained by training a classification model based on a task initial sample;
under the condition that the information type accords with the code task type, inputting the task initial information into a code processing model to execute the code processing task to obtain a code processing result, wherein the code processing model is obtained by training based on a code processing sample pair;
Feeding back the code processing result to the front-end user;
the obtaining the task initial information sent by the front-end user includes:
Receiving a task processing request, wherein the task processing request comprises the task initial information and a task identifier of a target task;
The step of inputting the task initial information into a target recognition model and determining the information type of the task initial information comprises the following steps:
identifying whether the target task is a question-answer task executed for the first time according to the task identification;
If yes, inputting the task initial information into a target recognition model for type recognition, and recognizing the information type of the task initial information.
12. A computing device, comprising:
A memory and a processor;
The memory is configured to store a computer program/instruction, and the processor is configured to execute the computer program/instruction, which when executed by the processor, implements the steps of the task processing method according to any one of claims 1 to 10 or the code processing method according to claim 11.
13. A computer readable storage medium storing a computer program/instruction which, when executed by a processor, implements the steps of the task processing method of any one of claims 1 to 10 or the code processing method of claim 11.
14. A computer program product comprising computer programs/instructions which when executed by a processor implement the steps of the task processing method of any one of claims 1 to 10 or the code processing method of claim 11.
CN202410114939.4A 2024-01-26 2024-01-26 Task processing and code processing method, computing device, medium, and program product Active CN117648986B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410114939.4A CN117648986B (en) 2024-01-26 2024-01-26 Task processing and code processing method, computing device, medium, and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410114939.4A CN117648986B (en) 2024-01-26 2024-01-26 Task processing and code processing method, computing device, medium, and program product

Publications (2)

Publication Number Publication Date
CN117648986A CN117648986A (en) 2024-03-05
CN117648986B true CN117648986B (en) 2024-05-14

Family

ID=90045327

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410114939.4A Active CN117648986B (en) 2024-01-26 2024-01-26 Task processing and code processing method, computing device, medium, and program product

Country Status (1)

Country Link
CN (1) CN117648986B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102013999A (en) * 2010-12-03 2011-04-13 中兴通讯股份有限公司 Method and device for processing warning data
CN113139043A (en) * 2021-04-29 2021-07-20 北京百度网讯科技有限公司 Question and answer sample generation method and device, electronic equipment and storage medium
CN113626571A (en) * 2021-08-09 2021-11-09 南方电网数字电网研究院有限公司 Answer sentence generating method and device, computer equipment and storage medium
CN116363452A (en) * 2023-03-07 2023-06-30 阿里巴巴(中国)有限公司 Task model training method and device
CN116431316A (en) * 2023-06-06 2023-07-14 阿里巴巴(中国)有限公司 Task processing method, system, platform and automatic question-answering method
CN116579339A (en) * 2023-07-12 2023-08-11 阿里巴巴(中国)有限公司 Task execution method and optimization task execution method
CN117150338A (en) * 2023-07-27 2023-12-01 阿里巴巴(中国)有限公司 Task processing, automatic question and answer and multimedia data identification model training method
CN117193965A (en) * 2023-08-04 2023-12-08 阿里巴巴(中国)有限公司 Task processing method, question-answer processing method and distributed system
CN117291185A (en) * 2023-08-17 2023-12-26 杭州阿里云飞天信息技术有限公司 Task processing method, entity identification method and task processing data processing method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113673260A (en) * 2020-05-15 2021-11-19 阿里巴巴集团控股有限公司 Model processing method, device, storage medium and processor

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102013999A (en) * 2010-12-03 2011-04-13 中兴通讯股份有限公司 Method and device for processing warning data
CN113139043A (en) * 2021-04-29 2021-07-20 北京百度网讯科技有限公司 Question and answer sample generation method and device, electronic equipment and storage medium
CN113626571A (en) * 2021-08-09 2021-11-09 南方电网数字电网研究院有限公司 Answer sentence generating method and device, computer equipment and storage medium
CN116363452A (en) * 2023-03-07 2023-06-30 阿里巴巴(中国)有限公司 Task model training method and device
CN116431316A (en) * 2023-06-06 2023-07-14 阿里巴巴(中国)有限公司 Task processing method, system, platform and automatic question-answering method
CN116579339A (en) * 2023-07-12 2023-08-11 阿里巴巴(中国)有限公司 Task execution method and optimization task execution method
CN117150338A (en) * 2023-07-27 2023-12-01 阿里巴巴(中国)有限公司 Task processing, automatic question and answer and multimedia data identification model training method
CN117193965A (en) * 2023-08-04 2023-12-08 阿里巴巴(中国)有限公司 Task processing method, question-answer processing method and distributed system
CN117291185A (en) * 2023-08-17 2023-12-26 杭州阿里云飞天信息技术有限公司 Task processing method, entity identification method and task processing data processing method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
On-Orbit Remote Sensing Image Processing Complex Task Scheduling Model Based on Heterogeneous Multiprocessor;Qiangqiang Jiang.etc;IEEE;20231231;全文 *
李孜颖 ; 石振国 ; .面向大数据任务的调度方法.计算机应用.(第10期),全文. *
赵鹏飞 ; 李艳玲 ; 林民 ; .面向迁移学习的意图识别研究进展.计算机科学与探索.(第08期),全文. *

Also Published As

Publication number Publication date
CN117648986A (en) 2024-03-05

Similar Documents

Publication Publication Date Title
US10552544B2 (en) Methods and systems of automated assistant implementation and management
CN107846350B (en) Method, computer readable medium and system for context-aware network chat
US20190272269A1 (en) Method and system of classification in a natural language user interface
CN109284399B (en) Similarity prediction model training method and device and computer readable storage medium
US11538468B2 (en) Using semantic frames for intent classification
US20230412537A1 (en) Cognitive determination of message suitability
US11436446B2 (en) Image analysis enhanced related item decision
CN111666400B (en) Message acquisition method, device, computer equipment and storage medium
CN116521841B (en) Method, device, equipment and medium for generating reply information
CN111858854A (en) Question-answer matching method based on historical dialogue information and related device
CN116595154B (en) Task processing method and automatic question-answering method
CN111639162A (en) Information interaction method and device, electronic equipment and storage medium
CN110399473B (en) Method and device for determining answers to user questions
CN114186041A (en) Answer output method
CN111538817A (en) Man-machine interaction method and device
CN117648986B (en) Task processing and code processing method, computing device, medium, and program product
CN115270818A (en) Intention identification method and device, storage medium and computer equipment
CN117573842B (en) Document retrieval method and automatic question-answering method
CN117648079B (en) Task processing, code completion, code question answering and task processing model training method
US11907500B2 (en) Automated processing and dynamic filtering of content for display
Yang et al. Servicegroup: A human-machine cooperation solution for group chat customer service
US11694686B2 (en) Virtual assistant response generation
US20220129642A1 (en) Intelligent conversational gateway
CN118098282A (en) Event reporting method and electronic equipment
CN117520497A (en) Large model interaction processing method, system, terminal, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant