CN117112777B

CN117112777B - LLM-based multitasking data processing method and storage medium

Info

Publication number: CN117112777B
Application number: CN202311382804.8A
Authority: CN
Inventors: 于伟; 石江枫; 赵洲洋; 靳雯; 王全修; 王明超
Original assignee: Rizhao Ruian Information Technology Co ltd; Beijing Rich Information Technology Co ltd
Current assignee: Rizhao Ruian Information Technology Co ltd; Beijing Rich Information Technology Co ltd
Priority date: 2023-10-24
Filing date: 2023-10-24
Publication date: 2024-01-26
Anticipated expiration: 2043-10-24
Also published as: CN117112777A

Abstract

The invention provides a LLM-based multitasking data processing method and a storage medium, relating to the field of large language models, wherein the method comprises the following steps: acquiring a preset task list, acquiring a subtask list contained in the preset task, splitting the subtasks according to a first event type in the field of a target sentence, acquiring a first combined task list, dividing the first combined task according to a second event type if the number of the subtasks contained in any one of the first combined tasks is larger than a preset subtask number threshold, acquiring a second combined task corresponding to the first combined task, replacing the first combined task in the first combined task list, acquiring a final task combined list, acquiring a task template list corresponding to the final task combined list, acquiring a preset task and a target sentence of a target LLM (web page) input by a user, and acquiring an output result; the processing of a plurality of tasks under the same large predictive model is realized.

Description

LLM-based multitasking data processing method and storage medium

Technical Field

The invention relates to the field of large language models, in particular to a LLM-based multitasking data processing method and a storage medium.

Background

Natural language processing techniques are widely used in today's society, such as relation extraction, entity recognition, text correction, word segmentation, part-of-speech tagging, etc., where these applications all require processing large amounts of text data, and use natural language processing techniques to understand and analyze the text; on the other hand, with the rapid development of large language models, the use of large language models for processing natural language class processing tasks is started, however, the large language models have a huge parameter amount, and how to use one large language model for processing a plurality of natural language class processing tasks is important.

Disclosure of Invention

Aiming at the technical problems, the invention adopts the following technical scheme: the method is used for acquiring an output result of a target sentence under a preset task based on a target LLM model, and comprises the following steps:

s100, acquiring a preset task list A= { A ₁ ，A ₂ ，…，A _i ，…，A _m }，A _i Is the i-th preset task, the value range of i is 1 to m, and m is the number of the preset tasks; the preset task is a natural language processing task processed on the target LLM model;

s200, acquiring a preset task A _i Contained subtask list set B _i ={B _i，1 ，B _i，2 ，…，B _i，j ，…，B _i，ni }, B is _i Dividing according to a first dividing principle to obtain a first divided combined task list C _i ={C _i，1 ，C _i，2 ，…，C _i，r ，…，C _i，si }；

Wherein the j-th subtask list B _i，j Comprising a preset task A _i All subtasks corresponding to the j-th preset field, wherein the value range of j is 1 to ni, and ni is the number of the preset fields; b (B) _i Corresponding r first combined task C _i，r Comprises at least one subtask, r has a value ranging from 1 to si, si is B _i The corresponding number of the first combined tasks is si less than or equal to ni; the first division principle is to divide according to the first-level task category of the subtasks;

s300, if C _i，r The number of the subtasks contained is greater than a preset subtask number threshold value, C is calculated _i，r Dividing according to a second dividing principle to obtain C _i，r Corresponding second combined task list D _i，r ={D _i，r1 ，D _i，r2 ，…，D _i，ry ，…，D _i，rp Use of D }, and _i，r1 ，D _i，r2 ，…，D _i，ry ，…，D _i，rp replacing the first combined task list C _i C in (C) _i，r Thereby obtaining A _i Corresponding final combined task list G _i ={G _i，1 ，G _i，2 ，…，G _i，x ，…，G _i，q }；

Wherein D is _i,ry Is C is _i，r A y second combined task obtained by dividing according to a second dividing principle, wherein the value range of y is 1 to p, and p is C _i，r The number of second combined tasks obtained after the division according to the second division principle;

G _i，x is the x final combination task, the value range of x is 1 to q, q is the number of the final combination tasks, and q=si+p-1;

the second division principle is to divide according to the secondary task category of the subtasks, wherein the secondary task category is lower than the primary task category;

s400, obtain G _i，x Corresponding task template E _i，x Thereby obtaining a task template list E _i ={E _i，1 ，E _i，2 ，…，E _i，x ，…，E _i，q -any of the followingBusiness template E _i，x Comprising the following steps: the method comprises the steps of presetting sample sentences and ' referring to the preset sample sentences ', outputting subtask output results corresponding to target sentences '; wherein, the preset sample sentence is: g _i，x The method comprises the steps of outputting a result by a preset sample sentence corresponding to each subtask and a subtask corresponding to each preset sample sentence; the subtask output result is output content after content extraction is carried out on a preset sample statement or a target statement according to the subtask;

s500, acquiring a preset task A of a user input target LLM model _i And a target sentence, obtaining a subtask output result F corresponding to the target sentence _i ={F _i，1 ，F _i，2 ，…，F _i，x ，…，F _i，q }，F _i，x Is G _i，x And outputting a result by the corresponding subtasks. .

A non-transitory computer readable storage medium having stored therein at least one instruction or at least one program loaded and executed by a processor to implement a LLM based multi-tasking data processing method as described above.

The invention has at least the following beneficial effects:

in summary, a preset task list is obtained, a subtask list contained in the preset task is obtained, the subtasks are split according to a first event type in the field of target sentences, a first combined task list is obtained, if the number of the subtasks contained in any one first combined task is larger than a preset subtask number threshold value, the first combined task is divided according to a second event type, a second combined task corresponding to the first combined task is obtained, the second combined task is replaced by the first combined task in the first combined task list, a final task combined list is obtained, a task template list corresponding to the final task combined list is obtained, the preset task and target sentences of a target LLM (web language) input by a user are obtained, and an output result is obtained; the subtasks of each preset task are combined according to the event types in the target field, and the processing of a plurality of tasks under the same large predictive model is realized through the selection of the preset tasks, so that the capability of the large predictive model is fully utilized; and a plurality of processing tasks of the ultraviolet language class are built on the same large predictive model, so that the experience of the user is better.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a flowchart of a LLM-based multi-task data processing method according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.

The embodiment of the invention provides a LLM-based multi-task data processing method, which is used for acquiring an output result of a target sentence under a preset task based on a target LLM model, as shown in fig. 1, and comprises the following steps:

s100, acquiring a preset task list A= { A ₁ ，A ₂ ，…，A _i ，…，A _m }，A _i Is the i-th preset task, the value range of i is 1 to m, and m is the number of the preset tasks; the preset task is a natural language class processing task processed on the target LLM model.

Specifically, the preset task list includes, for example: relationship extraction, entity identification and text proofreading. Further, the preset task list further includes: synonym recognition, keyword recognition, and the like. Those skilled in the art will appreciate that the foregoing list of preset tasks is merely exemplary and is not intended to limit the specific scope of the present invention.

S200, acquiring a preset task A _i Contained subtask list set B _i ={B _i，1 ，B _i，2 ，…，B _i，j ，…，B _i，ni }, B is _i Dividing according to a first dividing principle to obtain a first divided combined task list C _i ={C _i，1 ，C _i，2 ，…，C _i，r ，…，C _i，si -a }; wherein the j-th subtask list B _i，j Comprising a preset task A _i All subtasks corresponding to the j-th preset field, wherein the value range of j is 1 to ni, and ni is the number of the preset fields; b (B) _i Corresponding r first combined task C _i，r Comprises at least one subtask, r has a value ranging from 1 to si, si is B _i The corresponding number of the first combined tasks is si less than or equal to ni; the first division principle is to divide according to the first-level task category of the subtasks.

Specifically, in one embodiment of the present invention, when the domain in which the target sentence is located is an alert domain, A _i When the relation is extracted, the subtask list corresponding to Ai comprises: sibling relationships, relatives, parents, professions, ages, conversations, marketing, shopping, riding, etc.; dividing according to a first event type in the alert field into a first combined task list: relatives and friends relationship among people, attribute relationship of the people and business relationship; relatives and friends relationship between people: sibling relationships, relatives, parents; attribute relationship of the person itself: occupation and age; business relationship: talk, buy and sell, shop, ride.

Further, the first event type in the field where the target sentence is located may be determined according to an actual situation. For example, in another embodiment of the present invention, criminal cases, civil cases, administrative cases, economic cases may be classified as the first event type.

S300, if C _i，r The number of the subtasks contained is greater than a preset subtask number threshold value, C is calculated _i，r Dividing according to a second dividing principle to obtain C _i，r Corresponding second combined task list D _i，r ={D _i，r1 ，D _i，r2 ，…，D _i，ry ，…，D _i，rp Use of D }, and _i，r1 ，D _i，r2 ，…，D _i，ry ，…，D _i，rp replacing the first combined task list C _i C in (C) _i，r Thereby obtaining A _i Corresponding final combined task list G _i ={G _i，1 ，G _i，2 ，…，G _i，x ，…，G _i，q }, wherein D _i,ry Is C is _i，r A y second combined task obtained by dividing according to a second dividing principle, wherein the value range of y is 1 to p, and p is C _i，r The number of second combined tasks obtained after the division according to the second division principle; g _i，x Is the x final combination task, the value range of x is 1 to q, q is the number of the final combination tasks, and q=si+p-1; the second division principle is to divide according to the secondary task category of the subtasks, and the secondary task category is lower than the primary task category.

Specifically, the preset subtask number threshold value C0 may be determined according to actual requirements, and optionally, the value range of C0 is 1 to 15; preferably, c0=5.

Specifically, in an embodiment of the present invention, if c0=3, the number of subtasks included in the service relationship is 4 > c0=3, and the service relationship is divided into "consumption" and "non-consumption" again according to the second event type under the service relationship, so as to consume: buying, selling, shopping and riding; does not consume: and (5) talking.

Further, the second event type included in the first combined task may be determined according to actual requirements.

S400, obtain G _i，x Corresponding task template E _i，x Thereby obtaining a task template list E _i ={E _i，1 ，E _i，2 ，…，E _i，x ，…，E _i，q }, the task template E _i，x Comprising the following steps: outputting subtask input corresponding to the target statement by the preset sample statement and the reference preset sample statementOutputting a result "; wherein, the preset sample sentence is: g _i，x The method comprises the steps of outputting a result by a preset sample sentence corresponding to each subtask and a subtask corresponding to each preset sample sentence; and the subtask output result is output content after content extraction is carried out on a preset sample statement or a target statement according to the subtask.

Wherein, the preset sample sentence further comprises: the method comprises the steps of designating a preset sample statement corresponding to a subtask and designating the subtask, wherein the designating subtask is a subtask with an output of being empty.

Specifically, in one embodiment of the present invention, a task template is: outputting target sentences including help, buying and selling, riding relations and outputting triples by referring to preset sample sentences;

text 1: after the volunteers hear the help, the volunteers go to the rescue disaster-stricken.

1: (volunteers, people suffering from disaster, help)

Text 2: the customer purchases the merchandise at the merchant, completing the purchase and sale transaction.

2: (customer, merchant, trade) the method comprises the steps of

Text 3: the passenger takes the taxi and informs the driver of taking at the destination.

3: (passenger, taxi, riding)

Text 4: { input text, output result }.

S500, acquiring a preset task A of a user input target LLM model _i And a target sentence, obtaining a subtask output result F corresponding to the target sentence _i ={F _i，1 ，F _i，2 ，…，F _i，x ，…，F _i，q }，F _i，x Is G _i，x And outputting a result by the corresponding subtasks.

Specifically, a preset task A of a user input target LLM model is obtained _i And target sentence according to A _i Inclusion subtask list B _i Through A _i Each final combined task included processes the target sentence, thereby obtaining A _i The output result of each final combined task on the target sentence is taken as an inputResults F _i 。

Further, before the target LLM model is used for obtaining the output result of the target sentence under the preset task, training the LLM model is further included, and the target LLM model is obtained.

Specifically, the target LLM model is obtained by the following steps:

s001, acquiring a history task list and a history sample set corresponding to each history task, wherein the history sample comprises history sentences and history results of the history tasks corresponding to the history sentences. Wherein the history task list includes all subtasks.

S002, acquiring an instruction corresponding to each historical task, wherein the instruction is as follows: and outputting a subtask output result corresponding to the history sample according to the history sample corresponding to the history task.

S003, inputting the instruction corresponding to each historical task and the historical sample set corresponding to each historical task into the LLM model to obtain a target LLM model.

Further, the length of the character content input in each task template is less than or equal to M0, wherein M0 is a preset input character length threshold value. It will be appreciated that the length of the input large predictive model is limited and therefore the longest length of the task model is determined to be M0, and in particular, M0 may be determined according to the actual requirements and the capabilities of the large predictive model.

Further, in S400, G is acquired _i，x Corresponding task template E _i，x The method comprises the following steps of:

s410, acquiring a subtask list set B _i Corresponding reference sample list set H _i ={H _i，1 ，H _i，2 ，…，H _i，j ，…，H _i,ni }，H _i，j Is B _i，j A corresponding reference sample list, wherein H _i，j ={H _i，j1 ，H _i，j2 ，…，H _i，ja ，…，H _i，jb }，H _i，ja Is H _i，j In the a-th reference sample, the value range of a is 1 to B, and B is B _i，j The number of reference samples in the corresponding reference sample list.

Specifically, each subtask B _i，j There are corresponding reference sample sets, each reference sample set having b reference samples.

S420, obtain G _i，x Including subtasks and marked as intermediate subtasks, thereby obtaining an intermediate subtask list Q _i，x ={Q _i，x1 ，Q _i，x2 ，…，Q _i，xc ，…，Q _i，xd And based on H _i Acquiring a reference sample list corresponding to each intermediate subtask and marking the reference sample list as an intermediate sample list set R _i，x ={R _i，x1 ，R _i，x2 ，…，R _i，xc ，…，R _i,xd }，Q _i，xc Is G _i，x The c-th intermediate subtask is contained, R _i，xc Is Q _i，xc Corresponding intermediate sample list, the value range of c is 1 to d, d is G _i，x The number of intermediate subtasks involved.

S430, extracting an intermediate sample from each intermediate sample list as a sample to be used, and obtaining a combination list J to be used _i，x ={J _i，x1 ，J _i，x2 ，…，J _i，xe ，…，J _i，xf And obtain J _i，x Corresponding character quantity list K _i，x ={K _i，x1 ，K _i，x2 ，…，K _i，xe ，…，K _i，xf }，J _i，xe Is Q _i，x Corresponding e-th to-be-used combination, K _i，xe Is J _i，xe The corresponding character number, e, ranges from 1 to f, f is Q _i，x The corresponding number of combinations to be used, f=b ^d 。

Specifically, one intermediate sample is randomly extracted from each intermediate sample list to serve as a sample to be used, all the samples to be used serve as a combination to be used, and it can be understood that each intermediate sample in each intermediate sample list in the intermediate sample set is arranged and combined to obtain all possible combinations to serve as a combination to be used list.

S440, if K _i，xe If M0 is less than or equal to M, J is _i，xe As acquisition G _i，x Corresponding task template E _i，x In a predetermined sample sentence.

Specifically, S440 further includes: if a combined list J is to be used _i，x There are a plurality of K _i，xe M0 or less, K is selected _i，xe Marking the combination to be used which is less than or equal to M0 as a combination to be determined, and randomly selecting one combination to be determined as G _i，x Corresponding task template E _i，x 。

Further, those skilled in the art will know that any method for randomly extracting one from a plurality of combinations to be used falls within the scope of the present invention, and will not be described herein.

S450, if there is no K _i，xe M0 or less, obtaining K _i，x0 =min{K _i，x1 ，K _i，x2 ，…，K _i，xe ，…，K _i，xf And get K _i，x0 Corresponding J _i，x0 S460 is performed.

S460, obtaining a new sample, and replacing J with the new sample _i，x0 Neutralizing the sample to be used of which the new sample belongs to the same intermediate subtask as G _i，x Corresponding any ofBusiness template E _i，x In (1), wherein J _i，x0 The number of characters of the sample to be used, which belongs to the same subtask, of the new sample, namely the number of characters of the new sample is greater than or equal to K _i，x0 -M0, the new sample being a newly added reference sample corresponding to any intermediate subtask.

It is understood that if K is not present _i，xe M0 is not more than, the combination of any intermediate samples is not considered to meet the length requirement of M0, a new sample is acquired at the moment, and the number of characters of the new sample is ensured to replace J _i，x0 When the sample to be used of the same intermediate subtask as the new sample is neutralized, the character quantity requirement of M0 can be met.

On the whole, acquire subtask B _i Corresponding reference sample set H _i Acquisition of G _i，x The included subtasks are marked as intermediate subtasks, thereby obtaining an intermediate subtask list, and based on H _i Obtaining a reference sample list corresponding to each intermediate subtask and marking the reference sample list as an intermediate sample set, extracting one intermediate sample from each intermediate sample list as a sample to be used, obtaining a combination list to be used, obtaining a corresponding character number list, and if K _i，xe If M0 is less than or equal to M, J is _i，xe As acquisition G _i，x Corresponding task template E _i，x If K does not exist in the preset sample sentence _i，xe M0 or less, obtaining K _i，x0 =min{K _i，x1 ，K _i，x2 ，…，K _i，xe ，…，K _i，xf And get K _i，x0 Corresponding J _i，x0 Acquiring a new sample, and replacing J with the new sample _i，x0 To-be-used samples of the same intermediate subtasks as the new samples, thereby acting as G _i，x Corresponding task template E _i，x All combinations to be used are obtained by arranging and combining each intermediate sample in the intermediate sample list, and the number of characters of the combinations to be used is judged to enable the preset sample sentences to meet the number of characters of M0.

Further, S440 also includes, if K is present _i，xe M0 is less than or equal to M0, and M0-K _i，xe Not less than K0, wherein K0 is a preset gap threshold, performingThe method comprises the following steps:

s441, from R _i，x1 To R _i，xd Randomly extracting h intermediate samples meeting preset requirements, wherein h is initialized to 1; the preset vector corresponding to the intermediate sample meeting the preset requirement is shown as J _i，x The corresponding position of the included intermediate sample is marked as a first mark, the preset vector is used for representing the conflict situation of any two reference samples, and the first mark is used for representing that the two reference samples do not conflict.

In particular, if K is present _i，xe M0 is less than or equal to M0, and M0-K _i，xe Not less than K0, it is understood that there is a combination to be used meeting the requirement of the number of characters, but the number of characters of the combination to be used is much smaller than M0, at this time, from H _i，x1 To H _i，xd And randomly extracting h intermediate samples meeting preset requirements. The preset gap threshold K0 can be determined according to actual requirements.

Specifically, the preset vector meeting the preset requirement is that the preset vector corresponding to the intermediate sample is in J _i，x The corresponding positions of the included intermediate samples are marked as a first mark, namely h intermediate samples and J which are randomly extracted _i，x The intermediate samples contained are not conflicted so as to avoid h intermediate samples and J which are randomly extracted _i，x The intermediate samples involved produce conflicts.

Further, the first identifier is an identifier for characterizing the conflict of two reference samples.

S442, adding h intermediate samples meeting the preset requirements to J _i，xe And calculate J _i，xe Number of characters L after addition _i，xe 。

S443, if d+h=a0, or, M0-L _i，xe < K0, execute S444; otherwise, h=h+1, S441 is performed, and A0 is the preset number of samples.

Optionally, A0 is more than or equal to 6 and less than or equal to 8; preferably, a0=7.

S444, adding h J of intermediate samples meeting preset requirements _i，xe As acquisition G _i，x Corresponding task template E _i，x In a predetermined sample sentence.

In combination, if K is present _i，xe M0 is less than or equal to M0, and M0-K _i，xe Not less than K0, from H _i，x1 To H _i，xd Randomly extracting h intermediate samples meeting preset requirements, and adding the h intermediate samples meeting the preset requirements to J _i，xe And calculate J _i，xe Number of characters L after addition _i，xe Until d+h=preset number of samples A0, or M0-L _i，xe < K0, adding h J of intermediate samples meeting preset requirements _i，xe As acquisition G _i，x Corresponding task template E _i，x A preset sample sentence in the database; when the number of characters is too small, the intermediate samples are added again, so that the intermediate subtasks correspond to one or more intermediate samples, the intermediate subtasks corresponding to the intermediate samples have different description modes of the same subtask, the target LLM model has better understanding ability on the intermediate subtask, and the large predictive model has better prediction effect.

Embodiments of the present invention also provide a non-transitory computer readable storage medium that may be disposed in an electronic device to store at least one instruction or at least one program for implementing one of the methods embodiments, the at least one instruction or the at least one program being loaded and executed by the processor to implement the methods provided by the embodiments described above.

Embodiments of the present invention also provide an electronic device comprising a processor and the aforementioned non-transitory computer-readable storage medium.

Embodiments of the present invention also provide a computer program product comprising program code for causing an electronic device to carry out the steps of the method according to the various exemplary embodiments of the invention as described in the specification, when said program product is run on the electronic device.

While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims

1. The LLM-based multitasking data processing method is characterized in that the method is used for obtaining an output result of a target sentence under a preset task based on a target LLM model, and comprises the following steps:

s100, acquiring a preset task list A= { A ₁ ，A ₂ ，…，A _i ，…，A _m }，A _i Is the i-th preset task, the value range of i is 1 to m, and m is the number of the preset tasks; the preset task is a natural language processing task processed on the target LLM model, and the preset task list at least comprises: extracting the relation, identifying the entity and checking the text;

s400, obtain G _i，x Corresponding task template E _i，x Thereby obtaining a task template list E _i ={E _i，1 ，E _i，2 ，…，E _i，x ，…，E _i，q }, the task template E _i，x Comprising the following steps: the method comprises the steps of presetting sample sentences and ' referring to the preset sample sentences ', outputting subtask output results corresponding to target sentences '; wherein, the preset sample sentence is: g _i，x The method comprises the steps of outputting a result by a preset sample sentence corresponding to each subtask and a subtask corresponding to each preset sample sentence; the subtask output result is output content after content extraction is carried out on a preset sample statement or a target statement according to the subtask;

2. The LLM-based multitasking data processing method of claim 1, wherein the pre-sample sentences further comprise: the method comprises the steps of designating a preset sample statement corresponding to a subtask and designating the subtask, wherein the designating subtask is a subtask with an output of being empty.

3. The LLM-based multitasking data processing method of claim 1, wherein the length of the character content entered in each task template is less than or equal to M0, wherein M0 is a preset input character length threshold.

4. The LLM-based multitasking data processing method of claim 3, wherein in S400, G is obtained _i，x Corresponding task template E _i，x The method comprises the following steps of:

s410, acquiring a subtask list set B _i Corresponding reference sample list set H _i ={H _i，1 ，H _i，2 ，…，H _i，j ，…，H _i,ni }，H _i，j Is B _i，j A corresponding reference sample list, wherein H _i，j ={H _i，j1 ，H _i，j2 ，…，H _i，ja ，…，H _i，jb }，H _i，ja Is H _i，j In the a-th reference sample, the value range of a is 1 to B, and B is B _i，j The number of reference samples in the corresponding reference sample list;

s420, obtain G _i，x Including subtasks and marked as intermediate subtasks, thereby obtaining an intermediate subtask list Q _i，x ={Q _i，x1 ，Q _i，x2 ，…，Q _i，xc ，…，Q _i，xd And based on H _i Acquiring a reference sample list corresponding to each intermediate subtask and marking the reference sample list as an intermediate sample list set R _i，x ={R _i，x1 ，R _i，x2 ，…，R _i，xc ，…，R _i,xd }，Q _i，xc Is G _i，x The c-th intermediate subtask is contained, R _i，xc Is Q _i，xc Corresponding intermediate sample list, the value range of c is 1 to d, d is G _i，x The number of intermediate subtasks involved;

s430, extracting an intermediate sample from each intermediate sample list as a sample to be used, and obtaining a combination list J to be used _i，x ={J _i，x1 ，J _i，x2 ，…，J _i，xe ，…，J _i，xf And obtain J _i，x Corresponding character quantity list K _i，x ={K _i，x1 ，K _i，x2 ，…，K _i，xe ，…，K _i，xf }，

J _i，xe Is Q _i，x Corresponding e-th to-be-used combination, K _i，xe Is J _i，xe The corresponding character number, e, ranges from 1 to f, f is Q _i，x The corresponding number of combinations to be used, f=b ^d ；

S440, if K _i，xe If M0 is less than or equal to M, J is _i，xe As acquisition G _i，x Corresponding task template E _i，x A preset sample sentence in the database;

s450, if there is no K _i，xe M0 or less, obtaining K _i，x0 =min{K _i，x1 ，K _i，x2 ，…，K _i，xe ，…，K _i，xf And get K _i，x0 Corresponding J _i，x0 S460 is performed;

s460, obtaining a new sample, and replacing J with the new sample _i，x0 Neutralizing the sample to be used of which the new sample belongs to the same intermediate subtask as G _i，x Corresponding task template E _i，x In (1), wherein J _i，x0 The number of characters of the sample to be used, which belongs to the same subtask, of the new sample, namely the number of characters of the new sample is greater than or equal to K _i，x0 -M0, the new sample being a newly added reference sample corresponding to any intermediate subtask.

5. The LLM-based multitasking data processing method of claim 4, wherein S440 further comprises: if a combined list J is to be used _i，x There are a plurality of K _i，xe M0 or less, K is selected _i，xe Marking the combination to be used which is less than or equal to M0 as a combination to be determined, and randomly selecting one combination to be determinedCombined as G _i，x Corresponding task template E _i，x 。

6. The LLM-based multitasking data processing method of claim 4, wherein S440 further comprises, if K is present _i，xe M0 is less than or equal to M0, and M0-K _i，xe And (2) not less than K0, wherein K0 is a preset gap threshold value, and the following steps are executed:

s441, from R _i，x1 To R _i，xd Randomly extracting h intermediate samples meeting preset requirements, wherein h is initialized to 1; the preset vector corresponding to the intermediate sample meeting the preset requirement is shown as J _i，x The corresponding position of the included intermediate sample is marked as a first mark, the preset vector is used for representing the conflict situation of any two reference samples, and the first mark is used for representing the non-conflict mark of the two reference samples;

s442, adding h intermediate samples meeting the preset requirements to J _i，xe And calculate J _i，xe Number of characters L after addition _i，xe ；

S443, if d+h=a0, or, M0-L _i，xe < K0, execute S444; otherwise, h=h+1, S441 is executed, and A0 is the number of preset samples;

7. The LLM based multitasking data processing method of claim 6, wherein 6.ltoreq.a0.ltoreq.8.

8. The LLM-based multitasking data processing method of claim 7, characterized in that a0=7.

9. A non-transitory computer readable storage medium having at least one instruction or at least one program stored therein, wherein the at least one instruction or the at least one program is loaded and executed by a processor to implement the LLM based multitasking data processing method of any one of claims 1-8.