CN116976306A - Multi-model collaboration method based on large-scale language model - Google Patents
Multi-model collaboration method based on large-scale language model Download PDFInfo
- Publication number
- CN116976306A CN116976306A CN202310958947.2A CN202310958947A CN116976306A CN 116976306 A CN116976306 A CN 116976306A CN 202310958947 A CN202310958947 A CN 202310958947A CN 116976306 A CN116976306 A CN 116976306A
- Authority
- CN
- China
- Prior art keywords
- model
- task
- models
- tasks
- small
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000012545 processing Methods 0.000 claims description 7
- 238000004364 calculation method Methods 0.000 claims description 6
- 238000013135 deep learning Methods 0.000 claims description 5
- 241000282840 Vicugna vicugna Species 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 abstract description 2
- 238000011161 development Methods 0.000 description 3
- 238000013507 mapping Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/04—Inference or reasoning models
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Machine Translation (AREA)
Abstract
The application discloses a multi-model cooperation method based on a large-scale language model, which belongs to the technical field of artificial intelligence and comprises a plurality of steps of model preparation, task understanding and disassembling, model matching, subtask execution, task feedback iterative execution and the like.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a multi-model collaboration method based on a large-scale language model.
Background
LLM: large Language Model (large-scale language model); langChain: langChain is a framework based on Large Language Models (LLMs) and is intended to provide a generic interface for the development of LLMs applications; small model: the deep learning model of non-LLM has the general parameters of less than or equal to one hundred million, can process data of various modes such as images, texts and the like, and can only process one data in a small model, so that a specific problem is solved; ebedding: mapping the original data into a semantic space, wherein the purpose is that in the semantic space, quantized values can be used for judging semantic similarity between texts;
the large-scale language model LLM refers to a model capable of automatically predicting the next word in natural language, such as GPT-4, GLM, etc. Because of the self-learning capability of LLMs, the LLMs can learn on the basis of a large amount of data without manual intervention, can continuously improve the performance of the LLMs, and can generate high-quality natural language texts, thus being very useful for tasks such as natural language generation, machine translation and the like. But current inputs and outputs are limited to text only.
The approximation scheme is as follows: langchain is a framework based on a Large Language Model (LLMs), and aims to provide a universal interface for the development of LLMs application, simplify the development difficulty of the application, facilitate a developer to develop complex LLMs application rapidly, and support the call of a third party model to make an ebedding mapping on text data when indexing the text data.
Current LLM technology has met with great success, but has drawbacks that make LLMs difficult to process complex information, such as images and speech. In addition, some complex tasks require multiple models to be done in concert, which is beyond the capabilities of LLMs. While LLMs exhibit excellent results with zero or few samples, they are still not as effective as some experts (e.g., fine-tuning models). In order to process the complex AI tasks, LLMs should work in coordination with external models, but currently Langchain can only call other language models to map text data when the text data is indexed, and when the complex tasks are encountered, no method is available to call other non-language models automatically to do corresponding processing.
Disclosure of Invention
The application aims to provide a multi-model cooperation method based on a large-scale language model, which solves the problem that when complex tasks are encountered, the problem that other non-language models cannot be automatically called to do corresponding processing in the prior art.
In order to achieve the above purpose, the present application provides the following technical solutions: a multi-model collaboration method based on a large-scale language model comprises the following steps:
s1, preparing a model, wherein the model comprises a large model and a model library, and the model library comprises a deep learning small model capable of processing each mode;
s2, task understanding and disassembling, namely after a request is received, splitting the request into a series of structured task sequences, and identifying the dependency relationship and execution sequence among the tasks;
s3, matching the model, namely after resolving a task list in S2, matching subtasks with the small models, firstly, acquiring text descriptions of the small models, transmitting the descriptions to the LLM model, enabling the model to understand the capacity of the small models in terms of semantics through task setting and prompting, particularly the input and output requirements and limitations of the model, then dynamically selecting the models by using context task allocation, and adding the tasks after user inquiry and resolution into prompt information to select the small model most suitable for the task;
s4, executing subtasks, carrying out reasoning calculation of the small models on heterogeneous reasoning terminals for calculation stability, and enabling the small models to quickly calculate on the reasoning terminals and return results of the corresponding subtasks by adapting devices of different architectures;
s5, performing task feedback iteration, enabling the LLM to judge whether the sub-tasks disassembled in the S2 are completed or not according to the returned results of the sub-tasks, combining the results of the sub-tasks according to the context, and if the problem of the total task is not solved, continuing to judge the total problem by repeating the mode of disassembling and calling the small model until the LLM judges that the total problem is solved, and returning part of intermediate processes and final results.
Preferably, the large model comprises GPT4, GLM and VICUNA, and the deep learning small model is used for processing data of various modes such as images, texts and the like.
Preferably, in the step 2, after the LLM is prompted to understand the task correspondingly, the type of the task and the type of the data format are determined by semantics, the task requirement is identified from the input, the data and the scene information related in the task are extracted, and then the task is split into a group of input and output related to task planning according to the type of the task, wherein the input is a request of a user, and the output is a desired task sequence.
Preferably, the step 2 further includes analyzing information of the dependency relationship between the tasks, by understanding the logic relationship between the tasks, and determining the execution sequence and the resource dependency.
Preferably, in the step 3, because the word number of the prompt message is limited, all the model information cannot be added to the prompt message, so that the models are filtered according to the subtask types, the remaining small models are ordered according to the semantic matching degree, and the first K models are selected as candidate models of the subtasks.
Compared with the prior art, the application has the beneficial effects that:
1) The application can not only call other language models to map text data, but also call other non-language models automatically according to tasks, so that not only can text data and tasks be processed, but also data and tasks of images and voices can be processed.
2) The application can enable the large language model to cooperate with other models to automatically complete specific complex tasks, solve the problem that the current large language model can not solve, and improve the task solving efficiency.
Drawings
FIG. 1 is a schematic flow chart of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Examples:
referring to fig. 1, the present application provides a technical solution: a multi-model collaboration method based on a large-scale language model comprises the following steps:
s1, preparing a model, wherein one part is a large model of the current mainstream, such as GPT4, GLM, VICUNA and the like, the other part is a model library, and the model library comprises deep learning small models capable of processing various modes, wherein the parameters are generally less than one hundred million, and can process data of various modes, such as images, texts and the like, and generally one small model can only process one type of data, so that a specific problem, such as OCR or speech synthesis, speech recognition and the like, is solved;
s2, task understanding and disassembling, namely splitting the task into a series of structured task sequences after receiving a request, identifying the dependency relationship and the execution sequence among the tasks, judging the type of the task and the type of a data format through semantics after the LLM carries out corresponding understanding on the tasks through prompts, identifying the task requirement from input, extracting data and scene information related to the task, splitting the task into a group of input and output related to task planning according to the type of the task, wherein the input is a request of a user, the output is an expected task sequence, analyzing the information of the dependency relationship among the tasks, and determining the execution sequence and resource dependence through understanding the logic relationship among the tasks;
s3, model matching, namely after a task list is analyzed in S2, matching subtasks with small models, firstly, acquiring text descriptions of the small models, transmitting the descriptions to the LLM models, enabling the models to understand the capacity of the small models in terms of semantics through task setting and prompting, particularly the input and output requirements and limitations of the models, then dynamically selecting the models by using context task allocation, and adding tasks after user query and analysis into prompt information to select the small models most suitable for the task, wherein all model information cannot be added into the prompt information due to word number limitation of the prompt information, therefore, filtering the models according to subtask types, sequencing the residual small models according to semantic matching degree, and selecting the front K models as candidate models of the subtasks;
s4, executing subtasks, carrying out reasoning calculation of the small models on heterogeneous reasoning terminals for calculation stability, and enabling the small models to quickly calculate on the reasoning terminals and return results of the corresponding subtasks by adapting devices of different architectures;
s5, performing task feedback iteration, enabling the LLM to judge whether the sub-tasks disassembled in the S2 are completed or not according to the returned results of the sub-tasks, combining the results of the sub-tasks according to the context, and if the problem of the total task is not solved, continuing to judge the total problem by repeating the mode of disassembling and calling the small model until the LLM judges that the total problem is solved, and returning part of intermediate processes and final results.
While the fundamental and principal features of the application and advantages of the application have been shown and described, it will be apparent to those skilled in the art that the application is not limited to the details of the foregoing exemplary embodiments, but may be embodied in other specific forms without departing from the spirit or essential characteristics thereof; the present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Although embodiments of the present application have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the application, the scope of which is defined in the appended claims and their equivalents.
Claims (5)
1. A multi-model collaboration method based on a large-scale language model is characterized in that: the method comprises the following steps:
s1, preparing a model, wherein the model comprises a large model and a model library, and the model library comprises a deep learning small model capable of processing each mode;
s2, task understanding and disassembling, namely after a request is received, splitting the request into a series of structured task sequences, and identifying the dependency relationship and execution sequence among the tasks;
s3, matching the model, namely after resolving a task list in S2, matching subtasks with the small models, firstly, acquiring text descriptions of the small models, transmitting the descriptions to the LLM model, enabling the model to understand the capacity of the small models in terms of semantics through task setting and prompting, particularly the input and output requirements and limitations of the model, then dynamically selecting the models by using context task allocation, and adding the tasks after user inquiry and resolution into prompt information to select the small model most suitable for the task;
s4, executing subtasks, carrying out reasoning calculation of the small models on heterogeneous reasoning terminals for calculation stability, and enabling the small models to quickly calculate on the reasoning terminals and return results of the corresponding subtasks by adapting devices of different architectures;
s5, performing task feedback iteration, enabling the LLM to judge whether the sub-tasks disassembled in the S2 are completed or not according to the returned results of the sub-tasks, combining the results of the sub-tasks according to the context, and if the problem of the total task is not solved, continuing to judge the total problem by repeating the mode of disassembling and calling the small model until the LLM judges that the total problem is solved, and returning part of intermediate processes and final results.
2. The multi-model collaboration method based on a large-scale language model of claim 1, wherein: the large model comprises GPT4, GLM and VICUNA, and the deep learning small model is used for processing data of various modes such as images, texts and the like.
3. The multi-model collaboration method based on a large-scale language model of claim 1, wherein: and step 2, after the LLM carries out corresponding understanding on the task through prompting, judging the type of the task and the type of the data format through semantics, identifying the task requirement from the input, extracting data and scene information related in the task, and splitting the task into a group of input and output related to task planning according to the type of the task, wherein the input is a request of a user, and the output is a desired task sequence.
4. A multi-model collaboration method based on a large-scale language model as claimed in claim 3, wherein: and step 2, analyzing the information of the dependency relationship between the tasks, and determining the execution sequence and the resource dependency by understanding the logic relationship between the tasks.
5. The multi-model collaboration method based on a large-scale language model of claim 1, wherein: in the step 3, all the model information cannot be added to the prompt information due to word number limitation of the prompt information, so that the models are filtered according to subtask types, the remaining small models are ordered according to semantic matching degree, and the first K models are selected as candidate models of the subtasks.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310958947.2A CN116976306A (en) | 2023-08-01 | 2023-08-01 | Multi-model collaboration method based on large-scale language model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310958947.2A CN116976306A (en) | 2023-08-01 | 2023-08-01 | Multi-model collaboration method based on large-scale language model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116976306A true CN116976306A (en) | 2023-10-31 |
Family
ID=88482791
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310958947.2A Pending CN116976306A (en) | 2023-08-01 | 2023-08-01 | Multi-model collaboration method based on large-scale language model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116976306A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117196546A (en) * | 2023-11-08 | 2023-12-08 | 杭州实在智能科技有限公司 | RPA flow executing system and method based on page state understanding and large model driving |
CN117217238A (en) * | 2023-11-09 | 2023-12-12 | 成都理工大学 | Intelligent interaction system and method based on large language model |
CN117311949A (en) * | 2023-11-28 | 2023-12-29 | 三六零数字安全科技集团有限公司 | Task security operation method, device, equipment and storage medium |
CN117370638A (en) * | 2023-12-08 | 2024-01-09 | 中国科学院空天信息创新研究院 | Method and device for decomposing and scheduling basic model task with enhanced thought diagram prompt |
CN117420760A (en) * | 2023-11-24 | 2024-01-19 | 东莞市新佰人机器人科技有限责任公司 | Multi-mode control algorithm fusion method suitable for autonomous cooperation of robot |
CN117633174A (en) * | 2023-11-22 | 2024-03-01 | 北京万物可知技术有限公司 | Voting consensus system based on multiple large model conversations |
CN117649129A (en) * | 2023-12-25 | 2024-03-05 | 宏景科技股份有限公司 | Multi-agent cooperative system and strategy method suitable for industrial digitization |
-
2023
- 2023-08-01 CN CN202310958947.2A patent/CN116976306A/en active Pending
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117196546A (en) * | 2023-11-08 | 2023-12-08 | 杭州实在智能科技有限公司 | RPA flow executing system and method based on page state understanding and large model driving |
CN117217238A (en) * | 2023-11-09 | 2023-12-12 | 成都理工大学 | Intelligent interaction system and method based on large language model |
CN117217238B (en) * | 2023-11-09 | 2024-01-30 | 成都理工大学 | Intelligent interaction system and method based on large language model |
CN117633174A (en) * | 2023-11-22 | 2024-03-01 | 北京万物可知技术有限公司 | Voting consensus system based on multiple large model conversations |
CN117420760A (en) * | 2023-11-24 | 2024-01-19 | 东莞市新佰人机器人科技有限责任公司 | Multi-mode control algorithm fusion method suitable for autonomous cooperation of robot |
CN117311949A (en) * | 2023-11-28 | 2023-12-29 | 三六零数字安全科技集团有限公司 | Task security operation method, device, equipment and storage medium |
CN117370638A (en) * | 2023-12-08 | 2024-01-09 | 中国科学院空天信息创新研究院 | Method and device for decomposing and scheduling basic model task with enhanced thought diagram prompt |
CN117370638B (en) * | 2023-12-08 | 2024-05-07 | 中国科学院空天信息创新研究院 | Method and device for decomposing and scheduling basic model task with enhanced thought diagram prompt |
CN117649129A (en) * | 2023-12-25 | 2024-03-05 | 宏景科技股份有限公司 | Multi-agent cooperative system and strategy method suitable for industrial digitization |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116976306A (en) | Multi-model collaboration method based on large-scale language model | |
WO2020093761A1 (en) | Entity and relationship joint extraction method oriented to software bug knowledge | |
CN109325040B (en) | FAQ question-answer library generalization method, device and equipment | |
CN111414380B (en) | Method, equipment and storage medium for generating SQL (structured query language) sentences of Chinese database | |
CN111930906A (en) | Knowledge graph question-answering method and device based on semantic block | |
CN114547072A (en) | Method, system, equipment and storage medium for converting natural language query into SQL | |
CN113011337B (en) | Chinese character library generation method and system based on deep meta learning | |
CN116127020A (en) | Method for training generated large language model and searching method based on model | |
CN112766990B (en) | Intelligent customer service auxiliary system and method based on multi-round dialogue improvement | |
CN112527986A (en) | Multi-round dialog text generation method, device, equipment and storage medium | |
CN114238373A (en) | Method and device for converting natural language question into structured query statement | |
CN111930912A (en) | Dialogue management method, system, device and storage medium | |
CN111460303A (en) | Data processing method and device, electronic equipment and computer readable storage medium | |
CN112989829B (en) | Named entity recognition method, device, equipment and storage medium | |
JP2023539225A (en) | Selection and description of machine learning models for multidimensional datasets | |
CN113239698A (en) | Information extraction method, device, equipment and medium based on RPA and AI | |
CN116680368B (en) | Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier | |
CN116186219A (en) | Man-machine dialogue interaction method, system and storage medium | |
CN113468345B (en) | Entity co-reference detection data processing system based on knowledge graph | |
CN114822726A (en) | Construction method, analysis method, device, storage medium and computer equipment | |
CN114490974A (en) | Automatic information reply method, device, system, electronic equipment and readable medium | |
CN111091011B (en) | Domain prediction method, domain prediction device and electronic equipment | |
CN114997395A (en) | Training method of text generation model, method for generating text and respective devices | |
CN114333795A (en) | Speech recognition method and apparatus, computer readable storage medium | |
CN115114281A (en) | Query statement generation method and device, storage medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |