CN116976306A - Multi-model collaboration method based on large-scale language model - Google Patents

Multi-model collaboration method based on large-scale language model Download PDF

Info

Publication number
CN116976306A
CN116976306A CN202310958947.2A CN202310958947A CN116976306A CN 116976306 A CN116976306 A CN 116976306A CN 202310958947 A CN202310958947 A CN 202310958947A CN 116976306 A CN116976306 A CN 116976306A
Authority
CN
China
Prior art keywords
model
task
models
tasks
small
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310958947.2A
Other languages
Chinese (zh)
Inventor
李翔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhuhai Zhuohuan Technology Co ltd
Original Assignee
Zhuhai Zhuohuan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhuhai Zhuohuan Technology Co ltd filed Critical Zhuhai Zhuohuan Technology Co ltd
Priority to CN202310958947.2A priority Critical patent/CN116976306A/en
Publication of CN116976306A publication Critical patent/CN116976306A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/04Inference or reasoning models

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Machine Translation (AREA)

Abstract

The application discloses a multi-model cooperation method based on a large-scale language model, which belongs to the technical field of artificial intelligence and comprises a plurality of steps of model preparation, task understanding and disassembling, model matching, subtask execution, task feedback iterative execution and the like.

Description

Multi-model collaboration method based on large-scale language model
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a multi-model collaboration method based on a large-scale language model.
Background
LLM: large Language Model (large-scale language model); langChain: langChain is a framework based on Large Language Models (LLMs) and is intended to provide a generic interface for the development of LLMs applications; small model: the deep learning model of non-LLM has the general parameters of less than or equal to one hundred million, can process data of various modes such as images, texts and the like, and can only process one data in a small model, so that a specific problem is solved; ebedding: mapping the original data into a semantic space, wherein the purpose is that in the semantic space, quantized values can be used for judging semantic similarity between texts;
the large-scale language model LLM refers to a model capable of automatically predicting the next word in natural language, such as GPT-4, GLM, etc. Because of the self-learning capability of LLMs, the LLMs can learn on the basis of a large amount of data without manual intervention, can continuously improve the performance of the LLMs, and can generate high-quality natural language texts, thus being very useful for tasks such as natural language generation, machine translation and the like. But current inputs and outputs are limited to text only.
The approximation scheme is as follows: langchain is a framework based on a Large Language Model (LLMs), and aims to provide a universal interface for the development of LLMs application, simplify the development difficulty of the application, facilitate a developer to develop complex LLMs application rapidly, and support the call of a third party model to make an ebedding mapping on text data when indexing the text data.
Current LLM technology has met with great success, but has drawbacks that make LLMs difficult to process complex information, such as images and speech. In addition, some complex tasks require multiple models to be done in concert, which is beyond the capabilities of LLMs. While LLMs exhibit excellent results with zero or few samples, they are still not as effective as some experts (e.g., fine-tuning models). In order to process the complex AI tasks, LLMs should work in coordination with external models, but currently Langchain can only call other language models to map text data when the text data is indexed, and when the complex tasks are encountered, no method is available to call other non-language models automatically to do corresponding processing.
Disclosure of Invention
The application aims to provide a multi-model cooperation method based on a large-scale language model, which solves the problem that when complex tasks are encountered, the problem that other non-language models cannot be automatically called to do corresponding processing in the prior art.
In order to achieve the above purpose, the present application provides the following technical solutions: a multi-model collaboration method based on a large-scale language model comprises the following steps:
s1, preparing a model, wherein the model comprises a large model and a model library, and the model library comprises a deep learning small model capable of processing each mode;
s2, task understanding and disassembling, namely after a request is received, splitting the request into a series of structured task sequences, and identifying the dependency relationship and execution sequence among the tasks;
s3, matching the model, namely after resolving a task list in S2, matching subtasks with the small models, firstly, acquiring text descriptions of the small models, transmitting the descriptions to the LLM model, enabling the model to understand the capacity of the small models in terms of semantics through task setting and prompting, particularly the input and output requirements and limitations of the model, then dynamically selecting the models by using context task allocation, and adding the tasks after user inquiry and resolution into prompt information to select the small model most suitable for the task;
s4, executing subtasks, carrying out reasoning calculation of the small models on heterogeneous reasoning terminals for calculation stability, and enabling the small models to quickly calculate on the reasoning terminals and return results of the corresponding subtasks by adapting devices of different architectures;
s5, performing task feedback iteration, enabling the LLM to judge whether the sub-tasks disassembled in the S2 are completed or not according to the returned results of the sub-tasks, combining the results of the sub-tasks according to the context, and if the problem of the total task is not solved, continuing to judge the total problem by repeating the mode of disassembling and calling the small model until the LLM judges that the total problem is solved, and returning part of intermediate processes and final results.
Preferably, the large model comprises GPT4, GLM and VICUNA, and the deep learning small model is used for processing data of various modes such as images, texts and the like.
Preferably, in the step 2, after the LLM is prompted to understand the task correspondingly, the type of the task and the type of the data format are determined by semantics, the task requirement is identified from the input, the data and the scene information related in the task are extracted, and then the task is split into a group of input and output related to task planning according to the type of the task, wherein the input is a request of a user, and the output is a desired task sequence.
Preferably, the step 2 further includes analyzing information of the dependency relationship between the tasks, by understanding the logic relationship between the tasks, and determining the execution sequence and the resource dependency.
Preferably, in the step 3, because the word number of the prompt message is limited, all the model information cannot be added to the prompt message, so that the models are filtered according to the subtask types, the remaining small models are ordered according to the semantic matching degree, and the first K models are selected as candidate models of the subtasks.
Compared with the prior art, the application has the beneficial effects that:
1) The application can not only call other language models to map text data, but also call other non-language models automatically according to tasks, so that not only can text data and tasks be processed, but also data and tasks of images and voices can be processed.
2) The application can enable the large language model to cooperate with other models to automatically complete specific complex tasks, solve the problem that the current large language model can not solve, and improve the task solving efficiency.
Drawings
FIG. 1 is a schematic flow chart of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
Examples:
referring to fig. 1, the present application provides a technical solution: a multi-model collaboration method based on a large-scale language model comprises the following steps:
s1, preparing a model, wherein one part is a large model of the current mainstream, such as GPT4, GLM, VICUNA and the like, the other part is a model library, and the model library comprises deep learning small models capable of processing various modes, wherein the parameters are generally less than one hundred million, and can process data of various modes, such as images, texts and the like, and generally one small model can only process one type of data, so that a specific problem, such as OCR or speech synthesis, speech recognition and the like, is solved;
s2, task understanding and disassembling, namely splitting the task into a series of structured task sequences after receiving a request, identifying the dependency relationship and the execution sequence among the tasks, judging the type of the task and the type of a data format through semantics after the LLM carries out corresponding understanding on the tasks through prompts, identifying the task requirement from input, extracting data and scene information related to the task, splitting the task into a group of input and output related to task planning according to the type of the task, wherein the input is a request of a user, the output is an expected task sequence, analyzing the information of the dependency relationship among the tasks, and determining the execution sequence and resource dependence through understanding the logic relationship among the tasks;
s3, model matching, namely after a task list is analyzed in S2, matching subtasks with small models, firstly, acquiring text descriptions of the small models, transmitting the descriptions to the LLM models, enabling the models to understand the capacity of the small models in terms of semantics through task setting and prompting, particularly the input and output requirements and limitations of the models, then dynamically selecting the models by using context task allocation, and adding tasks after user query and analysis into prompt information to select the small models most suitable for the task, wherein all model information cannot be added into the prompt information due to word number limitation of the prompt information, therefore, filtering the models according to subtask types, sequencing the residual small models according to semantic matching degree, and selecting the front K models as candidate models of the subtasks;
s4, executing subtasks, carrying out reasoning calculation of the small models on heterogeneous reasoning terminals for calculation stability, and enabling the small models to quickly calculate on the reasoning terminals and return results of the corresponding subtasks by adapting devices of different architectures;
s5, performing task feedback iteration, enabling the LLM to judge whether the sub-tasks disassembled in the S2 are completed or not according to the returned results of the sub-tasks, combining the results of the sub-tasks according to the context, and if the problem of the total task is not solved, continuing to judge the total problem by repeating the mode of disassembling and calling the small model until the LLM judges that the total problem is solved, and returning part of intermediate processes and final results.
While the fundamental and principal features of the application and advantages of the application have been shown and described, it will be apparent to those skilled in the art that the application is not limited to the details of the foregoing exemplary embodiments, but may be embodied in other specific forms without departing from the spirit or essential characteristics thereof; the present embodiments are therefore to be considered in all respects as illustrative and not restrictive, the scope of the application being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.
Although embodiments of the present application have been shown and described, it will be understood by those skilled in the art that various changes, modifications, substitutions and alterations can be made therein without departing from the principles and spirit of the application, the scope of which is defined in the appended claims and their equivalents.

Claims (5)

1. A multi-model collaboration method based on a large-scale language model is characterized in that: the method comprises the following steps:
s1, preparing a model, wherein the model comprises a large model and a model library, and the model library comprises a deep learning small model capable of processing each mode;
s2, task understanding and disassembling, namely after a request is received, splitting the request into a series of structured task sequences, and identifying the dependency relationship and execution sequence among the tasks;
s3, matching the model, namely after resolving a task list in S2, matching subtasks with the small models, firstly, acquiring text descriptions of the small models, transmitting the descriptions to the LLM model, enabling the model to understand the capacity of the small models in terms of semantics through task setting and prompting, particularly the input and output requirements and limitations of the model, then dynamically selecting the models by using context task allocation, and adding the tasks after user inquiry and resolution into prompt information to select the small model most suitable for the task;
s4, executing subtasks, carrying out reasoning calculation of the small models on heterogeneous reasoning terminals for calculation stability, and enabling the small models to quickly calculate on the reasoning terminals and return results of the corresponding subtasks by adapting devices of different architectures;
s5, performing task feedback iteration, enabling the LLM to judge whether the sub-tasks disassembled in the S2 are completed or not according to the returned results of the sub-tasks, combining the results of the sub-tasks according to the context, and if the problem of the total task is not solved, continuing to judge the total problem by repeating the mode of disassembling and calling the small model until the LLM judges that the total problem is solved, and returning part of intermediate processes and final results.
2. The multi-model collaboration method based on a large-scale language model of claim 1, wherein: the large model comprises GPT4, GLM and VICUNA, and the deep learning small model is used for processing data of various modes such as images, texts and the like.
3. The multi-model collaboration method based on a large-scale language model of claim 1, wherein: and step 2, after the LLM carries out corresponding understanding on the task through prompting, judging the type of the task and the type of the data format through semantics, identifying the task requirement from the input, extracting data and scene information related in the task, and splitting the task into a group of input and output related to task planning according to the type of the task, wherein the input is a request of a user, and the output is a desired task sequence.
4. A multi-model collaboration method based on a large-scale language model as claimed in claim 3, wherein: and step 2, analyzing the information of the dependency relationship between the tasks, and determining the execution sequence and the resource dependency by understanding the logic relationship between the tasks.
5. The multi-model collaboration method based on a large-scale language model of claim 1, wherein: in the step 3, all the model information cannot be added to the prompt information due to word number limitation of the prompt information, so that the models are filtered according to subtask types, the remaining small models are ordered according to semantic matching degree, and the first K models are selected as candidate models of the subtasks.
CN202310958947.2A 2023-08-01 2023-08-01 Multi-model collaboration method based on large-scale language model Pending CN116976306A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310958947.2A CN116976306A (en) 2023-08-01 2023-08-01 Multi-model collaboration method based on large-scale language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310958947.2A CN116976306A (en) 2023-08-01 2023-08-01 Multi-model collaboration method based on large-scale language model

Publications (1)

Publication Number Publication Date
CN116976306A true CN116976306A (en) 2023-10-31

Family

ID=88482791

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310958947.2A Pending CN116976306A (en) 2023-08-01 2023-08-01 Multi-model collaboration method based on large-scale language model

Country Status (1)

Country Link
CN (1) CN116976306A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196546A (en) * 2023-11-08 2023-12-08 杭州实在智能科技有限公司 RPA flow executing system and method based on page state understanding and large model driving
CN117217238A (en) * 2023-11-09 2023-12-12 成都理工大学 Intelligent interaction system and method based on large language model
CN117311949A (en) * 2023-11-28 2023-12-29 三六零数字安全科技集团有限公司 Task security operation method, device, equipment and storage medium
CN117370638A (en) * 2023-12-08 2024-01-09 中国科学院空天信息创新研究院 Method and device for decomposing and scheduling basic model task with enhanced thought diagram prompt
CN117420760A (en) * 2023-11-24 2024-01-19 东莞市新佰人机器人科技有限责任公司 Multi-mode control algorithm fusion method suitable for autonomous cooperation of robot
CN117633174A (en) * 2023-11-22 2024-03-01 北京万物可知技术有限公司 Voting consensus system based on multiple large model conversations
CN117649129A (en) * 2023-12-25 2024-03-05 宏景科技股份有限公司 Multi-agent cooperative system and strategy method suitable for industrial digitization

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196546A (en) * 2023-11-08 2023-12-08 杭州实在智能科技有限公司 RPA flow executing system and method based on page state understanding and large model driving
CN117217238A (en) * 2023-11-09 2023-12-12 成都理工大学 Intelligent interaction system and method based on large language model
CN117217238B (en) * 2023-11-09 2024-01-30 成都理工大学 Intelligent interaction system and method based on large language model
CN117633174A (en) * 2023-11-22 2024-03-01 北京万物可知技术有限公司 Voting consensus system based on multiple large model conversations
CN117420760A (en) * 2023-11-24 2024-01-19 东莞市新佰人机器人科技有限责任公司 Multi-mode control algorithm fusion method suitable for autonomous cooperation of robot
CN117311949A (en) * 2023-11-28 2023-12-29 三六零数字安全科技集团有限公司 Task security operation method, device, equipment and storage medium
CN117370638A (en) * 2023-12-08 2024-01-09 中国科学院空天信息创新研究院 Method and device for decomposing and scheduling basic model task with enhanced thought diagram prompt
CN117370638B (en) * 2023-12-08 2024-05-07 中国科学院空天信息创新研究院 Method and device for decomposing and scheduling basic model task with enhanced thought diagram prompt
CN117649129A (en) * 2023-12-25 2024-03-05 宏景科技股份有限公司 Multi-agent cooperative system and strategy method suitable for industrial digitization

Similar Documents

Publication Publication Date Title
CN116976306A (en) Multi-model collaboration method based on large-scale language model
WO2020093761A1 (en) Entity and relationship joint extraction method oriented to software bug knowledge
CN109325040B (en) FAQ question-answer library generalization method, device and equipment
CN111414380B (en) Method, equipment and storage medium for generating SQL (structured query language) sentences of Chinese database
CN111930906A (en) Knowledge graph question-answering method and device based on semantic block
CN114547072A (en) Method, system, equipment and storage medium for converting natural language query into SQL
CN113011337B (en) Chinese character library generation method and system based on deep meta learning
CN116127020A (en) Method for training generated large language model and searching method based on model
CN112766990B (en) Intelligent customer service auxiliary system and method based on multi-round dialogue improvement
CN112527986A (en) Multi-round dialog text generation method, device, equipment and storage medium
CN114238373A (en) Method and device for converting natural language question into structured query statement
CN111930912A (en) Dialogue management method, system, device and storage medium
CN111460303A (en) Data processing method and device, electronic equipment and computer readable storage medium
CN112989829B (en) Named entity recognition method, device, equipment and storage medium
JP2023539225A (en) Selection and description of machine learning models for multidimensional datasets
CN113239698A (en) Information extraction method, device, equipment and medium based on RPA and AI
CN116680368B (en) Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier
CN116186219A (en) Man-machine dialogue interaction method, system and storage medium
CN113468345B (en) Entity co-reference detection data processing system based on knowledge graph
CN114822726A (en) Construction method, analysis method, device, storage medium and computer equipment
CN114490974A (en) Automatic information reply method, device, system, electronic equipment and readable medium
CN111091011B (en) Domain prediction method, domain prediction device and electronic equipment
CN114997395A (en) Training method of text generation model, method for generating text and respective devices
CN114333795A (en) Speech recognition method and apparatus, computer readable storage medium
CN115114281A (en) Query statement generation method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination