CN116975192A - Large language model operation chain device and method based on tree structure - Google Patents

Large language model operation chain device and method based on tree structure Download PDF

Info

Publication number
CN116975192A
CN116975192A CN202310898633.8A CN202310898633A CN116975192A CN 116975192 A CN116975192 A CN 116975192A CN 202310898633 A CN202310898633 A CN 202310898633A CN 116975192 A CN116975192 A CN 116975192A
Authority
CN
China
Prior art keywords
sub
model
language model
module
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310898633.8A
Other languages
Chinese (zh)
Inventor
郭红森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Shuheng Information Technology Co ltd
Original Assignee
Shanghai Shuheng Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Shuheng Information Technology Co ltd filed Critical Shanghai Shuheng Information Technology Co ltd
Priority to CN202310898633.8A priority Critical patent/CN116975192A/en
Publication of CN116975192A publication Critical patent/CN116975192A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Abstract

The invention relates to a large language model operation chain device and a method based on a tree structure, comprising the following steps: the hierarchical structure processing module is used for decomposing the large-scale language model into a plurality of sub-models and then organizing according to the tree structure; the context modeling module is used for extracting context information from input data and determining a sub-model path to be calculated according to the extracted context information; the computing resource allocation module is used for receiving the sub-model path determined by the context modeling module, computing resource requirements according to the sub-model path, and dynamically allocating computing resources for the computing process; the model parameter storage module is used for storing parameters of the large language model and sub-model parameters thereof; compared with the prior art, the method can realize the efficient operation of a large-scale language model under the condition of limited computing resources, reduce the waste of computing resources, maintain higher prediction accuracy, and solve the problems and disadvantages of the prior art in the aspects of computing resource requirements, context modeling, computing speed and the like.

Description

Large language model operation chain device and method based on tree structure
[ technical field ]
The invention relates to the technical field of language processing, in particular to a large language model operation chain device and method based on a tree structure.
[ background Art ]
In the prior art, the operation of a large-scale language model mainly depends on high-performance computing devices, such as GPUs, TPUs, and the like. The devices can support complex parallel computation, and satisfy high-speed multiplication and addition operation of a large-scale language model weight matrix. However, in the prior art, the language model is mainly operated by adopting a single-level structure and a fixed computing resource allocation mode. Such a technical means mainly has the following problems and drawbacks:
(1) The computational resource requirements increase with model size: along with the expansion of the scale of the language model, the computing resource demand grows exponentially; when the model scale reaches a certain degree, the traditional computing equipment such as GPU, TPU and other resources may not meet the operation requirements; today, the scale of large-scale language models is expected to continue to grow as the accuracy requirements for various types of natural language processing tasks increase.
(2) Fixed computing resource allocation scheme: the existing large-scale language model operation technology generally adopts a fixed strategy when computing resources are distributed; for language model applications under different scenarios and requirements, a fixed allocation of computing resources may result in wasted resources and performance bottlenecks.
(3) Lack of modeling of context: the prior art does not fully utilize the context information when computing a large-scale language model; optimizing modeling context information can effectively reduce the calculated amount and improve the calculation speed; in addition, for specific tasks or input data, context modeling can help filter irrelevant information, and improve the accuracy of reasoning results.
(4) The calculation complexity is high: in the prior art, the operation complexity of a large-scale language model is high; the reason for this is that the multiplication and addition of the weight matrix causes the calculation amount to increase exponentially with the increase of depth and width; this presents challenges for demand and performance optimization of computing resources.
In summary, the prior art has not been able to effectively solve the problem of the reasoning speed of the large-scale language model under the limited computing resources; aiming at the problem, the method can provide a large language model operation chain device based on a tree structure so as to solve the problems and disadvantages of the prior art in the aspects of calculation resource requirement, context modeling, calculation speed and the like, and has very important significance.
[ summary of the invention ]
The invention aims to solve the defects and provide a large language model operation chain device based on a tree structure, which can realize the efficient operation of a large-scale language model under the condition of limited computing resources, reduce the waste of computing resources, keep higher prediction accuracy, has good usability and expansibility, and solves the problems and the disadvantages of the prior art in the aspects of computing resource requirements, context modeling, computing speed and the like.
The large language model operation chain device based on the tree structure comprises a hierarchical structure processing module, a context modeling module, a computing resource allocation module and a model parameter storage module, wherein the hierarchical structure processing module is used for decomposing a large-scale language model into a plurality of sub-models and then organizing according to the tree structure; the context modeling module is used for extracting context information from input data and determining a sub-model path to be calculated according to the extracted context information; the computing resource allocation module is used for receiving the sub-model path determined by the context modeling module, computing resource requirements according to the sub-model path, and dynamically allocating computing resources for the computing process; and the model parameter storage module is used for storing parameters of the large language model and sub-model parameters thereof.
Further, in the hierarchical structure processing module, each node of the tree structure corresponds to one sub-model, and the connection between each node represents the association between the sub-models.
Further, in the context modeling module, the calculation amount is reduced by using the constraint of the context information, and the irrelevant nodes are filtered.
Further, in the model parameter storage module, model parameters are updated through a training process and used for calculating an output result in an reasoning process.
The invention also provides a large language model operation chain method based on the tree structure, which comprises the following steps: (1) Firstly, input data is sent to a context modeling module, context information is extracted, and a sub-model path required to be calculated is determined; (2) The computing resource allocation module dynamically allocates computing resources according to the path computing resource requirements of the sub-models, and realizes parallel computing among the sub-models; (3) The input data and the calculation resource allocation result are transmitted to a hierarchical structure processing module, the hierarchical structure processing module is utilized to decompose the large-scale language model into a plurality of sub-models, the sub-models are organized according to a tree structure, and then efficient operation is carried out on paths of the sub-models; (4) And after the calculation is completed, processing the calculation result to form a prediction output result.
Further, in the step (4), after the calculation is completed, the parameter storage module is used for storing the parameters of the large language model and the parameters of each sub-model thereof, so as to calculate an output result in the reasoning process.
Further, in step (1), the design context modeling module is configured to extract key information from the input data, and the pre-trained classifier or clustering algorithm determines a sub-model path to be calculated according to the extracted information.
Further, when a question to be answered is input, the context modeling module extracts useful information from the question and determines a computational submodel path; and after the calculation resource allocation module dynamically allocates resources, calculating a required submodel in the tree structure, and finally generating corresponding answers according to a knowledge base or other data sources.
Further, when inputting speech exercise data to be modified, the context modeling module extracts useful information from the data and determines a computational submodel path; after the computing resource allocation module dynamically allocates resources, a required submodel is computed in the tree structure, and finally, the score and feedback of the voice exercise are generated.
Further, when inputting the writing exercise text to be modified, the context modeling module extracts useful information from the text and determines a calculation sub-model path; and after the calculation resource allocation module dynamically allocates resources, calculating a required submodel in the tree structure, and finally generating the score and feedback of the writing exercise.
Compared with the prior art, the invention has the following advantages:
(1) Reducing computing resource requirements: according to the invention, the large-scale language model is decomposed into a plurality of sub-models through the hierarchical structure processing module, so that the scale of a single model is reduced, and the demand of computing resources is reduced, thereby the invention can still realize high-efficiency operation under the condition of limited computing resources, and is suitable for the demands of different performance devices.
(2) The calculation speed is improved: the method utilizes the context modeling module to extract and restrict the calculation path of the sub-model, reduces the calculated amount and filters out irrelevant sub-models; the computing resource allocation module performs intelligent allocation according to the input data and the sub-model computing resource requirements, and fully utilizes computing resources, so that the computing speed is improved.
(3) High accuracy is maintained: the invention adopts the tree structure to organize the submodel, thereby reducing the calculation complexity and simultaneously keeping higher accuracy; in addition, the context modeling module can filter irrelevant information according to input data, and the prediction accuracy of the model is improved.
(4) Flexible adaptation to different scenarios: the invention adopts a modularized design, and each module can be flexibly adjusted and optimized, so that the invention is suitable for language model application under different scenes and requirements; for example, the tree structure hierarchy in the hierarchical processing module may be optimized according to specific task requirements, or the policies of the computing resource allocation module may be adjusted according to computing device performance.
(5) Environmental protection: the invention reduces the energy consumption while reducing the demand of computing resources, and is beneficial to green computing and environmental protection.
(6) Easy to use and expand: the structure and the working principle of the invention are clear and easy to understand and implement; meanwhile, the invention has good expansibility and universality due to the modularized design, and can be applied to various natural language processing, machine learning and other related fields.
In summary, the invention can realize the efficient operation of a large-scale language model under the condition of limited computing resources, reduce the waste of computing resources, keep higher prediction accuracy and have good usability and expansibility; therefore, the invention has wide application value and can be applied to a plurality of natural language processing tasks such as chat robots, real-time translation, knowledge graph construction and the like.
[ description of the drawings ]
FIG. 1 is a schematic diagram of the structure of the present invention;
fig. 2 is a schematic flow chart of the present invention.
Detailed description of the preferred embodiments
The invention provides a large language model operation chain device and a large language model operation chain method based on a tree structure, which can effectively solve the problems and disadvantages in the aspects of calculation resource requirements, context modeling, calculation speed and the like in the prior art.
The invention is further described below with reference to the accompanying drawings:
as shown in fig. 1, the invention mainly comprises a hierarchical structure processing module (module 1), a context modeling module (module 2), a computing resource allocation module (module 3), a model parameter storage module (module 4) and other components. In particular, the method comprises the steps of,
hierarchical processing module (module 1): the module is responsible for decomposing the large-scale language model into a plurality of sub-models and then organizing according to a tree structure. Each node of the tree structure corresponds to one sub-model, and the connection between the nodes represents the association between the sub-models. By decomposing and reorganizing the large-scale language model, the scale of the individual models is reduced, thereby reducing the computational resource requirements.
Context modeling module (module 2): the module extracts context information from the input data and determines the sub-model path to be calculated based on the extracted context information. And the constraint of the context information is utilized, the calculated amount is reduced, and irrelevant nodes are filtered, so that the accuracy of a prediction result is improved.
Computing resource allocation module (module 3): the module receives the sub-model paths determined by the context modeling module and dynamically allocates computing resources for the computing process. And according to the input data and the sub-model computing resource requirements, intelligently adjusting and distributing resources, fully utilizing the computing resources and improving the computing speed.
Model parameter storage module (module 4): the component is responsible for storing parameters of the large language model and its sub-model parameters. Model parameters are updated through a training process and are used for calculating output results in an reasoning process.
The specific working steps and working principles of the invention are as follows:
(1) First, the input data is fed into a context modeling module (module 2), which extracts the context information and determines the sub-model paths to be calculated.
(2) The computing resource allocation module (module 3) dynamically allocates computing resources according to the sub-model paths.
(3) The input data and the calculation resource allocation result are transmitted to a hierarchical processing module (module 1) to perform efficient operation on the sub-model path.
(4) After the calculation is completed, the calculation result is processed (such as activating function, normalizing and the like) to form a prediction output result.
Through the technical scheme, the method and the device realize the efficient operation of the large-scale language model, and simultaneously, the advantage of the tree structure is utilized, so that the model can keep higher accuracy while the calculation complexity is reduced. Thus, the present invention is applicable to natural language processing, machine learning, and other related fields.
The invention is further illustrated below in connection with specific examples:
example 1: intelligent question-answering system
The invention can be applied to constructing an intelligent question-answering system for generating corresponding answers according to questions presented by users. In this embodiment 1, a tree-structure-based large language model operation chain device will be employed to realize efficient operation. The specific implementation process is as follows:
(1) The large-scale question-answering model is decomposed into a plurality of sub-models and organized according to a tree structure. Each sub-model represents a portion of the weights in the original model, such as vocabulary, grammar rules, domain knowledge, and the like.
(2) The context modeling module is designed to extract useful information, such as keywords, entities, domain types, etc., from the input questions. A pre-trained classifier or clustering algorithm determines the sub-model paths to be calculated for the input problem based on the extracted information.
(3) The computing resource allocation module dynamically allocates computing resources according to the input data and the sub-model computing resource requirements. According to task demands and computing device performances, parallel computation among different sub-models is realized in the multi-core processor environment.
(4) Parameters of the large language question-answer model and sub-model parameters thereof are stored. Used in the reasoning process to calculate the output result.
(5) When a user inputs a question to be answered, the context modeling module extracts useful information from the question and determines a computational submodel path. And after the calculation resource allocation module dynamically allocates resources, calculating a required submodel in the tree structure, and finally generating corresponding answers according to a knowledge base or other data sources in the system.
Therefore, the tree-structure-based large language model operation chain device is applied to the intelligent question-answering system, so that the problem raised by a user can be effectively solved under the condition of limited computing resources. In addition, the embodiment 1 maintains higher accuracy while reducing the computation complexity, and formulates a flexible computation resource allocation strategy, so that the method can be widely applied to various natural language processing scenes.
Example 2: intelligent voice training scoring system
The invention can be applied to construct an intelligent speech training scoring system for evaluating and providing feedback for speech exercises submitted by a user. The specific implementation process is as follows:
(1) The large-scale speech scoring model is decomposed into a plurality of sub-models and organized according to a tree structure. Each sub-model represents a portion of the weights in the original model, such as pronunciation accuracy, speech speed, intonation, fluency, etc.
(2) And the design context modeling module is used for extracting key information such as voice characteristics, grammar structures, semantic information and the like from the input voice data. A pre-trained classifier or clustering algorithm determines the sub-model paths to be calculated according to the extracted information.
(3) The computing resource allocation module dynamically allocates computing resources according to the input data and the sub-model computing resource requirements. According to task demands and computing device performances, parallel computation among different sub-models is realized in the multi-core processor environment.
(4) Parameters of a large language speech scoring model and sub-model parameters thereof are stored. Used in the reasoning process to calculate the output result.
(5) When the user inputs speech exercise data to be modified, the context modeling module extracts useful information from the data and determines a computational submodel path. After the computing resource allocation module dynamically allocates resources, a required submodel is computed in the tree structure, and finally, the score and feedback of the voice exercise are generated.
Therefore, by using the intelligent voice training scoring system constructed by the invention, a user can obtain evaluation and feedback of voice exercise in a short time, so that learning and exercise effects are effectively improved. Meanwhile, the invention realizes high-efficiency operation under the condition of limited computing resources, and reduces the running cost of the system.
Example 3: athens writing correction system
The invention can be applied to the construction of an elegance (iles) writing modification system for evaluating and providing feedback on elegance writing exercises submitted by a user. The specific implementation process is as follows:
(1) The large-scale yawing writing correction model is decomposed into a plurality of sub-models and organized according to a tree structure. Each sub-model represents a portion of the weights in the original model, such as scoring criteria, grammar rules, vocabulary, etc.
(2) And the design context modeling module is used for extracting key information such as keywords, topics, entities and the like from the input Athens writing text. A pre-trained classifier or clustering algorithm determines the sub-model paths to be calculated according to the extracted information.
(3) The computing resource allocation module dynamically allocates computing resources according to the input data and the sub-model computing resource requirements. According to task demands and computing device performances, parallel computation among different sub-models is realized in the multi-core processor environment.
(4) The parameters of the large language elegance writing correction model and the sub-model parameters thereof are stored. Used in the reasoning process to calculate the output result.
(5) When the user enters the Athens writing exercise text to be modified, the context modeling module extracts useful information from the text and determines a computational submodel path. And after the computing resource allocation module dynamically allocates resources, computing a required submodel in the tree structure, and finally generating the score and feedback of the Athens writing exercise.
Therefore, by using the yawing writing correction system constructed by the invention, a user can obtain evaluation and feedback of the yawing writing exercise in a short time, so that the learning and exercise effects are effectively improved. Meanwhile, the invention realizes high-efficiency operation under the condition of limited computing resources, and reduces the running cost of the system.
The foregoing is merely a specific implementation of the embodiment of the present invention, but the protection scope of the embodiment of the present invention is not limited to this, and any changes or substitutions within the technical scope disclosed in the embodiment of the present invention should be covered in the protection scope of the embodiment of the present invention. Therefore, the protection scope of the embodiments of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A tree-structure-based large language model operation chain device, comprising:
the hierarchical structure processing module is used for decomposing the large-scale language model into a plurality of sub-models and then organizing according to the tree structure;
the context modeling module is used for extracting context information from input data and determining a sub-model path to be calculated according to the extracted context information;
the computing resource allocation module is used for receiving the sub-model path determined by the context modeling module, computing resource requirements according to the sub-model path, and dynamically allocating computing resources for the computing process;
and the model parameter storage module is used for storing parameters of the large language model and sub-model parameters thereof.
2. The tree-structure-based large language model operation chain device according to claim 1, wherein: in the hierarchical structure processing module, each node of the tree structure corresponds to one sub-model, and the connection between the nodes represents the association between the sub-models.
3. The tree-structure-based large language model operation chain device according to claim 1, wherein: in the context modeling module, the calculation amount is reduced by utilizing the constraint of the context information, and irrelevant nodes are filtered.
4. The tree-structure-based large language model operation chain device according to claim 1, wherein: in the model parameter storage module, model parameters are updated through a training process and are used for calculating an output result in an reasoning process.
5. The tree structure-based large language model operation chain method is characterized by comprising the following steps of:
(1) Firstly, input data is sent to a context modeling module, context information is extracted, and a sub-model path required to be calculated is determined;
(2) The computing resource allocation module dynamically allocates computing resources according to the path computing resource requirements of the sub-models, and realizes parallel computing among the sub-models;
(3) The input data and the calculation resource allocation result are transmitted to a hierarchical structure processing module, the hierarchical structure processing module is utilized to decompose the large-scale language model into a plurality of sub-models, the sub-models are organized according to a tree structure, and then efficient operation is carried out on paths of the sub-models;
(4) And after the calculation is completed, processing the calculation result to form a prediction output result.
6. The tree-structure-based large language model operation chain method according to claim 5, wherein: in the step (4), after the calculation is completed, the parameter storage module is used for storing the parameters of the large language model and the parameters of each submodel thereof so as to calculate the output result in the reasoning process.
7. The tree-structure-based large language model operation chain method according to claim 5, wherein: in step (1), the design context modeling module is used for extracting key information from input data, and the pre-trained classifier or clustering algorithm determines a sub-model path to be calculated according to the extracted information.
8. The tree-structure-based large language model operation chain method according to claim 5, wherein: when a question to be answered is input, the context modeling module extracts useful information from the question and determines a calculation sub-model path; and after the calculation resource allocation module dynamically allocates resources, calculating a required submodel in the tree structure, and finally generating corresponding answers according to a knowledge base or other data sources.
9. The tree-structure-based large language model operation chain method according to claim 5, wherein: when inputting voice exercise data to be modified, the context modeling module extracts useful information from the data and determines a calculation sub-model path; after the computing resource allocation module dynamically allocates resources, a required submodel is computed in the tree structure, and finally, the score and feedback of the voice exercise are generated.
10. The tree-structure-based large language model operation chain method according to claim 5, wherein: when inputting the writing exercise text to be corrected, the context modeling module extracts useful information from the text and determines a calculation sub-model path; and after the calculation resource allocation module dynamically allocates resources, calculating a required submodel in the tree structure, and finally generating the score and feedback of the writing exercise.
CN202310898633.8A 2023-07-21 2023-07-21 Large language model operation chain device and method based on tree structure Pending CN116975192A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310898633.8A CN116975192A (en) 2023-07-21 2023-07-21 Large language model operation chain device and method based on tree structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310898633.8A CN116975192A (en) 2023-07-21 2023-07-21 Large language model operation chain device and method based on tree structure

Publications (1)

Publication Number Publication Date
CN116975192A true CN116975192A (en) 2023-10-31

Family

ID=88484380

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310898633.8A Pending CN116975192A (en) 2023-07-21 2023-07-21 Large language model operation chain device and method based on tree structure

Country Status (1)

Country Link
CN (1) CN116975192A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349034A (en) * 2023-12-05 2024-01-05 创意信息技术股份有限公司 Hierarchical loading method and device for large language model

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117349034A (en) * 2023-12-05 2024-01-05 创意信息技术股份有限公司 Hierarchical loading method and device for large language model
CN117349034B (en) * 2023-12-05 2024-02-23 创意信息技术股份有限公司 Hierarchical loading method and device for large language model

Similar Documents

Publication Publication Date Title
WO2021047286A1 (en) Text processing model training method, and text processing method and apparatus
Liang Learning executable semantic parsers for natural language understanding
Pichotta et al. Using sentence-level LSTM language models for script inference
CN112988785A (en) SQL conversion method and system based on language model coding and multitask decoding
CN110287482B (en) Semi-automatic participle corpus labeling training device
CN111274790B (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN116975192A (en) Large language model operation chain device and method based on tree structure
CN107662617A (en) Vehicle-mounted interactive controlling algorithm based on deep learning
CN112000770A (en) Intelligent question and answer oriented sentence-to-sentence matching method based on semantic feature map
CN110378489A (en) Representation of knowledge learning model based on the projection of entity hyperplane
WO2021139233A1 (en) Method and apparatus for generating data extension mixed strategy, and computer device
Zhao et al. Synchronously improving multi-user English translation ability by using AI
CN113326367B (en) Task type dialogue method and system based on end-to-end text generation
Khayut et al. Modeling of intelligent system thinking in complex adaptive systems
CN110297894A (en) A kind of Intelligent dialogue generation method based on auxiliary network
CN116821307B (en) Content interaction method, device, electronic equipment and storage medium
CN116932776A (en) Knowledge graph-based large model knowledge updating method and device
CN110888944A (en) Attention convolution neural network entity relation extraction method based on multiple convolution window sizes
CN111104508A (en) Method, system and medium for representing word bag model text based on fault-tolerant rough set
CN115796187A (en) Open domain dialogue method based on dialogue structure diagram constraint
CN102270190B (en) Computer modeling and solving processing method of complex decision-making problem
Mok et al. Scaling understanding up to mental spaces
Li et al. Semantic understanding processing model based on machine learning
CN112905806B (en) Knowledge graph materialized view generator based on reinforcement learning and generation method
CN117056459B (en) Vector recall method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination