CN117009113A

CN117009113A - Method and device for calling artificial intelligent model, computer equipment and storage medium

Info

Publication number: CN117009113A
Application number: CN202311134066.5A
Authority: CN
Inventors: 谢鑫
Original assignee: Ping An Bank Co Ltd
Current assignee: Ping An Bank Co Ltd
Priority date: 2023-09-04
Filing date: 2023-09-04
Publication date: 2023-11-07

Abstract

The application relates to the field of artificial intelligence and financial science and technology, and discloses a calling method, a device, computer equipment and a storage medium of an artificial intelligence model, wherein the calling method comprises the following steps: receiving a user instruction, wherein the user instruction carries task description information of a target downstream task; generating first model input data according to the task description information; invoking a target artificial intelligent model, and taking the input data of the first model as the input of a target artificial intelligent model for executing a target downstream task; and obtaining the final output of the target artificial intelligent model for executing the target downstream task. The introduction tool greatly expands the potential functions of the artificial intelligent model, overcomes the defect that the artificial intelligent model cannot interact with the outside, breaks through the use limitation of the artificial intelligent model, ensures that the function expansion of software application based on artificial intelligent development is more flexible, and meets the requirements under different scenes. The performance and flexibility of the software application are improved, and the functions of the software application and the artificial intelligence model are enriched.

Description

Method and device for calling artificial intelligent model, computer equipment and storage medium

Technical Field

The present application relates to the field of artificial intelligence and financial technology, and in particular, to a method, an apparatus, a computer device, and a storage medium for invoking an artificial intelligence model.

Background

With the mature development of artificial intelligence technology, more and more artificial intelligence models are being applied in various software applications. For example, natural language processing, computer vision, graphics learning, and other artificial intelligence fields have been widely studied and used.

Artificial intelligence models, while powerful, still have some drawbacks. In particular, artificial intelligence models support little networking, cannot handle very long text, cannot calculate data accurately, cannot interact with external data in real time, and so on. These deficiencies result in significant limitations on the capabilities of the artificial intelligence model, as well as limitations on the use of the artificial intelligence model by applications developed based on artificial intelligence techniques.

Disclosure of Invention

The application mainly aims to provide a method, a device, computer equipment and a storage medium for calling an artificial intelligent model, which can solve the technical problem that software application developed based on an AI technology in the prior art has larger limitation on the calling of the artificial intelligent model.

To achieve the above object, a first aspect of the present application provides a method for invoking an artificial intelligence model, the method comprising:

receiving a user instruction, wherein the user instruction carries task description information of a target downstream task;

generating first model input data according to the task description information;

invoking a target artificial intelligent model, and taking the input data of the first model as the input of a target artificial intelligent model for executing a target downstream task;

obtaining the final output of the target artificial intelligent model for executing the target downstream task; or, calling a target tool selected by the target artificial intelligent model, and performing data interaction with the target artificial intelligent model to obtain the final output of the target artificial intelligent model for executing the target downstream task.

To achieve the above object, a second aspect of the present application provides an artificial intelligence model invoking device, including:

the instruction receiving module is used for receiving a user instruction, wherein the user instruction carries task description information of a target downstream task;

the first input generation module is used for generating first model input data according to the task description information;

the first calling module is used for calling the target artificial intelligent model and taking the input data of the first model as the input of the target artificial intelligent model for executing the target downstream task;

The first output receiving module is used for obtaining the final output of the target artificial intelligent model for executing the target downstream task; or, calling a target tool selected by the target artificial intelligent model, and performing data interaction with the target artificial intelligent model to obtain the final output of the target artificial intelligent model for executing the target downstream task.

To achieve the above object, a third aspect of the present application provides a computer-readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform the steps of:

To achieve the above object, a fourth aspect of the present application provides a computer apparatus including a memory and a processor, the memory storing a computer program which, when executed by the processor, causes the processor to perform the steps of:

The embodiment of the application has the following beneficial effects:

according to the application, model input data is generated according to task description information of the target downstream task, and the target artificial intelligent model is called to directly obtain final output, or a target tool selected by the target artificial intelligent model is called to assist the target artificial intelligent model to complete the target downstream task to obtain final output. By introducing tools and perfect functional support, the application greatly expands the functions and the potential performances of the artificial intelligent model, overcomes the defect that the artificial intelligent model cannot interact with the outside, breaks the use limitation of the artificial intelligent model, ensures that the function expansion of the software application developed based on the artificial intelligent model is more flexible, and meets the requirements under different scenes. The proper tool method can be intelligently selected through the artificial intelligence model, so that the performance, efficiency and flexibility of the software application are improved, and the functions of the software application and the artificial intelligence model are enriched. The threshold and cost of developing applications with artificial intelligence techniques is also reduced.

Drawings

In order to more clearly illustrate the embodiments of the application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Wherein:

FIG. 1 is an application environment diagram of a calling method of an artificial intelligence model in an embodiment of the application;

FIG. 2 is a flowchart of a method for invoking an artificial intelligence model in an embodiment of the application;

FIG. 3 is a diagram illustrating the invocation of an artificial intelligence model in accordance with an embodiment of the present application;

FIG. 4 is a block diagram of a device for calling an artificial intelligence model according to an embodiment of the present application;

fig. 5 is a block diagram of a computer device in an embodiment of the application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

FIG. 1 is an application environment diagram of a calling method of an artificial intelligence model in one embodiment. Referring to fig. 1, the invocation method of the artificial intelligence model is applied to an invocation system of the artificial intelligence model. The invocation system of the artificial intelligence model includes a terminal 110 and a server 120. The terminal 110 and the server 120 are connected through a network, and the terminal 110 may be a desktop terminal or a mobile terminal, and the mobile terminal may be at least one of a mobile phone, a tablet computer, a notebook computer, and the like. The server 120 may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers. The terminal 110 is configured to generate a user instruction in response to a user operation, and send the user instruction to the server 120, where the server 120 is configured to receive the user instruction, and the user instruction carries task description information of a target downstream task; generating first model input data according to the task description information; invoking a target artificial intelligent model, and taking the input data of the first model as the input of a target artificial intelligent model for executing a target downstream task; obtaining the final output of the target artificial intelligent model for executing the target downstream task; or, calling a target tool selected by the target artificial intelligent model, and performing data interaction with the target artificial intelligent model to obtain the final output of the target artificial intelligent model for executing the target downstream task.

As shown in FIG. 2, in one embodiment, a method of invoking an artificial intelligence model is provided. The calling method of the artificial intelligent model specifically comprises the following steps:

s100: and receiving a user instruction, wherein the user instruction carries task description information of the target downstream task.

Specifically, artificial intelligence (Artificial Intelligence, abbreviated as AI) is a multi-domain interdisciplinary discipline for researching and developing theory, methods, techniques and application systems for simulating, extending and expanding human intelligence, and relates to multiple fields of computer science, neuroscience, psychology, linguistics, economics, mathematics, biology and the like. The research field of the method comprises machine learning, natural language processing, computer vision, robotics, intelligent control, intelligent decision making, intelligent searching, intelligent optimization, intelligent data analysis, intelligent modeling, intelligent computer aided design, intelligent computer aided education, intelligent computer aided system and the like. Machine learning is the core of artificial intelligence, which is a technique that allows a computer to automatically extract features from data and build models, thereby enabling automated learning and prediction.

The calling method of the artificial intelligence model can be applied to an application server corresponding to the local application. The application server receives, via the application client, user instructions for describing the purpose and desire of the user, which may include, but are not limited to, task description information carrying the target downstream task.

The target downstream task is an executable task which can be realized, migrated and emerged by the target artificial intelligence model to be called. The target downstream tasks that the target artificial intelligence model can perform are related to the type of the target artificial intelligence model.

For example, the target artificial intelligence model is a language model (e.g., large Language Model (LLM)), then the target downstream task may be at least one of a translation task, a text classification task, a question-answer task, a dialogue task, a summary generation task, a natural language generation task, a risk prediction task, dialogue grammar correction, text keyword extraction, text emotion recognition, parsing unstructured data into structured data (e.g., extracting a form of data from unstructured text), implementing code based on descriptions, interpretation code, optimization code, chat, role playing and chat, text expansion, and so forth.

As another example, the target artificial intelligence model is a speech model, and the target downstream task may be a speech recognition task, a tone recognition task, a speech synthesis task, or the like, without limitation.

As another example, the target artificial intelligence model is a visual model, and the target downstream task may be a picture classification task, a face recognition task, a video composition task, or the like, without limitation.

The task description information includes input data (which may include text, speech, pictures, etc.) to be input to the target artificial intelligence model, and may also include output examples or specified output formats, prompts or language descriptions for indicating model functions, etc., to which the present application is not limited.

S200: and generating first model input data according to the task description information.

Specifically, if the target artificial intelligence model already has functionality to perform the target downstream task, the first model input data includes input data of the target artificial intelligence model when performing the target downstream task.

If the target artificial intelligence model has a plurality of existing functions and some potential functions, but the target artificial intelligence model needs to be motivated to perform the target downstream task, the first model input data includes motivated data for motivating the target artificial intelligence model to perform the target downstream task, input data when the target artificial intelligence model performs the target downstream task, and the like.

If the target artificial intelligence model needs to assist in completing the target downstream task with the external tool, the first model input data includes descriptive data for instructing the target artificial intelligence model to make a target tool selection.

Thus, the first model input data includes at least one of excitation data for exciting a function of the target artificial intelligence model to perform a target downstream task, input data of the target artificial intelligence model when performing the target downstream task, description data for instructing the target artificial intelligence model to make a target tool selection.

Of course, the first model input data may also include other data, as the application is not limited in this regard.

S300: and calling the target artificial intelligent model, and taking the first model input data as input of the target artificial intelligent model for executing a target downstream task.

Specifically, the target artificial intelligence model is an artificial intelligence model or a machine learning model to be called, which is determined by an application server according to task description information of a target downstream task.

The target artificial intelligence model selected by the different target downstream task application servers may be different, as the application is not limited in this regard.

In addition, the application server can call various artificial intelligent models developed by different manufacturers through interfaces, so that popularization and use of the artificial intelligent models in software application are realized.

Of course, part or all of the artificial intelligence model may be integrated into the application server to provide the application server for local invocation. In summary, the application server may invoke either a local artificial intelligence model or an external artificial intelligence model.

In addition, the target artificial intelligence model is a model that has been trained in advance.

S400: obtaining the final output of the target artificial intelligent model for executing the target downstream task; or, calling a target tool selected by the target artificial intelligent model, and performing data interaction with the target artificial intelligent model to obtain the final output of the target artificial intelligent model for executing the target downstream task.

Specifically, if the target artificial intelligence model already has functionality to perform the target downstream task, the first model input data includes input data of the target artificial intelligence model when performing the target downstream task. And the target artificial intelligent model executes the target downstream task according to the input data of the first model to obtain the final output of the target downstream task.

If the target artificial intelligence model already has a function of executing the target downstream task, but the target artificial intelligence model can execute the target downstream task with the help of other tools, the first model input data includes input data when the target artificial intelligence model executes the target downstream task and description data for indicating the target artificial intelligence model to perform target tool selection.

The target artificial intelligence model selects target tools from given tools according to the description data of the target tool selection, and the selected target tools are used for an application server to call. And the application server and the target artificial intelligent model perform data interaction according to the input data when the target downstream task is executed, the calling result of the calling target tool, the intermediate output of the target artificial intelligent model and the like. The method is equivalent to the fact that the application server is used as a bridge to indirectly realize the calling of the artificial intelligent model to the tool, the function of the artificial intelligent model is expanded, and the limitation of using the artificial intelligent model is broken.

And the final target artificial intelligent model indirectly calls a target tool through an application server to assist in executing the target downstream task, and final output of the target downstream task is obtained.

If the target artificial intelligence model needs to be motivated to execute the function of the target downstream task, the first model input data comprises motivated data for motivating the target artificial intelligence model to execute the function of the target downstream task and input data when the target artificial intelligence model executes the target downstream task.

With the continued development of new models, large-scale artificial intelligence models have emerged with a number of unexpected capabilities beyond researchers. The unpredictable ability for these to occur on small models, but on large models, is called the emerging capability. Emerging capabilities (EmerrgentCapabilities) refer to the ability of a model to process specific tasks without training for those tasks, and can sometimes be understood as model potential. For example, the large language model (Large Language Model, abbreviated as LLM) does not accept dialogue corpus training, so dialogue capability is also a representation of the emerging capability of the large language model, and common translation, programming, reasoning, semantic understanding and the like belong to the emerging capability of the large language model, and the large language model has great technical potential.

And the target artificial intelligent model determines the background and identity of the target artificial intelligent model for realizing the target downstream task according to the excitation data, so as to excite the capability of the target artificial intelligent model for realizing the target downstream task. And then, the target artificial intelligence model executes the target downstream task according to the input data to obtain the final output of the target downstream task.

If the target artificial intelligence model needs to be stimulated to execute the function of the target downstream task, and the target artificial intelligence model needs other tool assistance to execute the target downstream task. The first model input data includes motivation data for motivating the target artificial intelligence model to perform the function of the target downstream task, input data for the target artificial intelligence model to perform the target downstream task, and description data for directing the target artificial intelligence model to perform target tool selection.

And the target artificial intelligent model determines the background and identity of the target artificial intelligent model for realizing the target downstream task according to the excitation data, so as to excite the capability of the target artificial intelligent model for realizing the target downstream task.

The calling scheme of the artificial intelligence model can be applied to various application scenes, particularly in the field of financial science and technology, such as banking business, with various business types, such as: face recognition, identity recognition, user risk assessment, identity card information extraction, image text recognition extraction, video text recognition extraction and the like, and various application scenes need artificial intelligent models. The scheme of the application can greatly promote the expansion of the functions of the artificial intelligent model, maximize the utilization of the potential functions of the artificial intelligent model, and enable the function expansion of the software application to be more flexible.

According to the embodiment, model input data are generated according to task description information of a target downstream task, and a target artificial intelligent model is called to directly obtain final output, or a target tool selected by the target artificial intelligent model is called to assist the target artificial intelligent model to complete the target downstream task to obtain final output. According to the embodiment, through introducing tools and perfect functional support, the functions and the potential performances of the artificial intelligent model are greatly expanded, the defect that the artificial intelligent model cannot interact with the outside is overcome, the use limitation of the artificial intelligent model is broken, the function expansion of software application developed based on the artificial intelligent model is more flexible, and the requirements under different scenes are met. The proper tool method can be intelligently selected through the artificial intelligence model, so that the performance, efficiency and flexibility of the software application are improved, and the functions of the software application and the artificial intelligence model are enriched. The threshold and cost of developing applications with artificial intelligence techniques is also reduced.

In one embodiment, before obtaining the final output of the target artificial intelligence model to perform the target downstream task in step S400, the method further comprises:

the capability of the target artificial intelligent model for processing the target downstream task is stimulated through prompt engineering;

or,

and exciting the capability of the target artificial intelligent model to process the target downstream task by fine-tuning the target artificial intelligent model in advance.

Specifically, the target artificial intelligence model is a large model pre-trained (pre-training) through a large, diverse data set. For example, the target artificial intelligence model is one of a language pre-training model (e.g., a Large Language Model (LLM)), an image pre-training model, a visual pre-training model (e.g., a model built based on ViT (Vision Transformer)), and the like, as the application is not limited in this regard.

Taking the example of a large language model, during the pre-training phase, the large language model learns from a large, diverse dataset, typically containing billions of words from different sources, such as websites, books, and articles. This stage allows large language models to learn general language patterns and characterizations.

Downstream tasks of the language pre-training model include, but are not limited to, translation, text classification, question-answering, dialogue, abstract generation tasks, natural language generation, and the like.

The downstream tasks of the image pre-training model or the visual pre-training model include, but are not limited to, image recognition, document image classification, document layout analysis, form detection, text detection, and the like.

Before invoking the target artificial intelligence model to perform the target downstream task, it is necessary to ensure that the target artificial intelligence model has the ability to perform the target downstream task. The present embodiment provides two alternative methods of hint engineering (prompt engineering) and fine-tuning (fine-tuning) to motivate the ability of the target artificial intelligence model to perform target downstream tasks.

Prompt engineering refers to the ability to trigger the emergence of a model by designing special prompts. This approach does not require additional training of the model, but only requires the model to be guided through appropriate prompts to accomplish specific tasks. Prompt engineering is typically used to quickly solve new problems without updating model parameters. By giving a reasonable hint (e.g., a natural language instruction), the guided model performs an efficient outcome output without updating model parameters, essentially a way to guide and excite model capabilities.

Taking a Large Language Model (LLM) as an example, the performance of the model in a downstream task can be improved by simply conditioning based on a small number of input and output examples in the large language model to form text that is output by the guided model. Unlike traditional supervised learning, this approach does not require large amounts of labeling data, but only small amounts of example data to allow the model to complete downstream tasks. How to design context-learning prompt text is a key to applying a large-scale language model, which is called prompt engineering (Prompt Engineering). The core idea of hint engineering is to design a set of valid hints (promts) to guide the AI model to output content that meets a particular condition or requirement. The method has become an important technology in the AI field in recent years, has wide application prospect, and is applied to a plurality of industries and scenes, such as intelligent customer service, automatic writing, virtual assistants and the like.

Fine tuning refers to additional training for specific tasks based on a pre-trained large model. This approach requires additional training of the model, but may improve the performance of the model on a particular task. Fine tuning can be used to solve some problems that cannot be solved by prompt engineering. It modifies the model part parameters by inputting additional samples, thereby strengthening certain part of the model capability. Is also essentially a method of guiding and exciting the capabilities of the model. In the fine tuning stage, the model is further trained on a more specific, smaller data set associated with the target task or domain, which helps the model adapt to the specific requirements of the task.

According to the embodiment, specific functions of the target artificial intelligent model are excited through prompt engineering or fine tuning, so that the artificial intelligent model has the capability of executing a target downstream task, the emerging capability and the technical potential of the artificial intelligent model are excited, and the use limit of the artificial intelligent model is reduced. The two excitation methods can be selected to supplement each other, so that the functions and infinite potential of the artificial intelligent model can be more comprehensively mined.

In one embodiment, generating the first model input data according to the task description information in step S200 includes:

And constructing a target prompt word according to the task description information and the prompt word template, and generating first model input data according to the target prompt word.

Specifically, the present embodiment selects a hint engineering to motivate the ability of the target artificial intelligence model to process tasks downstream of the target.

The first model input data includes motivation data for motivating a target artificial intelligence model to perform a function of a target downstream task. More specifically, the motivational data of the first model input data includes target prompters constructed from task description information and prompter templates. The target prompt word facilitates setting roles for the target artificial intelligence model and specifying target downstream tasks.

The hint word schemes can be divided into system messages and user messages. The system message is used for setting background and identity for the target artificial intelligence model, and the user message is used for providing specific working instructions. Through the prompt words, the artificial intelligent model can be stimulated to realize specific functions and execute target downstream tasks. Such as translating tasks, discriminating which tools the call, etc.

For example, the artificial intelligence model is required to implement the translation task of the specified format output, and the hint words of the system message are as follows:

you are a phonetic assistant in english that can distinguish english words provided by the user between english and american. Before outputting the answer, please check again whether the result is correct. Ensure that the answer you provide is outbound and proof, if not, please answer you don't know. The output format is as follows:

English:

sound-beautifying:

in the english sound, the "r" sound is attenuated or omitted so that the pronunciation is close to s. In the beauty sound, the "r" sound remains and is pronounced as "r".

The source is as follows:

Cambridge Dictionary.(n.d.).Fierce.Retrieved from<https://dictionary.cambridge.org/dictionary/english/fierce>。

in this embodiment, it is clear that the target downstream task in the prompt word is to output the translation result according to the specified format, and an output example of the specified format is given in the prompt word. Roles are designated for the artificial intelligent model, tasks are formulated and output formats are designated through prompt words.

The potential capability of the artificial intelligent model can be excited under the conditions that a large amount of data is not needed to finely adjust the model and model parameters of the artificial intelligent model are not changed through prompt engineering, a user flexibly customizes tasks for the model according to requirements, the model is guided to rapidly complete specific downstream tasks, the use limitation of the artificial intelligent model is broken, and the artificial intelligent model is more flexible to call and has diversified functions.

In one embodiment, if the first model input data is used to instruct the target artificial intelligence model to select a target tool from at least two given existing tools, the step S400 of calling the target tool selected by the target artificial intelligence model, performing data interaction with the target artificial intelligence model, and obtaining a final output of the target artificial intelligence model to execute a target downstream task includes:

Receiving a tool selection result of the target artificial intelligent model according to the input data output of the first model;

if the tool selection result indicates that at least one target tool exists, calling the corresponding target tool according to the task description information and the intermediate output data of the target artificial intelligent model;

and generating intermediate input data according to a calling result obtained by calling the target tool, and inputting the intermediate input data into the target artificial intelligent model until the final output of the target artificial intelligent model is obtained, wherein the intermediate output data is obtained by the target artificial intelligent model according to the input intermediate input data.

In particular, the artificial intelligence model may be able to complete some downstream tasks independently, while additional tool assistance may be required to complete other downstream tasks. Artificial intelligence models are not universal, and some functions do not, requiring the assistance of external tools to assist in accomplishing certain downstream tasks. However, the artificial intelligent model itself has the defects of no support for networking, no processing of ultra-long text, no accurate calculation of data, no real-time interaction with external data, and the like, so that the artificial intelligent model cannot directly call tools or interfaces. Based on the above, the application server is required to serve as a bridge to call the target tool to assist the artificial intelligence model to complete the target downstream task.

The first model input data of this embodiment includes description data for instructing the target artificial intelligence model to make a target tool selection, and in particular for instructing the target artificial intelligence model to make a target tool selection from at least two given existing tools.

For example, in order for a large language model to solve a task, the large language model needs to determine which target tool or tools should be invoked by the application server from the existing "tools", and in this embodiment, the task is selected by issuing tools to the large language model through prompt engineering. The prompting words of the system message can be specifically as follows:

you are an expert good at selecting the appropriate calling tool. There is a description of the task that needs to be solved, as well as the names of the tools and the uses of the tools that you can choose. You need to select a tool that is appropriate for solving the task based on the task to be solved and return the name of this tool. Without attempting to interpret your selection, without attempting to guide the user, the name is returned directly, if no appropriate tool is returned, please return the JSON string "{ result: null }). The input data format is as follows:

task name translation of an article

Optional tools:

tool 1 { name: calculator, description: "tool for mathematical computation, such as addition, subtraction, multiplication, division, etc. }

Tool 2 { name: translatizer, description @ for translating a language into a language, such as translation from text into English, etc. }

....

Tool n: {........}

The output is in the form of a json string, which is in the following format:

{result:"translator"}。

in this embodiment, the engineering of the prompt specifically instructs the target artificial intelligence model to select a target tool from the existing tools based on the task of "translate an article" and specify the formats of the input data and output data.

The tools 1 and 2 mentioned above are not limited to the local tools or external tools or plug-in tools that can be called by the application server. The specific embodiment described above selects a target tool to be invoked for an application server by a target artificial intelligence model based on a target downstream task.

Of course, the foregoing is merely illustrative, and the application is not limited in this regard to the manner in which the words of the prompts are presented and described in the context of a particular application.

The target tools selected by the target artificial intelligence model may include at least one, and may be empty (i.e., without invoking any target tools, the target artificial intelligence model may independently accomplish target downstream tasks). In the case that the target tools comprise at least one, the application server invokes the tool to select one or more target tools indicated by the results, generates intermediate input data according to the invoking results obtained by invoking the target tools, and inputs the intermediate input data to the target artificial intelligence model; the target artificial intelligent model obtains intermediate output data or final output according to the intermediate input data, the target artificial intelligent model outputs the intermediate output data or final output to the application server, and the application server obtains final output, or the application server calls other target tools according to the intermediate output data. And iterating the input and output to realize the data interaction between the application server and the target artificial intelligent model until the application server acquires the final output of the target artificial intelligent model for completing the target downstream task.

If the target tools include multiple target tools, the application server may call different target tools through multiple rounds, or may call all target tools in one round, which is specifically determined according to the actual downstream task and the application scenario, and the application is not limited to this.

The capability of the target artificial intelligence model of the embodiment to execute the target downstream task may be implemented through fine tuning in advance, or may be implemented through prompt engineering, which is not limited in this embodiment.

Under the condition that the capability of the target artificial intelligent model to execute the target downstream task is stimulated by the prompt engineering, the capability of the target artificial intelligent model to execute the target downstream task is stimulated by the prompt engineering preferentially, and then the target tool selection is conducted by the target artificial intelligent model is instructed by the prompt engineering.

The first model input data may include both motivation data for motivating the target artificial intelligence model to perform a function of a target downstream task, and description data for directing the target artificial intelligence model to target tool selection. Alternatively, the first model input data includes hints comprising descriptive data for instructing the target artificial intelligence model to make a target tool selection. Before generating the first model input data, the application server inputs a prompt word containing motivation data for motivating the target artificial intelligence model to perform a function of a target downstream task to the target artificial intelligence model.

The first model input data may also include input data of the target artificial intelligence model when performing the target downstream task.

For another example, the target downstream tasks are: automated acquisition of market research reports based on keywords, we want to provide keywords and mailboxes for Large Language Models (LLM) to help with the process.

The processing flow is specifically as follows:

1. converting the random input of the user into structured data:

input: aaaa@xxx.com intelligentized bank bbbb@yyy.com "

And (3) outputting:

{

the keyword is an intelligent bank,

emailList:[xiexin446@pingan.com.cn,””xiexin344@163.com”]

}

2. based on the task demand description, let the large language model select the appropriate target tool, the hint words such as:

the let promt1=' i needs to acquire related articles according to keywords, and sort out url of the articles, and the format of url is as follows:

among the target tools selected are the let toolone=gettoolbydescription (prot 1, data. Keyword), let toolGetArticleByUrl, let toolGetSummaryByArticle, let sendEmailTool.

You need to return to me in the following format:

let toolGetArticleByUrl =gettoolbydescription ("i need to obtain the content of the article corresponding to the web address according to url, and return the content of the article out", url list [ i ])

let toolGetSummaryByArticle =gettoolbydescription ("abstract" of what i will provide, extract the abstract within 100 words, arc)

let sendEmailTool = getToolByDescription ('according to me provided article list, send to mailbox list', data. Email list)

Thus, the process of acquiring and transmitting the automatic research report is completed.

For another example, the large language model may also select a query tool (search tool), and the application server may call the query tool to call search functions provided by other applications to obtain large data required by the large language model to perform the target downstream task.

The large language model may also select a tool for the database to hold data, which the application server may call to store some data in the database for the large language model.

Based on these tools, large language models can implement complex capabilities such as LLM-based dynamic intelligent knowledge bases (i.e., uploading documents and asking based on the document content). The method involves storing, searching, summarizing and answering the documents, which requires that a detection tool, a database storage tool and a database query tool are provided, and intelligent question and answer can be realized by matching with the summarizing capability of a large language model.

In addition, let LLM judge that selecting a certain tool is a selectable option, not an unnecessary option, we can also realize the preset tool call from the code logic level by oneself.

According to the method, the system and the device, the target artificial intelligent model is indicated by prompting engineering to complete a target downstream task, a target tool is selected for an application server, the application server replaces the target artificial intelligent model to call the target tool, the target artificial intelligent model is assisted to complete the target downstream task, the indirect calling tool of the artificial intelligent model is realized, the limitation that the artificial intelligent model cannot call a third party tool or an interface is broken, the function of the artificial intelligent model is expanded, the artificial intelligent model can be more flexibly and widely applied, and meanwhile, the function of software application corresponding to the application server is enriched.

In one embodiment, the intermediate output data and the final output of the target artificial intelligence model are both structured data.

Specifically, the input data (including intermediate input data) of the target artificial intelligent model, which is provided by the application server, can include a prompt word, and the prompt word can indicate the format of the next output to the target artificial intelligent model, so that each output of the target artificial intelligent model can be ensured to be structured data. The structured data is convenient for calculation mechanism solution and analysis, can improve the efficiency of model calling and reduce calculation cost.

In one embodiment, the step S400 of calling the target tool selected by the target artificial intelligence model, performing data interaction with the target artificial intelligence model, and obtaining a final output of the target artificial intelligence model for executing the target downstream task, further includes:

if the tool selection result indicates that the target tool does not need to be called, directly acquiring the final output of the target artificial intelligent model for executing the target downstream task;

or,

and if the tool selection result indicates that the target tool does not need to be called, generating second model input data according to the task description information, calling the target artificial intelligent model, taking the second model input data as the input of the target artificial intelligent model, and obtaining the final output of the target artificial intelligent model for executing the target downstream task.

Specifically, if the first model input data includes description data for instructing the target artificial intelligence model to perform the target tool selection, and does not include input data when the target artificial intelligence model performs the target downstream task, the application server needs to generate second model input data including input data when the target artificial intelligence model performs the target downstream task without calling the target tool, and the target artificial intelligence model independently completes the target downstream task without additional target tools according to the second model input data.

In addition, the second model input data may also be provided to the target artificial intelligence model in the form of a prompt word, and the second model input data may specify an input format, an output format, and the like in addition to describing the target downstream task, which is not limited in this regard by the present application.

According to the embodiment, the target artificial intelligent model can still be indicated to complete the target downstream task under the condition that a target tool is not needed, and the flexibility of calling the target artificial intelligent model is realized.

In one embodiment, the target artificial intelligence model is a large language model;

if the task description information includes non-text data, the method further includes:

determining a target conversion tool to be called according to the data type of the non-text data, or determining the target conversion tool to be called according to the data type of the non-text data and a target downstream task;

invoking a target conversion tool to convert the non-text data into text data which can be input as a large language model;

wherein the non-text data includes at least one of picture data, voice data, and video data.

Specifically, large Language Model (LLM), i.e. a large-scale language model or a large language model, is a natural language processing model or AI model based on deep learning for processing language words (or symbology), which can learn the grammar and semantics of natural language, find rules therein, and automatically generate content conforming to some rules, such as generating human-readable text, aiming at understanding and generating human language, according to prompt (prompt). Large language models are typically based on neural network models, trained using large-scale corpora, and can perform a wide range of tasks, including text summarization, emotion analysis, natural language generation, text classification, text abstract generation, machine translation, and the like.

Large language models perform well on tiny tasks, and downstream tasks can be described in terms of instructions by natural language descriptions (i.e., instructions). The large language model is able to perform new tasks by understanding task instructions without using explicit samples, which can greatly improve generalization ability. Through mental chain reasoning strategies, large language models can solve such tasks to arrive at final answers by utilizing a promtt mechanism involving intermediate reasoning steps.

The large language model may be, for example, a GPT (generated Pre-Trained Transformer (generated Pre-training transducer model)), BERT (Bidirectional Encoder Representations from Transformer), roBERTa, or the like model or a model based on the above, which is not limited by the present application.

In this embodiment, when the task description information includes non-text data, the application server may determine a target conversion tool to be invoked according to a data type of the non-text data, or may determine a target conversion tool to be invoked from a plurality of existing conversion tools according to a data type of the non-text data and a target downstream task, invoke the target conversion tool, and convert the non-text data into text data that can be input as the large language model.

More specifically, the target conversion tool includes one or more of an OCR tool, an ASR tool, and the like. OCR tools, i.e., optical character recognition (Optical Character Recognition) tools, are used for image text recognition to convert text in a picture or video image into a text format. ASR tools, i.e., speech recognition technology, also known as automatic speech recognition (Automatic Speech Recognition, ASR), aim to convert lexical content in human speech into computer-readable text input.

Of course, the conversion tool called by the application server is not limited to the above tool, and the conversion tool may be a third party tool, a locally integrated tool, or a conversion tool calling another server, which is not limited in the present application.

By calling the conversion tool, under the condition that the large language model does not support processing of multi-mode data (pictures, audio and video), complex processing of intermediate data and the like, the capability of the large language model is greatly released, the function of the large language model is expanded, the function expansion of software application corresponding to an application server is more flexible, and the requirements of different application scenes are met.

In one embodiment, the existing tools include at least one of tool methods for invoking external interfaces, third party tools installed in the native in plug-in form, code tools integrated in the native code.

In particular, existing tools and translation tools of the present application all refer to code modules that can be invoked by an application server. The existing tool may be a tool method for calling an external interface provided by an external application, or may be a third party tool installed in the local in a plug-in form, or may be a code tool integrated in the local code, or the like, not limited thereto.

More specifically, for example, existing tools may include, but are not limited to, tools for performing database read-write functions, tools for implementing POST requests, tools for implementing GET requests, local calculator tools, tools for invoking external calculators, tool methods for invoking external translation tools, tools for invoking external retrieval interfaces, and other expansion tools or expansion interfaces or custom interfaces, among others.

According to the embodiment, through diversified tool layout, functions of software applications corresponding to the application server and development and mining of various functions of the artificial intelligent model are greatly expanded.

FIG. 3 is a diagram illustrating the invocation of an artificial intelligence model in accordance with an embodiment of the present application; referring to fig. 3, taking an artificial intelligence model as a Large Language Model (LLM) as an example, an application server can convert pictures and videos into input text through an OCR tool, can convert audio or speech into input text through an ASR tool, and text input can be directly used as the input text without conversion. The application server also provides a prompt word template, and the application server can modify the prompt word template according to the user instruction to obtain the prompt word. The hint word is input to the artificial intelligence model, i.e., the large language model of the present embodiment. If the target tool is needed to assist in the process of executing the target downstream task by the large language model, the application server calls a tool method corresponding to the target tool to obtain a calling result, generates intermediate input data according to the calling result and provides the intermediate input data to the large language model, the large language model obtains intermediate output data or final output according to the intermediate input data and provides the intermediate output data to the application server, and the application server possibly calls other target tools according to the intermediate output data to obtain the intermediate input data again and provide the intermediate input data to the large language model, so that the input and the output are iterated until the final output of the target downstream task is obtained.

In addition, the application server can call a data storage tool to store intermediate data required by the large language model, and all data are not required to be fed to the large language model, so that the data burden of the large language model can be reduced.

Referring to fig. 4, the present application further provides an apparatus for invoking an artificial intelligence model, the apparatus comprising:

the instruction receiving module 100 is configured to receive a user instruction, where the user instruction carries task description information of a target downstream task;

a first input generation module 200, configured to generate first model input data according to task description information;

the first calling module 300 is configured to call the target artificial intelligence model, and use the first model input data as input of the target artificial intelligence model to execute the target downstream task;

a first output receiving module 400, configured to obtain a final output of the target artificial intelligence model for executing the target downstream task; or, calling a target tool selected by the target artificial intelligent model, and performing data interaction with the target artificial intelligent model to obtain the final output of the target artificial intelligent model for executing the target downstream task.

In one embodiment, the apparatus further comprises:

the first model function excitation module is used for exciting the capability of the target artificial intelligent model for processing the target downstream task through prompt engineering;

Or,

and the second model function excitation module is used for exciting the capability of the target artificial intelligent model for processing the target downstream task by fine-tuning the target artificial intelligent model in advance.

In one embodiment, the first input generating module 200 is specifically configured to construct a target prompt word according to the task description information and the prompt word template, and generate first model input data according to the target prompt word.

In one embodiment, if the first model input data is used to instruct the target artificial intelligence model to select a target tool from at least two given existing tools, the first output receiving module 400 specifically includes:

the second output receiving module is used for receiving a tool selection result output by the target artificial intelligent model according to the input data of the first model;

the first tool calling module is used for calling a corresponding target tool according to the task description information and the intermediate output data of the target artificial intelligent model if the tool selection result indicates that at least one target tool exists;

and the interaction module is used for generating intermediate input data according to a calling result obtained by calling the target tool and inputting the intermediate input data into the target artificial intelligent model until the final output of the target artificial intelligent model is obtained, wherein the intermediate output data is obtained by the target artificial intelligent model according to the input intermediate input data.

In one embodiment, the first output receiving module 400 is further configured to directly obtain a final output of the target artificial intelligence model for executing the target downstream task if the tool selection result indicates that the target tool does not need to be invoked;

or,

the first output receiving module 400 is further configured to generate second model input data according to the task description information if the tool selection result indicates that the target tool does not need to be called, call the target artificial intelligent model, and use the second model input data as input of the target artificial intelligent model to obtain a final output of the target artificial intelligent model for executing the target downstream task.

if the task description information includes non-text data, the apparatus further includes:

the conversion tool determining module is used for determining a target conversion tool to be called according to the data type of the non-text data or determining the target conversion tool to be called according to the data type of the non-text data and a target downstream task;

the second tool calling module is used for calling a target conversion tool and converting non-text data into text data which can be input as a large language model;

The method introduces a tool method and perfect functional support, greatly expands the functions and the potential performances of the artificial intelligent model, overcomes the defect that the artificial intelligent model cannot interact with the outside, breaks through the use limitation of the artificial intelligent model, ensures that the function expansion of the software application developed based on the artificial intelligent model is more flexible, and meets the requirements under different scenes. The proper tool method can be intelligently selected through the artificial intelligence model, so that the performance and efficiency of the software application are improved. The application greatly improves the efficiency and quality of application development and provides better development experience for users. The access development difficulty of the large language model is reduced. The configuration adaptation work in the application development process is greatly simplified, and the time and energy of a developer are saved.

FIG. 5 illustrates an internal block diagram of a computer device in one embodiment. The computer device may specifically be a terminal or a server. As shown in fig. 5, the computer device includes a processor, a memory, and a network interface connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program which, when executed by a processor, causes the processor to implement the steps of the method embodiments described above. The internal memory may also have stored therein a computer program which, when executed by a processor, causes the processor to perform the steps of the method embodiments described above. It will be appreciated by those skilled in the art that the structure shown in FIG. 5 is merely a block diagram of some of the structures associated with the present inventive arrangements and is not limiting of the computer device to which the present inventive arrangements may be applied, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.

In one embodiment, a computer device is provided comprising a memory and a processor, the memory storing a computer program that, when executed by the processor, causes the processor to perform the steps of:

In one embodiment, a computer readable storage medium is provided, storing a computer program which, when executed by a processor, causes the processor to perform the steps of:

Those skilled in the art will appreciate that the processes implementing all or part of the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, and the program may be stored in a non-volatile computer readable storage medium, and the program may include the processes of the embodiments of the methods as above when executed. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.

The foregoing examples illustrate only a few embodiments of the application and are described in detail herein without thereby limiting the scope of the application. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the application, which are all within the scope of the application. Accordingly, the scope of protection of the present application is to be determined by the appended claims.

Claims

1. A method for invoking an artificial intelligence model, the method comprising:

invoking a target artificial intelligence model, and taking the first model input data as input for the target artificial intelligence model to execute the target downstream task;

Obtaining a final output of the target artificial intelligence model for executing the target downstream task; or, calling a target tool selected by the target artificial intelligent model, and performing data interaction with the target artificial intelligent model to obtain the final output of the target artificial intelligent model for executing the target downstream task.

2. The method of claim 1, wherein prior to obtaining a final output of the target artificial intelligence model to perform the target downstream task, the method further comprises:

exciting the capability of the target artificial intelligence model to process the target downstream task through prompt engineering;

or,

and exciting the capability of the target artificial intelligent model for processing the target downstream task by fine tuning the target artificial intelligent model in advance.

3. The method of claim 1, wherein generating first model input data from the task description information comprises:

4. A method according to claim 3, wherein if the first model input data is used to instruct the target artificial intelligence model to perform target tool selection from at least two given existing tools, the invoking the target tool selected by the target artificial intelligence model to perform data interaction with the target artificial intelligence model, obtaining a final output of the target artificial intelligence model to perform the target downstream task, comprises:

Receiving a tool selection result output by the target artificial intelligent model according to the input data of the first model;

and generating intermediate input data according to a calling result obtained by calling a target tool, and inputting the intermediate input data into the target artificial intelligent model until the final output of the target artificial intelligent model is obtained, wherein the intermediate output data is obtained by the target artificial intelligent model according to the input intermediate input data.

5. The method of claim 4, wherein the invoking the target tool selected by the target artificial intelligence model, performing data interactions with the target artificial intelligence model, obtaining a final output of the target artificial intelligence model to perform the target downstream task, further comprises:

or,

6. The method of claim 1, wherein the target artificial intelligence model is a large language model;

determining a target conversion tool to be called according to the data type of the non-text data, or determining the target conversion tool to be called according to the data type of the non-text data and the target downstream task;

invoking the target conversion tool to convert the non-text data into text data which can be input as a large language model;

7. The method of claim 4, wherein the existing tool comprises at least one of a tool method for invoking an external interface, a third party tool installed in the native as a plug-in, and a code tool integrated in the native code.

8. An apparatus for invoking an artificial intelligence model, the apparatus comprising:

the first calling module is used for calling a target artificial intelligent model and taking the input data of the first model as the input of the target artificial intelligent model for executing the target downstream task;

9. A computer readable storage medium storing a computer program, which when executed by a processor causes the processor to perform the steps of the method according to any one of claims 1 to 7.

10. A computer device comprising a memory and a processor, wherein the memory stores a computer program which, when executed by the processor, causes the processor to perform the steps of the method of any of claims 1 to 7.