CN117539975A - Method, device, equipment and medium for generating prompt word information of large language model - Google Patents

Method, device, equipment and medium for generating prompt word information of large language model Download PDF

Info

Publication number
CN117539975A
CN117539975A CN202311340716.1A CN202311340716A CN117539975A CN 117539975 A CN117539975 A CN 117539975A CN 202311340716 A CN202311340716 A CN 202311340716A CN 117539975 A CN117539975 A CN 117539975A
Authority
CN
China
Prior art keywords
information
word
prompt
input information
input
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311340716.1A
Other languages
Chinese (zh)
Inventor
李雨城
卢伟鹏
王闯
王婷婷
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202311340716.1A priority Critical patent/CN117539975A/en
Publication of CN117539975A publication Critical patent/CN117539975A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/166Editing, e.g. inserting or deleting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Machine Translation (AREA)

Abstract

The disclosure provides a method, a device, equipment and a medium for generating prompt word information of a large language model, and relates to the technical field of artificial intelligence such as machine learning, natural language processing and the like. The specific implementation scheme is as follows: acquiring input information of a user when using a large language model; based on the input information and the configured parameter information, a pre-trained prompt word generation model is adopted to generate target prompt word information, and the target prompt word information is used for replacing the input information and is input into the large language model. The technology can effectively generate the target prompt word information which is richer, more comprehensive and accurate in content; and then the target prompt word information is adopted to replace input information of a user and input into the large language model, so that the large language model can generate reply text information more efficiently and accurately.

Description

Method, device, equipment and medium for generating prompt word information of large language model
Technical Field
The disclosure relates to the technical field of computers, in particular to the technical field of artificial intelligence such as machine learning, natural language processing and the like, and particularly relates to a method, a device, equipment and a medium for generating a prompt word of a large language model.
Background
A large language model (large language model; LLM) is a language model composed of a deep neural network model composed of parameters on the order of billions or more. Large language models are typically pre-trained on large text data sets, which can be up to 10 trillion words in size. Therefore, the existing large language model has very strong emerging capability and generalization capability, can be applied to various different scenes, and can process various different tasks, such as text abstracts, text generation, machine translation, question-answering and the like.
When the existing large language model is used, a user inputs information to the large language model, and the large language model can output reply text information for the user according to the input information of the user.
Disclosure of Invention
The disclosure provides a method, a device, equipment and a medium for generating a prompt word of a large language model.
According to an aspect of the present disclosure, there is provided a method for generating hint word information of a large language model, including:
acquiring input information of a user when using a large language model;
based on the input information and the configured parameter information, a pre-trained prompt word generation model is adopted to generate target prompt word information, and the target prompt word information is used for replacing the input information and is input into the large language model.
According to another aspect of the present disclosure, there is provided a generating apparatus of cue word information of a large language model, including:
the acquisition module is used for acquiring input information of a user when using the large language model;
the generating module is used for generating a model by adopting a pre-trained prompt word based on the input information and the configured parameter information, and generating target prompt word information which is used for replacing the input information and inputting the target prompt word information into the large language model.
According to still another aspect of the present disclosure, there is provided an electronic apparatus including:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the aspects and methods of any one of the possible implementations described above.
According to yet another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method of the aspects and any possible implementation described above.
According to yet another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of the aspects and any one of the possible implementations described above.
According to the technology disclosed by the invention, the target prompt word information which is richer, more comprehensive and accurate in content can be effectively generated; and then the target prompt word information is adopted to replace input information of a user and is input into the large language model, so that the large language model can generate reply text information more efficiently and accurately.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;
FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;
FIG. 3 is a schematic diagram of an interface for optimizing Prompt information provided by the present disclosure;
FIG. 4 is a schematic diagram of another interface for optimizing Prompt information provided by the present disclosure;
FIG. 5 is a schematic diagram of another interface for optimizing Prompt information provided by the present disclosure;
FIG. 6 is a schematic diagram of an interface provided in the present disclosure for optimizing Prompt information;
FIG. 7 is a schematic diagram according to a third embodiment of the present disclosure;
FIG. 8 is a schematic diagram according to a fourth embodiment of the present disclosure;
fig. 9 is a block diagram of an electronic device for implementing the methods of embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It will be apparent that the described embodiments are some, but not all, of the embodiments of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments in this disclosure without inventive faculty, are intended to be within the scope of this disclosure.
It should be noted that, the terminal device in the embodiments of the present disclosure may include, but is not limited to, smart devices such as a mobile phone, a personal digital assistant (Personal Digital Assistant, PDA), a wireless handheld device, and a Tablet Computer (Tablet Computer); the display device may include, but is not limited to, a personal computer, a television, or the like having a display function.
In addition, the term "and/or" herein is merely an association relationship describing an association object, and means that three relationships may exist, for example, a and/or B may mean: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship.
In the prior art, in the using process of the large language model (large language model; LLM), because the input information of the user is more personalized, the content is not rich enough, the reply text information generated by the large language model is usually not accurate enough and is not really needed by the user. Based on the above problems, the present disclosure provides a method for generating prompt word information, which is used for generating target prompt word information based on input information of a user, and is used for inputting a large language model, so that the input of the large language model can be enriched, and further, the accuracy of reply text information generated by the large language model is improved.
FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure; as shown in fig. 1, the present embodiment provides a method for generating Prompt word (promt) information of a large language model, which specifically includes the following steps:
s101, acquiring input information of a user when using a large language model;
s102, generating a model by adopting a pre-trained prompt word based on the input information and the configured parameter information, generating target prompt word information which is used for replacing the input information and inputting the target prompt word information into the large language model.
The execution main body of the method for generating the prompt word information of the large language model in this embodiment may be a device for generating the prompt word information of the large language model, and the device may be an independent electronic device, or may also be an application or a platform adopting software integration, and when in use, the device may generate the target prompt word information by adopting the prompt word generation model based on the input information of the user and the configured parameter information. The Prompt word information in this embodiment may also be Prompt information.
According to the technical scheme, the target prompt word information can be generated based on the input information and the configured parameter information of the user when the large language model is used, and the intention of the user can be expressed more abundantly, comprehensively and accurately. Then, the target prompt word information is used as the input of the large language model, so that the content of the input information of the large language model can be enriched, and the large language model can be assisted to generate more accurate reply text information.
Alternatively, the prompt word generation model of the present embodiment may be a single neural network model trained in advance, or may be integrated with a large language model for use together.
According to the method for generating the prompt word information of the large language model, the prompt word generation model is adopted based on the input information of the user and the configured parameter information, so that the target prompt word information which is richer, more comprehensive and more accurate in content can be effectively generated; and then the target prompt word information is adopted to replace input information of a user and is input into the large language model, so that the large language model can generate reply text information more efficiently and accurately.
FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure; as shown in fig. 2, the method for generating the prompt word information of the large language model according to the present embodiment further describes the technical solution of the present disclosure in more detail on the basis of the technical solution of the embodiment shown in fig. 1. As shown in fig. 2, the method for generating the prompt word information of the large language model of the present embodiment specifically may include the following steps:
s201, displaying a plurality of preconfigured input information templates to a user;
In this embodiment, a plurality of input information templates may be configured in advance to guide the input information of the user. Detailed description information of one topic may be included in each input information template for defining reply text information generated by the large language model, for example, at least one of a topic, included content, achieved effect, format, and the like of the reply text information generated by the large language model may be included.
S202, acquiring a target input information template selected by a user as input information;
s203, detecting whether the input information comprises a variable; if so, executing step S204; if not, executing step S207;
s204, popping up variable input prompt information; step S205 is performed;
s205, receiving content information of variables input by a user; step S206 is executed;
s206, filling the content information of the variables into a target input information template to be used as input information; step S207 is performed;
in order to increase the versatility and generalization of the input information, the input information may be provided with variables such as a place variable, a number of characters variable, and the like. The variable is a field or parameter whose contents are unknown. After the user selects the target input information template, the prompt word information generating device can detect whether the target input information template comprises a variable, if so, the variable input prompt information is popped up to enable the user to input the content information of the variable, and then the content information of the variable is filled into the target input information template to form complete and high-quality input information.
Steps S201-S206 are one implementation of step S101 in the embodiment shown in fig. 1. By the method, the user can be assisted to accurately and efficiently input the input information of the user.
In addition, optionally, in an embodiment of the present disclosure, step S101 of the embodiment shown in fig. 1 may be implemented specifically by the following steps:
(1) Acquiring original input information input by a user;
(2) Extracting the intention of the user based on the original input information;
(3) Based on the intention of the user, a target input information template is acquired from a plurality of preconfigured input information templates as input information.
Unlike the above steps S201 to S206, the implementation of the above steps S201 to S206 is to directly present all the input information templates to the user, from which the user selects one. In the implementation manners of the steps (1) - (3), a user can randomly input an original input information, and then the generating device of the prompt word information of the large language model extracts the intention of the user based on the original input information and compares the semantic relativity of the intention of the user and the theme of each input information template. And then, acquiring an input information template with the maximum semantic relativity with the intention of the user from a plurality of preconfigured input information templates, and taking the input information template as a target input information template to serve as input information of the user. Wherein the topics of the plurality of input information templates are also preconfigured when the input information templates are preconfigured.
By adopting the mode, the user can be assisted to accurately and efficiently input the input information of the user.
Or alternatively, in an embodiment of the present disclosure, the input information input by the user may be directly received, which is simpler to implement than the input information of the user in the two modes, but the target prompt word information generated based on the mode may not be more accurate in the two modes.
S207, acquiring parameter information after the user adjusts the pre-configured parameter information; step S208 is performed;
optionally, the parameter information in the present embodiment includes: shortening at least one of the length of the prompt word, iteration turns, optimizing quality, examining optimization and the like.
For example, optionally, in practical application, default values may be set for each parameter information, and the parameter information is configured in advance and used directly when in use. The corresponding step of obtaining the parameter information may be directly obtaining the default parameter information of the pre-configuration. If the user modifies the default pre-configured parameter information, step S207 is correspondingly adopted at this time, and before step S207, the method may further include: and receiving the adjustment of the parameter information by the user.
S208, generating a model by adopting a pre-trained prompt word based on the input information and the adjusted parameter information, and generating target prompt word information.
For example, for the above step S208, the specific implementation may include the following cases:
in the first case, when the parameter information includes the shortening notice word, step S208 may specifically include the following steps:
(a1) Generating initial prompt word information by adopting a prompt word generation model based on the input information;
(b1) Based on the initial prompt word information, according to a preset strategy for shortening the length of the prompt word, and obtaining target prompt word information.
In this case, the initial cue word information generated by the cue word generation model may be considered too lengthy, and in this embodiment, a policy for shortening the length of the cue word may be preset, for example, at least one policy including deleting duplicate information and deleting information having a low degree of correlation with the input information. And then, adopting a preset strategy for shortening the length of the prompting word to shorten the length of the initial prompting word information, so as to obtain more definite and simplified target prompting word information.
In the second case, when the parameter information includes iteration rounds, step S208 may specifically include the following steps:
If the iteration round comprises 1 round, generating a model by adopting a prompt word based on the input information, and generating target prompt word information;
and if the iteration turns comprise more than 2 turns, carrying out loop iteration of the iteration turns by adopting a prompt word generation model according to the iteration turns based on the input information to obtain target prompt word information.
For example, if the iteration round includes 2 rounds, generating a model by adopting the prompt word based on the input information, and generating initial prompt word information; and then generating target prompt word information by adopting a prompt word generation model based on the initial prompt word information. If the iteration turns comprise 3 turns, generating a model by adopting a prompt word based on the input information, and generating initial prompt word information; then, based on the initial prompt word information, generating a model by adopting the prompt word to generate intermediate prompt word information; and then generating target prompt word information by adopting a prompt word generation model based on the intermediate prompt word information. And similarly, when the iteration rounds comprise other positive integer rounds, carrying out loop iteration of the iteration rounds by adopting a prompt word generation model according to the iteration rounds, so as to obtain target prompt word information.
In an actual application scene, the more iteration rounds, the higher the accuracy of the obtained target prompt word information, but the longer the time consumption for generating the target prompt word information. Therefore, in the actual application scene, users can balance the advantages and disadvantages, and through multiple test verification, a reasonable iteration round is configured, so that accurate and high-quality target prompt word information can be obtained under the condition of low time consumption.
In the third case, when the parameter information includes the optimized quality, step S208 may specifically include the following steps:
(a2) Generating initial prompt word information by adopting a prompt word generation model based on the input information;
(b2) Based on the initial prompt word information, a pre-trained prompt word optimization model is adopted to generate target prompt word information.
The cue word optimization model in this embodiment is another pre-trained neural network model, which can be considered as a model with higher performance than the cue word generation model, and can further perform quality optimization on the initial cue word information generated by the cue word generation model to generate target cue word information with higher quality and better quality.
In the fourth case, when the parameter information includes the censoring optimization, step S208 may specifically include the following steps:
(a3) Generating initial prompt word information by adopting a prompt word generation model based on the input information;
(b3) Performing word segmentation processing on the initial prompt word information to obtain a word segmentation sequence formed by a plurality of word segments;
(c3) Based on a preset sensitive word stock, detecting sensitive words of each word in the word segmentation sequence;
(d3) In response to the segmentation word being a sensitive word, replacing the sensitive word by a synonym based on a preset synonym library;
(e3) And splicing the plurality of segmented words in the segmented word sequence after the sensitive word detection processing according to the positions in the initial prompt word information to obtain target prompt word information.
The sensitive word detection in the case can replace the sensitive word in the target prompt word information, and the quality of the target prompt word information is improved. For example, in the case of a large language model, when a sensitive word is included in the input information, the sensitive word is sometimes ignored when generating the reply text information, resulting in inaccurate and incomplete reply text information being generated. In the mode of the embodiment, through the above examination optimization, the sensitive word can be replaced, so that the target prompt word information is more standard, and when the large language model uses the target prompt word information to generate the reply text information, the more accurate and higher-quality reply text information can be generated.
The above four cases of the present embodiment may be used in any combination.
In addition, optionally, the optimized prompt word information generated in the embodiment may be used not only as input of a large language model, but also as other test scenarios, which is not limited herein.
When the technical scheme of the disclosure is used, the input information of the user can be used as prompt word information and input into the large language model for generating the reply text information. Therefore, the process of generating the hint word information according to the scheme of this embodiment may also be referred to as a process of optimizing the hint word information.
The process of optimizing or otherwise generating the alert word information and the usage fields Jing Shili of embodiments of the present disclosure are described below in connection with a specific example. The Prompt word information of the present disclosure may also be referred to as Prompt information, and the technical solutions of the present disclosure are described below in fig. 3 to 6 by taking the Prompt information as an example.
For example, fig. 3 is an interface schematic for optimizing the promt information provided by the present disclosure. As shown in fig. 3, which is a schematic diagram of an interface for online optimization of the promt information, a large input box may be displayed on the interface, and a user may input the original promt information in the input box, and then click an optimization button on the lower right side, so as to trigger the method for generating the promt information of the large language model according to the embodiment of the present disclosure, and at this time, the original information input by the user is displayed in the original promt information column. And displaying the optimized Prompt information before the optimized Prompt information is not obtained. The method for generating the Prompt information of the large language model and the large language model in the embodiment can be used jointly, and the lower reasoning result can respectively show the reasoning result corresponding to the original Prompt information and the reasoning result corresponding to the optimized Prompt information. The reasoning result corresponding to the original promt information is the reply text information generated by the large language model based on the original promt information; the reasoning result corresponding to the optimized Prompt information is the reply text information generated by the large language model based on the optimized Prompt information. And before the result is obtained, the reasoning results are displayed in a column.
Fig. 4 is a schematic diagram of another interface for optimizing the promt information provided by the present disclosure. As shown in fig. 4, on the basis of fig. 3, the optimized promt information is shown in the optimized promt information field. And the reasoning results corresponding to the original promt information and the reasoning results corresponding to the optimized promt information are respectively displayed, so that a user can conveniently compare the reasoning results generated by the large language model based on the original promt information with the reasoning results generated based on the optimized promt information. Based on the function, the original promt information and the optimized promt information can be compared in real time, and the user can compare the effects before and after optimizing the promt information at any time according to task requirements.
Fig. 5 is a schematic diagram of another interface for optimizing the promt information provided by the present disclosure. As shown in fig. 5, an optimizing parameter configuration button may be clicked on the promt information online optimizing interface, and at this time, a parameter configuration interface is popped up, and four kinds of adjusted parameter information are displayed on the interface. The user can click on a button for opening or closing the shortening prompt word, optimizing the quality and examining the optimization, and parameter configuration is realized. The user can also select the number of iterative rounds to realize parameter configuration.
Fig. 6 is a schematic diagram of an interface provided in the present disclosure for optimizing the promt information. As shown in fig. 4, the optimized promt information column is further provided with two buttons for copying and saving as templates, and the optimized promt information can be copied by clicking the copy button. Clicking on the save as template button may pop up the save as template interface shown in fig. 6. As shown in fig. 6, the interface may include a template name, where the user may enter the topic name of the template. It should be noted that, description information may be displayed below the input box of the template name, so that the user may input the corresponding template name based on the description information. As shown in fig. 6, the interface stored as a template may further be provided with a template tag, and a system may be provided with a plurality of template tags, where a user may select a preset number of template tags. The interface stored as the template also displays the optimized promt information, and the user can modify the promt information on the interface. And clicking the determination button below, and saving as a template. If preservation is not desired, cancellation may be clicked.
The method for generating the prompt word information of the large language model can effectively improve the performance and accuracy of the large language model product, and can optimize the prompt word information by using a plurality of preset input information templates, so that the large predictive model can achieve higher performance and accuracy. This means that in various natural language processing tasks, the user can obtain more accurate results, thereby improving the practicality and credibility of the large language model.
The method for generating the prompt word information of the large language model has wider applicability, can optimize the prompt word information without professional knowledge, has stronger attraction and can serve wider user groups.
The method for generating the prompt word information of the large language model can feed back the prompt word information in real time, can set iteration rounds, allows a user to adjust at any time according to actual task performance, timely obtains the optimal prompt word information, provides more control rights and adaptability for the user, and is beneficial to meeting requirements of different users.
In addition, optimization of Prompt word information, namely Prompt information, can be a very important task in natural language processing, and the performance and accuracy of the model in returning results are improved by adjusting the input Prompt word information. Currently, some preset prompt word information is available in the prior art, but the preset prompt word information cannot meet the requirements of all tasks. In addition, the homemade prompter information requires domain expertise and time to create, so not all users can easily use the homemade prompter information. By adopting the mode of the embodiment, the problems can be overcome, and the prompt word information with higher quality and better quality can be generated.
The method for generating the prompt word information of the large language model has certain flexibility, allows a user to start from preset prompt word information, and can optimize the prompt word information without professional knowledge, so that the operability of tasks is improved.
The method for generating the prompt word information of the large language model can perform parameter optimization configuration, improve the accuracy of the prompt word information, further remarkably improve the performance and effect of the large language model, and enable the large language model to be more suitable for various application scenes.
The method for generating the prompt word information of the large language model can enable a user to adjust the prompt word information at any time according to task performance, so that the user can keep the optimal state.
The method for generating the prompt word information of the large language model has wide applicability, and is suitable for various natural language tasks including text dialogue, marketing text, code programming, creative writing and the like.
FIG. 7 is a schematic diagram according to a third embodiment of the present disclosure; as shown in fig. 7, the present embodiment provides a device 700 for generating prompt word information of a large language model, including:
an obtaining module 701, configured to obtain input information of a user when using a large language model;
The generating module 702 is configured to generate target prompt word information by using a pre-trained prompt word generating model based on the input information and the configured parameter information, where the target prompt word information is used to replace the input information and is input into the large language model.
The implementation principle and the technical effect of the generation of the prompt word information by adopting the above modules in the generating device 700 of the prompt word information in this embodiment are the same as those of the implementation of the above related method embodiments, and detailed description of the above related method embodiments may be referred to and will not be repeated here.
FIG. 8 is a schematic diagram according to a fourth embodiment of the present disclosure; as shown in fig. 7, the present embodiment provides a device 800 for generating prompt word information of a large language model, which further describes the technical solution of the present disclosure in more detail on the basis of the technical solution of the embodiment shown in fig. 7, and as shown in fig. 8, the device 800 for generating prompt word information of the present embodiment includes the same-name and same-function modules of fig. 7: an acquisition module 801 and a generation module 802.
In this embodiment, a generating module 802 is configured to:
acquiring preset default parameter information; or alternatively
And acquiring the parameter information after the user adjusts the pre-configured parameter information.
Further alternatively, the parameter information of the present embodiment includes: at least one of the hint word length, iteration round, optimization quality, and censoring optimization is shortened.
Further optionally, in an embodiment of the present disclosure, when the shortening hint word is included in the parameter information, the generating module 802 is configured to:
based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
based on the initial prompt word information, according to a preset strategy for shortening the length of the prompt word, and obtaining the target prompt word information.
Further optionally, in an embodiment of the present disclosure, when the parameter information includes an iteration round, the generating module 802 is configured to:
if the iteration round comprises 1 round, generating the target prompt word information by adopting the prompt word generation model based on the input information; or alternatively
And if the iteration round comprises more than 2 rounds, carrying out loop iteration of the iteration round by adopting the prompt word generation model according to the iteration round based on the input information, and obtaining the target prompt word information.
Further optionally, in an embodiment of the present disclosure, when the parameter information includes an optimized quality, the generating module 802 is configured to:
Based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
and generating the target prompt word information by adopting a pre-trained prompt word optimization model based on the initial prompt word information.
Further optionally, in an embodiment of the present disclosure, when the parameter information includes censoring optimization, a generating module 802 is configured to:
based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
performing word segmentation processing on the initial prompt word information to obtain a word segmentation sequence formed by a plurality of word segments;
based on a preset sensitive word stock, detecting sensitive words of each word in the word segmentation sequence;
in response to the segmentation word being a sensitive word, replacing the sensitive word with a synonym based on a preset synonym library;
and splicing the plurality of word fragments in the word fragment sequence after the sensitive word detection processing according to the positions in the initial prompt word information to obtain the target prompt word information.
Further optionally, in an embodiment of the present disclosure, the obtaining module 801 is configured to:
presenting a plurality of preconfigured input information templates to the user;
And acquiring a target input information template selected by a user as the input information.
Further alternatively, as shown in fig. 8, in an embodiment of the present disclosure, the generating device 800 of the hint word information further includes:
the pop-up module 803 is configured to pop up the variable input prompt message if the target input message template includes a variable;
the obtaining module 801 is further configured to receive content information of the variable input by the user;
and a filling module 804, configured to fill content information of the variable into the target input information template, as the input information.
Further optionally, in an embodiment of the present disclosure, the obtaining module 801 is configured to:
acquiring original input information input by the user;
extracting the intention of a user based on the original input information;
and acquiring a target input information template from a plurality of preconfigured input information templates based on the intention of the user as the input information.
The implementation principle and the technical effect of the generation of the prompt word information by the aid of the modules in the generating device 800 of the prompt word information of the large language model in this embodiment are the same as those of the related method embodiments, and detailed description of the related method embodiments may be referred to herein and will not be repeated.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 9, the apparatus 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.
Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.
The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, for example, the above-described methods of the present disclosure. For example, in some embodiments, the above-described methods of the present disclosure may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into RAM 903 and executed by the computing unit 901, one or more steps of the above-described methods of the present disclosure described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the above-described methods of the present disclosure in any other suitable manner (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (23)

1. A method for generating prompt word information of a large language model comprises the following steps:
acquiring input information of a user when using a large language model;
based on the input information and the configured parameter information, a pre-trained prompt word generation model is adopted to generate target prompt word information, and the target prompt word information is used for replacing the input information and is input into the large language model.
2. The method of claim 1, wherein generating the target cue word information using a pre-trained cue word generation model based on the input information and the configured parameter information, comprises:
Acquiring preset default parameter information; or alternatively
And acquiring the parameter information after the user adjusts the pre-configured parameter information.
3. The method of claim 1 or 2, wherein the parameter information comprises: at least one of the hint word length, iteration round, optimization quality, and censoring optimization is shortened.
4. The method of claim 3, wherein when the parameter information includes a shortened prompt term, generating target prompt term information using a pre-trained prompt term generation model based on the input information and the configured parameter information, includes:
based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
based on the initial prompt word information, according to a preset strategy for shortening the length of the prompt word, and obtaining the target prompt word information.
5. The method of claim 3, wherein when the parameter information includes iteration rounds, generating target prompt word information by using a pre-trained prompt word generation model based on the input information and the configured parameter information, including:
if the iteration round comprises 1 round, generating the target prompt word information by adopting the prompt word generation model based on the input information; or alternatively
And if the iteration round comprises more than 2 rounds, carrying out loop iteration of the iteration round by adopting the prompt word generation model according to the iteration round based on the input information, and obtaining the target prompt word information.
6. The method of claim 3, wherein when the parameter information includes optimized quality, generating target cue word information using a pre-trained cue word generation model based on the input information and configured parameter information, comprising:
based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
and generating the target prompt word information by adopting a pre-trained prompt word optimization model based on the initial prompt word information.
7. The method of claim 3, wherein when the parameter information includes the censored optimization, generating target prompt word information by using a pre-trained prompt word generation model based on the input information and the configured parameter information, and the method comprises the following steps:
based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
performing word segmentation processing on the initial prompt word information to obtain a word segmentation sequence formed by a plurality of word segments;
Based on a preset sensitive word stock, detecting sensitive words of each word in the word segmentation sequence;
in response to the segmentation word being a sensitive word, replacing the sensitive word with a synonym based on a preset synonym library;
and splicing the plurality of word fragments in the word fragment sequence after the sensitive word detection processing according to the positions in the initial prompt word information to obtain the target prompt word information.
8. The method of any of claims 1-7, wherein obtaining input information from a user while using a large language model comprises:
presenting a plurality of preconfigured input information templates to the user;
and acquiring a target input information template selected by a user as the input information.
9. The method of claim 8, wherein after obtaining the user-selected target input information template, the method further comprises:
if the target input information template comprises variables, popping up the variables to input prompt information;
receiving content information of the variable input by the user;
and filling the content information of the variable into the target input information template to serve as the input information.
10. The method of any of claims 1-7, wherein obtaining input information from a user while using a large language model comprises:
Acquiring original input information input by the user;
extracting the intention of a user based on the original input information;
and acquiring a target input information template from a plurality of preconfigured input information templates based on the intention of the user as the input information.
11. A device for generating prompt word information of a large language model comprises:
the acquisition module is used for acquiring input information of a user when using the large language model;
the generating module is used for generating a model by adopting a pre-trained prompt word based on the input information and the configured parameter information, and generating target prompt word information which is used for replacing the input information and inputting the target prompt word information into the large language model.
12. The apparatus of claim 11, wherein the means for generating is configured to:
acquiring preset default parameter information; or alternatively
And acquiring the parameter information after the user adjusts the pre-configured parameter information.
13. The apparatus of claim 11 or 12, wherein the parameter information comprises: at least one of the hint word length, iteration round, optimization quality, and censoring optimization is shortened.
14. The apparatus of claim 13, wherein when the parameter information includes a shortening hint word, the generating module is configured to:
Based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
based on the initial prompt word information, according to a preset strategy for shortening the length of the prompt word, and obtaining the target prompt word information.
15. The apparatus of claim 13, wherein when the parameter information includes iteration rounds, the generating module is configured to:
if the iteration round comprises 1 round, generating the target prompt word information by adopting the prompt word generation model based on the input information; or alternatively
And if the iteration round comprises more than 2 rounds, carrying out loop iteration of the iteration round by adopting the prompt word generation model according to the iteration round based on the input information, and obtaining the target prompt word information.
16. The apparatus of claim 13, wherein when the parameter information includes an optimized quality, the generating module is configured to:
based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
and generating the target prompt word information by adopting a pre-trained prompt word optimization model based on the initial prompt word information.
17. The apparatus of claim 13, wherein the parameter information includes, when the parameter information includes the censoring optimization, the generating module is configured to:
based on the input information, generating an initial prompt word information by adopting the prompt word generation model;
performing word segmentation processing on the initial prompt word information to obtain a word segmentation sequence formed by a plurality of word segments;
based on a preset sensitive word stock, detecting sensitive words of each word in the word segmentation sequence;
in response to the segmentation word being a sensitive word, replacing the sensitive word with a synonym based on a preset synonym library;
and splicing the plurality of word fragments in the word fragment sequence after the sensitive word detection processing according to the positions in the initial prompt word information to obtain the target prompt word information.
18. The apparatus of any of claims 11-17, wherein the acquisition module is configured to:
presenting a plurality of preconfigured input information templates to the user;
and acquiring a target input information template selected by a user as the input information.
19. The apparatus of claim 18, wherein the apparatus further comprises:
the popup module is used for popup the variable input prompt information if the target input information template comprises the variable;
The acquisition module is also used for receiving the content information of the variable input by the user;
and the filling module is used for filling the content information of the variable into the target input information template to be used as the input information.
20. The apparatus of any of claims 11-17, wherein the acquisition module is configured to:
acquiring original input information input by the user;
extracting the intention of a user based on the original input information;
and acquiring a target input information template from a plurality of preconfigured input information templates based on the intention of the user as the input information.
21. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.
22. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-10.
23. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-10.
CN202311340716.1A 2023-10-16 2023-10-16 Method, device, equipment and medium for generating prompt word information of large language model Pending CN117539975A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311340716.1A CN117539975A (en) 2023-10-16 2023-10-16 Method, device, equipment and medium for generating prompt word information of large language model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311340716.1A CN117539975A (en) 2023-10-16 2023-10-16 Method, device, equipment and medium for generating prompt word information of large language model

Publications (1)

Publication Number Publication Date
CN117539975A true CN117539975A (en) 2024-02-09

Family

ID=89785043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311340716.1A Pending CN117539975A (en) 2023-10-16 2023-10-16 Method, device, equipment and medium for generating prompt word information of large language model

Country Status (1)

Country Link
CN (1) CN117539975A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744754A (en) * 2024-02-19 2024-03-22 浙江同花顺智能科技有限公司 Large language model task processing method, device, equipment and medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117744754A (en) * 2024-02-19 2024-03-22 浙江同花顺智能科技有限公司 Large language model task processing method, device, equipment and medium
CN117744754B (en) * 2024-02-19 2024-05-10 浙江同花顺智能科技有限公司 Large language model task processing method, device, equipment and medium

Similar Documents

Publication Publication Date Title
US20220350965A1 (en) Method for generating pre-trained language model, electronic device and storage medium
CN109300179B (en) Animation production method, device, terminal and medium
KR20210148872A (en) Method and apparatus for training language model based on various word vectors, device and recording medium
US20200101383A1 (en) Method and apparatus for recognizing game command
CN114970522B (en) Pre-training method, device, equipment and storage medium of language model
US20230089268A1 (en) Semantic understanding method, electronic device, and storage medium
CN111709234A (en) Training method and device of text processing model and electronic equipment
JP2021114284A (en) Method and apparatus for predicting punctuation mark
CN114841274B (en) Language model training method and device, electronic equipment and storage medium
CN117539975A (en) Method, device, equipment and medium for generating prompt word information of large language model
CN116012481B (en) Image generation processing method and device, electronic equipment and storage medium
CN115631261B (en) Training method of image generation model, image generation method and device
CN113655895A (en) Information recommendation method and device applied to input method and electronic equipment
CN117421403A (en) Intelligent dialogue method and device and electronic equipment
CN114969195B (en) Dialogue content mining method and dialogue content evaluation model generation method
CN113408702B (en) Music neural network model pre-training method, electronic device and storage medium
CN112799658B (en) Model training method, model training platform, electronic device, and storage medium
CN114218431A (en) Video searching method and device, electronic equipment and storage medium
CN113869042A (en) Text title generation method and device, electronic equipment and storage medium
CN113408632A (en) Method and device for improving image classification accuracy, electronic equipment and storage medium
CN114492456B (en) Text generation method, model training method, device, electronic equipment and medium
CN116257611B (en) Question-answering model training method, question-answering processing device and storage medium
CN115131709B (en) Video category prediction method, training method and device for video category prediction model
CN115630630B (en) Language model processing method, service processing method, device, equipment and medium
CN113723120B (en) Display method and device of reference information and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination