CN118093838A - Large language model prompt word generation method, system, terminal equipment and medium - Google Patents
Large language model prompt word generation method, system, terminal equipment and medium Download PDFInfo
- Publication number
- CN118093838A CN118093838A CN202410494748.5A CN202410494748A CN118093838A CN 118093838 A CN118093838 A CN 118093838A CN 202410494748 A CN202410494748 A CN 202410494748A CN 118093838 A CN118093838 A CN 118093838A
- Authority
- CN
- China
- Prior art keywords
- word
- relation
- prompt
- node
- prompt word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 238000004458 analytical method Methods 0.000 claims abstract description 63
- 238000004590 computer program Methods 0.000 claims description 21
- 239000003607 modifier Substances 0.000 claims description 12
- 238000004364 calculation method Methods 0.000 claims description 7
- 230000000295 complement effect Effects 0.000 claims description 6
- 239000002131 composite material Substances 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 5
- 230000014509 gene expression Effects 0.000 claims description 5
- 230000006399 behavior Effects 0.000 claims description 4
- 230000003993 interaction Effects 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 4
- 150000001875 compounds Chemical class 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 3
- 239000003550 marker Substances 0.000 claims description 3
- 230000008569 process Effects 0.000 description 5
- 230000000694 effects Effects 0.000 description 4
- 230000004044 response Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 241000282326 Felis catus Species 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3322—Query formulation using system suggestions
- G06F16/3323—Query formulation using system suggestions using document space presentation or visualization, e.g. category, hierarchy or range presentation and selection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/335—Filtering based on additional data, e.g. user or group profiles
- G06F16/337—Profile generation, learning or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Probability & Statistics with Applications (AREA)
- Mathematical Physics (AREA)
- Machine Translation (AREA)
Abstract
The application is suitable for the technical field of large language models, and provides a large language model prompt word generation method, a system, terminal equipment and a medium. Constructing a personalized corpus based on user data; acquiring a prompt word input by a user, and dividing the prompt word; determining a root node in the core word, and determining the relation between the core word and other unit texts according to the grammar relation; constructing a prompt word analysis tree according to the root nodes and the relation; calculating evolution probability and node depth probability, and evolving the prompt word analysis tree to obtain a final prompt word analysis tree; generating a plurality of new prompt words according to the final prompt word analysis tree, calculating the comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and outputting the new prompt word corresponding to the highest comprehensive score to the user. The method and the device can improve the accuracy of generating the prompt words.
Description
Technical Field
The application belongs to the technical field of large language models, and particularly relates to a large language model prompt word generation method, a large language model prompt word generation system, terminal equipment and a medium.
Background
The large language model refers to a deep learning model trained by using a large amount of text data, can generate natural language text or understand the meaning of the language text, can process various natural language tasks such as text classification, question-answer, dialogue and the like, and is an important path leading to artificial intelligence.
However, when interacting with large language models, the user-entered prompt often suffers from inaccuracy, ambiguity, incompleteness, lack of key information, misleading words, grammar mistakes, misspellings, ambiguous or ambiguous words, etc.
Furthermore, for domain-specific questions, the user may not provide background information or terms of the relevant domain, resulting in the model failing to understand or answer the relevant questions. Some large language models also limit the length of the prompt words entered by the user, which results in the prompt words being truncated or reduced, which in turn results in the loss or incompleteness of information. Therefore, a method capable of accurately generating a hint word is needed.
Disclosure of Invention
The application provides a large language model prompt word generation method, a large language model prompt word generation system, terminal equipment and medium, which can improve the accuracy of prompt word generation.
In a first aspect, the present application provides a method for generating a large language model prompt word, including:
constructing a personalized corpus based on pre-acquired user data; the personalized corpus is used for providing prompt words which better meet the requirements and preferences of specific users, thereby improving interaction experience and accuracy.
Acquiring a prompt word input by a user, and dividing the prompt word; wherein the prompt word is segmented into at least one unit text;
selecting a core word from at least one unit text as a root node, and determining the relation between the core word and other unit texts according to a preset grammar relation;
Constructing a prompt word analysis tree according to the root nodes and the relation; the nodes of the prompt word analysis tree correspond to the unit texts one by one, and the edges of the prompt word analysis tree represent the relation among the nodes;
Respectively calculating the evolution probability and the node depth probability of each node, and evolving the prompt word analysis tree according to the evolution probability and the node depth probability until a new prompt word analysis tree obtained by evolution meets a preset evolution termination condition, so as to obtain a final prompt word analysis tree; the node depth probability is used for calculating the contribution of the depth of the node to the evolution probability. The method has the effect of adding depth-related weights into the evolution probability of the nodes so as to consider the importance of the nodes in the prompt word structure and the context dependency.
Generating a plurality of new prompt words according to the final prompt word analysis tree, calculating the comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and outputting the new prompt word corresponding to the highest comprehensive score to the user.
Optionally, the user data includes user behavior information, user portrait information, and user feedback information.
Optionally, the core word is a verb with a main-predicate relation;
Grammatical relations include a master-predicate relation, a verb object relation, an indirect object relation, a subordinate master-predicate relation, a subordinate complement relation, an open complement relation, a passive master-predicate relation, a auxiliary verb relation, a series verb relation, a qualifier relation, an adjective modifier relation, a quantity modifier relation, an orthotopic relation, a idiom relation, a compound word relation, a marker relation, a preposition relation, a subordinate relation modifier relation, a parallel structure relation, a punctuation relation, a juxtaposition relation, and a flat modifier relation.
Alternatively, the evolution probability is calculated by the formula; Wherein/>Represents the/>Individual node/>Evolution probability of/>Representing a mapping function,/>Representing nodes/>Part of speech feature,/>Represents the/>Individual node/>Node depth probability,/>Represents the/>Individual node/>Is a position depth of (2);
the calculation formula of the node depth probability is as follows ; Wherein/>Represents the/>Individual node/>Node depth probability,/>Represents the/>Individual node/>Position depth of/>Representing the maximum depth of the hint word parse tree.
Optionally, the calculation expression of the composite score is:
Wherein, Representing the composite score,/>Representing a large model score representing a new prompt term/>, by the large modelAnd original prompt words/>The base scores given by comparison are-Representing the personalized corpus,/>Representing the specialized domain corpus,/>Representing unit text/>Importance score of/>Representing unit text/>And personalized corpus/>And professional domain corpus/>Cosine similarity scores between.
Optionally, the evolution termination condition is that the evolution frequency is greater than or equal to a preset maximum evolution frequency.
In a second aspect, the present application provides a large language model hint word generating system, including:
the corpus construction module is used for constructing a personalized corpus based on the user data acquired in advance;
the prompt word segmentation module is used for acquiring the prompt words input by the user and segmenting the prompt words; wherein the prompt word is segmented into at least one unit text;
the relation determining module is used for selecting a core word from at least one unit text as a root node and determining the relation between the core word and other unit texts according to a preset grammar relation;
the analysis tree construction module is used for constructing a prompt word analysis tree according to the root nodes and the relation; the nodes of the prompt word analysis tree correspond to the unit texts one by one, and the edges of the prompt word analysis tree represent the relation among the nodes;
The analysis tree evolution module is used for calculating the evolution probability and the node depth probability of each node respectively, and evolving the prompt word analysis tree according to the evolution probability and the node depth probability until the new prompt word analysis tree obtained by evolution meets the preset evolution termination condition, so as to obtain a final prompt word analysis tree;
The prompt word generation module is used for generating a plurality of new prompt words according to the final prompt word analysis tree, calculating the comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and outputting the new prompt word corresponding to the highest comprehensive score to the user.
In a third aspect, the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the method for generating a large language model prompt word described above when executing the computer program.
In a fourth aspect, the present application provides a computer readable storage medium storing a computer program, which when executed by a processor, implements the above-described large language model hint word generating method.
The scheme of the application has the following beneficial effects:
According to the large language model prompt word generation method provided by the application, the prompt word analysis tree is evolved according to the evolution probability and the node depth probability to obtain the final prompt word analysis tree, so that the prompt word with larger influence on the whole semantics can be reserved, and the accuracy of the generated prompt word is improved; based on the personalized corpus and the preset professional field corpus, the comprehensive score of each new prompting word is calculated, the importance and the relevance of the new prompting word in the personalized corpus and the professional field corpus are combined, the new prompting word is output according to the highest comprehensive score, and the accuracy of the generated prompting word can be further improved.
Other advantageous effects of the present application will be described in detail in the detailed description section which follows.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the embodiments or the description of the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flowchart of a method for generating large language model hint words according to an embodiment of the present application;
FIG. 2 is a schematic diagram of a hint word parse tree according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a large language model hint word generating system according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, techniques, etc., in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It should be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in the present specification and the appended claims refers to any and all possible combinations of one or more of the associated listed items, and includes such combinations.
As used in the present description and the appended claims, the term "if" may be interpreted as "when..once" or "in response to a determination" or "in response to detection" depending on the context. Similarly, the phrase "if a determination" or "if a [ described condition or event ] is detected" may be interpreted in the context of meaning "upon determination" or "in response to determination" or "upon detection of a [ described condition or event ]" or "in response to detection of a [ described condition or event ]".
Furthermore, the terms "first," "second," "third," and the like in the description of the present specification and in the appended claims, are used for distinguishing between descriptions and not necessarily for indicating or implying a relative importance.
Reference in the specification to "one embodiment" or "some embodiments" or the like means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," and the like in the specification are not necessarily all referring to the same embodiment, but mean "one or more but not all embodiments" unless expressly specified otherwise. The terms "comprising," "including," "having," and variations thereof mean "including but not limited to," unless expressly specified otherwise.
Aiming at the problem of low accuracy of the prompting words generated by the traditional method, the application provides a large language model prompting word generation method, a system, terminal equipment and a medium, wherein the method evolves a prompting word analysis tree according to evolution probability and node depth probability to obtain a final prompting word analysis tree, so that prompting words with great influence on overall semantics can be reserved, and the accuracy of the generated prompting words is improved; based on the personalized corpus and the preset professional field corpus, the comprehensive score of each new prompting word is calculated, the importance and the relevance of the new prompting word in the personalized corpus and the professional field corpus are combined, the new prompting word is output according to the highest comprehensive score, and the accuracy of the generated prompting word can be further improved.
The method for generating the large language model prompt word provided by the application is exemplified below.
As shown in FIG. 1, the method for generating the large language model prompt word provided by the application comprises the following steps:
and 11, constructing a personalized corpus based on the pre-acquired user data.
The personalized corpus is used for providing prompt words which better meet the requirements and preferences of specific users, thereby improving interaction experience and accuracy.
Specifically, the user data includes user behavior information, user portrait information, and user feedback information. In an embodiment of the present application, after the data is collected, the data may be stored in a user information table of a corresponding user, and a unique identifier is set for each user to prevent confusion, and the collected user information tables of all users form the personalized corpus. The database may be created using existing database creation methods, the kind of which is not limited here.
In an embodiment of the present application, the above data information is specifically:
User behavior information including user operations in a large language model, search records, click preferences, and the like. The data is analyzed to mine the user's preferences, interests and usage habits, such as words, phrases or expressions that are frequently used by the user.
User profile information including age, gender, occupation, geographic location, etc. Based on this information, a user portrait is created that helps to understand the characteristics and context of the user.
User feedback information including user ratings, preferences, satisfaction, etc. And analyzing the feedback information to know the preference and the requirement of the user for generating the prompt words.
And step 12, acquiring the prompt words input by the user, and dividing the prompt words.
Wherein the prompt word is segmented into at least one unit text.
Illustratively, in one embodiment of the present application, the hint word is "THE CAT CHASED THE mouse" (cat trapping mouse), then each word is used as a unit text, and the segmentation result is. In other embodiments of the present application, individual words (or terms) may be used as unit text, without limitation.
And 13, selecting a core word from at least one unit text as a root node, and determining the relation between the core word and other unit texts according to a preset grammar relation.
The core word is a verb with a main-predicate relation.
Specifically, in the embodiment of the present application, the above-mentioned grammatical relations include a main-predicate relation, a verb object relation, an indirect object relation, a subordinate main-predicate relation, a subordinate complement relation, an open complement relation, a passive main-predicate relation, a auxiliary verb relation, a system verb relation, a qualifier relation, an adjective modifier relation, a quantity modifier relation, an apposition relation, a idiom relation, a compound word relation, a marker relation, a preposition relation, a subordinate relation modifier relation, a parallel structure relation, a punctuation relation, a juxtaposition relation, and a flat modifier relation.
And 14, constructing a prompt word analysis tree according to the root nodes and the relations.
The nodes of the prompt word analysis tree are in one-to-one correspondence with the unit texts, and the edges of the prompt word analysis tree represent the relation among the nodes.
In an embodiment of the present application, the hint word parse tree is comprised of a set of nodes and a set of edges, the set of nodes may be represented as,/>Represents the/>Personal node,/>The numbering of the nodes starts from root node number 1, and is numbered layer by layer in the order from top to bottom and from left to right. Illustratively, a hint word parse tree is shown in FIG. 2.
Further, the nodeThe value of (2) represents the characteristics of the node, and the expression is: /(I)Wherein/>Representing semantic features of nodes (unit text)/>Representing part of speech features,/>Representing the location characteristics (where the node is located).
For example, the nodes in FIG. 2Semantic feature s with text "cashed" is interpreted as "captured" by context, part-of-speech feature w is verb, location feature/>The value of (2) is/>。
The edge set may be represented as,/>Representing nodes/>And nodeConnected edges,/>The characteristic value of (a) is expressed as/>Wherein/>Representing nodes/>And node/>Grammatical relations between.
The hint word parse tree may be represented asWherein, when/>Time,/>Representing the whole prompt word analysis tree, and representing nodes/>, when i is other valuesIs the subtree of the root node. /(I)Is expressed as the characteristic value ofWherein/>Representation tree/>Included node set,/>Representation tree/>Included edge set,/>Representation tree/>A set of words included.
And 15, respectively calculating the evolution probability and the node depth probability of each node, and evolving the prompt word analysis tree according to the evolution probability and the node depth probability until the new prompt word analysis tree obtained by evolution meets the preset evolution termination condition, so as to obtain a final prompt word analysis tree.
The node depth probability is used for calculating the contribution of the depth of the node to the evolution probability. The method has the effect of adding depth-related weights into the evolution probability of the nodes so as to consider the importance of the nodes in the prompt word structure and the context dependency.
Specifically, the evolution probability is calculated by the formula; Wherein/>Represents the/>Individual node/>Evolution probability of/>Representing a mapping function,/>Representing nodes/>Part of speech feature,/>Represents the/>Individual node/>Node depth probability,/>Represents the/>Individual node/>Is a position depth of (2);
the calculation formula of the node depth probability is as follows ; Wherein/>Represents the/>Individual node/>Node depth probability,/>Represents the/>Individual node/>Position depth of/>Representing the maximum depth of the hint word parse tree. In this formula, nodes with smaller depth (closer to the root node) will have higher probability weights because they play a more important role in the hint word structure and context dependencies. Conversely, nodes with greater depth (close to leaf nodes) will have lower probability weights because they have relatively less semantic impact on the overall sentence.
In the embodiment of the application, the evolution termination condition is that the evolution frequency is greater than or equal to a preset maximum evolution frequency.
The evolution process in the embodiment of the present application is exemplarily described below.
Exemplary, first, a maximum number of iterations is initializedAnd hint word parse Tree Generation quantity/>。
Then, all nodes are iterated from the root node, and whether the node is to be evolved is judged by importance of feature information (evolution probability and node depth probability) contained in the node. When the evolution probability of the node is greater than the preset evolution probability threshold, the node is determined to evolve, and in an embodiment of the present application, the preset evolution probability threshold may be set to 0.8.
And 16, generating a plurality of new prompt words according to the final prompt word analysis tree, calculating the comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and outputting the new prompt word corresponding to the highest comprehensive score to the user.
Specifically, the calculation expression of the composite score is:
Wherein, Representing the composite score,/>Representing a large model score representing a new prompt term/>, by the large modelAnd original prompt words/>The base scores given by comparison are-Representing the personalized corpus,/>Representing the specialized domain corpus,/>Representing unit text/>Importance score of/>Representing unit text/>And personalized corpus/>And professional domain corpus/>Cosine similarity scores between.
According to the large language model prompt word generation method provided by the application, the prompt word analysis tree is evolved according to the evolution probability and the node depth probability to obtain the final prompt word analysis tree, so that the prompt word with great influence on the whole semantics can be reserved, and the accuracy of the generated prompt word is improved; based on the personalized corpus and the preset professional field corpus, the comprehensive score of each new prompting word is calculated, the importance and the relevance of the new prompting word in the personalized corpus and the professional field corpus are combined, the new prompting word is output according to the highest comprehensive score, and the accuracy of the generated prompting word can be further improved.
The large language model prompt word generating system provided by the application is exemplified below.
As shown in fig. 3, the large language model hint word generation system 300 includes:
The corpus construction module 301 is configured to construct a personalized corpus based on the user data collected in advance;
the prompt word segmentation module 302 is configured to obtain a prompt word input by a user, and segment the prompt word; wherein the prompt word is segmented into at least one unit text;
A relationship determining module 303, configured to select a core word from at least one unit text as a root node, and determine a relationship between the core word and other unit texts according to a preset grammatical relationship;
The parse tree construction module 304 is configured to construct a prompt word parse tree according to the root node and the relationship; the nodes of the prompt word analysis tree correspond to the unit texts one by one, and the edges of the prompt word analysis tree represent the relation among the nodes;
The analysis tree evolution module 305 is configured to calculate an evolution probability and a node depth probability of each node respectively, and evolve the analysis tree of the hint word according to the evolution probability and the node depth probability until a new analysis tree of the hint word obtained by evolution meets a preset evolution termination condition, so as to obtain a final analysis tree of the hint word;
the prompt word generation module 306 is configured to generate a plurality of new prompt words according to the final prompt word parsing tree, calculate a comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and output a new prompt word corresponding to the highest comprehensive score to the user.
It should be noted that, because the content of information interaction and execution process between the above devices/units is based on the same concept as the method embodiment of the present application, specific functions and technical effects thereof may be referred to in the method embodiment section, and will not be described herein.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions. The functional units and modules in the embodiment may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit, where the integrated units may be implemented in a form of hardware or a form of a software functional unit. In addition, the specific names of the functional units and modules are only for distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working process of the units and modules in the above system may refer to the corresponding process in the foregoing method embodiment, which is not described herein again.
As shown in fig. 4, an embodiment of the present application provides a terminal device, and as shown in fig. 4, a terminal device D10 of the embodiment includes: at least one processor D100 (only one processor is shown in fig. 4), a memory D101 and a computer program D102 stored in the memory D101 and executable on the at least one processor D100, the processor D100 implementing the steps in any of the various method embodiments described above when executing the computer program D102.
Specifically, when the processor D100 executes the computer program D102, a personalized corpus is constructed based on the user data collected in advance; acquiring a prompt word input by a user, and dividing the prompt word; selecting a core word from at least one unit text as a root node, and determining the relation between the core word and other unit texts according to a preset grammar relation; constructing a prompt word analysis tree according to the root nodes and the relation; respectively calculating the evolution probability and the node depth probability of each node, and evolving the prompt word analysis tree according to the evolution probability and the node depth probability until a new prompt word analysis tree obtained by evolution meets a preset evolution termination condition, so as to obtain a final prompt word analysis tree; generating a plurality of new prompt words according to the final prompt word analysis tree, calculating the comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and outputting the new prompt word corresponding to the highest comprehensive score to the user. The method comprises the steps of carrying out evolution on a prompt word analysis tree according to evolution probability and node depth probability to obtain a final prompt word analysis tree, reserving prompt words with great influence on overall semantics, and improving the accuracy of the generated prompt words; based on the personalized corpus and the preset professional field corpus, the comprehensive score of each new prompting word is calculated, the importance and the relevance of the new prompting word in the personalized corpus and the professional field corpus are combined, the new prompting word is output according to the highest comprehensive score, and the accuracy of the generated prompting word can be further improved.
The Processor D100 may be a central processing unit (CPU, central Processing Unit), the Processor D100 may also be other general purpose processors, digital signal processors (DSP, digital Signal processors), application SPECIFIC INTEGRATED integrated circuits (ASICs), off-the-shelf Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory D101 may in some embodiments be an internal storage unit of the terminal device D10, for example a hard disk or a memory of the terminal device D10. The memory D101 may also be an external storage device of the terminal device D10 in other embodiments, for example, a plug-in hard disk, a smart memory card (SMC, smart Media Card), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the terminal device D10. Further, the memory D101 may also include both an internal storage unit and an external storage device of the terminal device D10. The memory D101 is used for storing an operating system, an application program, a boot loader (BootLoader), data, other programs, etc., such as program codes of the computer program. The memory D101 may also be used to temporarily store data that has been output or is to be output.
Embodiments of the present application also provide a computer readable storage medium storing a computer program which, when executed by a processor, implements steps for implementing the various method embodiments described above.
Embodiments of the present application provide a computer program product enabling a terminal device to carry out the steps of the method embodiments described above when the computer program product is run on the terminal device.
The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the present application may implement all or part of the flow of the method of the above embodiments, and may be implemented by a computer program to instruct related hardware, where the computer program may be stored in a computer readable storage medium, and when the computer program is executed by a processor, the computer program may implement the steps of each of the method embodiments described above. Wherein the computer program comprises computer program code which may be in source code form, object code form, executable file or some intermediate form etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a large language model hint word generating system/terminal equipment, a recording medium, a computer Memory, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), an electrical carrier signal, a telecommunication signal, and a software distribution medium. Such as a U-disk, removable hard disk, magnetic or optical disk, etc. In some jurisdictions, computer readable media may not be electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.
In the foregoing embodiments, the descriptions of the embodiments are emphasized, and in part, not described or illustrated in any particular embodiment, reference is made to the related descriptions of other embodiments.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other manners. For example, the apparatus/network device embodiments described above are merely illustrative, e.g., the division of the modules or units is merely a logical functional division, and there may be additional divisions in actual implementation, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection via interfaces, devices or units, which may be in electrical, mechanical or other forms.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
While the foregoing is directed to the preferred embodiments of the present application, it will be appreciated by those skilled in the art that various modifications and adaptations can be made without departing from the principles of the present application, and such modifications and adaptations are intended to be comprehended within the scope of the present application.
Claims (9)
1. A method for generating a large language model prompt word, comprising:
constructing a personalized corpus based on pre-acquired user data; the personalized corpus is used for providing prompt words which more meet the requirements and preferences of specific users;
acquiring a prompt word input by a user, and dividing the prompt word; wherein the prompt word is segmented into at least one unit text;
selecting a core word from the at least one unit text as a root node, and determining the relation between the core word and other unit texts according to a preset grammar relation;
Constructing a prompt word analysis tree according to the root node and the relation; the nodes of the prompt word analysis tree correspond to the unit texts one by one, and the edges of the prompt word analysis tree represent the relation among the nodes;
Respectively calculating the evolution probability and the node depth probability of each node, and evolving the prompt word analysis tree according to the evolution probability and the node depth probability until a new prompt word analysis tree obtained by evolution meets a preset evolution termination condition, so as to obtain a final prompt word analysis tree; the node depth probability is used for calculating the contribution of the depth of the node to the evolution probability, and the contribution is used for adding a depth-related weight into the evolution probability of the node so as to consider the importance of the node in the prompt word structure and the context dependency;
Generating a plurality of new prompt words according to the final prompt word analysis tree, calculating the comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and outputting the new prompt word corresponding to the highest comprehensive score to a user.
2. The large language model hint word generating method of claim 1, wherein the user data includes historical input data, historical behavior data, user personal information, hint word feedback information, hint word interaction information, user preference information, and user portrayal information.
3. The large language model hint word generating method of claim 1, wherein the core word is a verb with a main-predicate relationship;
The grammatical relations include a master-predicate relation, a verb object relation, an indirect object relation, a subordinate master-predicate relation, a subordinate complement relation, an open complement relation, a passive master-predicate relation, a subordinate verb relation, a tethered verb relation, a qualifier relation, an adjective modifier relation, a quantity modifier relation, an orthotopic relation, a idiom relation, a compound word relation, a marker relation, a preposition relation, a subordinate modifier relation, a parallel structure relation, a punctuation relation, a juxtaposition relation, and a flat modifier relation.
4. The method for generating large language model hint words according to claim 1, wherein the calculation formula of the evolution probability is; Wherein/>Represents the/>Individual node/>Evolution probability of/>Representing a mapping function,/>Representing nodes/>Part of speech feature,/>Represents the/>Individual node/>Node depth probability,/>Represents the/>Individual node/>Is a position depth of (2);
The calculation formula of the node depth probability is as follows ; Wherein/>Represents the/>Individual node/>Node depth probability,/>Represents the/>Individual node/>Position depth of/>Representing the maximum depth of the prompt word parse tree.
5. The large language model hint word generation method of claim 1, wherein the calculation expression of the composite score is:
Wherein, Representing the composite score,/>Representing a large model score representing a new prompt term/>, by the large modelAnd original prompt words/>The base scores given by comparison are-Representing the personalized corpus,/>Representing the specialized domain corpus,/>Representing unit text/>Importance score of/>Representing unit text/>And personalized corpus/>And professional domain corpus/>Cosine similarity scores between.
6. The method for generating a large language model hint word according to claim 1, wherein the evolution termination condition is that the number of evolutions is equal to or greater than a preset maximum number of evolutions.
7. A large language model hint word generation system, comprising:
the corpus construction module is used for constructing a personalized corpus based on the user data acquired in advance;
the prompt word segmentation module is used for acquiring the prompt words input by the user and segmenting the prompt words; wherein the prompt word is segmented into at least one unit text;
The relation determining module is used for selecting a core word from the at least one unit text as a root node and determining the relation between the core word and other unit texts according to a preset grammar relation;
The analysis tree construction module is used for constructing a prompt word analysis tree according to the root node and the relation; the nodes of the prompt word analysis tree correspond to the unit texts one by one, and the edges of the prompt word analysis tree represent the relation among the nodes;
The analysis tree evolution module is used for respectively calculating the evolution probability and the node depth probability of each node, and evolving the prompt word analysis tree according to the evolution probability and the node depth probability until a new prompt word analysis tree obtained by evolution meets a preset evolution termination condition, so as to obtain a final prompt word analysis tree;
The prompt word generation module is used for generating a plurality of new prompt words according to the final prompt word analysis tree, calculating the comprehensive score of each new prompt word based on the personalized corpus and the preset professional field corpus, and outputting the new prompt word corresponding to the highest comprehensive score to a user.
8. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the large language model hint word generating method according to any one of claims 1 to 6 when the computer program is executed by the processor.
9. A computer-readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the large language model hint word generating method according to any one of claims 1 to 6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410494748.5A CN118093838B (en) | 2024-04-24 | 2024-04-24 | Large language model prompt word generation method, system, terminal equipment and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410494748.5A CN118093838B (en) | 2024-04-24 | 2024-04-24 | Large language model prompt word generation method, system, terminal equipment and medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118093838A true CN118093838A (en) | 2024-05-28 |
CN118093838B CN118093838B (en) | 2024-07-16 |
Family
ID=91155533
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410494748.5A Active CN118093838B (en) | 2024-04-24 | 2024-04-24 | Large language model prompt word generation method, system, terminal equipment and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118093838B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140163962A1 (en) * | 2012-12-10 | 2014-06-12 | International Business Machines Corporation | Deep analysis of natural language questions for question answering system |
CN103914569A (en) * | 2014-04-24 | 2014-07-09 | 百度在线网络技术(北京)有限公司 | Input prompt method and device and dictionary tree model establishing method and device |
CN106294324A (en) * | 2016-08-11 | 2017-01-04 | 上海交通大学 | A kind of machine learning sentiment analysis device based on natural language parsing tree |
CN116012492A (en) * | 2022-12-13 | 2023-04-25 | 特赞(上海)信息科技有限公司 | Prompt word intelligent optimization method and system for character generation image |
CN116881415A (en) * | 2023-07-11 | 2023-10-13 | 严梓铭 | System and method for large language model second order prompt words for digital person intelligence |
CN117744753A (en) * | 2024-02-19 | 2024-03-22 | 浙江同花顺智能科技有限公司 | Method, device, equipment and medium for determining prompt word of large language model |
-
2024
- 2024-04-24 CN CN202410494748.5A patent/CN118093838B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140163962A1 (en) * | 2012-12-10 | 2014-06-12 | International Business Machines Corporation | Deep analysis of natural language questions for question answering system |
CN103914569A (en) * | 2014-04-24 | 2014-07-09 | 百度在线网络技术(北京)有限公司 | Input prompt method and device and dictionary tree model establishing method and device |
CN106294324A (en) * | 2016-08-11 | 2017-01-04 | 上海交通大学 | A kind of machine learning sentiment analysis device based on natural language parsing tree |
CN116012492A (en) * | 2022-12-13 | 2023-04-25 | 特赞(上海)信息科技有限公司 | Prompt word intelligent optimization method and system for character generation image |
CN116881415A (en) * | 2023-07-11 | 2023-10-13 | 严梓铭 | System and method for large language model second order prompt words for digital person intelligence |
CN117744753A (en) * | 2024-02-19 | 2024-03-22 | 浙江同花顺智能科技有限公司 | Method, device, equipment and medium for determining prompt word of large language model |
Also Published As
Publication number | Publication date |
---|---|
CN118093838B (en) | 2024-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11816438B2 (en) | Context saliency-based deictic parser for natural language processing | |
US10997370B2 (en) | Hybrid classifier for assigning natural language processing (NLP) inputs to domains in real-time | |
US11017178B2 (en) | Methods, devices, and systems for constructing intelligent knowledge base | |
CN108304375B (en) | Information identification method and equipment, storage medium and terminal thereof | |
US10496749B2 (en) | Unified semantics-focused language processing and zero base knowledge building system | |
US7035789B2 (en) | Supervised automatic text generation based on word classes for language modeling | |
US20140351228A1 (en) | Dialog system, redundant message removal method and redundant message removal program | |
US11113470B2 (en) | Preserving and processing ambiguity in natural language | |
CN108460011A (en) | A kind of entitative concept mask method and system | |
CN110147544B (en) | Instruction generation method and device based on natural language and related equipment | |
CN112417846B (en) | Text automatic generation method and device, electronic equipment and storage medium | |
US11170169B2 (en) | System and method for language-independent contextual embedding | |
JP2006065387A (en) | Text sentence search device, method, and program | |
Denis | New learning models for robust reference resolution | |
CN118093838B (en) | Large language model prompt word generation method, system, terminal equipment and medium | |
CN111611793B (en) | Data processing method, device, equipment and storage medium | |
CN110750967A (en) | Pronunciation labeling method and device, computer equipment and storage medium | |
CN113255374B (en) | Question and answer management method and system | |
CN113536776A (en) | Confusion statement generation method, terminal device and computer-readable storage medium | |
CN112732885A (en) | Answer extension method and device for question-answering system and electronic equipment | |
CN115437620B (en) | Natural language programming method, device, equipment and storage medium | |
CN114091465B (en) | Semantic recognition method, semantic recognition device, storage medium and electronic device | |
Truskinger et al. | Reconciling folksonomic tagging with taxa for bioacoustic annotations | |
Narayan et al. | Pre-Neural Approaches | |
Henrich et al. | LISGrammarChecker: Language Independent Statistical Grammar Checking |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |