CN117764180A - Illusion determination method and device and electronic equipment - Google Patents
Illusion determination method and device and electronic equipment Download PDFInfo
- Publication number
- CN117764180A CN117764180A CN202311846383.XA CN202311846383A CN117764180A CN 117764180 A CN117764180 A CN 117764180A CN 202311846383 A CN202311846383 A CN 202311846383A CN 117764180 A CN117764180 A CN 117764180A
- Authority
- CN
- China
- Prior art keywords
- response result
- language model
- prompt
- illusion
- statement
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000004044 response Effects 0.000 claims abstract description 91
- 238000004590 computer program Methods 0.000 claims description 20
- 238000005457 optimization Methods 0.000 claims description 13
- 230000004927 fusion Effects 0.000 claims description 9
- 238000003860 storage Methods 0.000 claims description 9
- 230000014509 gene expression Effects 0.000 claims description 7
- 238000001514 detection method Methods 0.000 claims description 4
- 238000011156 evaluation Methods 0.000 claims description 4
- 238000013473 artificial intelligence Methods 0.000 abstract description 5
- 238000004519 manufacturing process Methods 0.000 description 33
- 238000010586 diagram Methods 0.000 description 12
- 230000006870 function Effects 0.000 description 7
- 230000008520 organization Effects 0.000 description 7
- 238000012545 processing Methods 0.000 description 7
- 238000012795 verification Methods 0.000 description 6
- 238000009826 distribution Methods 0.000 description 4
- 239000000284 extract Substances 0.000 description 4
- 238000000605 extraction Methods 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000012549 training Methods 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 239000012634 fragment Substances 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 208000004547 Hallucinations Diseases 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004891 communication Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 239000003814 drug Substances 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 108010015780 Viral Core Proteins Proteins 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 230000006386 memory function Effects 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 230000001502 supplementing effect Effects 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
The application relates to the technical field of artificial intelligence, in particular to a method and a device for determining illusion and electronic equipment. In the method, after a first instruction is sent to a language model, a response result from the language model is received, wherein the first instruction comprises a problem input by a user. After sending the second instruction to the language model, a declaration from the language model is received. The second instruction comprises a response result and a preset density chain COD prompt word. The COD prompt word is used for guiding the language model to extract the statement from the response result according to the COD prompt word. Based on the statement, it is detected whether or not there is an illusion, which is a response result inconsistent with the fact, or a response result not corresponding to the instruction. According to the scheme, the illusion determination mode can be optimized, and the correctness of the response result is verified.
Description
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a method and a device for determining illusion and electronic equipment.
Background
In recent years, language models have attracted extensive attention and research in the technical field of artificial intelligence, have made remarkable breakthrough, and have been widely applied to various industries. However, since the language model is limited in vertical data in different fields such as medicine, law, power, etc. This results in language models that do not have expertise in the vertical domain, so that there is some illusion error in the response results.
Meanwhile, large-scale language models of practical deployment applications such as ChatGPT are currently on the market, and the parameter scale of the language models reaches the billion level. This very large scale model is difficult to deploy and has a large computational cost. While the illusion in the response results severely hampers the application of language models in real-world scenarios. How to determine whether there is a illusion in the response results of the language model, verifying the correctness of the response results is a questionable problem.
Disclosure of Invention
The embodiment of the application provides a method, a device and electronic equipment for determining illusions, which are used for optimizing an illusion determination mode and verifying the correctness of a response result.
In a first aspect, embodiments of the present application provide a method for determining an illusion, including:
after a first instruction is sent to the language model, a response result from the language model is received, wherein the first instruction comprises a problem input by a user;
after a second instruction is sent to the language model, a statement from the language model is received, wherein the second instruction comprises a response result and a preset density chain COD prompt word, and the COD prompt word is used for guiding the language model to extract the statement from the response result according to the COD prompt word;
based on the statement, it is detected whether or not there is an illusion, which is a response result inconsistent with the fact, or a response result not corresponding to the instruction.
Optionally, the COD hint term is one or more of the following information:
entity recognition prompt words, namely prompt words for recognizing a plurality of entities in the response result;
text length cue words, which are cue words for determining the declared text length;
iterative optimization prompt words, wherein the iterative optimization prompt words are prompt words for setting iterative optimization times;
entity fusion prompt words are prompt words used for setting fusion rules among a plurality of entities.
Optionally, the declaration is one of the following types of declarations:
a fact, or, a piece of code, or, a mathematical expression, or, a literature reference.
Optionally, the text length of the statement is less than or equal to the text length of the response result.
Optionally, the COD-end hint may further include a declaration output hint for instructing the language model to output declarations of different text types.
Optionally, detecting whether the response result has a illusion according to the statement, specifically including:
based on the statement, using the illusion evaluation tool, factol, it is determined whether an illusion exists in the response result.
In a second aspect, embodiments of the present application further provide an illusion determining apparatus, including:
the receiving and transmitting module is used for receiving a response result from the language model after sending a first instruction to the language model, wherein the first instruction comprises a problem input by a user;
the receiving and transmitting module is further used for receiving a statement from the language model after sending a second instruction to the language model, wherein the second instruction comprises a response result and a preset density chain COD prompt word, and the COD prompt word is used for guiding the language model to extract the statement from the response result according to the COD prompt word;
and the detection module is used for detecting whether the response result has illusion based on the statement, wherein the illusion is a response result inconsistent with the fact or a response result not corresponding to the instruction.
In a third aspect, an embodiment of the present application further provides an electronic device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the computer program when executed by the processor causes the processor to implement any one of the methods for determining an index based on internet of things data in the first aspect.
In a fourth aspect, embodiments of the present application further provide a computer readable storage medium, where a computer program is stored, where any one of the methods for determining an index based on internet of things data in the first aspect is implemented when the computer program is executed by a processor.
In a fifth aspect, embodiments of the present application further provide a computer program product, including a computer program, where the computer program is executed by a processor to implement any one of the methods for determining an index based on internet of things data according to the first aspect.
The technical effects caused by any implementation manner of the second aspect to the fifth aspect may refer to the technical effects caused by the corresponding implementation manner of the first aspect, and are not described herein.
Drawings
Fig. 1 is an application scenario schematic diagram of a hallucination determining method provided in an embodiment of the present application;
FIG. 2 is a flow chart of a method for determining illusion according to an embodiment of the present application;
FIG. 3 is a schematic diagram of an apparatus for illusion determination according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
In the following, some terms in the embodiments of the present application are explained for easy understanding by those skilled in the art.
1. "and/or", describes an association relationship of an association object, and indicates that there may be three relationships, for example, a and/or B, and may indicate: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship.
2. The language model refers to a deep learning model trained using a large amount of text data, which can generate natural language text or understand the meaning of the language text. The language model can process various natural language tasks, such as text classification, question-answering, dialogue and the like, and is an important path to artificial intelligence.
3. Factol is an important technical framework for preventing language model illusion, and is mainly divided into five major core modules. Namely declaration extraction, query generation, tool query, evidence collection, and consistency verification. Specifically, claim extraction is to extract claims from responses of various task settings, query generation is to generate a query to obtain evidence, tool queries are to obtain evidence using tools, evidence collection is to collect evidence to verify the correctness of the claim, and consistency verification is to verify the correctness of the claim using multiple tools.
4. The concept of Chain of thought (CoT) was first proposed in Google's paper "Chain-of-Thought Prompting Elicits Reasoning in Large Language Models". The chain of thought (CoT) is an improved hinting strategy for improving LLM performance in complex reasoning tasks such as arithmetic reasoning, common sense reasoning and symbolic reasoning.
Among these, the CoT cues include two main types:
1. less-sample cots. It is a model that is prompted by several presentations, each of which contains a high quality chain of inferences that are written manually (or model generation).
2. Zero fire CoT. Using natural language sentences Let us think step by step () Let's think step by step to explicitly encourage the model to first generate the inference chain, and then prompt that the answer is therefore (Therefore the answer is), generating an answer. Or similar phrase "Let us solve this problem step by step to ensure that we have the correct answer (Let's work this out it a step by step to be sure we have the right answer)".
5. An entity is a meaning of an entity's essence, a specific thing, an individual subject, a supporter of a phenomenon, etc., and its meaning generally refers to what can exist independently as a basis for all attributes and a everything primitive.
For the purposes of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail below with reference to the accompanying drawings, wherein it is apparent that the described embodiments are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.
The application scenario described in the embodiments of the present application is for more clearly describing the technical solution of the embodiments of the present application, and does not constitute a limitation on the technical solution provided in the embodiments of the present application, and as a person of ordinary skill in the art can know that, with the appearance of a new application scenario, the technical solution provided in the embodiments of the present application is also applicable to similar technical problems. In the description of the present application, unless otherwise indicated, the meaning of "a plurality" is two or more.
In recent years, language models have attracted extensive attention and research in the technical field of artificial intelligence, have made remarkable breakthrough, and have been widely applied to various industries. However, since the language model is limited in vertical data in different fields such as medicine, law, power, etc. This results in language models that do not have expertise in the vertical domain, so that there is some illusion error in the response results.
Meanwhile, a large-scale language model of practical deployment application such as ChatGPT is currently on the market, and the parameter scale of the large-scale language model reaches the billion level. This very large scale model is difficult to deploy and has a large computational cost. While the illusion in the response results severely hampers the application of language models in real-world scenarios. How to determine whether there is a illusion in the response results of the language model, verifying the correctness of the response results is a questionable problem.
In order to solve the above problems, embodiments of the present application provide a method, an apparatus, and an electronic device for determining a illusion. For example, after sending a first instruction to the language model, a response result from the language model is received. Wherein the first instruction includes a question entered by the user. After sending the second instruction to the language model, a declaration from the language model is received. The second instruction comprises a response result and a preset density chain COD prompt word. The COD prompt word is used for guiding the language model to extract the statement from the response result according to the COD prompt word. Whether or not there is an illusion in the response result is detected based on the statement. Wherein the illusion is a response result that is inconsistent with the fact, or a response result that does not correspond to the instruction.
Therefore, the method and the device extract the statement from the response result according to the COD prompt words based on the COD prompt word guidance language model, and are convenient for detecting whether the response result has illusion or not based on the statement more accurately. Meanwhile, in the prior art, a COT prompt word is generally adopted to guide a language model to extract a statement from a response result according to the COD prompt word. Since the declarations generated based on the COT hint words are often fragmented, not complete whole sentences. And COD cue words increase the consistency between cue words compared with COT cue words. Therefore, the statement extracted from the response result by the COD prompt word based on the COD prompt word guiding language model is a whole sentence which is more in line with the real answer, is more accurate, is beneficial to optimizing the subsequent comparison and verification whether the response result has illusion, and reduces the possibility of subsequent risks caused by the illusion in the response result.
As shown in fig. 1, an application scenario diagram of an alternative illusion determining method in the embodiment of the present application includes a server 100 and a terminal 101, where the server 100 and the terminal 101 may be connected to each other by a network in a communication manner, so as to implement the illusion determining method in the present application.
A user may interact with the terminal 101, e.g. receive or send messages, etc., through a network using the server 100. The terminal 101 may have installed thereon various client applications such as a programming class application, a web browser application, a search class application, and the like.
It can be appreciated that in the embodiment of the present application, the server 100 may be implemented as a stand-alone server or a server cluster formed by a plurality of servers. The terminal 101 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, desktop computers, and the like.
As shown in fig. 2, a flowchart of a hallucination determining method provided in an embodiment of the present application may specifically include the following steps.
Step 201, after sending a first instruction to the language model, receiving a response result from the language model.
Wherein the first instruction includes a question entered by the user.
For example, the user may input the question "what is an index of five implementations five into place". The response result of the self-language model is received, namely that the requirement of party administrative responsibility must be met, and the director and the general manager share leadership for the enterprise safety production work; the 'one post double responsibility' of the safety production must be implemented, and all leading team members bear corresponding responsibility for the safety production work in the distribution and management range; a security production organization leader must be implemented, a security production committee is established, and the director or the overall manager acts as the master; the safety management force must be realized, the safety production management mechanism is set according to law, and professional safety management personnel such as security engineers are registered in a matched mode; the safety production reporting system must be implemented to report the safety production to the board of the board and the performance assessment department at regular intervals, and to be publicized to society. The safety responsibility, the safety investment, the safety training, the safety management and the emergency rescue must be in place.
Step S202, after a second instruction is sent to the language model, a statement from the language model is received.
The second instruction includes a response result and a preset Density Chain (CoD) hint word. The COD prompt word is used for guiding the language model to extract the statement from the response result according to the COD prompt word.
The COD prompt word is one or more of the following information:
entity recognition prompt words, namely prompt words for recognizing a plurality of entities in the response result;
text length cue words, which are cue words for determining the declared text length;
iterative optimization prompt words, wherein the iterative optimization prompt words are prompt words for setting iterative optimization times;
entity fusion prompt words are prompt words used for setting fusion rules among a plurality of entities.
Optionally, the COD prompting words may further include declarative output prompting words. The declaration output prompt word is used for instructing the language model to output declarations of different text types.
It is understood that the declared text length is less than or equal to the response result text length.
Optionally, the declaration is one of the following types of declarations:
a fact, or, a piece of code, or, a mathematical expression, or, a literature reference.
For example, assume that the user input question is "what is an index of five implementations five into place". The response result of the self-language model is received, namely that the requirement of party administrative responsibility must be met, and the director and the general manager share leadership for the enterprise safety production work; the 'one post double responsibility' of the safety production must be implemented, and all leading team members bear corresponding responsibility for the safety production work in the distribution and management range; a security production organization leader must be implemented, a security production committee is established, and the director or the overall manager acts as the master; the safety management force must be realized, the safety production management mechanism is set according to law, and professional safety management personnel such as security engineers are registered in a matched mode; the safety production reporting system must be implemented to report the safety production to the board of the board and the performance assessment department at regular intervals, and to be publicized to society. The safety responsibility, the safety investment, the safety training, the safety management and the emergency rescue must be in place.
Let it be assumed that the COD prompt can be "you need to deal with a piece of text containing knowledge claims according to a question. A claim is a statement that asserts something true or false that can be verified by a human. The task is to understand the text from a word perspective, while deeply understanding the text and questions, accurately recognize and extract each claim in the provided text and output in accordance with the true logic. Then, for clarity, any reference or other expression in the claims is resolved, and the absence of subject matter is requested to find the correct subject matter from the question and text and add it. Each claim should be compact (less than 50 words) and self-organizing.
You answer must be a list of dictionaries. Each dictionary should contain a key "class", and corresponding to the extracted claims (all references are resolved), splitting the sentence with the style number is accurate and without limitation, and reorganizing the sentence with the containing relation is accurate and without limitation.
The following 2 steps are repeated 5 times.
Step 1, 1-3 informative entities (in ";" partitions ") are identified from the text, which are missing in the claims previously generated.
Step 2, writing a new, denser set of claims, the same length, covering each entity and detail in the previous claims, as well as the missing entities.
The missing entities are:
related to: in relation to the main story.
Specific: descriptive but concise (5 words or less).
Novel: not in the preceding abstract.
Faithful: in the text.
Anywhere: anywhere in the text.
The guide is as follows:
the first group of claims should be numerous (8-10 sentences), but very much less specific, with little information other than the entities marked as missing. Too lengthy language and filler words (e.g., "this text discussion") are used to reach about 80 words.
Letting each word have value: the previous summary is rewritten to improve fluency and make room for additional entities.
Space is made by fusing, compressing, and removing non-information phrases such as "this text discussion".
Claims should become very dense and concise, but self-contained, e.g., without the need for articles, to be easily understood.
The missing entity may appear anywhere in the new claim.
No entity is ever deleted from the previous claims. If space cannot be made, fewer new entities are added. Bearing in mind that the number of words per subject should be identical.
The following is now completed:
[ problem ] [ query ]
[ text ] { input }
[ answer ] [ please help me output the best set of "among the several sets of answers above"
The server sends the response result and the preset COD prompt word as a second instruction to the language model, and the received statement is:
1. the five implementation and five in place means that the requirement of party management responsibility must be implemented, and the director, party organization book and general manager share the leadership for the safety production work of the enterprise.
2. The "one post double duty" of the safety production must be implemented, and all lead team members assume the corresponding duty for the safety production work within the scope of the distribution pipe.
3. The security production organization leader organization must be implemented to establish a security production committee, which is acted as the master by the board or general manager.
4. The security management force must be realized, the security production management mechanism is set according to law, and professional security management personnel such as security engineers are registered in a matched mode.
5. The safety production reporting system must be implemented to report the safety production to the board of the board and the performance assessment department at regular intervals, and to be publicized to society.
6. Safety responsibility, safety investment, safety training, safety management and emergency rescue must be in place.
In the prior art, often, a COT prompt word is used to instruct a language model to extract a statement from a response result. For example, the COT hint word may be "you need to deal with a piece of text containing knowledge claims in terms of a question. A claim is a statement that asserts something true or false that can be verified by a human. The task is to understand the text from a word perspective, while deeply understanding the text and questions, accurately recognize and extract each claim in the provided text and output in accordance with the true logic. Then, for clarity, any reference or other expression in the claims is resolved, and the absence of subject matter is requested to find the correct subject matter from the question and text and add it. Each claim should be compact (less than 50 words) and self-organizing.
You answer must be a list of dictionaries. Each dictionary should contain a key "class", corresponding to the extracted claims (all references are resolved), splitting the sentence with the style is accurate and without limitation, reorganizing the sentence with the containing relation is accurate and without limitation, and the example will help the generation of claims.
You have to answer only in the format described below. And does not answer anything else. Adding any other additional description that violates the answer format is prohibited. The answer is started with '[' you.
Answer format:
[
{{
"claim" ensures that fewer than 50 words are claimed and conveys a complete idea. For clarity, any reference relationships (pronouns or other indicative expressions) in the claims are resolved, attention is paid to supplementing the time-like language, splitting sentences with a pause number is accurate and without limitation,
}},
...
]”
the server sends the response result and the preset COT prompt word as a second instruction to the language model, and the received statement is:
1. the five implementation and five in place means that the requirement of party management responsibility must be implemented, and the director, party organization book and general manager share the leadership for the safety production work of the enterprise.
2. The safety production "one post double duty" must be implemented.
3. All lead team members assume the corresponding responsibility for secure production work within the scope of the distribution.
4. A safety production organization leader must be implemented.
5. The safety production committee is established.
6. The principal is served by the board or general manager.
7. The security management forces must be implemented.
8. And a safety production management mechanism is arranged according to law.
9. Professional security management personnel such as security engineers and the like are registered in a matched and strong mode.
10. A safety production reporting regime must be implemented.
11. And regularly reporting the safe production condition to the board of the directors and the performance assessment departments.
12. Is publicized to society.
13. Security responsibility must be put in place.
14. Safely put into place.
15. Safety training is in place.
16. The security management is in place.
17. Emergency rescue is in place.
Therefore, compared with the prior art that the COD prompt word is adopted to guide the language model to extract the statement from the response result, the method and the device have the advantages that the COD prompt word is adopted to guide the language model to extract the statement from the response result, iterative optimization can be carried out on sparse generation generated by the language model on the premise that the statement is smaller than or equal to the text length of the response result, missing important entities are gradually added, and the statement with better consistency is generated. Facilitating a later more accurate determination of whether or not there is an illusion in the response result based on the statement.
Step S203, detecting whether there is a illusion based on the declaration.
Wherein the illusion is a response result that is inconsistent with the fact, or a response result that does not correspond to the instruction.
In an alternative embodiment, the illusion evaluation tool, factol, may be employed to determine whether an illusion exists in the response results based on the statement.
Among them, factol is an important technical framework for preventing the illusion of large models, and is mainly divided into five core modules. Namely declaration extraction, query generation, tool query, evidence collection, and consistency verification. Specifically, claim extraction is to extract claims from responses of various task settings, query generation is to generate a query to obtain evidence, tool queries are to obtain evidence using tools, evidence collection is to collect evidence to verify the correctness of the claim, and consistency verification is to verify the correctness of the claim using multiple tools.
For example, after the completion of extracting the claims, each claim may be converted into a list of queries for querying external tools such as a search engine, a Python interpreter, or Google scholarar. In particular, the method comprises the steps of,
for example, for a knowledge base based question-answer scenario, the prompt language model generates multiple search engine queries for each claim. These queries are intended to help humans verify the authenticity of the claims. Assuming a mathematical problem scenario, the language model is prompted to translate all mathematical operations into executable code (Python) fragments. The purpose of these code segments is to return "true" when the computed result is consistent with the computed answer, and "false" when it is inconsistent.
After the query is completed, various tools are queried using the query to collect relevant evidence statements.
For example, for a mathematical problem scenario, execution results of code fragments are collected, a language model extracts a mathematical problem of "30/3=10", and then converts it into Python executable code, for example, "print (30/3, 7) = =10)".
After the related evidence sentences are collected, a consistency verification method is adopted, and each statement is marked with a factual label according to the support degree of the collected evidence on each statement. The factual tag is used to represent the true or false of the assertion.
For example, for a mathematical problem scenario, the execution results of each code segment are compiled. If any code segment returns "false," the declaration generated in relation is classified as false. Otherwise, if all code fragments return "true," the corresponding generated statement is classified as "true.
In the case where all assertions are true, it is determined that no illusion exists in the response result. In the case of either claim being false, it is determined that there is an illusion in the response result.
According to the method and the device for determining whether the response result has the illusion or not based on the statement, the illusion evaluating tool Factol is adopted to determine whether the response result has the illusion or not, the illusion can be timely determined whether the response result has the illusion or not, the cost is low, the efficiency is high, and the use experience of a user is effectively improved.
Fig. 3 is a schematic structural diagram of an index determining device based on internet of things data according to an embodiment of the present application, where, as shown in fig. 3, the device includes: a transceiver module 301 and a detection module 302.
The transceiver module 301 is configured to receive a response result from the language model after sending a first instruction to the language model, where the first instruction includes a problem input by a user;
the transceiver module 301 is further configured to receive a declaration from the language model after sending a second instruction to the language model, where the second instruction includes a response result and a preset density chain COD prompting word, and the COD prompting word is used to instruct the language model to extract the declaration from the response result according to the COD prompting word;
the detection module 302 is configured to detect whether there is a illusion, which is a response result inconsistent with the fact, or a response result not corresponding to the instruction, based on the statement.
Optionally, the COD hint term is one or more of the following information:
entity recognition prompt words, namely prompt words for recognizing a plurality of entities in the response result;
text length cue words, which are cue words for determining the declared text length;
iterative optimization prompt words, wherein the iterative optimization prompt words are prompt words for setting iterative optimization times;
entity fusion prompt words are prompt words used for setting fusion rules among a plurality of entities.
Optionally, the declaration is one of the following types of declarations:
a fact, or, a piece of code, or, a mathematical expression, or, a literature reference.
Optionally, the text length of the statement is less than or equal to the text length of the response result.
Optionally, the COD-end hint may further include a declaration output hint for instructing the language model to output declarations of different text types.
Optionally, the detecting module 302 is specifically configured to detect whether there is an illusion according to the statement, where the illusion is detected according to the result of the response:
based on the statement, using the illusion evaluation tool, factol, it is determined whether an illusion exists in the response result.
Based on the same technical conception, the embodiment of the application also provides electronic equipment, which can realize the function of the index determining device based on the internet of things data.
Fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
At least one processor 401, and a memory 402 connected to the at least one processor 401, in this embodiment of the present application, a specific connection medium between the processor 401 and the memory 402 is not limited, and in fig. 4, the processor 401 and the memory 402 are connected by a bus 400 as an example. The bus 400 is shown in bold lines in fig. 4, and the manner in which the other components are connected is illustrated schematically and not by way of limitation. The bus 400 may be divided into an address bus, a data bus, a control bus, etc., and is represented by only one thick line in fig. 4 for ease of illustration, but does not represent only one bus or one type of bus. Alternatively, the processor 401 may be referred to as a controller, and the name is not limited.
In the embodiment of the present application, the memory 402 stores instructions executable by the at least one processor 401, and the at least one processor 401 may execute the index determining method based on the internet of things data by executing the instructions stored in the memory 402. Processor 401 may implement the functions of the various modules in the apparatus shown in fig. 3.
The processor 401 is a control center of the apparatus, and various interfaces and lines can be used to connect various parts of the entire control device, and by executing or executing instructions stored in the memory 402 and invoking data stored in the memory 402, various functions of the apparatus and processing data can be performed, so that the apparatus is monitored as a whole.
In one possible design, processor 401 may include one or more processing units, and processor 401 may integrate an application processor and a modem processor, wherein the application processor primarily processes operating systems, driver interfaces, application programs, and the like, and the modem processor primarily processes wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 401. In some embodiments, processor 401 and memory 402 may be implemented on the same chip, and in some embodiments they may be implemented separately on separate chips.
The processor 401 may be a general purpose processor such as a Central Processing Unit (CPU), digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, which may implement or perform the methods, steps and logic blocks disclosed in the embodiments of the present application. The general purpose processor may be a microprocessor or any conventional processor or the like. The steps of the index determining method based on the internet of things data disclosed in connection with the embodiment of the application can be directly embodied as the execution completion of a hardware processor or the execution completion of the combination execution of hardware and software modules in the processor.
Memory 402 is a non-volatile computer-readable storage medium that can be used to store non-volatile software programs, non-volatile computer-executable programs, and modules. The Memory 402 may include at least one type of storage medium, which may include, for example, flash Memory, hard disk, multimedia card, card Memory, random access Memory (Random Access Memory, RAM), static random access Memory (Static Random Access Memory, SRAM), programmable Read-Only Memory (Programmable Read Only Memory, PROM), read-Only Memory (ROM), charged erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory), magnetic Memory, magnetic disk, optical disk, and the like. Memory 402 is any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to such. The memory 402 in the present embodiment may also be circuitry or any other device capable of implementing a memory function for storing program instructions and/or data.
By programming the processor 401, codes corresponding to the index determining method based on the internet of things data described in the foregoing embodiment can be cured into the chip, so that the chip can execute the index determining method based on the internet of things data in the embodiment shown in fig. 2 during running. How to design and program the processor 401 is a technology well known to those skilled in the art, and will not be described in detail here.
It should be noted that, the above power-on electronic device provided in the embodiment of the present application can implement all the method steps implemented in the embodiment of the method, and can achieve the same technical effects, and specific details of the same parts and beneficial effects as those of the embodiment of the method in the embodiment are not described herein.
The embodiment of the application also provides a computer readable storage medium, wherein the computer readable storage medium stores computer executable instructions, and the computer executable instructions are used for enabling a computer to execute the index determining method based on the internet of things data in the embodiment.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present application without departing from the spirit or scope of the application. Thus, if such modifications and variations of the present application fall within the scope of the claims and the equivalents thereof, the present application is intended to cover such modifications and variations.
Claims (10)
1. A method of illusion determination, the method comprising:
after a first instruction is sent to a language model, receiving a response result from the language model, wherein the first instruction comprises a problem input by a user;
after a second instruction is sent to the language model, a statement from the language model is received, wherein the second instruction comprises the response result and a preset density chain COD prompt word, and the COD prompt word is used for guiding the language model to extract the statement from the response result according to the COD prompt word;
detecting whether a phantom exists in the response result based on the statement, the phantom being a response result inconsistent with a fact or a response result not corresponding to the instruction.
2. The method of claim 1, wherein the COD hint term is one or more of the following information:
entity recognition prompt words, wherein the entity recognition prompt words are prompt words for recognizing a plurality of entities in the response result;
a text length prompt, the text length prompt being a prompt for determining a text length of the statement;
iterative optimization prompt words, wherein the iterative optimization prompt words are prompt words used for setting iterative optimization times;
entity fusion prompt words, wherein the entity fusion prompt words are prompt words for setting fusion rules among a plurality of entities.
3. The method of claim 1, wherein the declaration is one of the following types of declarations:
a fact, or, a piece of code, or, a mathematical expression, or, a literature reference.
4. A method according to any one of claims 1 to 3, wherein the declared text length is less than or equal to the response result text length.
5. A method according to any one of claims 1 to 3, wherein COD hint words further comprise claim output hint words for instructing the language model to output claims of different text types.
6. The method according to claim 1, wherein said detecting whether said response result has an illusion according to said statement, in particular comprises:
determining whether the response result has an illusion by using an illusion evaluation tool, factol, based on the statement.
7. An illusion determining device, comprising:
the receiving and transmitting module is used for receiving a response result from the language model after sending a first instruction to the language model, wherein the first instruction comprises a problem input by a user;
the receiving and transmitting module is further configured to receive a statement from the language model after sending a second instruction to the language model, where the second instruction includes the response result and a preset density chain COD prompting word, and the COD prompting word is used to instruct the language model to extract the statement from the response result according to the COD prompting word;
and the detection module is used for detecting whether the response result has illusion based on the statement, wherein the illusion is a response result inconsistent with the fact or a response result not corresponding to the instruction.
8. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable by the processor, characterized in that the processor implements the steps of the method according to any one of claims 1-6 when executing the computer program.
9. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method according to any one of claims 1-6.
10. A computer program product, characterized in that the computer program product, when called by a computer, causes the computer to perform the steps of the method according to any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311846383.XA CN117764180A (en) | 2023-12-28 | 2023-12-28 | Illusion determination method and device and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311846383.XA CN117764180A (en) | 2023-12-28 | 2023-12-28 | Illusion determination method and device and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117764180A true CN117764180A (en) | 2024-03-26 |
Family
ID=90312429
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311846383.XA Pending CN117764180A (en) | 2023-12-28 | 2023-12-28 | Illusion determination method and device and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117764180A (en) |
-
2023
- 2023-12-28 CN CN202311846383.XA patent/CN117764180A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7685082B1 (en) | System and method for identifying, prioritizing and encapsulating errors in accounting data | |
US9542477B2 (en) | Method of automated discovery of topics relatedness | |
US9607035B2 (en) | Extensible validation framework for question and answer systems | |
WO2022237253A1 (en) | Test case generation method, apparatus and device | |
US20200356363A1 (en) | Methods and systems for automatically generating documentation for software | |
Vlas et al. | Two rule-based natural language strategies for requirements discovery and classification in open source software development projects | |
US20220343082A1 (en) | System and method for ensemble question answering | |
CN102609406B (en) | Learning device, judgment means, learning method and determination methods | |
CN111694937A (en) | Interviewing method and device based on artificial intelligence, computer equipment and storage medium | |
CN111866004B (en) | Security assessment method, apparatus, computer system, and medium | |
CN107391682A (en) | Knowledge verification method, knowledge verification equipment and storage medium | |
CN113704420A (en) | Method and device for identifying role in text, electronic equipment and storage medium | |
CN113704393A (en) | Keyword extraction method, device, equipment and medium | |
CN116796730A (en) | Text error correction method, device, equipment and storage medium based on artificial intelligence | |
CN116861242A (en) | Language perception multi-language pre-training and fine tuning method based on language discrimination prompt | |
Zacharis et al. | AiCEF: an AI-assisted cyber exercise content generation framework using named entity recognition | |
CN112100355A (en) | Intelligent interaction method, device and equipment | |
CN117034230A (en) | Data verification method, device, equipment and storage medium thereof | |
CN116956866A (en) | Scenario data processing method, apparatus, device, storage medium and program product | |
Schmaltz | On the utility of lay summaries and AI safety disclosures: Toward robust, open research oversight | |
Yang | [Retracted] Application of English Vocabulary Presentation Based on Clustering in College English Teaching | |
KR102626714B1 (en) | Twofold semi-automatic symbolic propagation method of training data for natural language understanding model, and device therefor | |
CN117764180A (en) | Illusion determination method and device and electronic equipment | |
JP2021149956A (en) | Problem item Automatic generation method and system | |
CN112989001A (en) | Question and answer processing method, device, medium and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |