CN117744662A

CN117744662A - Method, device, electronic equipment and medium for processing prompt information

Info

Publication number: CN117744662A
Application number: CN202311627941.3A
Authority: CN
Inventors: 高奇乐
Original assignee: Industrial and Commercial Bank of China Ltd ICBC
Current assignee: Industrial and Commercial Bank of China Ltd ICBC
Priority date: 2023-11-30
Filing date: 2023-11-30
Publication date: 2024-03-22

Abstract

The disclosure provides a method for processing prompt information, which can be applied to the technical field of natural language processing. The method comprises the following steps: carrying out semantic analysis on the prompt information to be processed to obtain a semantic structure diagram, wherein the semantic structure diagram comprises nodes related to prompt words in the prompt information and side relations among the nodes, and the side relations represent semantic relations among the prompt words; determining intermediate prompt words from the prompt information according to the edge relation in the semantic structure diagram; and generating target prompt information according to the intermediate prompt words and the intermediate side relations associated with the intermediate prompt words, wherein the target prompt information is suitable for being input into a pre-trained large language model, and generating target text. The present disclosure also provides an apparatus, device, storage medium, and program product for processing hint information.

Description

Method, device, electronic equipment and medium for processing prompt information

Technical Field

The present disclosure relates to the field of natural language processing, and in particular, to a method, apparatus, device, medium, and program product for processing hint information.

Background

With the rapid development of large language models, various industries are actively attempting to merge the large language models with actual business scenes so as to improve the working efficiency and quality. In the application process, the large language model can generate a prediction result such as a text and a picture according to text information input by a user so as to improve office efficiency, but the accuracy of the generated prediction result of the current large language model is not ideal, the processing time is long, and the actual requirement is difficult to meet.

Disclosure of Invention

In view of the foregoing, the present disclosure provides a method, apparatus, electronic device, computer storage medium, and program product for processing hint information.

According to a first aspect of the present disclosure, there is provided a method for processing hint information, comprising:

carrying out semantic analysis on the prompt information to be processed to obtain a semantic structure diagram, wherein the semantic structure diagram comprises nodes related to prompt words in the prompt information and side relations among the nodes, and the side relations represent semantic relations among the prompt words;

determining intermediate prompt words from the prompt information according to the edge relation in the semantic structure diagram; and

and generating target prompt information according to the intermediate prompt words and the intermediate side relation associated with the intermediate prompt words, wherein the target prompt information is suitable for being input into a pre-trained large language model, and generating target text.

According to an embodiment of the present disclosure, determining an intermediate hint word from hint information according to an edge relationship in a semantic structure graph includes:

determining the quantity of the side relation coefficient related to the node in the semantic structure diagram;

comparing the number of the edge relations with a preset number threshold value to obtain a number comparison result;

determining key nodes from the semantic structure diagram according to the quantity comparison result; and

and determining the prompt word associated with the key node in the prompt information as an intermediate prompt word.

According to an embodiment of the present disclosure, determining key nodes from a semantic structure graph according to a number comparison result includes:

and under the condition that the quantity comparison result represents that the quantity of the edge relations is larger than or equal to a preset quantity threshold value, determining a node corresponding to the quantity comparison result in the semantic structure diagram as a first key node, wherein the key node comprises the first key node.

According to an embodiment of the present disclosure, determining key nodes from the semantic structure graph further includes:

under the condition that the number of the edge relation represented by the number comparison result is smaller than a preset number threshold, matching the prompting words represented by the nodes corresponding to the number comparison result with preset keywords in a preset keyword set to obtain a keyword matching result;

And under the condition that the keyword matching result characterizes the matching, determining the node corresponding to the keyword matching result in the semantic structure diagram as a second key node, wherein the key node further comprises the second key node.

According to an embodiment of the present disclosure, the side relationship includes side direction information characterizing semantic dependencies between hint words,

wherein, according to the side relation in the semantic structure diagram, determining the intermediate prompt word from the prompt message further comprises:

determining an end node from the semantic structure diagram according to the side direction information, wherein the side direction information related to the end node is first direction information or second direction information, the first direction information represents that the slave end node points to a node adjacent to the end node, and the second direction information represents that the slave end node points to the end node; and

and determining the prompt word associated with the end node in the prompt information as an intermediate prompt word.

According to an embodiment of the present disclosure, generating target hint information from an intermediate hint word and an intermediate edge relationship associated with the intermediate hint word includes:

determining the intermediate semantic relationship between the intermediate prompt words according to the intermediate side relationship associated with the intermediate prompt words;

fusing a plurality of intermediate prompt words according to the intermediate semantic relation to obtain intermediate prompt information; and

And processing the intermediate prompt information by using the pre-trained text prediction model to obtain target prompt information.

According to an embodiment of the present disclosure, the method for processing the hint information further includes:

inputting target prompt information into a prompt information classification model, and outputting prompt information types; and

and labeling the target prompt information according to the prompt information category.

According to an embodiment of the present disclosure, the hint information classification model includes a text feature extraction network, a location feature extraction network, a fusion network, and a classification network;

the target prompt information is input into the prompt information classification model, and the output prompt information comprises the following steps:

inputting the target prompt information into a text feature extraction network, and outputting text features;

inputting the target prompt information into a position feature extraction network, and outputting position features;

inputting the text features and the position features into a fusion network, and outputting the fusion features, wherein the fusion network is constructed based on an attention network algorithm; and

and inputting the fusion characteristics into a classification network, and outputting prompt information types.

A second aspect of the present disclosure provides a text generation method, including:

responding to the text editing operation, and updating target prompt information according to the text corresponding to the text editing operation to obtain updated target prompt information, wherein the target prompt information is obtained according to the method for processing the prompt information;

And inputting the updated target prompt information into a pre-trained large language model to generate a target text.

A third aspect of the present disclosure provides an apparatus for processing hint information, comprising:

the analysis module is used for carrying out semantic analysis on the prompt information to be processed to obtain a semantic structure diagram, wherein the semantic structure diagram comprises nodes related to prompt words in the prompt information and side relations among the nodes, and the side relations represent semantic relations among the prompt words;

the extraction module is used for determining intermediate prompt words from the prompt information according to the edge relation in the semantic structure diagram; and

and the optimization module is used for generating target prompt information according to the intermediate prompt words and the intermediate side relation associated with the intermediate prompt words, wherein the target prompt information is suitable for being input into the pre-trained large language model to generate target text.

A fourth aspect of the present disclosure provides a text generating apparatus, including:

the updating module is used for responding to the text editing operation, updating the target prompt information according to the text corresponding to the text editing operation and obtaining updated target prompt information, wherein the target prompt information is obtained according to the method for processing the prompt information;

And the training module is used for inputting the updated target prompt information into the pre-trained large language model to generate a target text.

A fifth aspect of the present disclosure provides an electronic device, comprising: one or more processors; and a memory for storing one or more programs, wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the above-described method for processing hint information and text generation method.

The sixth aspect of the present disclosure also provides a computer-readable storage medium having stored thereon executable instructions that, when executed by a processor, cause the processor to perform the above-described method for processing hint information and text generation method.

A seventh aspect of the present disclosure also provides a computer program product comprising a computer program which, when executed by a processor, implements the above-described method for processing a hint information and text generation method.

According to the method, the device, the equipment, the medium and the program product for processing the prompt information, the intermediate prompt words are extracted according to the side relation of the semantic relation between the obtained semantic structure diagram and the representation prompt words by carrying out semantic analysis on the prompt information, so that redundant information in the prompt information is filtered, and the intermediate prompt words with stronger correlation with the prompt semantic attribute of the prompt information are fully reserved; according to the intermediate prompt words and the associated intermediate side relations of the intermediate prompt words, target prompt information capable of accurately representing prompt semantic attributes is further generated, text prediction is performed rapidly and accurately by the large language model under the condition that prompt semantics are accurately understood, text prediction precision and efficiency are improved, and the problems of low accuracy and low efficiency when the large language model processes the prompt information are solved.

Drawings

The foregoing and other objects, features and advantages of the disclosure will be more apparent from the following description of embodiments of the disclosure with reference to the accompanying drawings, in which:

FIG. 1 schematically illustrates an application scenario diagram of a method for processing hints information in accordance with an embodiment of the present disclosure;

FIG. 2 schematically illustrates a flow chart of a method for processing hint information according to embodiments of the present disclosure;

FIG. 3 schematically illustrates a schematic diagram of a method of generating a semantic structure according to an embodiment of the present disclosure;

FIG. 4 schematically illustrates a schematic diagram of a method of generating targeted hints information in accordance with an embodiment of the disclosure;

FIG. 5 schematically illustrates a flow chart of a text generation method according to an embodiment of the disclosure;

FIG. 6 schematically illustrates a block diagram of an apparatus for processing hint information according to an embodiment of the present disclosure;

fig. 7 schematically shows a block diagram of a text generating apparatus according to an embodiment of the present disclosure; and

fig. 8 schematically illustrates a block diagram of an electronic device adapted to implement a method for processing hint information according to an embodiment of the present disclosure.

Detailed Description

Hereinafter, embodiments of the present disclosure will be described with reference to the accompanying drawings. It should be understood that the description is only exemplary and is not intended to limit the scope of the present disclosure. In the following detailed description, for purposes of explanation, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present disclosure. It may be evident, however, that one or more embodiments may be practiced without these specific details. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the concepts of the present disclosure.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the disclosure. The terms "comprises," "comprising," and/or the like, as used herein, specify the presence of stated features, steps, operations, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, or components.

All terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art unless otherwise defined. It should be noted that the terms used herein should be construed to have meanings consistent with the context of the present specification and should not be construed in an idealized or overly formal manner.

Where expressions like at least one of "A, B and C, etc. are used, the expressions should generally be interpreted in accordance with the meaning as commonly understood by those skilled in the art (e.g.," a system having at least one of A, B and C "shall include, but not be limited to, a system having a alone, B alone, C alone, a and B together, a and C together, B and C together, and/or A, B, C together, etc.).

In the process of realizing the method, the large language model is found to generate a result according to the text prompt information input by the user, but for the more complex or spoken prompt information, the accuracy of the generated result of the large language model is not ideal, and the processing time length is increased. Existing improvement methods often require more human effort and are more difficult to shorten the processing time of the model. Therefore, a more efficient method is needed to improve the accuracy and efficiency of processing hints by enterprise-level large language models.

For enterprise-level large language models (Large Language Model, LLM), the situation is frequent due to the fact that fewer training sets and highly specialized vocabularies, and existing improvement methods often require more manpower investment and are difficult to shorten the processing time of the models. Therefore, a more efficient method is needed to improve the accuracy and efficiency of processing hints by enterprise-level large language models.

In view of this, embodiments of the present disclosure provide a method, an apparatus, an electronic device, and a medium for processing a prompt message. The method comprises the following steps: carrying out semantic analysis on the prompt information to be processed to obtain a semantic structure diagram, wherein the semantic structure diagram comprises nodes related to prompt words in the prompt information and side relations among the nodes, and the side relations represent semantic relations among the prompt words; determining intermediate prompt words from the prompt information according to the edge relation in the semantic structure diagram; and generating target prompt information according to the intermediate prompt words and the intermediate side relations associated with the intermediate prompt words, wherein the target prompt information is suitable for being input into a pre-trained large language model, and generating target text.

It should be noted that the method for processing the prompt message and the device for processing the prompt message provided by the present disclosure may be used in a financial field, for example, a financial institution such as a bank, and may also be used in any field other than the financial field, so the application fields of the method for processing the prompt message and the device for processing the prompt message provided by the present disclosure are not limited.

In the technical scheme of the invention, the related user information (including but not limited to user personal information, user image information, user equipment information, such as position information and the like) and data (including but not limited to data for analysis, stored data, displayed data and the like) are information and data authorized by a user or fully authorized by all parties, and the processing of the related data such as collection, storage, use, processing, transmission, provision, disclosure, application and the like are all conducted according to the related laws and regulations and standards of related countries and regions, necessary security measures are adopted, no prejudice to the public welfare is provided, and corresponding operation inlets are provided for the user to select authorization or rejection.

Fig. 1 schematically illustrates an application scenario diagram of a method for processing hint information according to an embodiment of the present disclosure.

As shown in fig. 1, an application scenario 100 according to this embodiment may include a first terminal device 101, a second terminal device 102, a third terminal device 103, a network 104, and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as shopping class applications, web browser applications, search class applications, instant messaging tools, mailbox clients, social platform software, etc. (by way of example only) may be installed on the terminal devices 101, 102, 103.

The terminal devices 101, 102, 103 may be a variety of electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablets, laptop and desktop computers, and the like.

The server 105 may be a server providing various services, such as a background management server (by way of example only) providing support for websites browsed by users using the terminal devices 101, 102, 103. The background management server may analyze and process the received data such as the user request, and feed back the processing result (e.g., the web page, information, or data obtained or generated according to the user request) to the terminal device.

It should be noted that, the method for processing the prompt information provided by the embodiment of the disclosure may be generally performed by the server 105. Accordingly, the apparatus for processing prompt information provided by the embodiments of the present disclosure may be generally disposed in the server 105. The method for processing hint information provided by the embodiments of the present disclosure may also be performed by a server or cluster of servers other than server 105 and capable of communicating with terminal devices 101, 102, 103 and/or server 105. Accordingly, the apparatus for processing hint information provided by the embodiments of the present disclosure may also be provided in a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.

It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

The method for processing hint information of the disclosed embodiment will be described in detail with reference to fig. 2 to 4 based on the scenario described in fig. 1.

Fig. 2 schematically illustrates a flow chart of a method for processing hint information according to an embodiment of the present disclosure.

As shown in fig. 2, the method 200 includes operations S210 to S230.

In operation S210, semantic analysis is performed on the prompt information to be processed to obtain a semantic structure diagram, where the semantic structure diagram includes nodes related to the prompt words in the prompt information and edge relations between the nodes, and the edge relations represent semantic relations between the prompt words.

In operation S220, intermediate cue words are determined from the cue information according to the edge relations in the semantic structure diagram.

In operation S230, target prompt information is generated according to the intermediate prompt word and the intermediate side relation associated with the intermediate prompt word, wherein the target prompt information is applicable to the large language model to be input into the pre-training, and a target text is generated.

According to embodiments of the present disclosure, the prompt message characterizes a plurality of prompt words that may prompt a user for input in an inputtable dialog.

According to embodiments of the present disclosure, a semantic structure graph may be a structured graph including nodes related to hint words in hint information, and side relationships between nodes. Each node may characterize semantic concepts of a hint word.

According to an embodiment of the present disclosure, the edge relationship characterizes a semantic relationship between two nodes, the semantic relationship characterizes a relationship or constraint between two hint words.

According to the embodiment of the disclosure, according to the edge relation in the semantic structure diagram, the intermediate prompt words are determined from the prompt messages, and the intermediate prompt words represent and extract the prompt words with high association degree with the prompt messages.

According to embodiments of the present disclosure, the intermediate side relationships characterize semantic relationships between intermediate cue words.

According to the embodiment of the disclosure, the relationship between the intermediate prompt word and the intermediate edge associated with the intermediate prompt word is optimized, and the target prompt sentence with high accuracy is obtained.

According to an embodiment of the present disclosure, the target hint information includes a target hint statement. The target prompt information characterizes the prompt information generated by adjusting the grammar structure of the intermediate prompt words after optimizing the intermediate side relation associated with the intermediate prompt words.

According to an embodiment of the present disclosure, target prompt information is input into a pre-trained large language model to generate target text.

According to the embodiment of the disclosure, the intermediate prompt words are extracted according to the side relation of the semantic relation among the presentation words in the obtained semantic structure diagram by carrying out semantic analysis on the prompt information, so that redundant information in the prompt information is filtered, and the intermediate prompt words with strong correlation with the prompt semantic attribute of the prompt information are fully reserved; according to the intermediate prompt words and the associated intermediate side relations of the intermediate prompt words, target prompt information capable of accurately representing prompt semantic attributes is further generated, text prediction is performed rapidly and accurately by the large language model under the condition that prompt semantics are accurately understood, text prediction precision and efficiency are improved, and the problems of low accuracy and low efficiency when the large language model processes the prompt information are solved.

FIG. 3 schematically illustrates a schematic diagram of a method of generating a semantic structure according to an embodiment of the present disclosure.

According to the embodiment of the disclosure, the prompt information to be processed is subjected to semantic analysis, for example, the semantic analysis can be performed by utilizing a AMR (Abstract Meaning Representation) model to analyze the prompt words in the prompt information and semantic relations among the prompt words, and a semantic structure diagram is constructed according to the nodes and the side relations related to the prompt words.

As shown in FIG. 3, the semantic structure diagram includes nodes such as "maker", "share", "Friday", and the like.

According to embodiments of the present disclosure, the hint words can characterize keywords such as concepts, entities, relationships, and actions.

Table 1 schematically shows a general keyword information table according to an embodiment of the present disclosure.

Keyword(s)	Description of the invention
		:ARG0	The first argument of a predicate, typically the subject
:ARG1	The second argument of the predicate, typically the object
		:mod	Modification of an entity, typically to describe attributes or constraints of the entity
:time	Time information, commonly used to describe the time or duration of occurrence of an event
		:location	Location information, commonly used to describe the location of an entity or event
:polarity	Polarity information, commonly used to describe emotional polarity of affirmative, negative, or doubt
		:name	Name or identifier of an entity

According to the embodiment of the disclosure, the nouns of predicate collocations are called arguments, the arguments represent hints with part of speech names in one hint information, and the keyword (ARG 0) represents the first argument of the predicate, usually the subject; the keyword ARG1 characterizes the second argument of the predicate, typically the object; the key ": mod" characterizes the modification of an entity, typically used to describe the attribute or constraint of the entity; the keyword time characterizes time information, which is generally used for describing the time or duration of occurrence of an event; the keyword "location" characterizes location information, typically used to describe the location of an entity or event; the keyword polarity characterizes polarity information, which is generally used for describing emotion polarities such as positive, negative or doubt; the key ": name" characterizes the name or identifier of a certain entity.

According to embodiments of the present disclosure, an AMR model is semantically pre-trained using a text encoder for learning event semantics and a graph encoder for learning event structures.

According to embodiments of the present disclosure, rules and keywords specific to different industry categories and enterprise designs may be designed. For example, a loan load key may be set for banking; for the medical industry, cure rate keywords and the like can be set, and special rules and keywords are input into an AMR model for semantic pre-training.

According to the embodiment of the disclosure, semantic analysis is performed on prompt information by utilizing an AMR model, lexical analysis is performed first, the input prompt information is divided into a plurality of prompt words, and the part of speech of each prompt word is determined; secondly, carrying out syntactic analysis, establishing a grammar structure tree of natural language by using technologies such as dependency analysis or phrase structure analysis, and analyzing semantic relations among prompt words; and finally, aligning the prompt information, the lexical analysis result of the prompt information and the syntactic analysis result of the prompt information, and constructing a semantic structure diagram through the nodes related to the prompt words and the side relations among the nodes.

According to an embodiment of the present disclosure, determining an intermediate hint word from hint information according to an edge relationship in a semantic structure graph includes: determining the quantity of the side relation coefficient related to the node in the semantic structure diagram; comparing the number of the edge relations with a preset number threshold value to obtain a number comparison result; determining key nodes from the semantic structure diagram according to the quantity comparison result; and determining the prompt word associated with the key node in the prompt information as an intermediate prompt word.

According to embodiments of the present disclosure, one node in the semantic structure diagram may have connected edges with multiple nodes, and the number of edge relationships associated with each node may be different.

According to embodiments of the present disclosure, the higher the amount of side-to-side coefficients associated with a node, the more important in characterizing the node in the semantic structure.

According to the embodiment of the disclosure, comparing the number of the edge relations with a preset number threshold value to obtain a number comparison result, determining key nodes from the semantic structure diagram according to the number comparison result, and determining the prompting words associated with the key nodes in the prompting information as intermediate prompting words.

According to the embodiment of the disclosure, the key nodes in the semantic structure chart are determined, the prompt words associated with the key nodes in the prompt information are determined to be the intermediate prompt words, the content which is more spoken and has lower relevance in the prompt information can be removed, the key information in the content is extracted, the understanding degree of the large language model on the prompt information is increased, and the accuracy of the result is improved.

According to the embodiment of the disclosure, when the quantity of the side-closure coefficient of a node is greater than or equal to a preset quantity threshold value, the node is characterized to have high importance in a semantic structure diagram.

According to an embodiment of the disclosure, a node corresponding to the number comparison result in the semantic structure diagram is determined as a first key node, and the key node may include a plurality of first key nodes.

According to the embodiment of the disclosure, a first key node in the semantic structure diagram represents that the number of edge relations is greater than or equal to a preset number threshold, and the semantic relations between the prompt words associated with the first key node are key information needing to be extracted.

under the condition that the number of the edge relation represented by the number comparison result is smaller than a preset number threshold, matching the prompting words represented by the nodes corresponding to the number comparison result with preset keywords in a preset keyword set to obtain a keyword matching result; and under the condition that the keyword matching result characterizes the matching, determining the node corresponding to the keyword matching result in the semantic structure diagram as a second key node, wherein the key node further comprises the second key node.

According to the embodiment of the disclosure, when the edge coefficient quantity of one node is smaller than a preset quantity threshold value, the node is characterized to have low importance in a semantic structure diagram.

According to the embodiment of the disclosure, a preset keyword set is composed of a plurality of preset keywords designed according to the requirements of industries and enterprises.

According to the embodiment of the disclosure, the prompting words of the node characterization corresponding to the quantity comparison result are matched with the preset keywords in the preset keyword set, and a keyword matching result is obtained.

According to the embodiment of the disclosure, the preset keyword set is characterized by comprising the prompting words of the node characterization corresponding to the quantity comparison result under the condition that the keyword matching result is matched.

According to an embodiment of the disclosure, a node corresponding to the number comparison result in the semantic structure diagram is determined as a second key node, and the key node may include a plurality of second key nodes.

According to the embodiment of the disclosure, when the importance degree is low according to the side relation coefficient quantity of the node in the semantic structure diagram, the prompting word of the node representation and the preset keyword in the preset keyword set can be matched, and under the condition that the keyword matching result representation is matched, the node corresponding to the keyword matching result in the semantic structure diagram is determined to be the second keyword. In addition to retaining the semantic relation between the high-importance prompt words and the prompt words in the semantic structure diagram, the semantic relation between the low-importance prompt words and the prompt words required by industries and enterprises is also retained.

determining an end node from the semantic structure diagram according to the side direction information, wherein the side direction information related to the end node is first direction information or second direction information, the first direction information represents that the slave end node points to a node adjacent to the end node, and the second direction information represents that the slave end node points to the end node; and determining the prompt word associated with the end node in the prompt information as an intermediate prompt word.

According to an embodiment of the present disclosure, the side relation includes side direction information characterizing semantic dependencies between hint words.

According to embodiments of the present disclosure, the first direction information characterizes a direction from the end node to a node adjacent to the end node, which may be 1 or more, e.g., according to semantic dependencies of the end node a, node B and node C, the end node a being directed to node B, the end node a being directed to node C.

According to an embodiment of the present disclosure, the second direction information characterizes a direction from a node adjacent to the end node, e.g. according to semantic dependencies of end node a, node B and node C, node B is directed to end node a and node C is directed to end node a.

According to the embodiment of the disclosure, the key node is taken as an end node, and the prompt word associated with the end node in the prompt information is determined to be an intermediate prompt word.

According to the embodiment of the disclosure, the intermediate semantic relation among the intermediate prompt words is determined according to the intermediate side relation associated with the intermediate prompt words in the semantic structure diagram, and a plurality of intermediate prompt words are fused according to the intermediate semantic relation to obtain intermediate prompt information.

According to the embodiment of the disclosure, the intermediate prompt information is input into a pre-trained text prediction model to obtain the target prompt information.

According to an embodiment of the present disclosure, the pre-trained text prediction model may be a mask syntax error correction (MaskGEC) model.

According to the embodiment of the disclosure, firstly, marking error processing is performed by using a MaskGEC model, and some words or punctuations in the input intermediate prompt information are replaced by special marks, such as marking inappropriate words and grammar structures; secondly, error correction processing is carried out, which words in the intermediate prompt information need to be replaced or what contents are replaced, for example, word spelling errors in the intermediate prompt information are predicted to be replaced, and synonym or near-meaning words or random words can be replaced; and finally, performing fine tuning, comparing the predicted result with the actual value, further calculating a loss function, and retraining the model through the loss function.

Fig. 4 schematically illustrates a schematic diagram of a method of generating target hint information according to an embodiment of the present disclosure.

As shown in fig. 4, the intermediate prompt is "i learn chinese. And inputting the intermediate prompt information into a noise model to perform noise processing, so as to obtain I@Chinese. ", and then input into the MaskGEC model. The MaskGEC model adopts a multi-head attention mechanism (transducer) as an encoder (Encoders) -decoder (Decoders) framework, adopts a Linear function (Linear) and a normalized exponential function (Softmax) for training, and finally outputs a target prompt message 'i am learning Chinese'. ".

According to an embodiment of the present disclosure, the noise processing may include a filler replacement processing, where each intermediate cue word in the intermediate cue information has a certain probability of being selected and replaced with a filler symbol; random replacement processing, namely randomly extracting some intermediate prompt words from the intermediate prompt information according to a certain probability, and replacing the intermediate prompt words by using random words in a vocabulary.

According to the embodiment of the disclosure, the middle prompt information is processed by using the pre-trained text prediction model, grammar in the middle prompt information can be corrected or corrected, so that the grammar is more in accordance with the usage specification of a specific scene, and the text meaning of the prompt information can be better understood by a large language model.

According to an embodiment of the present disclosure, the method for processing hint information further includes:

According to embodiments of the present disclosure, target hint information is input to a hint information classification model, and a hint information category is output, e.g., target hint information "i am learning chinese" may fall into a "chinese learning" hint information category.

According to the embodiment of the disclosure, the target prompt information is marked according to the prompt information category.

According to the embodiment of the disclosure, for each target prompt message obtained by the prompt message input by the user, specific generalization and classification can be performed according to different enterprises, so that the large language model can be used for positioning key points and searching more quickly when encountering similar problems, and the correct result can be output in a shorter time, so that the time for processing the problems of the large language model is shortened.

According to the embodiment of the disclosure, the text feature extraction network is utilized to process the target prompt message 'i am learning Chinese', so as to obtain text features (i am learning Chinese).

According to the embodiment of the disclosure, the location feature extraction network is utilized to process target prompt information 'I learn Chinese', and location features (1, 2,3 and 4) are obtained.

According to the embodiment of the disclosure, the fusion network is constructed based on a bidirectional attention network algorithm, text features and position features can be fused, fusion features (I1, 2,3 and 4) are output, and the fusion features fuse text content information and contextual position information of target prompt information.

According to the embodiment of the disclosure, the fusion characteristics are input into the classification network, the probability of classifying the fusion characteristics into different prompt information categories is output, and the prompt information category with the largest probability value is taken as the prompt information category of the target prompt information.

According to the embodiment of the disclosure, the prompt information classification model comprises a text feature extraction network, a position feature extraction network, a fusion network and a classification network, so that the relation and the dependence among prompt words in the target prompt information can be captured, and important features and information in the text can be better identified.

Fig. 5 schematically shows a flowchart of a text generation method according to an embodiment of the present disclosure.

As shown in fig. 5, the method 500 includes operations S510 to S520.

In operation S510, in response to the text editing operation, updating the target prompt information according to the text corresponding to the text editing operation, to obtain updated target prompt information, where the target prompt information is obtained according to the above method for processing prompt information.

In operation S520, the updated target prompt information is input to the pre-trained large language model, and the target text is generated.

According to an embodiment of the present disclosure, text editing operations, such as a summary input box, a word number requirement input box, a keyword input box, and the like, are performed in an inputtable dialog box.

According to the embodiment of the disclosure, the input prompt information is processed according to the method for processing the prompt information, and the updated target prompt information is obtained.

According to the embodiment of the disclosure, the updated target prompt information is input into a pre-trained large language model, and the target text is automatically generated through autonomous learning of the large language model.

Based on the method for processing the prompt information, the disclosure also provides a device for processing the prompt information. The device will be described in detail below in connection with fig. 6.

Fig. 6 schematically shows a block diagram of an apparatus for processing hint information according to an embodiment of the present disclosure.

As shown in fig. 6, the apparatus 600 for processing hint information of this embodiment includes a parsing module 610, an extracting module 620, and an optimizing module 630.

The parsing module 610 is configured to perform semantic analysis on the prompt information to be processed to obtain a semantic structure diagram, where the semantic structure diagram includes nodes related to the prompt words in the prompt information, and side relations between the nodes, and the side relations represent semantic relations between the prompt words. In an embodiment, the parsing module 610 may be configured to perform the operation S210 described above, which is not described herein.

The extraction module 620 is configured to determine the intermediate hint word from the hint information according to the edge relation in the semantic structure. In an embodiment, the extraction module 620 may be configured to perform the operation S220 described above, which is not described herein.

The optimizing module 630 is configured to generate target prompt information according to the intermediate prompt word and the intermediate side relation associated with the intermediate prompt word, where the target prompt information is applicable to the large language model that is input into the pre-training, and generate a target text. In an embodiment, the optimization module 630 may be configured to perform the operation S230 described above, which is not described herein.

According to an embodiment of the present disclosure, the extraction module 620 includes a side-by-side coefficient quantity determination sub-module, a comparison sub-module, a key node determination sub-module, and a first extraction sub-module.

And the side relation coefficient quantity determination submodule is used for determining the side relation coefficient quantity related to the node in the semantic structure diagram.

And the comparison sub-module is used for comparing the number of the edge relations with a preset number threshold value to obtain a number comparison result.

And the key node determining submodule is used for determining key nodes from the semantic structure diagram according to the quantity comparison result.

And the first extraction sub-module is used for determining the prompt word associated with the key node in the prompt information as an intermediate prompt word.

According to an embodiment of the present disclosure, the critical node determination submodule includes a first determination unit.

The first determining unit is used for determining the node corresponding to the number comparison result in the semantic structure diagram as a first key node when the number of the edge relation represented by the number comparison result is larger than or equal to a preset number threshold value, wherein the key node comprises the first key node.

According to an embodiment of the present disclosure, the key node determination submodule further includes a matching unit and a second determination unit.

And the matching unit is used for matching the prompting words of the node representation corresponding to the quantity comparison result with preset keywords in the preset keyword set under the condition that the quantity of the quantity comparison result representation edge relation is smaller than a preset quantity threshold value, so as to obtain a keyword matching result.

And the second determining unit is used for determining the node corresponding to the keyword matching result in the semantic structure diagram as a second key node under the condition that the keyword matching result characterizations are matched, and the key node further comprises the second key node.

The extraction module 620 also includes an end node determination sub-module and a second extraction sub-module, according to embodiments of the present disclosure.

And the end node determining submodule is used for determining the end node from the semantic structure diagram according to the side direction information, wherein the side direction information related to the end node is first direction information or second direction information, the first direction information indicates that the end node points to a node adjacent to the end node, and the second direction information indicates that the end node points to the node adjacent to the end node.

And the second extraction sub-module is used for determining the prompt word associated with the end node in the prompt information as an intermediate prompt word.

According to an embodiment of the present disclosure, the optimization module 630 includes a first optimization sub-module, a second optimization sub-module, and a third optimization sub-module.

And the first optimization sub-module is used for determining the intermediate semantic relation among the intermediate prompt words according to the intermediate side relation associated with the intermediate prompt words.

And the second optimization sub-module is used for fusing a plurality of intermediate prompt words according to the intermediate semantic relation to obtain intermediate prompt information.

And the third optimization sub-module is used for processing the intermediate prompt information by utilizing the pre-trained text prediction model to obtain the target prompt information.

According to an embodiment of the present disclosure, the apparatus 600 for processing hint information further includes a classification module.

According to an embodiment of the present disclosure, the classification module includes a category output sub-module and an annotation sub-module.

And the category output sub-module is used for inputting the target prompt information into the prompt information classification model and outputting the prompt information category.

And the labeling sub-module is used for labeling the target prompt information according to the prompt information category.

According to an embodiment of the present disclosure, the category output submodule includes a text feature output unit, a location feature output unit, a fusion feature output unit, and a category output unit.

And the text feature output unit is used for inputting the target prompt information into the text feature extraction network and outputting the text features.

And the position feature output unit is used for inputting the target prompt information into the position feature extraction network and outputting the position features.

And the fusion feature output unit is used for inputting the text features and the position features into a fusion network and outputting the fusion features, wherein the fusion network is constructed based on an attention network algorithm.

And the category output unit is used for inputting the fusion characteristics into the classification network and outputting prompt information categories.

Based on the text generation method, the disclosure also provides a text generation device. The device will be described in detail below in connection with fig. 7.

Fig. 7 schematically shows a block diagram of a text generating apparatus according to an embodiment of the present disclosure.

As shown in fig. 7, the text generating apparatus 700 of this embodiment includes an update module 710 and a training module 720.

The updating module 710 is configured to respond to the text editing operation, update the target prompt information according to the text corresponding to the text editing operation, and obtain updated target prompt information, where the target prompt information is obtained according to the method for processing prompt information. In an embodiment, the updating module 710 may be configured to perform the operation S510 described above, which is not described herein.

The training module 720 is configured to input the updated target prompt information to the pre-trained large language model, and generate a target text. In an embodiment, the training module 720 may be configured to perform the operation S520 described above, which is not described herein.

Any of the parsing module 610, the extraction module 620, and the optimization module 630 or the update module 710, the training module 720 may be combined in one module to be implemented, or any of the modules may be split into multiple modules, according to embodiments of the present disclosure. Alternatively, at least some of the functionality of one or more of the modules may be combined with at least some of the functionality of other modules and implemented in one module. According to embodiments of the present disclosure, at least one of parsing module 610, extraction module 620, and optimization module 630 or update module 710, training module 720 may be implemented, at least in part, as hardware circuitry, such as a Field Programmable Gate Array (FPGA), programmable Logic Array (PLA), system-on-chip, system-on-substrate, system-on-package, application Specific Integrated Circuit (ASIC), or in hardware or firmware, such as any other reasonable manner of integrating or packaging the circuitry, or in any one of or a suitable combination of three of software, hardware, and firmware. Alternatively, at least one of the parsing module 610, the extraction module 620 and the optimization module 630 or the update module 710, the training module 720 may be at least partially implemented as a computer program module which, when executed, may perform the corresponding functions.

As shown in fig. 8, an electronic device 800 according to an embodiment of the present disclosure includes a processor 801 that can perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. The processor 801 may include, for example, a general purpose microprocessor (e.g., a CPU), an instruction set processor and/or an associated chipset and/or a special purpose microprocessor (e.g., an Application Specific Integrated Circuit (ASIC)), or the like. The processor 801 may also include on-board memory for caching purposes. The processor 801 may include a single processing unit or multiple processing units for performing the different actions of the method flows according to embodiments of the disclosure.

In the RAM 803, various programs and data required for the operation of the electronic device 800 are stored. The processor 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. The processor 801 performs various operations of the method flow according to the embodiments of the present disclosure by executing programs in the ROM 802 and/or the RAM 803. Note that the program may be stored in one or more memories other than the ROM 802 and the RAM 803. The processor 801 may also perform various operations of the method flows according to embodiments of the present disclosure by executing programs stored in one or more memories.

According to an embodiment of the present disclosure, the electronic device 800 may also include an input/output (I/O) interface 805, the input/output (I/O) interface 805 also being connected to the bus 804. The electronic device 800 may also include one or more of the following components connected to the I/O interface 805: an input portion 806 including a keyboard, mouse, etc.; an output portion 807 including a display such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and a speaker; a storage section 808 including a hard disk or the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. The drive 810 is also connected to the I/O interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage section 808 as needed.

The present disclosure also provides a computer-readable storage medium that may be embodied in the apparatus/device/system described in the above embodiments; or may exist alone without being assembled into the apparatus/device/system. The computer-readable storage medium carries one or more programs which, when executed, implement methods in accordance with embodiments of the present disclosure.

According to embodiments of the present disclosure, the computer-readable storage medium may be a non-volatile computer-readable storage medium, which may include, for example, but is not limited to: a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. For example, according to embodiments of the present disclosure, the computer-readable storage medium may include ROM 802 and/or RAM 803 and/or one or more memories other than ROM 802 and RAM 803 described above.

Embodiments of the present disclosure also include a computer program product comprising a computer program containing program code for performing the methods shown in the flowcharts. When the computer program product runs in a computer system, the program code is used for enabling the computer system to realize the method for processing prompt information or the text generation method provided by the embodiment of the disclosure.

The above-described functions defined in the system/apparatus of the embodiments of the present disclosure are performed when the computer program is executed by the processor 801. The systems, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

In one embodiment, the computer program may be based on a tangible storage medium such as an optical storage device, a magnetic storage device, or the like. In another embodiment, the computer program may also be transmitted, distributed, and downloaded and installed in the form of a signal on a network medium, and/or from a removable medium 811 via a communication portion 809. The computer program may include program code that may be transmitted using any appropriate network medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.

In such an embodiment, the computer program may be downloaded and installed from a network via the communication section 809, and/or installed from the removable media 811. The above-described functions defined in the system of the embodiments of the present disclosure are performed when the computer program is executed by the processor 801. The systems, devices, apparatus, modules, units, etc. described above may be implemented by computer program modules according to embodiments of the disclosure.

According to embodiments of the present disclosure, program code for performing computer programs provided by embodiments of the present disclosure may be written in any combination of one or more programming languages, and in particular, such computer programs may be implemented in high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. Programming languages include, but are not limited to, such as Java, c++, python, "C" or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Those skilled in the art will appreciate that the features recited in the various embodiments of the disclosure and/or in the claims may be provided in a variety of combinations and/or combinations, even if such combinations or combinations are not explicitly recited in the disclosure. In particular, the features recited in the various embodiments of the present disclosure and/or the claims may be variously combined and/or combined without departing from the spirit and teachings of the present disclosure. All such combinations and/or combinations fall within the scope of the present disclosure.

The embodiments of the present disclosure are described above. However, these examples are for illustrative purposes only and are not intended to limit the scope of the present disclosure. Although the embodiments are described above separately, this does not mean that the measures in the embodiments cannot be used advantageously in combination. The scope of the disclosure is defined by the appended claims and equivalents thereof. Various alternatives and modifications can be made by those skilled in the art without departing from the scope of the disclosure, and such alternatives and modifications are intended to fall within the scope of the disclosure.

Claims

1. A method for processing hints information, comprising:

carrying out semantic analysis on prompt information to be processed to obtain a semantic structure diagram, wherein the semantic structure diagram comprises nodes related to prompt words in the prompt information and side relations among the nodes, and the side relations represent semantic relations among the prompt words;

Determining intermediate prompt words from the prompt information according to the side relation in the semantic structure diagram; and

generating target prompt information according to the intermediate prompt words and the intermediate side relation associated with the intermediate prompt words, wherein the target prompt information is suitable for being input into a pre-trained large language model, and generating target text.

2. The method of claim 1, wherein determining intermediate hints from the hints information based on side relationships in the semantic structure graph comprises:

comparing the side relation coefficient quantity with a preset quantity threshold value to obtain a quantity comparison result;

and determining the prompt word associated with the key node in the prompt information as the intermediate prompt word.

3. The method of claim 2, wherein determining key nodes from the semantic structure graph based on the number comparison results comprises:

and under the condition that the quantity comparison result represents that the quantity of the edge relations is larger than or equal to the preset quantity threshold, determining a node corresponding to the quantity comparison result in the semantic structure diagram as a first key node, wherein the key node comprises the first key node.

4. A method according to claim 2 or 3, wherein determining key nodes from the semantic structure graph based on the number comparison result further comprises:

under the condition that the quantity comparison result represents that the side relation coefficient quantity is smaller than the preset quantity threshold value, matching the prompting words represented by the nodes corresponding to the quantity comparison result with preset keywords in a preset keyword set to obtain a keyword matching result;

and under the condition that the keyword matching result representation is matched, determining a node corresponding to the keyword matching result in the semantic structure diagram as a second key node, wherein the key node further comprises the second key node.

5. The method according to claim 1 or 2, wherein the side relation comprises side direction information characterizing semantic dependencies between the hint words,

wherein, according to the side relation in the semantic structure diagram, determining the intermediate prompt word from the prompt information further comprises:

determining an end node from the semantic structure diagram according to the side direction information, wherein the side direction information related to the end node is first direction information or second direction information, the first direction information represents that the end node points to a node adjacent to the end node, and the second direction information represents that the end node points to the node adjacent to the end node; and

And determining the prompt word associated with the end node in the prompt information as the intermediate prompt word.

6. The method of claim 1, wherein generating target hinting information from the intermediate hinting words and intermediate side relationships associated with the intermediate hinting words comprises:

determining intermediate semantic relationships among the intermediate prompt words according to intermediate side relationships associated with the intermediate prompt words;

and processing the intermediate prompt information by using a pre-trained text prediction model to obtain the target prompt information.

7. The method of claim 1, further comprising:

inputting the target prompt information into a prompt information classification model, and outputting the prompt information category; and

8. The method of claim 7, wherein the hint information classification model includes a text feature extraction network, a location feature extraction network, a fusion network, and a classification network;

the target prompt information is input to a prompt information classification model, and the output prompt information comprises:

Inputting the target prompt information into the text feature extraction network, and outputting text features;

inputting the target prompt information into the position feature extraction network, and outputting position features;

inputting the text features and the position features into the fusion network, and outputting fusion features, wherein the fusion network is constructed based on an attention network algorithm; and

and inputting the fusion characteristics into the classification network, and outputting the prompt information category.

9. A text generation method, comprising:

responding to a text editing operation, and updating target prompt information according to a text corresponding to the text editing operation to obtain updated target prompt information, wherein the target prompt information is obtained according to the method of any one of claims 1 to 8;

10. An apparatus for processing a hint message, comprising:

The extraction module is used for determining intermediate prompt words from the prompt information according to the side relationship in the semantic structure diagram; and

and the optimization module is used for generating target prompt information according to the intermediate prompt words and the intermediate side relation associated with the intermediate prompt words, wherein the target prompt information is suitable for being input into a pre-trained large language model to generate target text.

11. A text generation apparatus comprising:

12. An electronic device, comprising:

one or more processors;

storage means for storing one or more programs,

wherein the one or more programs, when executed by the one or more processors, cause the one or more processors to perform the method of any of claims 1-9.

13. A computer readable storage medium having stored thereon executable instructions which, when executed by a processor, cause the processor to perform the method according to any of claims 1 to 9.

14. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1 to 9.