CN111090991A - Scene error correction method and device, electronic equipment and storage medium - Google Patents

Scene error correction method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111090991A
CN111090991A CN201911358292.5A CN201911358292A CN111090991A CN 111090991 A CN111090991 A CN 111090991A CN 201911358292 A CN201911358292 A CN 201911358292A CN 111090991 A CN111090991 A CN 111090991A
Authority
CN
China
Prior art keywords
text information
corpus
original text
scene
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911358292.5A
Other languages
Chinese (zh)
Other versions
CN111090991B (en
Inventor
赖佳伟
付志宏
邓卓彬
徐梦笛
罗希意
何径舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201911358292.5A priority Critical patent/CN111090991B/en
Publication of CN111090991A publication Critical patent/CN111090991A/en
Application granted granted Critical
Publication of CN111090991B publication Critical patent/CN111090991B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The application discloses a scene error correction method, a scene error correction device, electronic equipment and a storage medium, and relates to the technical field of error correction. The specific implementation scheme is as follows: when the original text information input by the user in the current scene needs to be corrected, the candidate corpus matched with the original text information is obtained from the corpus set corresponding to the current scene, the fact identification is carried out on the original text information according to the candidate corpus to determine whether the original text information has errors, and when the original text information has errors, the error correction is carried out on the original text information according to the candidate corpus, and the text information after error correction is provided for the user. Therefore, a mode of correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of performing fact instantiation on the original text information.

Description

Scene error correction method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for scene error correction, an electronic device, and a storage medium.
Background
When the user does not have enough or careless sense of the language during the use of the language, errors such as misuse of words and wearing of the hat can easily occur. In some scenarios, a very small language error may also have very adverse effects. Error correction techniques are techniques to solve this problem.
In the process of correcting the input text, the error correction modes in the related technology comprise a traditional error correction mode and an error correction mode based on deep learning, the traditional error correction mode translates the error text into a correct text through a language model and a translation model, and the error correction mode based on the deep learning directly corrects the input text through a large-scale supervised corpus error correction model and a trained error correction model. However, in the course of implementing the present application, the inventors found that there are at least the following technical problems in the related art: when the traditional error correction mode is used for correcting errors of input texts in different scenes, language models and translation models corresponding to the different scenes respectively need to be trained, characteristics of the different scenes need to be manually selected, meanwhile, language models and translation models related to the scenes need to be trained, expandability and universality are lacked, and the corrected texts obtained by the mode possibly do not meet actual fact conditions. The error correction mode based on deep learning needs to manually label the supervision corpora, different error correction models are needed for different scenes, different scene part models cannot be taken, and the scene error correction cost is high. Therefore, how to effectively implement scene error correction is a technical problem which needs to be solved urgently at present.
Disclosure of Invention
The application provides a scene error correction method, a scene error correction device, electronic equipment and a storage medium, which are used for directly combining candidate corpora matched with an input text in a scene corpus and performing fact instantiation on original text information by combining the candidate corpora, so that the original text information in different scenes can be corrected accurately.
An embodiment of a first aspect of the present application provides a scene error correction method, including: acquiring original text information input by a user in a current scene; acquiring a corpus set corresponding to the current scene from a pre-constructed scene corpus; obtaining a candidate corpus matched with the original text information from the corpus set; performing fact recognition on the original text information according to the candidate corpus to determine whether the original text information has errors; and if the original text information has errors, correcting the error of the original text information according to the candidate corpus, and providing the corrected text information for the user.
In an embodiment of the present application, the performing fact recognition on the original text information according to the corpus candidate to determine whether there is an error in the original text information includes: extracting fact information from the original text information to obtain fact information described in the original text information, wherein the fact information includes: attribute information of an entity, the attribute information including an attribute and a first attribute value of the attribute; acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute; if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors; the performing error correction on the original text information according to the candidate corpus includes: and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
In an embodiment of the present application, the performing fact recognition on the original text information according to the corpus candidate to determine whether there is an error in the original text information includes: reasoning and analyzing the candidate corpus and the original text information to obtain a reasoning relation between the candidate corpus and the original text information; and if the inference relation is of a first type, determining that the original text information has errors, wherein the first type is used for representing that the candidate corpus is inconsistent with the original text information.
In an embodiment of the present application, the performing inference analysis on the corpus candidate and the original text information to obtain an inference relationship between the corpus candidate and the original text information includes: and inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain a reasoning relation between the candidate corpus and the original text information.
In an embodiment of the present application, the obtaining, from the corpus set, a corpus candidate matched with the original text information includes: determining the relevancy between the original text information and each corpus in the corpus set; and according to the relevancy, selecting the first M corpora with high relevancy from the corpus set as candidate corpora, wherein M is a positive integer.
In an embodiment of the present application, the determining a correlation degree between the original text information and each corpus in the corpus set includes: acquiring a first keyword feature in the original text information and a second keyword feature in the corpus for each corpus in the corpus set; and determining the relevance of the original text information and the corpus according to the first keyword features and the second keyword features.
In one embodiment of the present application, the method further comprises: receiving a scene expansion request of the user, wherein the scene expansion request comprises scene corpus data of a new scene; and adding scene corpus data of the new scene into the scene corpus.
According to the scene error correction method, when the original text information input by the user in the current scene needs to be corrected, the candidate corpus matched with the original text information is obtained from the corpus set corresponding to the current scene, fact recognition is conducted on the original text information according to the candidate corpus to determine whether the original text information has errors, when the original text information has errors, error correction is conducted on the original text information according to the candidate corpus, and the text information after error correction is provided for the user. Therefore, a mode of correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of performing fact instantiation on the original text information.
An embodiment of a second aspect of the present application provides a scene error correction apparatus, which includes: the first acquisition module is used for acquiring original text information input by a user in a current scene; the second acquisition module is used for acquiring a corpus set corresponding to the current scene from a pre-constructed scene corpus; the matching module is used for acquiring the candidate corpus matched with the original text information from the corpus set; the fact identification module is used for carrying out fact identification on the original text information according to the candidate corpus so as to determine whether the original text information has errors or not; and the error correction module is used for correcting the original text information according to the candidate corpus and providing the corrected text information for the user if the original text information has errors.
In an embodiment of the present application, the fact identification module is specifically configured to: extracting fact information from the original text information to obtain fact information described in the original text information, wherein the fact information includes: attribute information of an entity, the attribute information including an attribute and a first attribute value of the attribute; acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute; if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors; the error correction module is specifically configured to: and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
In one embodiment of the present application, the fact identification module includes: the reasoning unit is used for reasoning and analyzing the candidate corpus and the original text information to obtain a reasoning relation between the candidate corpus and the original text information; and a first determining unit, configured to determine that the original text information has an error if the inference relationship is of a first type, where the first type is used to indicate that the corpus candidate is inconsistent with the original text information.
In an embodiment of the present application, the inference unit is specifically configured to: and inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain a reasoning relation between the candidate corpus and the original text information.
In one embodiment of the present application, the matching module includes: a second determining unit, configured to determine a degree of correlation between the original text information and each corpus in the corpus set; and the selecting unit is used for selecting the first M corpora with high correlation degrees from the corpus set as candidate corpora according to the correlation degrees, wherein M is a positive integer.
In an embodiment of the application, the second determining unit is specifically configured to: acquiring a first keyword feature in the original text information and a second keyword feature in the corpus for each corpus in the corpus set; and determining the relevance of the original text information and the corpus according to the first keyword features and the second keyword features.
In one embodiment of the present application, the apparatus further comprises: a receiving module, configured to receive a scene extension request of the user, where the scene extension request includes scene corpus data of a new scene; and the adding module is used for adding the scene corpus data of the new scene into the scene corpus.
When the original text information input by the user in the current scene needs to be corrected, the scene error correction device obtains the candidate corpus matched with the original text information from the corpus set corresponding to the current scene, performs fact recognition on the original text information according to the candidate corpus to determine whether the original text information has errors, corrects the original text information according to the candidate corpus when the original text information has errors, and provides the corrected text information for the user. Therefore, a mode of correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of performing fact instantiation on the original text information.
An embodiment of a third aspect of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor to enable the at least one processor to execute the scene error correction method of the embodiment of the application.
A fourth aspect of the present application provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to execute a scene error correction method disclosed in the embodiments of the present application.
One embodiment in the above application has the following advantages or benefits: and the fact instantiation is carried out on the original text information by directly combining the candidate linguistic data matched with the input text in the scene corpus and combining the candidate linguistic data, so that the error correction of the original text information in different scenes is accurately realized. The technical means that the candidate corpus matched with the original text information is obtained from the corpus set corresponding to the current scene, the fact identification is carried out on the original text information according to the candidate corpus to determine whether the original text information has errors, the error correction is carried out on the original text information according to the candidate corpus when the original text information has errors, and the text information after error correction is provided for a user are overcome, so that the technical problem that the text after error correction possibly does not accord with the actual fact situation in the related technology is solved, the fact of the original text information is carried out according to the candidate corpus corresponding to the scene, and the technical effect of error correction on the original text information in different scenes is accurately achieved.
Other effects of the above-described alternative will be described below with reference to specific embodiments.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present application;
FIG. 2 is a schematic illustration according to a second embodiment of the present application;
FIG. 3 is a schematic illustration according to a third embodiment of the present application;
FIG. 4 is a schematic flow chart diagram according to an embodiment of the present application;
FIG. 5 is a schematic illustration according to a fourth embodiment of the present application;
FIG. 6 is a schematic illustration according to a fifth embodiment of the present application;
FIG. 7 is a block diagram of an electronic device used to implement embodiments of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, includes various details of the embodiments of the application for the understanding of the same, which are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
A scene error correction method, apparatus, electronic device, and storage medium according to embodiments of the present application are described below with reference to the accompanying drawings.
Fig. 1 is a schematic diagram according to a first embodiment of the present application. It should be noted that an execution subject of the scene error correction method of this embodiment is a scene error correction device, which may be implemented in a software and/or hardware manner, and the device may be configured in an electronic device, and the electronic device may include, but is not limited to, a terminal device, a server, and the like, which is not limited in this embodiment.
As shown in fig. 1, the scene error correction method may include:
step 101, obtaining original text information input by a user in a current scene.
Specifically, in the current scene, in the process of inputting a text by a user, when the text input by the user needs to be corrected, the original text information currently input by the user in the current scene may be acquired.
In this embodiment, in order to meet the requirement of personalized input of the user, the original text information in this embodiment may be input based on voice, text, or image, and it can be understood that, when the voice is input, after the voice information input by the user is acquired, the voice recognition may be performed on the input voice information, and the text information obtained by the voice recognition is used as the original text information. When inputting with an image, after acquiring the image input by the user, the character recognition can be carried out on the image, and the recognized character is taken as original text information.
The current scene in this embodiment may be a customer service voice scene, a text review scene, a map scene, a search scene, and the like, which is not specifically limited in this embodiment.
Step 102, obtaining a corpus set corresponding to a current scene from a pre-constructed scene corpus.
In this embodiment, a corpus corresponding to each scene is stored in the scene corpus, where the corpus includes a large number of corpora.
It should be understood that, in this embodiment, information described by a corpus in a corpus set is correct text information commonly used in a corresponding scenario, and the correct text information is described based on an actual fact situation.
And 103, acquiring candidate corpora matched with the original text information from the corpus set.
In this embodiment, a specific implementation manner of obtaining the candidate corpus matched with the original text information from the corpus set is as follows: determining the relevance of the original text information and each corpus in the corpus set; and according to the relevancy, selecting the first M corpora with high relevancy from the corpus set as candidate corpora, wherein M is a positive integer.
In this embodiment, the relevance of the original text information and each corpus in the corpus set in the keyword feature dimension may be determined based on a keyword matching technique.
Specifically, for each corpus in the corpus set, a first keyword feature in the original text information can be obtained, and a second keyword feature in the corpus can be obtained; and determining the relevance of the original text information and the corpus according to the first keyword features and the second keyword features.
In this embodiment, in order to further improve the accuracy of the selected corpus candidate, as an exemplary implementation manner, the relevance of the original text information and each corpus in the corpus set in each preset feature dimension may be further calculated, and then, based on the relevance of the original text information and each corpus in each preset feature dimension, the relevance between the original text information and the corresponding corpus is determined.
The preset feature dimensions may include a keyword feature dimension, a semantic feature dimension, and a dependency syntactic feature dimension.
And 104, performing fact identification on the original text information according to the candidate corpus to determine whether the original text information has errors.
In this embodiment, the specific implementation manner of performing fact recognition on the original text information according to the corpus candidate to determine whether the original text information has an error may be: inference analysis can be carried out on the candidate corpus and the original text information to obtain an inference relationship between the candidate corpus and the original text information, and if the inference relationship is determined to be of a first type, an error exists in the original text information.
The first type is used for representing the contradiction between the candidate corpus and the original text information.
The inference relationship in this embodiment may include a second type, in addition to the first type, where the second type is used to indicate that the corpus candidate is consistent with the text information, and when the inference relationship is the second type, it indicates that the corpus candidate is consistent with the description fact information in the text information.
In this embodiment, in order to conveniently and quickly determine the inference relationship between the corpus candidate and the text information, the corpus candidate and the text information may be input into a preset Natural Language Inference (NLI) model to obtain the inference relationship between the corpus candidate and the text information.
And 105, if the original text information has errors, correcting the error of the original text information according to the candidate corpus, and providing the corrected text information for the user.
For example, in a search scenario, the user enters the original text information in the search box as: however, in fact, a wife is B, at this time, the error correction method in the related art does not correct a that is input by the user, and the error correction method provided in this embodiment can determine that the wife described in the original text information is incorrect when the original text information is actually identified by the candidate data, and modify the original text information into "a wife is B" according to the candidate corpus, and provide the modified information "a wife is B" as the text information after error correction to the user, thereby implementing accurate error correction of the original text information based on the actual fact information in the corresponding scene.
According to the scene error correction method, when the original text information input by the user in the current scene needs to be corrected, the candidate corpus matched with the original text information is obtained from the corpus set corresponding to the current scene, fact recognition is conducted on the original text information according to the candidate corpus to determine whether the original text information has errors, when the original text information has errors, error correction is conducted on the original text information according to the candidate corpus, and the text information after error correction is provided for the user. Therefore, a mode of correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of performing fact instantiation on the original text information.
Based on the foregoing embodiment, in practical application, when the original text information input by the user in the new scene needs to be corrected, in the related art, manual feature selection and manual tagging are required, however, in this embodiment, by adding the corpus data of the new scene to the scene corpus, it is convenient to correct the original text information input by the user in the new scene based on the updated scene corpus, and the updated scene corpus of this embodiment is further described below with reference to fig. 2.
As shown in fig. 2, the method may further include:
step 201, receiving a scene extension request of a user, wherein the scene extension request includes scene corpus data of a new scene.
In this embodiment, in order to improve the accuracy of subsequent error correction on the text in the new scene, the corpus that does not meet the predetermined standard in the scene corpus data of the new scene may be cleaned, and the processed scene corpus data may be stored in the scene corpus.
Step 202, adding scene corpus data of the new scene into a scene corpus.
In this embodiment, the extension of the error correction scene can be realized by adding scene corpus data of the new scene in the scene corpus, and in the process of extending the new scene, feature selection and manual labeling do not need to be performed manually, so that the labor cost required in the error correction scene is greatly reduced, and the error correction scene can be conveniently extended in the scene corpus at any time according to requirements.
Fig. 3 is a third embodiment of the present application. It should be noted that the present embodiment is a further refinement or extension of the above-described embodiments.
As shown in fig. 3, the method may include:
step 301, obtaining original text information input by a user in a current scene.
Step 302, a corpus set corresponding to a current scene is obtained from a pre-constructed scene corpus.
Step 303, obtaining the candidate corpus matched with the original text information from the corpus set.
Step 304, extracting fact information from the original text information to obtain fact information described in the original text information, wherein the fact information includes: attribute information of the entity, the attribute information including an attribute and a first attribute value of the attribute.
Step 305, obtaining a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute.
In this embodiment, the fact information may be extracted from the corpus candidate to obtain the fact information described in the original text information, where the fact information described in the original text information may include a first attribute value of an attribute of the entity, and may also include attribute values of other attributes of the entity.
And step 306, if the first attribute value and the second attribute value are not consistent, determining that the original text information has errors.
And 307, replacing the first attribute value in the original text information with the second attribute value to obtain the corrected text information, and providing the corrected text information for the user.
In order to make the present application more clear to those skilled in the art, the scene error correction method of the present embodiment is described below with reference to fig. 4.
In this embodiment, the corpus includes a specific scene corpus and a general corpus, and specifically, after a user inputs a text in a current scene, it may be determined whether corpus data of the current scene exists in the scene corpus, if not, a candidate corpus related to the input text may be extracted from the general scene corpus through a search engine, if so, a candidate corpus related to the input text may be extracted from a scene corpus corresponding to the current scene through the search engine, and then, the input text and the candidate corpus are input to a verification module, which is configured to determine whether the input text is correct by using a candidate text based on a natural language inference model or a semantic matching model, and correct the input text based on the candidate text and output an error correction text when the input text is incorrect.
In order to implement the foregoing embodiments, an embodiment of the present application further provides a scene error correction device.
Fig. 5 is a schematic diagram according to a fifth embodiment of the present application. As shown in fig. 5, the scene error correction apparatus 100 includes a first obtaining module 110, a second obtaining module 120, a matching module 130, a fact identification module 140, and an error correction module 150, wherein:
the first obtaining module 110 is configured to obtain original text information input by a user in a current scene.
The second obtaining module 120 is configured to obtain a corpus set corresponding to a current scene from a pre-constructed scene corpus.
The matching module 130 is configured to obtain a corpus candidate matched with the original text information from the corpus set.
And the fact identification module 140 is configured to perform fact identification on the original text information according to the corpus candidate to determine whether an error exists in the original text information.
And the error correction module 150 is configured to, if the original text information has an error, correct the original text information according to the corpus candidate, and provide the text information after error correction to the user.
In an embodiment of the present application, the fact identification module 140 is specifically configured to: extracting fact information from the original text information to obtain fact information described in the original text information, wherein the fact information comprises: attribute information of the entity, the attribute information including an attribute and a first attribute value of the attribute. And acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute. And if the first attribute value and the second attribute value are not consistent, determining that the original text information has errors.
In an embodiment of the present application, the error correction module 150 is specifically configured to: and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
In one embodiment of the present application, the fact identification module 140 includes:
and the reasoning unit is used for reasoning and analyzing the candidate corpus and the original text information to obtain a reasoning relation between the candidate corpus and the original text information.
And the first determining unit is used for determining that the original text information has errors if the reasoning relationship is of a first type, wherein the first type is used for indicating that the candidate corpus is inconsistent with the original text information.
The inference unit is specifically configured to: and inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain a reasoning relation between the candidate corpus and the original text information.
In one embodiment of the present application, the matching module 130 may include:
and the second determining unit is used for determining the correlation degree of the original text information and each corpus in the corpus set.
And the selecting unit is used for selecting the first M corpora with high correlation degrees from the corpus set as candidate corpora according to the correlation degrees, wherein M is a positive integer.
In an embodiment of the application, the second determining unit is specifically configured to: and aiming at each corpus in the corpus set, acquiring a first keyword feature in the original text information and acquiring a second keyword feature in the corpus. And determining the relevance of the original text information and the corpus according to the first keyword features and the second keyword features.
Based on the above embodiment, in addition to the embodiment shown in fig. 5, as shown in fig. 6, the apparatus further includes:
the receiving module 160 is configured to receive a scene extension request of a user, where the scene extension request includes scene corpus data of a new scene.
And an adding module 170, configured to add scene corpus data of the new scene to the scene corpus.
It should be noted that the explanation of the scene error correction method is also applicable to the scene error correction apparatus of the present embodiment, and is not repeated here.
When the original text information input by the user in the current scene needs to be corrected, the scene error correction device obtains the candidate corpus matched with the original text information from the corpus set corresponding to the current scene, performs fact recognition on the original text information according to the candidate corpus to determine whether the original text information has errors, corrects the original text information according to the candidate corpus when the original text information has errors, and provides the corrected text information for the user. Therefore, a mode of correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of performing fact instantiation on the original text information.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 7, is a block diagram of an electronic device according to an embodiment of the application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 7, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 7, one processor 701 is taken as an example.
The memory 702 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by the at least one processor, so that the at least one processor executes the scene error correction method provided by the application. The non-transitory computer-readable storage medium of the present application stores computer instructions for causing a computer to perform the scene error correction method provided by the present application.
The memory 702, which is a non-transitory computer readable storage medium, may be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the scene correction method in the embodiments of the present application. The processor 701 executes various functional applications of the server and data processing, i.e., implements the scene error correction method in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 702.
The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 702 may optionally include memory located remotely from the processor 701, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 7 illustrates an example of a connection by a bus.
The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus, such as a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, or other input devices. The output devices 704 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (16)

1. A method for scene error correction, the method comprising:
acquiring original text information input by a user in a current scene;
acquiring a corpus set corresponding to the current scene from a pre-constructed scene corpus;
obtaining a candidate corpus matched with the original text information from the corpus set;
performing fact recognition on the original text information according to the candidate corpus to determine whether the original text information has errors;
and if the original text information has errors, correcting the error of the original text information according to the candidate corpus, and providing the corrected text information for the user.
2. The method according to claim 1, wherein said performing fact recognition on said original text information according to said corpus candidate to determine whether there is an error in said original text information comprises:
extracting fact information from the original text information to obtain fact information described in the original text information, wherein the fact information includes: attribute information of an entity, the attribute information including an attribute and a first attribute value of the attribute;
acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute;
if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors;
the performing error correction on the original text information according to the candidate corpus includes:
and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
3. The method according to claim 1, wherein said performing fact recognition on said original text information according to said corpus candidate to determine whether there is an error in said original text information comprises:
reasoning and analyzing the candidate corpus and the original text information to obtain a reasoning relation between the candidate corpus and the original text information;
and if the inference relation is of a first type, determining that the original text information has errors, wherein the first type is used for representing that the candidate corpus is inconsistent with the original text information.
4. The method according to claim 3, wherein said performing inference analysis on said corpus candidate and said original text information to obtain an inference relationship between said corpus candidate and said original text information comprises:
and inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain a reasoning relation between the candidate corpus and the original text information.
5. The method according to claim 1, wherein said obtaining the corpus candidate matching the original text information from the corpus comprises:
determining the relevancy between the original text information and each corpus in the corpus set;
and according to the relevancy, selecting the first M corpora with high relevancy from the corpus set as candidate corpora, wherein M is a positive integer.
6. The method according to claim 5, wherein said determining the relevance of the text information to each corpus in the corpus comprises:
acquiring a first keyword feature in the original text information and a second keyword feature in the corpus for each corpus in the corpus set;
and determining the relevance of the original text information and the corpus according to the first keyword features and the second keyword features.
7. The method of claim 1, further comprising:
receiving a scene expansion request of the user, wherein the scene expansion request comprises scene corpus data of a new scene;
and adding scene corpus data of the new scene into the scene corpus.
8. An apparatus for scene error correction, the apparatus comprising:
the first acquisition module is used for acquiring original text information input by a user in a current scene;
the second acquisition module is used for acquiring a corpus set corresponding to the current scene from a pre-constructed scene corpus;
the matching module is used for acquiring the candidate corpus matched with the original text information from the corpus set;
the fact identification module is used for carrying out fact identification on the original text information according to the candidate corpus so as to determine whether the original text information has errors or not;
and the error correction module is used for correcting the original text information according to the candidate corpus and providing the corrected text information for the user if the original text information has errors.
9. The apparatus of claim 8, wherein the fact identification module is specifically configured to:
extracting fact information from the original text information to obtain fact information described in the original text information, wherein the fact information includes: attribute information of an entity, the attribute information including an attribute and a first attribute value of the attribute;
acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute;
if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors;
the error correction module is specifically configured to:
and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
10. The apparatus of claim 8, wherein the fact identification module comprises:
the reasoning unit is used for reasoning and analyzing the candidate corpus and the original text information to obtain a reasoning relation between the candidate corpus and the original text information;
and a first determining unit, configured to determine that the original text information has an error if the inference relationship is of a first type, where the first type is used to indicate that the corpus candidate is inconsistent with the original text information.
11. The apparatus according to claim 10, wherein the inference unit is specifically configured to:
and inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain a reasoning relation between the candidate corpus and the original text information.
12. The apparatus of claim 8, wherein the matching module comprises:
a second determining unit, configured to determine a degree of correlation between the original text information and each corpus in the corpus set;
and the selecting unit is used for selecting the first M corpora with high correlation degrees from the corpus set as candidate corpora according to the correlation degrees, wherein M is a positive integer.
13. The apparatus according to claim 12, wherein the second determining unit is specifically configured to:
acquiring a first keyword feature in the original text information and a second keyword feature in the corpus for each corpus in the corpus set;
and determining the relevance of the original text information and the corpus according to the first keyword features and the second keyword features.
14. The apparatus of claim 8, further comprising:
a receiving module, configured to receive a scene extension request of the user, where the scene extension request includes scene corpus data of a new scene;
and the adding module is used for adding the scene corpus data of the new scene into the scene corpus.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN201911358292.5A 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium Active CN111090991B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911358292.5A CN111090991B (en) 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911358292.5A CN111090991B (en) 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111090991A true CN111090991A (en) 2020-05-01
CN111090991B CN111090991B (en) 2023-07-04

Family

ID=70397231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911358292.5A Active CN111090991B (en) 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111090991B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724244A (en) * 2020-06-11 2020-09-29 中国建设银行股份有限公司 Objection correction method and device
CN112597754A (en) * 2020-12-23 2021-04-02 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and readable storage medium
CN113361266A (en) * 2021-06-25 2021-09-07 达闼机器人有限公司 Text error correction method, electronic device and storage medium
WO2023231987A1 (en) * 2022-05-30 2023-12-07 华为技术有限公司 Text recognition method and electronic device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239547A (en) * 2017-06-05 2017-10-10 北京智能管家科技有限公司 Voice error correction method, terminal and storage medium for ordering song by voice
CN107679032A (en) * 2017-09-04 2018-02-09 百度在线网络技术(北京)有限公司 Voice changes error correction method and device
US20180286390A1 (en) * 2017-03-28 2018-10-04 Rovi Guides, Inc. Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate
US20190278843A1 (en) * 2017-02-27 2019-09-12 Tencent Technology (Shenzhen) Company Ltd Text entity extraction method, apparatus, and device, and storage medium
CN110232129A (en) * 2019-06-11 2019-09-13 北京百度网讯科技有限公司 Scene error correction method, device, equipment and storage medium
CN110288985A (en) * 2019-06-28 2019-09-27 北京猎户星空科技有限公司 Voice data processing method, device, electronic equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190278843A1 (en) * 2017-02-27 2019-09-12 Tencent Technology (Shenzhen) Company Ltd Text entity extraction method, apparatus, and device, and storage medium
US20180286390A1 (en) * 2017-03-28 2018-10-04 Rovi Guides, Inc. Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate
CN107239547A (en) * 2017-06-05 2017-10-10 北京智能管家科技有限公司 Voice error correction method, terminal and storage medium for ordering song by voice
CN107679032A (en) * 2017-09-04 2018-02-09 百度在线网络技术(北京)有限公司 Voice changes error correction method and device
CN110232129A (en) * 2019-06-11 2019-09-13 北京百度网讯科技有限公司 Scene error correction method, device, equipment and storage medium
CN110288985A (en) * 2019-06-28 2019-09-27 北京猎户星空科技有限公司 Voice data processing method, device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韦向峰;袁毅;张全;池毓焕;: "富媒体环境下语音和文本内容的对齐研究" *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724244A (en) * 2020-06-11 2020-09-29 中国建设银行股份有限公司 Objection correction method and device
CN112597754A (en) * 2020-12-23 2021-04-02 北京百度网讯科技有限公司 Text error correction method and device, electronic equipment and readable storage medium
CN112597754B (en) * 2020-12-23 2023-11-21 北京百度网讯科技有限公司 Text error correction method, apparatus, electronic device and readable storage medium
CN113361266A (en) * 2021-06-25 2021-09-07 达闼机器人有限公司 Text error correction method, electronic device and storage medium
CN113361266B (en) * 2021-06-25 2022-12-06 达闼机器人股份有限公司 Text error correction method, electronic device and storage medium
WO2023231987A1 (en) * 2022-05-30 2023-12-07 华为技术有限公司 Text recognition method and electronic device

Also Published As

Publication number Publication date
CN111090991B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
US11341366B2 (en) Cross-modality processing method and apparatus, and computer storage medium
CN111522994B (en) Method and device for generating information
JP2021114291A (en) Time series knowledge graph generation method, apparatus, device and medium
CN110674314B (en) Sentence recognition method and device
CN111104514B (en) Training method and device for document tag model
CN111090991B (en) Scene error correction method, device, electronic equipment and storage medium
CN111967262A (en) Method and device for determining entity tag
CN111709234B (en) Training method and device for text processing model and electronic equipment
CN112001169B (en) Text error correction method and device, electronic equipment and readable storage medium
JP7222040B2 (en) Model training, image processing method and device, storage medium, program product
CN112528001B (en) Information query method and device and electronic equipment
CN111949814A (en) Searching method, searching device, electronic equipment and storage medium
US20220027575A1 (en) Method of predicting emotional style of dialogue, electronic device, and storage medium
CN111291192B (en) Method and device for calculating triplet confidence in knowledge graph
CN111563198B (en) Material recall method, device, equipment and storage medium
EP3839799A1 (en) Method, apparatus, electronic device and readable storage medium for translation
CN113516491B (en) Popularization information display method and device, electronic equipment and storage medium
CN111241838B (en) Semantic relation processing method, device and equipment for text entity
CN112380847A (en) Interest point processing method and device, electronic equipment and storage medium
CN111666771A (en) Semantic label extraction device, electronic equipment and readable storage medium of document
CN112270169B (en) Method and device for predicting dialogue roles, electronic equipment and storage medium
CN111309872B (en) Search processing method, device and equipment
CN112328896B (en) Method, apparatus, electronic device, and medium for outputting information
CN111259058B (en) Data mining method, data mining device and electronic equipment
CN111666417A (en) Method and device for generating synonyms, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant