CN111090991B - Scene error correction method, device, electronic equipment and storage medium - Google Patents

Scene error correction method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN111090991B
CN111090991B CN201911358292.5A CN201911358292A CN111090991B CN 111090991 B CN111090991 B CN 111090991B CN 201911358292 A CN201911358292 A CN 201911358292A CN 111090991 B CN111090991 B CN 111090991B
Authority
CN
China
Prior art keywords
text information
corpus
original text
scene
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911358292.5A
Other languages
Chinese (zh)
Other versions
CN111090991A (en
Inventor
赖佳伟
付志宏
邓卓彬
徐梦笛
罗希意
何径舟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201911358292.5A priority Critical patent/CN111090991B/en
Publication of CN111090991A publication Critical patent/CN111090991A/en
Application granted granted Critical
Publication of CN111090991B publication Critical patent/CN111090991B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Machine Translation (AREA)

Abstract

The application discloses a scene error correction method, a scene error correction device, electronic equipment and a storage medium, and relates to the technical field of error correction. The specific implementation scheme is as follows: when the original text information input by a user in the current scene is required to be corrected, acquiring a candidate corpus matched with the original text information from a corpus set corresponding to the current scene, carrying out fact recognition on the original text information according to the candidate corpus to determine whether the original text information has errors, correcting the original text information according to the candidate corpus when the original text information has errors, and providing the corrected text information for the user. Therefore, the method for correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of carrying out fact illustration on the original text information.

Description

Scene error correction method, device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to a method and apparatus for scene error correction, an electronic device, and a storage medium.
Background
When a user is not enough to grasp the language or carelessly in the use process of the language, mistakes such as misuse of words, wearing of the article of Japanese plum and the like are easy to occur. In some scenarios, small language errors can also have very adverse effects. And error correction techniques are techniques to solve this problem.
In the process of correcting the input text, the correction modes in the related technology comprise a traditional correction mode and a correction mode based on deep learning, the traditional correction mode translates the error text into the correct text through a language model and a translation model, the correction mode based on the deep learning, the correction mode is trained through a large-scale supervision corpus, and the correction is directly carried out on the input text through the trained correction model. However, in the process of implementing the present application, the inventors found that at least the following technical problems exist in the related art: when the input text of different scenes is subjected to error correction in the traditional error correction mode, language models and translation models corresponding to the different scenes are required to be trained, the characteristics of the different scenes are required to be selected manually, meanwhile, the language models and the translation models related to the scenes are required to be trained, the expandability and the universality are lacking, and the text subjected to error correction obtained in the mode may not conform to the actual fact. The error correction mode based on deep learning requires manual labeling of supervision corpus, different error correction models are required for different scenes, different scene part models cannot be taken, and scene error correction cost is high. Therefore, how to effectively implement scene correction is a technical problem that needs to be solved at present.
Disclosure of Invention
The scene correction method, the device, the electronic equipment and the storage medium are provided, the candidate corpus matched with the input text in the scene corpus is directly combined, and the fact illustration is carried out on the original text information by combining the candidate corpus, so that the correction of the original text information in different scenes is accurately realized.
An embodiment of a first aspect of the present application provides a scene error correction method, including: acquiring original text information input by a user in a current scene; acquiring a corpus set corresponding to the current scene from a scene corpus set constructed in advance; obtaining a candidate corpus matched with the original text information from the corpus set; carrying out fact recognition on the original text information according to the candidate corpus so as to determine whether the original text information has errors or not; and if the original text information has errors, correcting the error of the original text information according to the candidate corpus, and providing the corrected text information for the user.
In one embodiment of the present application, the performing fact recognition on the original text information according to the candidate corpus to determine whether the original text information has an error includes: extracting the fact information from the original text information to obtain the fact information described in the original text information, wherein the fact information comprises: attribute information of an entity, wherein the attribute information comprises an attribute and a first attribute value of the attribute; acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute; if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors; the error correction of the original text information according to the candidate corpus comprises the following steps: and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
In one embodiment of the present application, the performing fact recognition on the original text information according to the candidate corpus to determine whether the original text information has an error includes: carrying out reasoning analysis on the candidate corpus and the original text information to obtain a reasoning relation between the candidate corpus and the original text information; and if the reasoning relation is of a first type, determining that the original text information has errors, wherein the first type is used for representing that the candidate corpus contradicts the original text information.
In one embodiment of the present application, the performing an inference analysis on the candidate corpus and the original text information to obtain an inference relationship between the candidate corpus and the original text information includes: inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain the reasoning relation between the candidate corpus and the original text information.
In one embodiment of the present application, the obtaining, from the corpus set, a candidate corpus matching the original text information includes: determining the relevance of the original text information and each corpus in the corpus set; and selecting the first M corpus with high correlation degree from the corpus set as candidate corpus according to the correlation degree, wherein M is a positive integer.
In one embodiment of the present application, the determining the relevance between the original text information and each corpus in the corpus set includes: aiming at each corpus in the corpus set, acquiring first keyword features in the original text information and acquiring second keyword features in the corpus; and determining the relevance between the original text information and the corpus according to the first keyword features and the second keyword features.
In one embodiment of the present application, the method further comprises: receiving a scene expansion request of the user, wherein the scene expansion request comprises scene corpus data of a new scene; and adding the scene corpus data of the new scene into the scene corpus.
According to the scene error correction method, when error correction is required to be carried out on original text information input by a user under a current scene, candidate corpus matched with the original text information is obtained from a corpus set corresponding to the current scene, fact recognition is carried out on the original text information according to the candidate corpus, whether the original text information has errors or not is determined, and when the original text information has errors, error correction is carried out on the original text information according to the candidate corpus, and the text information after error correction is provided for the user. Therefore, the method for correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of carrying out fact illustration on the original text information.
An embodiment of a second aspect of the present application proposes a scene error correction device, including: the first acquisition module is used for acquiring original text information input by a user in a current scene; the second acquisition module is used for acquiring a corpus set corresponding to the current scene from a scene corpus set constructed in advance; the matching module is used for acquiring a candidate corpus matched with the original text information from the corpus set; the fact identification module is used for carrying out fact identification on the original text information according to the candidate corpus so as to determine whether the original text information has errors or not; and the error correction module is used for correcting errors of the original text information according to the candidate corpus and providing the corrected text information for the user if the original text information has errors.
In one embodiment of the present application, the fact identification module is specifically configured to: extracting the fact information from the original text information to obtain the fact information described in the original text information, wherein the fact information comprises: attribute information of an entity, wherein the attribute information comprises an attribute and a first attribute value of the attribute; acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute; if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors; the error correction module is specifically configured to: and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
In one embodiment of the present application, the fact identification module includes: the reasoning unit is used for carrying out reasoning analysis on the candidate corpus and the original text information so as to obtain a reasoning relation between the candidate corpus and the original text information; and the first determining unit is used for determining that the original text information has errors if the reasoning relation is of a first type, wherein the first type is used for representing that the candidate corpus contradicts the original text information.
In one embodiment of the present application, the inference unit is specifically configured to: inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain the reasoning relation between the candidate corpus and the original text information.
In one embodiment of the present application, the matching module includes: the second determining unit is used for determining the relevance between the original text information and each corpus in the corpus set; and the selection unit is used for selecting the first M corpus with high correlation degree from the corpus set as the candidate corpus according to the correlation degree, wherein M is a positive integer.
In one embodiment of the present application, the second determining unit is specifically configured to: aiming at each corpus in the corpus set, acquiring first keyword features in the original text information and acquiring second keyword features in the corpus; and determining the relevance between the original text information and the corpus according to the first keyword features and the second keyword features.
In one embodiment of the present application, the apparatus further comprises: the receiving module is used for receiving a scene expansion request of the user, wherein the scene expansion request comprises scene corpus data of a new scene; and the adding module is used for adding the scene corpus data of the new scene into the scene corpus.
According to the scene error correction device, when error correction is required to be performed on original text information input by a user in a current scene, candidate corpus matched with the original text information is obtained from a corpus set corresponding to the current scene, fact recognition is performed on the original text information according to the candidate corpus, whether the original text information has errors or not is determined, and when the original text information has errors, error correction is performed on the original text information according to the candidate corpus, and the corrected text information is provided for the user. Therefore, the method for correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of carrying out fact illustration on the original text information.
An embodiment of a third aspect of the present application provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the scene correction method of embodiments of the present application.
The fourth aspect of the present application provides a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the scene correction method disclosed in the embodiments of the present application.
One embodiment of the above application has the following advantages or benefits: the method has the advantages that the method directly combines the candidate corpus matched with the input text in the scene corpus, and performs fact illustration on the original text information by combining the candidate corpus, so that the error correction of the original text information in different scenes is accurately realized. Because the candidate corpus matched with the original text information is obtained from the corpus set corresponding to the current scene, fact identification is carried out on the original text information according to the candidate corpus, so that whether the original text information is wrong or not is determined, when the original text information is wrong, the error correction is carried out on the original text information according to the candidate corpus, and a technical means of the text information after the error correction is provided for a user, the technical problem that the text after the error correction possibly does not conform to the actual fact in the related technology is solved, and therefore the fact that the original text information is subjected to the fact combining with the candidate corpus corresponding to the scene is achieved, and the technical effect of the error correction of the original text information in different scenes is accurately achieved.
Other effects of the above alternative will be described below in connection with specific embodiments.
Drawings
The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:
FIG. 1 is a schematic diagram according to a first embodiment of the present application;
FIG. 2 is a schematic diagram according to a second embodiment of the present application;
FIG. 3 is a schematic diagram according to a third embodiment of the present application;
FIG. 4 is a schematic illustration of a specific flow according to one embodiment of the present application;
FIG. 5 is a schematic diagram according to a fourth embodiment of the present application;
FIG. 6 is a schematic diagram according to a fifth embodiment of the present application;
fig. 7 is a block diagram of an electronic device used to implement an embodiment of the present application.
Detailed Description
Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The scene error correction method, apparatus, electronic device and storage medium according to the embodiments of the present application are described below with reference to the accompanying drawings.
Fig. 1 is a schematic diagram according to a first embodiment of the present application. It should be noted that, the execution body of the scene error correction method in this embodiment is a scene error correction device, which may be implemented in software and/or hardware, and the device may be configured in an electronic device, where the electronic device may include, but is not limited to, a terminal device, a server, and the embodiment is not specifically limited to this.
As shown in fig. 1, the scene error correction method may include:
step 101, acquiring original text information input by a user in a current scene.
Specifically, in the current scene, in the process of inputting text by a user, when error correction is required to be performed on the text input by the user, original text information currently input by the user in the current scene can be obtained.
In this embodiment, in order to meet the requirement of user personalized input, the original text information in this embodiment may be input based on voice, text or image, etc., and it is understood that when voice input is performed, after voice information input by a user is obtained, voice recognition may be performed on the input voice information, and text information obtained by voice recognition is used as the original text information. When the image is input, after the image input by the user is acquired, the character recognition can be performed on the image, and the recognized character is used as original text information.
The current scene in this embodiment may be a customer service voice scene, a text review scene, a map scene, a search scene, etc., which is not limited in detail in this embodiment.
Step 102, acquiring a corpus set corresponding to the current scene from a scene corpus set constructed in advance.
In this embodiment, a corpus set corresponding to each scene is stored in the field Jing Yuliao database, where the corpus set includes a large number of corpora.
It should be understood that, the information described by the corpus in the corpus set in this embodiment is correct text information that is commonly used in the corresponding scene, where the correct text information is described based on actual facts.
And step 103, obtaining the candidate corpus matched with the original text information from the corpus set.
In this embodiment, the specific implementation manner of obtaining the candidate language corpus matched with the original text information from the corpus set is as follows: determining the relativity of the original text information and each corpus in the corpus set; according to the relevance, selecting the first M corpus with high relevance from the corpus set as the candidate corpus, wherein M is a positive integer.
In this embodiment, the relevance of the original text information and each corpus in the corpus set in the keyword feature dimension may be determined based on a keyword matching technique.
Specifically, for each corpus in the corpus set, first keyword features in the original text information can be obtained, and second keyword features in the corpus are obtained; and determining the relevance between the original text information and the corpus according to the first keyword features and the second keyword features.
In this embodiment, in order to further improve accuracy of the selected corpus candidate, as an exemplary implementation manner, a degree of correlation between the original text information and each corpus in the corpus set in each preset feature dimension may be calculated, and then, based on the degree of correlation between the original text information and each corpus in each preset feature dimension, a degree of correlation between the original text information and the corresponding corpus may be determined.
The preset feature dimensions may include a keyword feature dimension, a semantic feature dimension, and a dependency syntax feature dimension.
And 104, carrying out fact recognition on the original text information according to the candidate corpus to determine whether the original text information has errors.
In this embodiment, the specific implementation manner of performing fact recognition on the original text information according to the candidate corpus to determine whether the original text information has an error may be: the inference analysis can be performed on the candidate corpus and the original text information to obtain an inference relation between the candidate corpus and the original text information, and if the inference relation is determined to be of a first type, the original text information is determined to have errors.
The first type is used for representing contradiction between the candidate corpus and the original text information.
The inference relation in this embodiment may include a second type, in addition to the first type, where the second type is used to indicate that the candidate corpus is consistent with the original text information, and when the inference relation is the second type, it is explained that descriptive fact information in the candidate corpus and the original text information is consistent.
In this embodiment, in order to conveniently and quickly determine the inference relationship between the candidate corpus and the original text information, the inference relationship between the candidate corpus and the original text information may be obtained by inputting the candidate corpus and the original text information into a preset natural language inference model (Natural language inference, NLI).
And 105, if the original text information has errors, correcting the errors of the original text information according to the candidate corpus, and providing the corrected text information to the user.
For example, in a search scenario, entering original text information in a search box by a user is: the wife is a "and, in fact, is B, at this time, the error correction method in the related art does not correct the error of the a input by the user, but in the error correction method provided in this embodiment, when the fact recognition is performed on the original text information through the candidate data, it can be determined that the wife described in the original text information is not right, and according to the candidate corpus, the original text information is modified to be" the wife is B ", and the modified information" the wife is B "is provided to the user as the corrected text information, so that under the corresponding scenario, accurate error correction is achieved on the original text information based on the real fact information.
According to the scene error correction method, when error correction is required to be carried out on original text information input by a user under a current scene, candidate corpus matched with the original text information is obtained from a corpus set corresponding to the current scene, fact recognition is carried out on the original text information according to the candidate corpus, whether the original text information has errors or not is determined, and when the original text information has errors, error correction is carried out on the original text information according to the candidate corpus, and the text information after error correction is provided for the user. Therefore, the method for correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of carrying out fact illustration on the original text information.
Based on the above embodiment, in practical application, when error correction is required for original text information input by a user in a new scene, in related technology, feature selection and manual labeling are required manually, however, in this embodiment, by adding corpus data of the new scene to the field Jing Yuliao library, subsequent error correction for original text information input by the user in the new scene can be conveniently implemented based on the updated scene corpus, and the updated scene corpus of this embodiment is further described below with reference to fig. 2.
As shown in fig. 2, the method may further include:
step 201, receiving a scene extension request of a user, wherein the scene extension request includes scene corpus data of a new scene.
In this embodiment, in order to improve the accuracy and correction of the text in the new scene, the corpus which does not meet the predetermined standard in the scene corpus data of the new scene can be cleaned, and the processed scene corpus data is stored in the scene corpus.
Step 202, adding scene corpus data of the new scene into a field Jing Yuliao library.
In the embodiment, the expansion of the error correction scene can be realized by adding scene corpus data of the new scene into the scene corpus, and in the process of expanding the new scene, the characteristic selection and the manual annotation do not need to be manually carried out, so that the labor cost required in the error correction scene is greatly reduced, and the error correction scene can be conveniently expanded in a Jing Yuliao library according to the requirement at any time.
Fig. 3 is a third embodiment of the present application. In this embodiment, the above embodiment is further refined or extended.
As shown in fig. 3, the method may include:
step 301, obtaining original text information input by a user in a current scene.
Step 302, obtaining a corpus set corresponding to a current scene from a pre-constructed scene corpus.
Step 303, obtaining a candidate corpus matched with the original text information from the corpus set.
Step 304, extracting the fact information from the original text information to obtain the fact information described in the original text information, where the fact information includes: attribute information of an entity, the attribute information including an attribute and a first attribute value of the attribute.
Step 305, obtaining a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute.
In this embodiment, fact information extraction may be performed on the corpus candidate to obtain fact information described in the original text information, where the fact information described in the original text information may include a first attribute value of an attribute of the entity, and may also include attribute values of other attributes of the entity.
Step 306, if the first attribute value and the second attribute value are not consistent, determining that the original text information has an error.
Step 307, replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction, and providing the text information after error correction to the user.
In order to make the present application more clear to those skilled in the art, the scene error correction method of the present embodiment is described below with reference to fig. 4.
In this embodiment, the corpus includes a specific scene corpus and a general corpus, specifically, after a user inputs a text in a current scene, it may be determined whether the text data of the current scene exists in the scene corpus, if not, a candidate corpus related to the input text may be extracted from the general scene corpus by a search engine, if so, a candidate corpus related to the input text may be extracted from the scene corpus corresponding to the current scene by the search engine, then the input text and the candidate corpus are input into a verification module, and the verification module is configured to determine whether the input text is correct based on a natural language reasoning model or a semantic matching model by using the candidate text, and correct the input text based on the candidate text and output an error correction text when the input text is incorrect.
In order to achieve the above embodiments, the present application further provides a scene error correction device.
Fig. 5 is a schematic diagram according to a fifth embodiment of the present application. As shown in fig. 5, the scene error correction apparatus 100 includes a first acquisition module 110, a second acquisition module 120, a matching module 130, a fact identification module 140, and an error correction module 150, wherein:
the first obtaining module 110 is configured to obtain original text information input by a user in a current scene.
The second obtaining module 120 is configured to obtain a corpus set corresponding to the current scene from a pre-constructed scene corpus.
And the matching module 130 is used for acquiring the candidate corpus matched with the original text information from the corpus set.
The fact recognition module 140 is configured to perform fact recognition on the original text information according to the candidate corpus, so as to determine whether the original text information has an error.
And the error correction module 150 is configured to correct the original text information according to the candidate corpus if the original text information has an error, and provide the corrected text information to the user.
In one embodiment of the present application, the fact identification module 140 is specifically configured to: extracting the fact information from the original text information to obtain the fact information described in the original text information, wherein the fact information comprises: attribute information of an entity, the attribute information including an attribute and a first attribute value of the attribute. And obtaining a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute. If the first attribute value and the second attribute value are inconsistent, determining that the original text information has errors.
In one embodiment of the present application, the error correction module 150 is specifically configured to: and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
In one embodiment of the present application, the fact identification module 140 includes:
and the reasoning unit is used for carrying out reasoning analysis on the candidate corpus and the original text information so as to obtain the reasoning relation of the candidate corpus and the original text information.
And the first determining unit is used for determining that the original text information has errors if the reasoning relation is of a first type, wherein the first type is used for representing that the candidate corpus contradicts the original text information.
The reasoning unit is specifically used for: inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain the reasoning relation of the candidate corpus and the original text information.
In one embodiment of the present application, the matching module 130 may include:
and the second determining unit is used for determining the relevance between the original text information and each corpus in the corpus set.
And the selection unit is used for selecting the first M corpus with high correlation degree from the corpus set as the candidate corpus according to the correlation degree, wherein M is a positive integer.
In one embodiment of the present application, the second determining unit is specifically configured to: for each corpus in the corpus set, acquiring first keyword features in the original text information and acquiring second keyword features in the corpus. And determining the relevance between the original text information and the corpus according to the first keyword features and the second keyword features.
Based on the above embodiment, on the basis of the embodiment shown in fig. 5, as shown in fig. 6, the apparatus further includes:
the receiving module 160 is configured to receive a scene extension request of a user, where the scene extension request includes scene corpus data of a new scene.
The adding module 170 is configured to add the scene corpus data of the new scene to the farm Jing Yuliao library.
It should be noted that the foregoing explanation of the scene error correction method is also applicable to the scene error correction device of the present embodiment, and will not be repeated here.
According to the scene error correction device, when error correction is required to be performed on original text information input by a user in a current scene, candidate corpus matched with the original text information is obtained from a corpus set corresponding to the current scene, fact recognition is performed on the original text information according to the candidate corpus, whether the original text information has errors or not is determined, and when the original text information has errors, error correction is performed on the original text information according to the candidate corpus, and the corrected text information is provided for the user. Therefore, the method for correcting the original text information based on the scene corpus is provided, and the correction of the original text information in different scenes is accurately realized by combining the result of carrying out fact illustration on the original text information.
According to embodiments of the present application, an electronic device and a readable storage medium are also provided.
As shown in fig. 7, is a block diagram of an electronic device according to an embodiment of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.
As shown in fig. 7, the electronic device includes: one or more processors 701, memory 702, and interfaces for connecting the various components, including high-speed interfaces and low-speed interfaces. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions executing within the electronic device, including instructions stored in or on memory to display graphical information of the GUI on an external input/output device, such as a display device coupled to the interface. In other embodiments, multiple processors and/or multiple buses may be used, if desired, along with multiple memories and multiple memories. Also, multiple electronic devices may be connected, each providing a portion of the necessary operations (e.g., as a server array, a set of blade servers, or a multiprocessor system). One processor 701 is illustrated in fig. 7.
Memory 702 is a non-transitory computer-readable storage medium provided herein. The memory stores instructions executable by the at least one processor to cause the at least one processor to perform the scene correction method provided herein. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the scene correction method provided by the present application.
The memory 702 is used as a non-transitory computer readable storage medium for storing non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the scene correction method in the embodiments of the present application. The processor 701 executes various functional applications of the server and data processing, i.e., implements the scene correction method in the above-described method embodiments, by running non-transitory software programs, instructions, and modules stored in the memory 702.
Memory 702 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created according to the use of the electronic device, etc. In addition, the memory 702 may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage device. In some embodiments, memory 702 may optionally include memory located remotely from processor 701, which may be connected to the electronic device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or otherwise, in fig. 7 by way of example.
The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic device, such as a touch screen, keypad, mouse, trackpad, touchpad, pointer stick, one or more mouse buttons, trackball, joystick, and like input devices. The output device 704 may include a display apparatus, auxiliary lighting devices (e.g., LEDs), and haptic feedback devices (e.g., vibration motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device may be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASIC (application specific integrated circuit), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
These computing programs (also referred to as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented in a high-level procedural and/or object-oriented programming language, and/or in assembly/machine language. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present application may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application can be achieved, and are not limited herein.
The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims (14)

1. A method of scene correction, the method comprising:
acquiring original text information input by a user in a current scene;
acquiring a corpus set corresponding to the current scene from a scene corpus set constructed in advance;
obtaining a candidate corpus matched with the original text information from the corpus set;
carrying out fact recognition on the original text information according to the candidate corpus so as to determine whether the original text information has errors or not;
if the original text information is wrong, correcting the error of the original text information according to the candidate corpus, and providing the corrected text information to the user;
the step of performing fact recognition on the original text information according to the candidate corpus to determine whether the original text information has errors, including:
extracting the fact information from the original text information to obtain the fact information described in the original text information, wherein the fact information comprises: attribute information of an entity, wherein the attribute information comprises an attribute and a first attribute value of the attribute;
acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute;
if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors;
or alternatively
Carrying out reasoning analysis on the candidate corpus and the original text information to obtain a reasoning relation between the candidate corpus and the original text information;
and if the reasoning relation is of a first type, determining that the original text information has errors, wherein the first type is used for representing that the candidate corpus contradicts the original text information.
2. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the error correction of the original text information according to the candidate corpus comprises the following steps:
and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
3. The method of claim 1, wherein performing an inference analysis on the candidate corpus and the original text information to obtain an inference relationship between the candidate corpus and the original text information comprises:
inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain the reasoning relation between the candidate corpus and the original text information.
4. The method of claim 1, wherein the obtaining, from the corpus set, a candidate corpus that matches the original text information comprises:
determining the relevance of the original text information and each corpus in the corpus set;
and selecting the first M corpus with high correlation degree from the corpus set as candidate corpus according to the correlation degree, wherein M is a positive integer.
5. The method of claim 4, wherein the determining the relevance of the original text information to each corpus in the corpus set comprises:
aiming at each corpus in the corpus set, acquiring first keyword features in the original text information and acquiring second keyword features in the corpus;
and determining the relevance between the original text information and the corpus according to the first keyword features and the second keyword features.
6. The method according to claim 1, wherein the method further comprises:
receiving a scene expansion request of the user, wherein the scene expansion request comprises scene corpus data of a new scene;
and adding the scene corpus data of the new scene into the scene corpus.
7. A scene correction device, the device comprising:
the first acquisition module is used for acquiring original text information input by a user in a current scene;
the second acquisition module is used for acquiring a corpus set corresponding to the current scene from a scene corpus set constructed in advance;
the matching module is used for acquiring a candidate corpus matched with the original text information from the corpus set;
the fact identification module is used for carrying out fact identification on the original text information according to the candidate corpus so as to determine whether the original text information has errors or not;
the error correction module is used for correcting errors of the original text information according to the candidate corpus and providing the corrected text information for the user if the original text information has errors;
the fact identification module is specifically configured to:
extracting the fact information from the original text information to obtain the fact information described in the original text information, wherein the fact information comprises: attribute information of an entity, wherein the attribute information comprises an attribute and a first attribute value of the attribute;
acquiring a second attribute value of the attribute in the candidate corpus, wherein the second attribute value is a real attribute value of the attribute;
if the first attribute value is inconsistent with the second attribute value, determining that the original text information has errors;
alternatively, the fact identification module includes:
the reasoning unit is used for carrying out reasoning analysis on the candidate corpus and the original text information so as to obtain a reasoning relation between the candidate corpus and the original text information;
and the first determining unit is used for determining that the original text information has errors if the reasoning relation is of a first type, wherein the first type is used for representing that the candidate corpus contradicts the original text information.
8. The apparatus of claim 7, wherein the device comprises a plurality of sensors,
the error correction module is specifically configured to:
and replacing the first attribute value in the original text information with the second attribute value to obtain the text information after error correction.
9. The apparatus according to claim 7, wherein the inference unit is specifically configured to:
inputting the candidate corpus and the original text information into a preset natural language reasoning model to obtain the reasoning relation between the candidate corpus and the original text information.
10. The apparatus of claim 7, wherein the matching module comprises:
the second determining unit is used for determining the relevance between the original text information and each corpus in the corpus set;
and the selection unit is used for selecting the first M corpus with high correlation degree from the corpus set as the candidate corpus according to the correlation degree, wherein M is a positive integer.
11. The apparatus according to claim 10, wherein the second determining unit is specifically configured to:
aiming at each corpus in the corpus set, acquiring first keyword features in the original text information and acquiring second keyword features in the corpus;
and determining the relevance between the original text information and the corpus according to the first keyword features and the second keyword features.
12. The apparatus of claim 7, wherein the apparatus further comprises:
the receiving module is used for receiving a scene expansion request of the user, wherein the scene expansion request comprises scene corpus data of a new scene;
and the adding module is used for adding the scene corpus data of the new scene into the scene corpus.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-6.
14. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-6.
CN201911358292.5A 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium Active CN111090991B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911358292.5A CN111090991B (en) 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911358292.5A CN111090991B (en) 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111090991A CN111090991A (en) 2020-05-01
CN111090991B true CN111090991B (en) 2023-07-04

Family

ID=70397231

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911358292.5A Active CN111090991B (en) 2019-12-25 2019-12-25 Scene error correction method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN111090991B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111724244A (en) * 2020-06-11 2020-09-29 中国建设银行股份有限公司 Objection correction method and device
CN112597754B (en) * 2020-12-23 2023-11-21 北京百度网讯科技有限公司 Text error correction method, apparatus, electronic device and readable storage medium
CN113361266B (en) * 2021-06-25 2022-12-06 达闼机器人股份有限公司 Text error correction method, electronic device and storage medium
CN117197811A (en) * 2022-05-30 2023-12-08 华为技术有限公司 Text recognition method and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239547A (en) * 2017-06-05 2017-10-10 北京智能管家科技有限公司 Voice error correction method, terminal and storage medium for ordering song by voice
CN107679032A (en) * 2017-09-04 2018-02-09 百度在线网络技术(北京)有限公司 Voice changes error correction method and device
CN110232129A (en) * 2019-06-11 2019-09-13 北京百度网讯科技有限公司 Scene error correction method, device, equipment and storage medium
CN110288985A (en) * 2019-06-28 2019-09-27 北京猎户星空科技有限公司 Voice data processing method, device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106910501B (en) * 2017-02-27 2019-03-01 腾讯科技(深圳)有限公司 Text entities extracting method and device
US10430449B2 (en) * 2017-03-28 2019-10-01 Rovi Guides, Inc. Systems and methods for correcting a voice query based on a subsequent voice query with a lower pronunciation rate

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107239547A (en) * 2017-06-05 2017-10-10 北京智能管家科技有限公司 Voice error correction method, terminal and storage medium for ordering song by voice
CN107679032A (en) * 2017-09-04 2018-02-09 百度在线网络技术(北京)有限公司 Voice changes error correction method and device
CN110232129A (en) * 2019-06-11 2019-09-13 北京百度网讯科技有限公司 Scene error correction method, device, equipment and storage medium
CN110288985A (en) * 2019-06-28 2019-09-27 北京猎户星空科技有限公司 Voice data processing method, device, electronic equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
韦向峰 ; 袁毅 ; 张全 ; 池毓焕 ; .富媒体环境下语音和文本内容的对齐研究.情报工程.(第02期),全文. *

Also Published As

Publication number Publication date
CN111090991A (en) 2020-05-01

Similar Documents

Publication Publication Date Title
CN111090991B (en) Scene error correction method, device, electronic equipment and storage medium
US11847164B2 (en) Method, electronic device and storage medium for generating information
CN111144115B (en) Pre-training language model acquisition method, device, electronic equipment and storage medium
JP2021114291A (en) Time series knowledge graph generation method, apparatus, device and medium
KR20210152924A (en) Method, apparatus, device, and storage medium for linking entity
CN110597959B (en) Text information extraction method and device and electronic equipment
CN111522967B (en) Knowledge graph construction method, device, equipment and storage medium
CN111859997B (en) Model training method and device in machine translation, electronic equipment and storage medium
CN112541076B (en) Method and device for generating expanded corpus in target field and electronic equipment
CN113553414B (en) Intelligent dialogue method, intelligent dialogue device, electronic equipment and storage medium
CN111949814A (en) Searching method, searching device, electronic equipment and storage medium
JP7222040B2 (en) Model training, image processing method and device, storage medium, program product
CN112528001B (en) Information query method and device and electronic equipment
CN111695519B (en) Method, device, equipment and storage medium for positioning key point
CN111984774B (en) Searching method, searching device, searching equipment and storage medium
CN111767381A (en) Automatic question answering method and device
EP3839799A1 (en) Method, apparatus, electronic device and readable storage medium for translation
CN111291192B (en) Method and device for calculating triplet confidence in knowledge graph
CN111309872B (en) Search processing method, device and equipment
CN111339314B (en) Ternary group data generation method and device and electronic equipment
CN113516491B (en) Popularization information display method and device, electronic equipment and storage medium
CN110688837B (en) Data processing method and device
CN111259058B (en) Data mining method, data mining device and electronic equipment
CN111177479B (en) Method and device for acquiring feature vector of node in relational network graph
CN110674262B (en) Word recognition method, device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant