WO2021174823A1 - Grammatical error correction method, apparatus, computer system, and readable storage medium - Google Patents

Grammatical error correction method, apparatus, computer system, and readable storage medium Download PDF

Info

Publication number
WO2021174823A1
WO2021174823A1 PCT/CN2020/118197 CN2020118197W WO2021174823A1 WO 2021174823 A1 WO2021174823 A1 WO 2021174823A1 CN 2020118197 W CN2020118197 W CN 2020118197W WO 2021174823 A1 WO2021174823 A1 WO 2021174823A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
real
cursor
smart
error correction
Prior art date
Application number
PCT/CN2020/118197
Other languages
French (fr)
Chinese (zh)
Inventor
金晓辉
阮晓雯
徐亮
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021174823A1 publication Critical patent/WO2021174823A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/423Preprocessors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of intelligent decision-making in artificial intelligence, and in particular to a method, device, computer system, and readable storage medium for grammatical error correction.
  • the above-mentioned method to modify the grammar can only be based on limited predefined rules and mapping functions, and it takes a long time and the error correction efficiency is low.
  • the purpose of this application is to provide a method, device, computer system, and readable storage medium for grammatical error correction, which can only be based on limited predefined rules and mapping functions, and is time-consuming. Long, the problem of low error correction efficiency.
  • this application provides a grammatical error correction method, including:
  • this application also provides a grammar error correction device, including:
  • the preprocessing module is used to obtain the initial text, and insert a smart cursor that can perform actions at a preset position of the initial text;
  • the state acquisition module is used to mark the real-time state of the initial text with the smart cursor to obtain real-time state information
  • An action determining module configured to determine the action data of the smart cursor according to the real-time status information
  • the action execution module is configured to use the smart cursor to process the initial text based on the action data to obtain the target text.
  • the present application also provides a computer-readable storage medium, which includes multiple storage media, each of which stores a computer program, and when the computer program stored in the multiple storage media is executed by a processor Jointly implement the following steps of the above grammatical error correction method:
  • the grammatical error correction method, device, computer system, and readable storage medium provided in this application insert an actionable smart cursor into the initial text, and determine the action data of the smart cursor through the error correction model based on the real-time status information of the initial text, Then perform actions on the initial text, and finally obtain the target text after the smart cursor completes all actions.
  • the solution to the grammatical error correction existing in the prior art can only be based on limited predefined rules and mapping functions, and it takes a long time. The problem of low efficiency.
  • FIG. 1 is a flowchart of Embodiment 1 of the grammatical error correction method described in this application;
  • FIG. 2 is a specific flow chart of real-time status marking of the initial text with smart cursor in the first embodiment of the grammatical error correction method described in this application to obtain real-time status information;
  • FIG. 3 is a specific flowchart of determining the action data of the smart cursor by using an error correction model according to the real-time status information in the first embodiment of the grammatical error correction method according to this application;
  • FIG. 5 is a specific flowchart of obtaining reward and punishment data according to the compilation result in Embodiment 1 of the grammatical error correction method according to this application;
  • FIG. 6 is a flowchart of using the smart cursor to process the initial text based on the action data in the first embodiment of the grammatical error correction method according to the application to obtain a target file;
  • FIG. 7 is a schematic diagram of program modules of Embodiment 2 of the enhanced grammatical error correction method according to this application;
  • FIG. 8 is a schematic diagram of the hardware structure of the computer device in the third embodiment of the computer system of this application.
  • the grammatical error correction method, device, computer system, and readable storage medium provided in this application are suitable for the field of intelligent decision-making, and provide a grammatical error correction method based on a preprocessing module, a state acquisition module, an action determination module, and an action execution module .
  • This application inserts a smart cursor that can perform actions into the initial text, determines the action data of the smart cursor through an error correction model based on the real-time status information of the initial text, and then executes actions on the initial text, and finally after the smart cursor completes all actions
  • Obtain the target text use the smart cursor as the smart carrier of the error correction model, locate the wrong grammar and modify it intelligently, and solve the existing grammatical error correction in the prior art, which can only be based on limited predefined rules and mapping functions, and is time-consuming Longer, the problem of lower error correction efficiency.
  • a grammatical error correction method of this embodiment is used on the server side to automatically identify and correct grammatical errors before program coding, including the following steps:
  • S100 Obtain an initial text, and insert a smart cursor that can perform an action at a preset position of the initial text;
  • the above-mentioned cursor is a flexible means for retrieving data from the table and performing operations.
  • the cursor is mainly used on the server to process SQL statements sent from the client to the server, or batch processing, stored procedure, or trigger.
  • the advantage of the cursor is that it can locate a row in the result set and perform specific operations on the row of data.
  • the above-mentioned preset position is the head position of the initial text for subsequent use Smart cursor is used to correct grammatical errors from the head to the tail of the initial text.
  • S200 Mark the real-time status of the initial text with the smart cursor to obtain real-time status information
  • the above-mentioned real-time status mark is performed on the initial text with the smart cursor to obtain real-time status information, referring to Figure 2, including the following steps:
  • the serialization is in units of words, and the words include pre-defined functions, custom variables, operators and other units of the program.
  • the moving length of the smart cursor is also in words.
  • the unit is the same as in the serialization process here.
  • S220 Locate the smart cursor in real time, obtain real-time position information of the smart cursor, mark the first processed data based on the position information, and obtain the first processed data with the cursor real-time position mark as real-time status information .
  • the above-mentioned real-time status information is used to determine the specific position of the smart cursor in the initial file, so that the error correction model in the subsequent step S300 is used to automatically determine whether error correction is required according to the text at the position of the smart cursor.
  • special text is used For example, ⁇ #cursor#> marks the position of the cursor, the initial position is the forefront of the initial text, and other special marks can also be used.
  • S300 Use an error correction model to determine the action data of the smart cursor according to the real-time status information
  • the above action data includes two types: editing action data and navigation action data.
  • the navigation action includes moving the position of the smart cursor in the initial text, which can move one word to the right or move down to the bottom of the initial text.
  • the starting position of a sentence of code; editing actions include 3 types of insertion, deletion and replacement.
  • the main editing object is defined as a variable set, including but not limited to semicolons, brackets, brackets, commas and dots.
  • the above error correction model is LSTM The network combines the A2C model.
  • the above-mentioned error correction model is used to determine the action data of the smart cursor according to the real-time status information, referring to FIG. 3, including the following steps:
  • S310 Use a neural network to perform mapping processing on the real-time status information to obtain first data
  • the LSTM neural network is used in step S310, and the real-time status information obtained after serialization in S211 is input into the long and short-term sequence (LSTM) network, and each word of the real-time status information is mapped to obtain a corresponding vector.
  • the LSTM network is Used here as an encoding and decoding network.
  • S320 Perform element averaging processing on the first data to obtain second data
  • the Mean Pooling layer (average pooling layer) is used to calculate the element average of the output vector to obtain the Embedding vector of the state, which is the second data.
  • the above step S310 and step S320 process the real-time state information and transform it into the corresponding Vector.
  • S330 Use a deep reinforcement learning model to process the second data, and determine the smart cursor action data.
  • the deep reinforcement learning model is an A2C model
  • the A2C network is a multi-threaded reinforcement learning algorithm.
  • Each thread contains its own thread network, which is divided into two parts: Actor network and Critic network.
  • the Actor network is used to solve the action strategy
  • the Critic network is used to solve the value function
  • the actor network is the input state (state) and the output action.
  • Probability distribution from which actions are selected as the input of the critic network; the critic network is to input state and action to estimate the q-value of the next state, and determine the action data of the smart cursor through the above-mentioned LSTM model combined with the A2C model.
  • the error correction model is trained before determining the action data of the smart cursor according to the real-time status information.
  • the training process includes the following:
  • S331 Obtain training samples, process the training samples by using a neural network, and obtain sample processing data;
  • the training sample may be data similar to the initial text.
  • the training sample is processed by the LSTM neural network to obtain the corresponding sample vector.
  • S332 Use the action network and the state network in the deep reinforcement learning model to process the training sample processing data to obtain the initial action strategy and value function;
  • the linear layer 1 plus the softmax fully connected layer is used as the Actor to generate the action strategy; the linear layer 2 is used as the Critic to generate the value function.
  • the loss function will adjust the initial action strategy and the value function to obtain the output sample action data, and the loss function will be adjusted during the training process.
  • S334 Use a compiler to compile the sample action data, and obtain reward and punishment data according to the compiling result
  • the error correction model is trained by the way of compiler feedback. Compared with some products in the industry with supervised learning and training, there is no need to give paired samples of the error code and the correct code at the same time, and there is no need to learn from In the process of manual modification and marking of samples, the rules of errors are sorted out, and the provision and selection of sample data is more flexible and convenient.
  • the reward and punishment data includes a positive punishment and a negative punishment.
  • obtaining the reward and punishment data according to the compilation result includes the following steps:
  • S334-1 Obtain the number of historical errors from the preset database, and obtain the number of errors after compilation according to the compilation result;
  • a preset database is provided for storing the number of errors after each compilation, and is updated according to the number of compilations to obtain the number of historical errors and the number of errors after compilation, which is mainly used to subsequently determine whether the errors of the compilation result increase or decrease, so that To determine whether it is a positive penalty or a negative penalty (ie reward).
  • S334-2 Determine whether the number of errors has increased based on the number of historical errors and the number of errors after compilation;
  • the reward and punishment data in the database is updated based on the change in the amount of error data. If the amount of error data fed back in the compilation result increases, a negative penalty of -1 will be given; if the amount of feedback error data decreases, A positive reward of +1 will be given; if the compilation is passed, a positive reward of +100 will be given to end the iteration.
  • the initial value of the reward and punishment data is preset, and then it will be updated according to the above-mentioned step S334-4, that is, the latest reward and punishment data is retained to adjust the above-mentioned initial data.
  • S400 Use the smart cursor to process the initial text based on the action data to obtain the target text.
  • the smart cursor is used to process the initial text based on the action data.
  • the process includes the following steps:
  • the data types include edit type and navigation type.
  • the edit type is the need to modify the text at the location of the smart cursor, that is, the correction of incorrect grammar
  • the navigation type is the guidance for the smart cursor , That is, the grammar is correct and does not need to be corrected, so that the smart cursor moves to the position of the next word.
  • S440 Acquire current position information of the smart cursor, and determine whether the smart cursor is at the end of the initial text based on the position information;
  • the special text ⁇ #cursor#> is used to mark the position of the cursor as described above. Therefore, it can be judged whether the cursor is at the end of the initial text according to the mark. If it is at the end of the initial text, it means that the cursor moves from the head of the initial text to the end. , Complete the grammatical error correction of the entire initial text. If it is not at the end of the initial text, repeat the above S310-S330 and S410-S440 based on the updated real-time status information to determine the execution action of the smart cursor again until the smart cursor reaches the end of the initial text .
  • the above target text can also be stored in a node of a blockchain, and the technical solution of this application can also be applied to other documents stored on the blockchain
  • the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm.
  • Blockchain essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block.
  • the blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
  • the smart cursor is used as the smart carrier of programming grammar error correction for intensive learning training, which can locate errors, intelligently modify them, and directly mobilize the compiler to make judgments after the modification during the training process. There is no need for technicians to manually troubleshoot errors.
  • Modification and compilation solves the problem that the prior art can only be based on limited predefined rules and mapping functions and has low inspection efficiency.
  • the smart cursor is given a series of actions that enable it to quickly reach the location that caused the program grammatical error, and directly make an action strategy based on the current text and cursor position information. higher efficiency.
  • a syntax error correction device 5 of this embodiment includes:
  • the preprocessing module 51 is configured to obtain the initial text, and insert a smart cursor that can perform actions at a preset position of the initial text;
  • the status acquisition module 52 is configured to mark the initial text with the smart cursor in real-time status to obtain real-time status information
  • the action determining module 53 is configured to determine the action data of the smart cursor according to the real-time status information
  • action data includes two types: editing action data and navigation action data.
  • the navigation action includes moving the position of the smart cursor in the initial text, which can move one word to the right or move down to the bottom of the initial text.
  • the starting position of a sentence of code; editing actions include 3 types of insertion, deletion, and replacement.
  • the main editing object is defined as a variable set, including but not limited to semicolons, brackets, brackets, commas, and dots.
  • the action determining module 53 includes the following:
  • the first processing unit 531 is configured to use a neural network to perform mapping processing on the real-time status information to obtain first data;
  • the neural network is an LSTM neural network.
  • the second processing unit 532 is configured to perform element averaging processing on the first data to obtain second data;
  • the Pooling layer (average pooling layer) performs element averaging calculation on the output vector to obtain the Embedding vector of the state.
  • the third processing unit 533 is configured to use a deep reinforcement learning model to process the second data and determine the smart cursor action data.
  • the deep reinforcement learning model is an A2C model, and the A2C network is a multi-threaded reinforcement learning algorithm.
  • Each thread will contain its own thread network, which is divided into two parts: Actor network and Critic network.
  • the Actor network is used to solve the action strategy
  • the Critic network is used to solve the value function.
  • the compiler is used to analyze the sample
  • the action data is compiled, reward and punishment data is obtained according to the result of the compilation, the loss function and parameters in the error correction model are adjusted based on the reward and punishment data, and processed again until the training process is completed, and the error correction model is obtained.
  • the action execution module 54 is configured to use the smart cursor to process the initial text based on the action data to obtain the target text.
  • This technical solution is based on the detection model of intelligent decision-making, inserts an actionable smart cursor into the initial text through the preprocessing module, uses the state acquisition module to acquire the real-time state information of the initial text, and then uses the action determination module based on the acquired real-time state information Determine the action data of the smart cursor through the error correction model, and use the action execution module to use the smart cursor to perform actions on the initial text. Finally, the target text is obtained after the smart cursor completes all actions.
  • the smart cursor is used as the smart carrier of the error correction model to locate Incorrect grammar and intelligently modify it, so as to solve the problem that grammatical error correction in the prior art can only be based on limited predefined rules and mapping functions, and takes a long time and has low error correction efficiency.
  • the technical solution is also based on the first processing unit, the second processing unit, and the third processing unit to determine the action data of the smart cursor through the error correction model.
  • the error correction model is implemented through the LSTM model combined with the A2C model, and the smart cursor is controlled according to the action data. Editing to realize automatic error correction, and at the same time, after editing the text, a compiler will be used to compile the edited text, and the compilation result will be fed back to the adjustment error correction model to achieve independent learning of the error correction model and improve subsequent acquisition It is used for the accuracy of the action data of the smart cursor.
  • the error correction model does not need to give paired samples of the error code and the correct code at the same time. Training samples are provided and selected more flexibly and conveniently.
  • this application also provides a computer device 6 which includes multiple computer devices.
  • the components of the grammatical error correction device 5 of the second embodiment can be dispersed in different computer devices.
  • the device can be a smart phone, a tablet, a laptop, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster composed of multiple servers) that executes the program Wait.
  • the computer equipment in this embodiment at least includes but is not limited to: a memory 61 and a processor 62 that can be communicatively connected to each other through a system bus, as shown in FIG. 8. It should be pointed out that FIG. 8 only shows a computer device with components, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
  • the memory 61 (ie, readable storage medium) includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), Read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, etc.
  • the memory 61 may be an internal storage unit of a computer device, such as a hard disk or a memory of the computer device.
  • the memory 61 may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) equipped on the computer device. Card, Flash Card, etc.
  • the memory 61 may also include both the internal storage unit of the computer device and its external storage device.
  • the memory 61 is generally used to store an operating system and various application software installed in a computer device, such as the program code of the grammatical error correction method in the first embodiment, and so on.
  • the memory 61 may also be used to temporarily store various types of data that have been output or will be output.
  • the processor 62 may be a central processing unit (Central Processing Unit) in some embodiments. Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip.
  • the processor 62 is generally used to control the overall operation of the computer equipment.
  • the processor 62 is configured to run program codes or process data stored in the memory 61, for example, to run a syntax error correction device, so as to implement the syntax error correction method of the first embodiment.
  • the network interface 63 may include a wireless network interface or a wired network interface, and the network interface 63 is generally used to establish a communication connection between the computer device 6 and other computer devices 6.
  • the network interface 63 is used to connect the computer device 6 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the computer device 6 and the external terminal.
  • the network may be an intranet (Intranet), the Internet (Internet), a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
  • FIG. 8 only shows the computer device 6 with components 61-63, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
  • the grammatical error correction device 5 stored in the memory 61 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 61 and are composed of one or more program modules.
  • Multiple processors are executed to complete the application.
  • the present application also provides a computer-readable storage medium.
  • the computer-readable storage medium may be non-volatile or volatile, and includes multiple storage media, such as flash memory, hard disk, and multimedia.
  • the computer-readable storage medium of this embodiment is used to store a syntax error correction device, and when executed by the processor 62, the syntax error correction method of the first embodiment is implemented.
  • the computer-readable storage medium includes a storage data area and a storage program area, the storage data area stores data created according to the use of blockchain nodes, and the storage program area stores computer programs; wherein When the computer program is executed by the processor 62, the grammatical error correction method described in any of the embodiments is implemented.

Abstract

Provided are a grammatical error correction method, apparatus, computer system, and readable storage medium, relating to the field of intelligent decision-making in artificial intelligence, comprising: obtaining an initial text, and inserting an actionable smart cursor at a preset position of said initial text (S100); performing real-time status marking of the initial text having a smart cursor to obtain real-time status information (S200); according to the real-time status information, using an error correction model to determine action data of the smart cursor (S300); using the smart cursor to process the initial text on the basis of the action data to obtain a target text (S400). The invention solves the problem in the prior art that grammatical error correction can be based only on limited predefined rules and mapping functions, it being time-consuming and the efficiency of error correction being low.

Description

语法纠错方法、装置、计算机系统及可读存储介质Grammar error correction method, device, computer system and readable storage medium
本申请要求于2020年7月30日提交中国专利局申请号为202010752813.1,名称为“语法纠错方法、装置、计算机系统及可读存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed on July 30, 2020 with the Chinese Patent Office application number 202010752813.1, titled "Syntax Error Correction Method, Device, Computer System, and Readable Storage Medium", the entire content of which is incorporated by reference Incorporated in this application.
技术领域Technical field
本申请涉及人工智能中的智能决策领域,尤其涉及一种语法纠错方法、装置、计算机系统及可读存储介质。This application relates to the field of intelligent decision-making in artificial intelligence, and in particular to a method, device, computer system, and readable storage medium for grammatical error correction.
背景技术Background technique
程序在运行前,往往会需要采用编译器进行程序编译,若编写的脚本中存在语法错误,则编译器会报错且无法继续运行,现有技术中技术人员大多依赖编译器获得脚本错误信息的反馈,但是该方式获得反馈信息中并不能准确地定位到具体语法错误处,这使得修正语法错误非常耗时的,且对技术人员专业要求较高。Before the program is run, it is often necessary to use a compiler to compile the program. If there are syntax errors in the written script, the compiler will report an error and cannot continue to run. In the prior art, most technicians rely on the compiler to obtain feedback on script error messages. However, the feedback information obtained in this way cannot accurately locate the specific grammatical error, which makes it very time-consuming to correct the grammatical error, and requires high professional requirements for the technicians.
发明人发现为了提高语法纠错的效率,现有技术中还采用预设规则或映射等方式对一些常见语法错误预先进行检查,并将检查结果映射到正确语法的代码上实现错误语法的纠正,但是上述方法修正语法只能基于有限的预定义规则和映射函数,且耗费时间较长,纠错效率较低。The inventor found that in order to improve the efficiency of grammatical error correction, in the prior art, some common grammatical errors are pre-checked by means of preset rules or mapping, etc., and the check result is mapped to the correct grammatical code to correct the wrong grammar. However, the above-mentioned method to modify the grammar can only be based on limited predefined rules and mapping functions, and it takes a long time and the error correction efficiency is low.
技术问题technical problem
本申请的目的是提供一种语法纠错方法、装置、计算机系统及可读存储介质,用于解决现有技术存在的语法纠错只能基于有限的预定义规则和映射函数,且耗费时间较长,纠错效率较低的问题。The purpose of this application is to provide a method, device, computer system, and readable storage medium for grammatical error correction, which can only be based on limited predefined rules and mapping functions, and is time-consuming. Long, the problem of low error correction efficiency.
技术解决方案Technical solutions
为实现上述目的,本申请提供一种语法纠错方法,包括:In order to achieve the above-mentioned purpose, this application provides a grammatical error correction method, including:
获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;Acquiring the initial text, and inserting a smart cursor that can perform actions at a preset position of the initial text;
对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;Mark the initial text with the smart cursor in real-time status to obtain real-time status information;
根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据;Using an error correction model to determine the action data of the smart cursor according to the real-time status information;
基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。Using the smart cursor to process the initial text based on the action data to obtain the target text.
为实现上述目的,本申请还提供一种语法纠错装置,包括:To achieve the above objective, this application also provides a grammar error correction device, including:
预处理模块,用于获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;The preprocessing module is used to obtain the initial text, and insert a smart cursor that can perform actions at a preset position of the initial text;
状态获取模块,用于对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;The state acquisition module is used to mark the real-time state of the initial text with the smart cursor to obtain real-time state information;
动作确定模块,用于根据所述实时状态信息确定所述智能游标的动作数据;An action determining module, configured to determine the action data of the smart cursor according to the real-time status information;
动作执行模块,用于基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。The action execution module is configured to use the smart cursor to process the initial text based on the action data to obtain the target text.
为实现上述目的,本申请还提供一种计算机设备,所述计算机设备包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现上述语法纠错方法的以下步骤:In order to achieve the foregoing objective, the present application also provides a computer device, the computer device including a memory, a processor, and a computer program stored in the memory and running on the processor. The processor executes the computer program when the computer program is executed. The following steps of the above grammatical error correction method:
获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;Acquiring the initial text, and inserting a smart cursor that can perform actions at a preset position of the initial text;
对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;Mark the initial text with the smart cursor in real-time status to obtain real-time status information;
根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据;Using an error correction model to determine the action data of the smart cursor according to the real-time status information;
基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。Using the smart cursor to process the initial text based on the action data to obtain the target text.
为实现上述目的,本申请还提供一种计算机可读存储介质,其包括多个存储介质,各存储介质上存储有计算机程序,所述多个存储介质存储的所述计算机程序被处理器执行时共同实现上述语法纠错方法的以下步骤:To achieve the above objective, the present application also provides a computer-readable storage medium, which includes multiple storage media, each of which stores a computer program, and when the computer program stored in the multiple storage media is executed by a processor Jointly implement the following steps of the above grammatical error correction method:
获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;Acquiring the initial text, and inserting a smart cursor that can perform actions at a preset position of the initial text;
对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;Mark the initial text with the smart cursor in real-time status to obtain real-time status information;
根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据;Using an error correction model to determine the action data of the smart cursor according to the real-time status information;
基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。Using the smart cursor to process the initial text based on the action data to obtain the target text.
有益效果Beneficial effect
本申请提供的语法纠错方法、装置、计算机系统及可读存储介质,通过在初始文本中插入可执行动作的智能游标,根据初始文本的实时状态信息通过纠错模型确定智能游标的动作数据,而后对所述初始文本执行动作,最终在智能游标完成所有动作后获得目标文本,解决现有技术存在的语法纠错只能基于有限的预定义规则和映射函数,且耗费时间较长,纠错效率较低的问题。The grammatical error correction method, device, computer system, and readable storage medium provided in this application insert an actionable smart cursor into the initial text, and determine the action data of the smart cursor through the error correction model based on the real-time status information of the initial text, Then perform actions on the initial text, and finally obtain the target text after the smart cursor completes all actions. The solution to the grammatical error correction existing in the prior art can only be based on limited predefined rules and mapping functions, and it takes a long time. The problem of low efficiency.
附图说明Description of the drawings
图1为本申请所述语法纠错方法实施例一的流程图;FIG. 1 is a flowchart of Embodiment 1 of the grammatical error correction method described in this application;
图2为本申请所述语法纠错方法实施例一中带有智能游标的初始文本进行实时状态标记,获得实时状态信息的具体流程图;2 is a specific flow chart of real-time status marking of the initial text with smart cursor in the first embodiment of the grammatical error correction method described in this application to obtain real-time status information;
图3为本申请所述语法纠错方法实施例一中根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据的具体流程图;FIG. 3 is a specific flowchart of determining the action data of the smart cursor by using an error correction model according to the real-time status information in the first embodiment of the grammatical error correction method according to this application;
图4为本申请所述语法纠错方法实施例一中在根据所述实时状态信息确定所述智能游标的动作数据前,对所述纠错模型进行训练的具体流程图;4 is a specific flowchart of training the error correction model before determining the action data of the smart cursor according to the real-time status information in the first embodiment of the grammatical error correction method of this application;
图5为本申请所述语法纠错方法实施例一中所述根据编译结果获得奖惩数据的具体流程图;FIG. 5 is a specific flowchart of obtaining reward and punishment data according to the compilation result in Embodiment 1 of the grammatical error correction method according to this application;
图6为本申请所述语法纠错方法实施例一中基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文件的流程图;FIG. 6 is a flowchart of using the smart cursor to process the initial text based on the action data in the first embodiment of the grammatical error correction method according to the application to obtain a target file;
图7为本申请所述基于强化的语法纠错方法装置实施例二的程序模块示意图;FIG. 7 is a schematic diagram of program modules of Embodiment 2 of the enhanced grammatical error correction method according to this application;
图8为本申请计算机系统实施例三中计算机设备的硬件结构示意图。FIG. 8 is a schematic diagram of the hardware structure of the computer device in the third embodiment of the computer system of this application.
本发明的实施方式Embodiments of the present invention
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处所描述的具体实施例仅用以解释本申请,并不用于限定本申请。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。In order to make the purpose, technical solutions, and advantages of this application clearer and clearer, the following further describes the application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
本申请提供的语法纠错方法、装置、计算机系统及可读存储介质,适用于智能决策领域,为提供一种基于预处理模块、状态获取模块、动作确定模块、动作执行模块的语法纠错方法。本申请通过在初始文本中插入可执行动作的智能游标,根据初始文本的实时状态信息通过纠错模型确定智能游标的动作数据,而后对所述初始文本执行动作,最终在智能游标完成所有动作后获得目标文本,用智能游标作为纠错模型的智能载体,定位错误语法并对其智能的进行修改,解决现有技术存在的语法纠错只能基于有限的预定义规则和映射函数,且耗费时间较长,纠错效率较低的问题。The grammatical error correction method, device, computer system, and readable storage medium provided in this application are suitable for the field of intelligent decision-making, and provide a grammatical error correction method based on a preprocessing module, a state acquisition module, an action determination module, and an action execution module . This application inserts a smart cursor that can perform actions into the initial text, determines the action data of the smart cursor through an error correction model based on the real-time status information of the initial text, and then executes actions on the initial text, and finally after the smart cursor completes all actions Obtain the target text, use the smart cursor as the smart carrier of the error correction model, locate the wrong grammar and modify it intelligently, and solve the existing grammatical error correction in the prior art, which can only be based on limited predefined rules and mapping functions, and is time-consuming Longer, the problem of lower error correction efficiency.
实施例一Example one
请参阅图1,本实施例的一种语法纠错方法,用于服务器端,在程序编码前对语法错误进行自动识别和更正,包括以下步骤:Please refer to Figure 1. A grammatical error correction method of this embodiment is used on the server side to automatically identify and correct grammatical errors before program coding, including the following steps:
S100:获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;S100: Obtain an initial text, and insert a smart cursor that can perform an action at a preset position of the initial text;
在本实施方式中,上述游标是一种从表中检索数据并进行操作的灵活手段,游标主要用在服务器上,处理由客户端发送给服务端的sql语句,或是批处理、存储过程、触发器中的数据处理请求,游标的优点在于它可以定位到结果集中的某一行,并可以对该行数据执行特定操作,具体的,上述预设位置为初始文本的头部位置,以便用于后续采用智能游标由初始文本头部向尾部进行语法纠错。In this embodiment, the above-mentioned cursor is a flexible means for retrieving data from the table and performing operations. The cursor is mainly used on the server to process SQL statements sent from the client to the server, or batch processing, stored procedure, or trigger The advantage of the cursor is that it can locate a row in the result set and perform specific operations on the row of data. Specifically, the above-mentioned preset position is the head position of the initial text for subsequent use Smart cursor is used to correct grammatical errors from the head to the tail of the initial text.
S200:对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;S200: Mark the real-time status of the initial text with the smart cursor to obtain real-time status information;
具体的,上述对带有智能游标的初始文本进行实时状态标记,获得实时状态信息,参阅图2,包括以下步骤:Specifically, the above-mentioned real-time status mark is performed on the initial text with the smart cursor to obtain real-time status information, referring to Figure 2, including the following steps:
S210:对所述初始文本进行序列化,获得第一处理数据;S210: Serialize the initial text to obtain first processed data;
在上述实施方式中,所述序列化以字为单位,所述字包括程序的预定义函数,自定义变量,运算符等单元,在后续处理过程中,所述智能游标移动长度也以字为单位,与此处序列化过程中一致。In the foregoing embodiment, the serialization is in units of words, and the words include pre-defined functions, custom variables, operators and other units of the program. In the subsequent processing, the moving length of the smart cursor is also in words. The unit is the same as in the serialization process here.
S220:实时定位所述智能游标,获取所述智能游标的实时位置信息,基于所述位置信息对所述第一处理数据进行标记,获得带有游标实时位置标记的第一处理数据作为实时状态信息。S220: Locate the smart cursor in real time, obtain real-time position information of the smart cursor, mark the first processed data based on the position information, and obtain the first processed data with the cursor real-time position mark as real-time status information .
本实施方式中,上述实时状态信息作用在于确定该智能游标在初始文件中的具体位置,以便采用后续步骤S300中采用纠错模型根据智能游标所在位置的文本自动进行判断是否需要纠错,游标会根据所在位置前进方向的文本内容进行动作(移动或编辑),游标位置会不断改变,因此根据游标的实时位置对序列化后的初始文本进行标记即可得到实时状态信息,具体的,采用特殊文本,例如<#cursor#>标记游标所在位置,初始位置为初始文本最前端,也可采用其他特殊标记。In this embodiment, the above-mentioned real-time status information is used to determine the specific position of the smart cursor in the initial file, so that the error correction model in the subsequent step S300 is used to automatically determine whether error correction is required according to the text at the position of the smart cursor. Actions (moving or editing) according to the text content in the forward direction of the position, the cursor position will continue to change, so the real-time status information can be obtained by marking the serialized initial text according to the real-time position of the cursor. Specifically, special text is used For example, <#cursor#> marks the position of the cursor, the initial position is the forefront of the initial text, and other special marks can also be used.
S300:根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据;S300: Use an error correction model to determine the action data of the smart cursor according to the real-time status information;
需要说明的是,上述动作数据包括两种类型:编辑动作数据和导航动作数据,所述导航动作包括移动初始文本中的智能游标位置,可以向右移动一个字或向下移动到初始文本中下一句代码的起始位置;编辑动作包括插入、删除和替换3类,主要编辑对象定义为一个可变集合,包括但不限于分号、括号、话括号、逗号和点,上述纠错模型为LSTM网络结合A2C模型。It should be noted that the above action data includes two types: editing action data and navigation action data. The navigation action includes moving the position of the smart cursor in the initial text, which can move one word to the right or move down to the bottom of the initial text. The starting position of a sentence of code; editing actions include 3 types of insertion, deletion and replacement. The main editing object is defined as a variable set, including but not limited to semicolons, brackets, brackets, commas and dots. The above error correction model is LSTM The network combines the A2C model.
具体的,上述根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据,参阅图3,包括以下步骤:Specifically, the above-mentioned error correction model is used to determine the action data of the smart cursor according to the real-time status information, referring to FIG. 3, including the following steps:
S310:采用神经网络对所述实时状态信息进行映射处理,获得第一数据;S310: Use a neural network to perform mapping processing on the real-time status information to obtain first data;
在本实施方式中,步骤S310中采用LSTM神经网络,将S211中进行序列化后获得的实时状态信息输入长短期序列(LSTM)网络,将实时状态信息每一个字映射获得对应向量,LSTM网络在此处用作编码和解码网络。In this embodiment, the LSTM neural network is used in step S310, and the real-time status information obtained after serialization in S211 is input into the long and short-term sequence (LSTM) network, and each word of the real-time status information is mapped to obtain a corresponding vector. The LSTM network is Used here as an encoding and decoding network.
S320:对所述第一数据进行元素平均处理,获得第二数据;S320: Perform element averaging processing on the first data to obtain second data;
在上述实施方式中,使用Mean Pooling层(平均池化层)对输出向量进行元素平均计算得到状态的Embedding向量,即为第二数据,上述步骤S310和步骤S320对实时状态信息处理,转化成对应的向量。In the above embodiment, the Mean Pooling layer (average pooling layer) is used to calculate the element average of the output vector to obtain the Embedding vector of the state, which is the second data. The above step S310 and step S320 process the real-time state information and transform it into the corresponding Vector.
S330:采用深度强化学习模型对所述第二数据进行处理,确定所述智能游标动作数据。S330: Use a deep reinforcement learning model to process the second data, and determine the smart cursor action data.
在本实施方式中,所述深度强化学习模型为A2C模型,A2C网络是一种多线程强化学习算法。每一个线程都会包含一个自己的线程网络,分为Actor网络和Critic网络两部分,其中Actor网络用于求解动作策略,Critic网络用于求解值函数,actor网络是输入state(状态),输出动作的概率分布,从中选择动作后作为critic网络的输入;critic网络是输入state和action预估下一个state的q-value,通过上述LSTM模型结合A2C模型进行智能游标动作数据的确定。In this embodiment, the deep reinforcement learning model is an A2C model, and the A2C network is a multi-threaded reinforcement learning algorithm. Each thread contains its own thread network, which is divided into two parts: Actor network and Critic network. The Actor network is used to solve the action strategy, the Critic network is used to solve the value function, and the actor network is the input state (state) and the output action. Probability distribution, from which actions are selected as the input of the critic network; the critic network is to input state and action to estimate the q-value of the next state, and determine the action data of the smart cursor through the above-mentioned LSTM model combined with the A2C model.
在本实施方式中,在根据所述实时状态信息确定所述智能游标的动作数据前,对所述纠错模型进行训练,参阅图4,训练过程包括以下:In this embodiment, before determining the action data of the smart cursor according to the real-time status information, the error correction model is trained. Referring to FIG. 4, the training process includes the following:
S331:获取训练样本,采用神经网络对所述训练样本处理,获取样本处理数据;S331: Obtain training samples, process the training samples by using a neural network, and obtain sample processing data;
具体的,所述训练样本为如初始文本类似的数据即可,如上步骤S310所述,采用LSTM神经网络对训练样本进行处理,获得对应的样本向量。Specifically, the training sample may be data similar to the initial text. As described in step S310 above, the training sample is processed by the LSTM neural network to obtain the corresponding sample vector.
S332:采用深度强化学习模型中动作网络和状态网络对训练样本处理数据进行处理,获得初始动作策略和值函数;S332: Use the action network and the state network in the deep reinforcement learning model to process the training sample processing data to obtain the initial action strategy and value function;
将第二数据输入A2C模型后,即Embedding向量进入线程后,使用线性层1加上softmax全连接层作为Actor,生成动作策略;使用线性层2作为Critic,生成值函数。After the second data is input into the A2C model, that is, after the Embedding vector enters the thread, the linear layer 1 plus the softmax fully connected layer is used as the Actor to generate the action strategy; the linear layer 2 is used as the Critic to generate the value function.
S333:基于所述初始动作策略和值函数采用损失函数处理获得样本动作数据;S333: Use loss function processing to obtain sample action data based on the initial action strategy and value function;
在本实施方式中,损失函数会对初始动作策略和值函数进行调整,获得输出的样本动作数据,所述损失函数会在训练过程中进行调整。In this embodiment, the loss function will adjust the initial action strategy and the value function to obtain the output sample action data, and the loss function will be adjusted during the training process.
S334:采用编译器对所述样本动作数据进行编译,根据编译结果获得奖惩数据;S334: Use a compiler to compile the sample action data, and obtain reward and punishment data according to the compiling result;
在训练过程中,用编译器反馈的方式进行纠错模型的训练,相比于业内部分有监督学习训练的产品,不需要同时给出错误代码和正确代码的成对样本,也不需要再从人工修改和标记样本的过程中梳理错误的规律,在样本数据提供和选择上更灵活便捷。In the training process, the error correction model is trained by the way of compiler feedback. Compared with some products in the industry with supervised learning and training, there is no need to give paired samples of the error code and the correct code at the same time, and there is no need to learn from In the process of manual modification and marking of samples, the rules of errors are sorted out, and the provision and selection of sample data is more flexible and convenient.
参阅图5,在上述步骤S334中,所述奖惩数据包括正向惩罚和负向惩罚,具体的所述根据编译结果获得奖惩数据,包括以下步骤:Referring to FIG. 5, in the above step S334, the reward and punishment data includes a positive punishment and a negative punishment. Specifically, obtaining the reward and punishment data according to the compilation result includes the following steps:
S334-1:从预设数据库中获取历史错误数量,根据编译结果获得编译后的错误数量;S334-1: Obtain the number of historical errors from the preset database, and obtain the number of errors after compilation according to the compilation result;
上述实施方式中,提供一预设数据库用于存储每次编译后的错误数量,并根据编译次数更新,获取历史错误数量以及编译后的错误数量主要用于后续判断编译结果错误增加还是减少,以便于确定是正向惩罚还是负向惩罚(即奖励)。In the above embodiment, a preset database is provided for storing the number of errors after each compilation, and is updated according to the number of compilations to obtain the number of historical errors and the number of errors after compilation, which is mainly used to subsequently determine whether the errors of the compilation result increase or decrease, so that To determine whether it is a positive penalty or a negative penalty (ie reward).
S334-2:基于所历史错误数量和编译后的错误数量判断错误数量是否增加;S334-2: Determine whether the number of errors has increased based on the number of historical errors and the number of errors after compilation;
S334-3:若是,则奖惩数据为负向惩罚;S334-3: If yes, the reward and punishment data is negative punishment;
S334-4:若不是,则奖惩数据为正向惩罚。S334-4: If not, the reward and punishment data is a positive punishment.
作为举例而非限定的,所述基于所述错误数据量变化更新数据库中的奖惩数据,若编译结果中反馈的错误数据量增多,则给予-1负向惩罚;若反馈的错误数据量减少,则给予+1正向奖励;若编译通过,则给予+100正向奖励结束迭代。As an example and not limitation, the reward and punishment data in the database is updated based on the change in the amount of error data. If the amount of error data fed back in the compilation result increases, a negative penalty of -1 will be given; if the amount of feedback error data decreases, A positive reward of +1 will be given; if the compilation is passed, a positive reward of +100 will be given to end the iteration.
S334-4:在获得惩罚数据后采用所述编译后的错误数量更新所述历史错误数量并存储在所述预设数据库中。S334-4: After obtaining the penalty data, use the compiled error number to update the historical error number and store it in the preset database.
具体的,奖惩数据初始值为预设,而后会根据上述步骤S334-4中所述进行更新,即保留最新的奖惩数据对上述初始数据进行调整。Specifically, the initial value of the reward and punishment data is preset, and then it will be updated according to the above-mentioned step S334-4, that is, the latest reward and punishment data is retained to adjust the above-mentioned initial data.
S335:基于所述奖惩数据对所述纠错模型中的损失函数和参数进行调整,再次处理,直至完成训练过程,获得训练后的纠错模型。S335: Adjust the loss function and parameters in the error correction model based on the reward and punishment data, and process again until the training process is completed, and a trained error correction model is obtained.
根据奖惩数据计算损失函数,并更新纠错模型中的参数,上述S331-S334以及S335调整损失函数和参数前为一个完整的迭代过程,每一次迭代,各个线程会采用同步更新的机制将调整的数据传递给全局网络。Calculate the loss function according to the reward and punishment data, and update the parameters in the error correction model. The above S331-S334 and S335 adjust the loss function and parameters before it is a complete iterative process. Each iteration, each thread will use a synchronous update mechanism to adjust The data is passed to the global network.
S400:基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。S400: Use the smart cursor to process the initial text based on the action data to obtain the target text.
具体的,上述基于所述动作数据采用所述智能游标对所述初始文本进行处理,参阅图6,包括以下步骤:Specifically, the smart cursor is used to process the initial text based on the action data. Referring to FIG. 6, the process includes the following steps:
S410:基于所述动作数据获取对应的数据类型;S410: Obtain a corresponding data type based on the action data;
如上述,所述数据类型包括编辑类型和导航类型,所述编辑类型为需要对所述智能游标所在位置处的文本进行修改,即对错误语法的更正,所述导航类型则对智能游标的导向,即语法正确,不需要更正,使智能游标移动至下一个字的位置。As mentioned above, the data types include edit type and navigation type. The edit type is the need to modify the text at the location of the smart cursor, that is, the correction of incorrect grammar, and the navigation type is the guidance for the smart cursor , That is, the grammar is correct and does not need to be corrected, so that the smart cursor moves to the position of the next word.
S420:当所述数据类型为编辑类型,根据所述动作数据对所述初始文本进行编辑,并基于所述智能游标的位置信息更新所述实时状态信息;S420: When the data type is an edit type, edit the initial text according to the action data, and update the real-time status information based on the position information of the smart cursor;
S430:当所述数据类型为导航类型,根据所述动作数据移动所述智能游标并基于所述智能游标的位置信息更新所述实时状态信息。S430: When the data type is a navigation type, move the smart cursor according to the action data and update the real-time status information based on the position information of the smart cursor.
在采用所述智能游标对所述初始文本进行处理后,无论是编辑还是移动,最终都需要根据智能游标的位置对实时状态信息进行更新,以便于再次根据实时状态信息确定智能游标执行动作或停止。After using the smart cursor to process the initial text, whether it is editing or moving, it is finally necessary to update the real-time status information according to the position of the smart cursor, so as to determine whether the smart cursor executes or stops again according to the real-time status information .
在对所述初始文本进行处理后,获得目标文本,还包括以下步骤:After processing the initial text to obtain the target text, it also includes the following steps:
S440:获取所述智能游标的当前位置信息,基于所述位置信息判断所述智能游标是否处于所述初始文本尾部;S440: Acquire current position information of the smart cursor, and determine whether the smart cursor is at the end of the initial text based on the position information;
S450:若是,则基于被处理后的初始文本获得目标文本;S450: If yes, obtain the target text based on the processed initial text;
S460:若否,则根据实时状态信息再次确定智能游标的动作数据。S460: If not, determine the action data of the smart cursor again according to the real-time status information.
在本实施方式中,如上述采用特殊文本<#cursor#>标记游标所在位置,因此根据标记可以判断游标是否处于初始文本尾部,若处于初始文本尾部,则说明游标由初始文本头部移动至尾部,完成整个初始文本的语法纠错,若未处于初始文本尾部,则基于更新后的实时状态信息重复上述S310-S330和S410-S440,再次确定智能游标的执行动作,直至智能游标达到初始文本尾部。In this embodiment, the special text <#cursor#> is used to mark the position of the cursor as described above. Therefore, it can be judged whether the cursor is at the end of the initial text according to the mark. If it is at the end of the initial text, it means that the cursor moves from the head of the initial text to the end. , Complete the grammatical error correction of the entire initial text. If it is not at the end of the initial text, repeat the above S310-S330 and S410-S440 based on the updated real-time status information to determine the execution action of the smart cursor again until the smart cursor reaches the end of the initial text .
需要说明的是,为进一步保证上述目标文本的私密和安全性,上述目标文本还可以存储于一区块链的节点中,本申请的技术方案还可适用于其他存储于区块链上的文档的分类,本申请所指区块链是分布式数据存储、点对点传输、共识机制、加密算法等计算机技术的新型应用模式。区块链(Blockchain),本质上是一个去中心化的数据库,是一串使用密码学方法相关联产生的数据块,每一个数据块中包含了一批次网络交易的信息,用于验证其信息的有效性(防伪)和生成下一个区块。区块链可以包括区块链底层平台、平台产品服务层以及应用服务层等。It should be noted that, in order to further ensure the privacy and security of the above target text, the above target text can also be stored in a node of a blockchain, and the technical solution of this application can also be applied to other documents stored on the blockchain According to the classification, the blockchain referred to in this application is a new application mode of computer technology such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information for verification. The validity of the information (anti-counterfeiting) and the generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.
本方案中使用智能游标作为强化学习训练的编程语法纠错智能载体,可以定位错误,智能的进行修改并在训练过程中直接在修改后调动编译器进行判断,无需技术人员再手动的排查错误、修改并编译,解决了现有技术中只能基于有限的预定义规则和映射函数且检查效率较低的问题。In this solution, the smart cursor is used as the smart carrier of programming grammar error correction for intensive learning training, which can locate errors, intelligently modify them, and directly mobilize the compiler to make judgments after the modification during the training process. There is no need for technicians to manually troubleshoot errors. Modification and compilation solves the problem that the prior art can only be based on limited predefined rules and mapping functions and has low inspection efficiency.
本方案相比于现有技术中基于规则的遍历搜索,智能游标被赋予的一系列动作使得其能够快速到达造成程序语法错误的所在位置,并直接根据当前文本和游标位置信息做出动作策略,效率更高。Compared with the rule-based traversal search in the prior art, the smart cursor is given a series of actions that enable it to quickly reach the location that caused the program grammatical error, and directly make an action strategy based on the current text and cursor position information. higher efficiency.
实施例二:Embodiment two:
请参阅图7,本实施例的一种语法纠错装置5,包括:Referring to FIG. 7, a syntax error correction device 5 of this embodiment includes:
预处理模块51,用于获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;The preprocessing module 51 is configured to obtain the initial text, and insert a smart cursor that can perform actions at a preset position of the initial text;
状态获取模块52,用于对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;The status acquisition module 52 is configured to mark the initial text with the smart cursor in real-time status to obtain real-time status information;
动作确定模块53,用于根据所述实时状态信息确定所述智能游标的动作数据;The action determining module 53 is configured to determine the action data of the smart cursor according to the real-time status information;
需要说明的是,上述动作数据包括两种类型:编辑动作数据和导航动作数据,所述导航动作包括移动初始文本中的智能游标位置,可以向右移动一个字或向下移动到初始文本中下一句代码的起始位置;编辑动作包括插入、删除和替换3类,主要编辑对象定义为一个可变集合,包括但不限于分号、括号、话括号、逗号和点。It should be noted that the above action data includes two types: editing action data and navigation action data. The navigation action includes moving the position of the smart cursor in the initial text, which can move one word to the right or move down to the bottom of the initial text. The starting position of a sentence of code; editing actions include 3 types of insertion, deletion, and replacement. The main editing object is defined as a variable set, including but not limited to semicolons, brackets, brackets, commas, and dots.
所述动作确定模块53包括以下:The action determining module 53 includes the following:
第一处理单元531,用于采用神经网络对所述实时状态信息进行映射处理,获得第一数据;The first processing unit 531 is configured to use a neural network to perform mapping processing on the real-time status information to obtain first data;
所述神经网络为LSTM神经网络。The neural network is an LSTM neural network.
第二处理单元532,用于对所述第一数据进行元素平均处理,获得第二数据;The second processing unit 532 is configured to perform element averaging processing on the first data to obtain second data;
具体的,使用Mean Pooling层(平均池化层)对输出向量进行元素平均计算得到状态的Embedding向量。Specifically, use Mean The Pooling layer (average pooling layer) performs element averaging calculation on the output vector to obtain the Embedding vector of the state.
第三处理单元533,用于采用深度强化学习模型对所述第二数据进行处理,确定所述智能游标动作数据。The third processing unit 533 is configured to use a deep reinforcement learning model to process the second data and determine the smart cursor action data.
所述深度强化学习模型为A2C模型,A2C网络是一种多线程强化学习算法。每一个线程都会包含一个自己的线程网络,分为Actor网络和Critic网络两部分,其中Actor网络用于求解动作策略,Critic网络用于求解值函数,在训练过程中,采用编译器对所述样本动作数据进行编译,根据编译结果获得奖惩数据,基于所述奖惩数据对所述纠错模型中的损失函数和参数进行调整,再次处理,直至完成训练过程,获得纠错模型。The deep reinforcement learning model is an A2C model, and the A2C network is a multi-threaded reinforcement learning algorithm. Each thread will contain its own thread network, which is divided into two parts: Actor network and Critic network. The Actor network is used to solve the action strategy, and the Critic network is used to solve the value function. During the training process, the compiler is used to analyze the sample The action data is compiled, reward and punishment data is obtained according to the result of the compilation, the loss function and parameters in the error correction model are adjusted based on the reward and punishment data, and processed again until the training process is completed, and the error correction model is obtained.
动作执行模块54,用于基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。The action execution module 54 is configured to use the smart cursor to process the initial text based on the action data to obtain the target text.
本技术方案基于智能决策的检测模型,通过预处理模块在初始文本中插入可执行动作的智能游标,采用状态获取模块获取初始文本的实时状态信息,而后通过动作确定模块基于获取到的实时状态信息通过纠错模型确定智能游标的动作数据,采用动作执行模块采用智能游标对所述初始文本执行动作,最终在智能游标完成所有动作后获得目标文本,用智能游标作为纠错模型的智能载体,定位错误语法并对其智能的进行修改,解决现有技术存在的语法纠错只能基于有限的预定义规则和映射函数,且耗费时间较长,纠错效率较低的问题。This technical solution is based on the detection model of intelligent decision-making, inserts an actionable smart cursor into the initial text through the preprocessing module, uses the state acquisition module to acquire the real-time state information of the initial text, and then uses the action determination module based on the acquired real-time state information Determine the action data of the smart cursor through the error correction model, and use the action execution module to use the smart cursor to perform actions on the initial text. Finally, the target text is obtained after the smart cursor completes all actions. The smart cursor is used as the smart carrier of the error correction model to locate Incorrect grammar and intelligently modify it, so as to solve the problem that grammatical error correction in the prior art can only be based on limited predefined rules and mapping functions, and takes a long time and has low error correction efficiency.
本技术方案还基于第一处理单元、第二处理单元和第三处理单元实现通过纠错模型确定智能游标的动作数据,所述纠错模型通过LSTM模型结合A2C模型实现,根据动作数据控制智能游标进行编辑实现自动纠错,同时在完成对文本的编辑后还会采用编译器对所述编辑后的文本进行编译,将编译结果反馈至调整纠错模型,实现纠错模型自主学习,提高后续获得用于智能游标的动作数据的准确性,同时该纠错模型不需要同时给出错误代码和正确代码的成对样本,也不需要再从人工修改和标记样本的过程中梳理错误的规律,在训练样本提供和选择上更灵活便捷。The technical solution is also based on the first processing unit, the second processing unit, and the third processing unit to determine the action data of the smart cursor through the error correction model. The error correction model is implemented through the LSTM model combined with the A2C model, and the smart cursor is controlled according to the action data. Editing to realize automatic error correction, and at the same time, after editing the text, a compiler will be used to compile the edited text, and the compilation result will be fed back to the adjustment error correction model to achieve independent learning of the error correction model and improve subsequent acquisition It is used for the accuracy of the action data of the smart cursor. At the same time, the error correction model does not need to give paired samples of the error code and the correct code at the same time. Training samples are provided and selected more flexibly and conveniently.
实施例三:Embodiment three:
为实现上述目的,本申请还提供本申请还提供一种计算机设备6,该计算机设备包括多个计算机设备,实施例二的语法纠错装置5的组成部分可分散于不同的计算机设备中,计算机设备可以是执行程序的智能手机、平板电脑、笔记本电脑、台式计算机、机架式服务器、刀片式服务器、塔式服务器或机柜式服务器(包括独立的服务器,或者多个服务器所组成的服务器集群)等。本实施例的计算机设备至少包括但不限于:可通过系统总线相互通信连接的存储器61、处理器62,如图8所示。需要指出的是,图8仅示出了具有组件-的计算机设备,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。In order to achieve the above purpose, this application also provides a computer device 6 which includes multiple computer devices. The components of the grammatical error correction device 5 of the second embodiment can be dispersed in different computer devices. The device can be a smart phone, a tablet, a laptop, a desktop computer, a rack server, a blade server, a tower server, or a rack server (including a stand-alone server, or a server cluster composed of multiple servers) that executes the program Wait. The computer equipment in this embodiment at least includes but is not limited to: a memory 61 and a processor 62 that can be communicatively connected to each other through a system bus, as shown in FIG. 8. It should be pointed out that FIG. 8 only shows a computer device with components, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
本实施例中,存储器61(即可读存储介质)包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘等。在一些实施例中,存储器61可以是计算机设备的内部存储单元,例如该计算机设备的硬盘或内存。在另一些实施例中,存储器61也可以是计算机设备的外部存储设备,例如该计算机设备上配备的插接式硬盘,智能存储卡(Smart Media Card, SMC),安全数字(Secure Digital, SD)卡,闪存卡(Flash Card)等。当然,存储器61还可以既包括计算机设备的内部存储单元也包括其外部存储设备。本实施例中,存储器61通常用于存储安装于计算机设备的操作系统和各类应用软件,例如实施例一的语法纠错方法的程序代码等。此外,存储器61还可以用于暂时地存储已经输出或者将要输出的各类数据。In this embodiment, the memory 61 (ie, readable storage medium) includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), Read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 61 may be an internal storage unit of a computer device, such as a hard disk or a memory of the computer device. In other embodiments, the memory 61 may also be an external storage device of the computer device, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) equipped on the computer device. Card, Flash Card, etc. Of course, the memory 61 may also include both the internal storage unit of the computer device and its external storage device. In this embodiment, the memory 61 is generally used to store an operating system and various application software installed in a computer device, such as the program code of the grammatical error correction method in the first embodiment, and so on. In addition, the memory 61 may also be used to temporarily store various types of data that have been output or will be output.
处理器62在一些实施例中可以是中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器、或其他数据处理芯片。该处理器62通常用于控制计算机设备的总体操作。本实施例中,处理器62用于运行存储器61中存储的程序代码或者处理数据,例如运行语法纠错装置,以实现实施例一的语法纠错方法。The processor 62 may be a central processing unit (Central Processing Unit) in some embodiments. Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip. The processor 62 is generally used to control the overall operation of the computer equipment. In this embodiment, the processor 62 is configured to run program codes or process data stored in the memory 61, for example, to run a syntax error correction device, so as to implement the syntax error correction method of the first embodiment.
所述网络接口63可包括无线网络接口或有线网络接口,该网络接口63通常用于在所述计算机设备6与其他计算机设备6之间建立通信连接。例如,所述网络接口63用于通过网络将所述计算机设备6与外部终端相连,在所述计算机设备6与外部终端之间的建立数据传输通道和通信连接等。所述网络可以是企业内部网(Intranet)、互联网(Internet)、全球移动通讯系统(Global System of Mobile communication,GSM)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、4G网络、5G网络、蓝牙(Bluetooth)、Wi-Fi等无线或有线网络。The network interface 63 may include a wireless network interface or a wired network interface, and the network interface 63 is generally used to establish a communication connection between the computer device 6 and other computer devices 6. For example, the network interface 63 is used to connect the computer device 6 to an external terminal through a network, and to establish a data transmission channel and a communication connection between the computer device 6 and the external terminal. The network may be an intranet (Intranet), the Internet (Internet), a global system of mobile communication (GSM), a wideband code division multiple access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.
需要指出的是,图8仅示出了具有部件61-63的计算机设备6,但是应理解的是,并不要求实施所有示出的部件,可以替代的实施更多或者更少的部件。It should be pointed out that FIG. 8 only shows the computer device 6 with components 61-63, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.
在本实施例中,存储于存储器61中的所述语法纠错装置5还可以被分割为一个或者多个程序模块,所述一个或者多个程序模块被存储于存储器61中,并由一个或多个处理器(本实施例为处理器62)所执行,以完成本申请。In this embodiment, the grammatical error correction device 5 stored in the memory 61 may also be divided into one or more program modules, and the one or more program modules are stored in the memory 61 and are composed of one or more program modules. Multiple processors (the processor 62 in this embodiment) are executed to complete the application.
实施例四:Embodiment four:
为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质可以是非易失性,也可以是易失性,其包括多个存储介质,如闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、随机访问存储器(RAM)、静态随机访问存储器(SRAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、可编程只读存储器(PROM)、磁性存储器、磁盘、光盘、服务器、App应用商城等等,其上存储有计算机程序,程序被处理器62执行时实现相应功能。本实施例的计算机可读存储介质用于存储语法纠错装置,被处理器62执行时实现实施例一的语法纠错方法。In order to achieve the above objective, the present application also provides a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile, and includes multiple storage media, such as flash memory, hard disk, and multimedia. Card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), Programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, servers, App application malls, etc., have computer programs stored thereon, and corresponding functions are realized when the programs are executed by the processor 62. The computer-readable storage medium of this embodiment is used to store a syntax error correction device, and when executed by the processor 62, the syntax error correction method of the first embodiment is implemented.
在一实施例中,所述计算机可读存储介质,包括存储数据区和存储程序区,存储数据区存储根据区块链节点的使用所创建的数据,存储程序区存储有计算机程序;其中,所述计算机程序被处理器62执行时实现任一实施例所述的语法纠错方法。In an embodiment, the computer-readable storage medium includes a storage data area and a storage program area, the storage data area stores data created according to the use of blockchain nodes, and the storage program area stores computer programs; wherein When the computer program is executed by the processor 62, the grammatical error correction method described in any of the embodiments is implemented.
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims (20)

  1. 一种语法纠错方法,其中,包括:A method for grammatical error correction, which includes:
    获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;Acquiring the initial text, and inserting a smart cursor that can perform actions at a preset position of the initial text;
    对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;Mark the initial text with the smart cursor in real-time status to obtain real-time status information;
    根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据;Using an error correction model to determine the action data of the smart cursor according to the real-time status information;
    基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。Using the smart cursor to process the initial text based on the action data to obtain the target text.
  2. 根据权利要求1所述的语法纠错方法,其中,所述对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息,包括以下:The method for grammatical error correction according to claim 1, wherein the real-time status marking of the initial text with the smart cursor to obtain real-time status information includes the following:
    对所述初始文本进行序列化,获得第一处理数据;Serialize the initial text to obtain first processed data;
    实时定位所述智能游标,获取所述智能游标的实时位置信息,基于所述位置信息对所述第一处理数据进行标记,获得带有游标实时位置标记的第一处理数据作为实时状态信息。 The smart cursor is located in real time, real-time position information of the smart cursor is obtained, the first processed data is marked based on the position information, and the first processed data with the real-time position mark of the cursor is obtained as real-time status information.
  3. 根据权利要求1所述的语法纠错方法,其中,根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据,包括以下:The method for grammatical error correction according to claim 1, wherein the use of an error correction model to determine the action data of the smart cursor according to the real-time status information includes the following:
    采用神经网络对所述实时状态信息进行映射处理,获得第一数据;Use a neural network to perform mapping processing on the real-time status information to obtain the first data;
    对所述第一数据进行元素平均处理,获得第二数据;Performing element averaging processing on the first data to obtain second data;
    采用深度强化学习模型对所述第二数据进行处理,确定所述智能游标动作数据。A deep reinforcement learning model is used to process the second data to determine the smart cursor action data.
  4. 根据权利要求1所述的语法纠错方法,其中,在根据所述实时状态信息确定所述智能游标的动作数据前,对所述纠错模型进行训练,包括以下:The method for grammatical error correction according to claim 1, wherein before determining the action data of the smart cursor according to the real-time status information, training the error correction model includes the following:
    获取训练样本,采用神经网络对所述训练样本处理,获取样本处理数据;Obtaining training samples, processing the training samples by using a neural network, and obtaining sample processing data;
    采用深度强化学习模型中动作网络和状态网络对训练样本处理数据进行处理,获得初始动作策略和值函数;Use the action network and state network in the deep reinforcement learning model to process the training sample processing data to obtain the initial action strategy and value function;
    基于所述初始动作策略和值函数采用损失函数处理获得样本动作数据;Using loss function processing to obtain sample action data based on the initial action strategy and value function;
    采用编译器对所述样本动作数据进行编译,根据编译结果获得奖惩数据;Use a compiler to compile the sample action data, and obtain reward and punishment data according to the compiling result;
    基于所述奖惩数据对所述纠错模型中的损失函数和参数进行调整,再次处理,直至完成训练过程,获得训练后的纠错模型。The loss function and parameters in the error correction model are adjusted based on the reward and punishment data, and processed again until the training process is completed, and the trained error correction model is obtained.
  5. 根据权利要求4所述的语法纠错方法,其中,所述根据编译结果获得奖惩数据,包括以下:The method for grammatical error correction according to claim 4, wherein said obtaining reward and punishment data according to the compilation result comprises the following:
    从预设数据库中获取历史错误数量,根据编译结果获得编译后的错误数量;Obtain the number of historical errors from the preset database, and obtain the number of errors after compilation according to the compilation result;
    基于所述历史错误数量和编译后的错误数量判断错误数量是否增加;Judging whether the number of errors has increased based on the number of historical errors and the number of errors after compilation;
    若是,则奖惩数据为负向惩罚;若不是,则奖惩数据为正向惩罚;If it is, the reward and punishment data is a negative punishment; if not, the reward and punishment data is a positive punishment;
    在获得惩罚数据后采用所述编译后的错误数量更新所述历史错误数量并存储在所述预设数据库中。After the penalty data is obtained, the number of errors after compilation is used to update the number of historical errors and stored in the preset database.
  6. 根据权利要求1所述的语法纠错方法,其中,所述基于所述动作数据采用所述智能游标对所述初始文本进行处理,包括以下步骤:The method for grammatical error correction according to claim 1, wherein said using said smart cursor to process said initial text based on said action data comprises the following steps:
    基于所述动作数据获取对应的数据类型;Obtaining a corresponding data type based on the action data;
    当所述数据类型为编辑数据,根据所述动作数据对所述初始文本进行编辑,并基于所述智能游标的位置信息更新所述实时状态信息;When the data type is edit data, edit the initial text according to the action data, and update the real-time status information based on the position information of the smart cursor;
    当所述数据类型为导航数据, 根据所述动作数据移动所述智能游标并基于所述智能游标的位置信息更新所述实时状态信息。When the data type is navigation data, move the smart cursor according to the action data and update the real-time status information based on the position information of the smart cursor.
  7. 根据权利要求1所述的语法纠错方法,其中,在对所述初始文本进行处理后,且获得目标文本前,还包括以下步骤:The method for grammatical error correction according to claim 1, wherein after the initial text is processed and before the target text is obtained, the method further comprises the following steps:
    获取所述智能游标的当前位置信息,基于所述位置信息判断所述智能游标是否处于所述初始文本尾部;Acquiring current position information of the smart cursor, and judging whether the smart cursor is at the end of the initial text based on the position information;
    若是,则基于被处理后的初始文本获得目标文本,将所述目标文本上传至区块链;If yes, obtain the target text based on the processed initial text, and upload the target text to the blockchain;
    若否,则根据实时状态信息再次确定智能游标的动作数据。If not, the action data of the smart cursor is determined again according to the real-time status information.
  8. 一种语法纠错装置,其中,包括:A grammar error correction device, which includes:
    预处理模块,用于获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;The preprocessing module is used to obtain the initial text, and insert a smart cursor that can perform actions at a preset position of the initial text;
    状态获取模块,用于对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;The state acquisition module is used to mark the real-time state of the initial text with the smart cursor to obtain real-time state information;
    动作确定模块,用于根据所述实时状态信息确定所述智能游标的动作数据;An action determining module, configured to determine the action data of the smart cursor according to the real-time status information;
    动作执行模块,用于基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。The action execution module is configured to use the smart cursor to process the initial text based on the action data to obtain the target text.
  9. 一种计算机设备,其中,所述计算机设备包括存储器、处理器以及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述计算机程序时实现所述语法纠错方法的以下步骤:A computer device, wherein the computer device includes a memory, a processor, and a computer program that is stored in the memory and can run on the processor, and the processor implements the grammatical error correction method when the computer program is executed. The following steps:
    获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;Acquiring the initial text, and inserting a smart cursor that can perform actions at a preset position of the initial text;
    对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;Mark the initial text with the smart cursor in real-time status to obtain real-time status information;
    根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据;Using an error correction model to determine the action data of the smart cursor according to the real-time status information;
    基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。Using the smart cursor to process the initial text based on the action data to obtain the target text.
  10. 根据权利要求9所述的计算机设备,其中,所述对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息,包括以下:9. The computer device according to claim 9, wherein the real-time status marking of the initial text with the smart cursor to obtain real-time status information comprises the following:
    对所述初始文本进行序列化,获得第一处理数据;Serialize the initial text to obtain first processed data;
    实时定位所述智能游标,获取所述智能游标的实时位置信息,基于所述位置信息对所述第一处理数据进行标记,获得带有游标实时位置标记的第一处理数据作为实时状态信息。The smart cursor is located in real time, real-time position information of the smart cursor is obtained, the first processed data is marked based on the position information, and the first processed data with the real-time position mark of the cursor is obtained as real-time status information.
  11. 根据权利要求9所述的计算机设备,其中,所述根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据,包括以下:9. The computer device according to claim 9, wherein the determining the action data of the smart cursor by using an error correction model according to the real-time status information comprises the following:
    采用神经网络对所述实时状态信息进行映射处理,获得第一数据;Use a neural network to perform mapping processing on the real-time status information to obtain the first data;
    对所述第一数据进行元素平均处理,获得第二数据;Performing element averaging processing on the first data to obtain second data;
    采用深度强化学习模型对所述第二数据进行处理,确定所述智能游标动作数据。A deep reinforcement learning model is used to process the second data to determine the smart cursor action data.
  12. 根据权利要求9所述的计算机设备,其中,在根据所述实时状态信息确定所述智能游标的动作数据前,对所述纠错模型进行训练,包括以下:The computer device according to claim 9, wherein, before determining the action data of the smart cursor according to the real-time status information, training the error correction model includes the following:
    获取训练样本,采用神经网络对所述训练样本处理,获取样本处理数据;Obtaining training samples, processing the training samples by using a neural network, and obtaining sample processing data;
    采用深度强化学习模型中动作网络和状态网络对训练样本处理数据进行处理,获得初始动作策略和值函数;Use the action network and state network in the deep reinforcement learning model to process the training sample processing data to obtain the initial action strategy and value function;
    基于所述初始动作策略和值函数采用损失函数处理获得样本动作数据;Using loss function processing to obtain sample action data based on the initial action strategy and value function;
    采用编译器对所述样本动作数据进行编译,根据编译结果获得奖惩数据;Use a compiler to compile the sample action data, and obtain reward and punishment data according to the compiling result;
    基于所述奖惩数据对所述纠错模型中的损失函数和参数进行调整,再次处理,直至完成训练过程,获得训练后的纠错模型。The loss function and parameters in the error correction model are adjusted based on the reward and punishment data, and processed again until the training process is completed, and a trained error correction model is obtained.
  13. 根据权利要求9所述的计算机设备,其中,所述基于所述动作数据采用所述智能游标对所述初始文本进行处理,包括以下步骤:9. The computer device according to claim 9, wherein said using said smart cursor to process said initial text based on said action data comprises the following steps:
    基于所述动作数据获取对应的数据类型;Obtaining a corresponding data type based on the action data;
    当所述数据类型为编辑数据,根据所述动作数据对所述初始文本进行编辑,并基于所述智能游标的位置信息更新所述实时状态信息;When the data type is edit data, edit the initial text according to the action data, and update the real-time status information based on the position information of the smart cursor;
    当所述数据类型为导航数据, 根据所述动作数据移动所述智能游标并基于所述智能游标的位置信息更新所述实时状态信息。When the data type is navigation data, move the smart cursor according to the action data and update the real-time status information based on the position information of the smart cursor.
  14. 根据权利要求9所述的计算机设备,其中,在对所述初始文本进行处理后,且获得目标文本前,还包括以下步骤:9. The computer device according to claim 9, wherein after the initial text is processed and before the target text is obtained, the method further comprises the following steps:
    获取所述智能游标的当前位置信息,基于所述位置信息判断所述智能游标是否处于所述初始文本尾部;Acquiring current position information of the smart cursor, and judging whether the smart cursor is at the end of the initial text based on the position information;
    若是,则基于被处理后的初始文本获得目标文本,将所述目标文本上传至区块链;If yes, obtain the target text based on the processed initial text, and upload the target text to the blockchain;
    若否,则根据实时状态信息再次确定智能游标的动作数据。If not, the action data of the smart cursor is determined again according to the real-time status information.
  15. 一种计算机可读存储介质,其包括多个存储介质,各存储介质上存储有计算机程序,其中,所述多个存储介质存储的所述计算机程序被处理器执行时共同实现所述语法纠错方法的以下步骤:A computer-readable storage medium includes multiple storage media, and each storage medium stores a computer program, wherein the computer programs stored in the multiple storage media jointly implement the grammatical error correction when executed by a processor The following steps of the method:
    获取初始文本,在所述初始文本预设位置插入可执行动作的智能游标;Acquiring the initial text, and inserting a smart cursor that can perform actions at a preset position of the initial text;
    对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息;Mark the initial text with the smart cursor in real-time status to obtain real-time status information;
    根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据;Using an error correction model to determine the action data of the smart cursor according to the real-time status information;
    基于所述动作数据采用所述智能游标对所述初始文本进行处理,获得目标文本。Using the smart cursor to process the initial text based on the action data to obtain the target text.
  16. 根据权利要求15所述的计算机可读存储介质,其中,所述对所述带有智能游标的初始文本进行实时状态标记,获得实时状态信息,包括以下:The computer-readable storage medium according to claim 15, wherein the real-time status marking of the initial text with the smart cursor to obtain real-time status information comprises the following:
    对所述初始文本进行序列化,获得第一处理数据;Serialize the initial text to obtain first processed data;
    实时定位所述智能游标,获取所述智能游标的实时位置信息,基于所述位置信息对所述第一处理数据进行标记,获得带有游标实时位置标记的第一处理数据作为实时状态信息。The smart cursor is located in real time, real-time position information of the smart cursor is obtained, the first processed data is marked based on the position information, and the first processed data with the real-time position mark of the cursor is obtained as real-time status information.
  17. 根据权利要求15所述的计算机可读存储介质,其中,所述根据所述实时状态信息采用纠错模型确定所述智能游标的动作数据,包括以下:15. The computer-readable storage medium according to claim 15, wherein the determining the action data of the smart cursor according to the real-time status information using an error correction model comprises the following:
    采用神经网络对所述实时状态信息进行映射处理,获得第一数据;Use a neural network to perform mapping processing on the real-time status information to obtain the first data;
    对所述第一数据进行元素平均处理,获得第二数据;Performing element averaging processing on the first data to obtain second data;
    采用深度强化学习模型对所述第二数据进行处理,确定所述智能游标动作数据。A deep reinforcement learning model is used to process the second data to determine the smart cursor action data.
  18. 根据权利要求15所述的计算机可读存储介质,其中,在根据所述实时状态信息确定所述智能游标的动作数据前,对所述纠错模型进行训练,包括以下:The computer-readable storage medium according to claim 15, wherein, before determining the action data of the smart cursor according to the real-time status information, training the error correction model includes the following:
    获取训练样本,采用神经网络对所述训练样本处理,获取样本处理数据;Obtaining training samples, processing the training samples by using a neural network, and obtaining sample processing data;
    采用深度强化学习模型中动作网络和状态网络对训练样本处理数据进行处理,获得初始动作策略和值函数;Use the action network and state network in the deep reinforcement learning model to process the training sample processing data to obtain the initial action strategy and value function;
    基于所述初始动作策略和值函数采用损失函数处理获得样本动作数据;Using loss function processing to obtain sample action data based on the initial action strategy and value function;
    采用编译器对所述样本动作数据进行编译,根据编译结果获得奖惩数据;Use a compiler to compile the sample action data, and obtain reward and punishment data according to the compiling result;
    基于所述奖惩数据对所述纠错模型中的损失函数和参数进行调整,再次处理,直至完成训练过程,获得训练后的纠错模型。The loss function and parameters in the error correction model are adjusted based on the reward and punishment data, and processed again until the training process is completed, and the trained error correction model is obtained.
  19. 根据权利要求15所述的计算机可读存储介质,其中,所述基于所述动作数据采用所述智能游标对所述初始文本进行处理,包括以下步骤:15. The computer-readable storage medium according to claim 15, wherein said using said smart cursor to process said initial text based on said action data comprises the following steps:
    基于所述动作数据获取对应的数据类型;Obtaining a corresponding data type based on the action data;
    当所述数据类型为编辑数据,根据所述动作数据对所述初始文本进行编辑,并基于所述智能游标的位置信息更新所述实时状态信息;When the data type is edit data, edit the initial text according to the action data, and update the real-time status information based on the position information of the smart cursor;
    当所述数据类型为导航数据, 根据所述动作数据移动所述智能游标并基于所述智能游标的位置信息更新所述实时状态信息。When the data type is navigation data, move the smart cursor according to the action data and update the real-time status information based on the position information of the smart cursor.
  20. 根据权利要求15所述的计算机可读存储介质,其中,在对所述初始文本进行处理后,且获得目标文本前,还包括以下步骤:15. The computer-readable storage medium according to claim 15, wherein after the initial text is processed and before the target text is obtained, the method further comprises the following steps:
    获取所述智能游标的当前位置信息,基于所述位置信息判断所述智能游标是否处于所述初始文本尾部;Acquiring current position information of the smart cursor, and judging whether the smart cursor is at the end of the initial text based on the position information;
    若是,则基于被处理后的初始文本获得目标文本,将所述目标文本上传至区块链;If yes, obtain the target text based on the processed initial text, and upload the target text to the blockchain;
    若否,则根据实时状态信息再次确定智能游标的动作数据。If not, the action data of the smart cursor is determined again according to the real-time status information.
PCT/CN2020/118197 2020-07-30 2020-09-27 Grammatical error correction method, apparatus, computer system, and readable storage medium WO2021174823A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010752813.1A CN111897535A (en) 2020-07-30 2020-07-30 Grammar error correction method, device, computer system and readable storage medium
CN202010752813.1 2020-07-30

Publications (1)

Publication Number Publication Date
WO2021174823A1 true WO2021174823A1 (en) 2021-09-10

Family

ID=73182599

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/118197 WO2021174823A1 (en) 2020-07-30 2020-09-27 Grammatical error correction method, apparatus, computer system, and readable storage medium

Country Status (2)

Country Link
CN (1) CN111897535A (en)
WO (1) WO2021174823A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114664121A (en) * 2022-03-23 2022-06-24 合肥置顶信息技术有限公司 Intelligent error-correcting civil aviation meteorological observation making and publishing system and method

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199831A (en) * 2014-07-31 2014-12-10 深圳市腾讯计算机系统有限公司 Information processing method and device
CN105550171A (en) * 2015-12-31 2016-05-04 北京奇艺世纪科技有限公司 Error correction method and system for query information of vertical search engine
CN110362310A (en) * 2019-03-19 2019-10-22 南京大学 A kind of code syntax errors repair method based on incomplete abstract syntax tree
CN110502754A (en) * 2019-08-26 2019-11-26 腾讯科技(深圳)有限公司 Text handling method and device
US20200097387A1 (en) * 2018-09-25 2020-03-26 International Business Machines Corporation Code dependency influenced bug localization

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6936318B2 (en) * 2016-09-30 2021-09-15 ロヴィ ガイズ, インコーポレイテッド Systems and methods for correcting mistakes in caption text
CN109753636A (en) * 2017-11-01 2019-05-14 阿里巴巴集团控股有限公司 Machine processing and text error correction method and device calculate equipment and storage medium
CN110162767A (en) * 2018-02-12 2019-08-23 北京京东尚科信息技术有限公司 The method and apparatus of text error correction
CN110969012B (en) * 2019-11-29 2023-04-07 北京字节跳动网络技术有限公司 Text error correction method and device, storage medium and electronic equipment
CN111310473A (en) * 2020-02-04 2020-06-19 四川无声信息技术有限公司 Text error correction method and model training method and device thereof
CN111310447B (en) * 2020-03-18 2024-02-02 河北省讯飞人工智能研究院 Grammar error correction method, grammar error correction device, electronic equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199831A (en) * 2014-07-31 2014-12-10 深圳市腾讯计算机系统有限公司 Information processing method and device
CN105550171A (en) * 2015-12-31 2016-05-04 北京奇艺世纪科技有限公司 Error correction method and system for query information of vertical search engine
US20200097387A1 (en) * 2018-09-25 2020-03-26 International Business Machines Corporation Code dependency influenced bug localization
CN110362310A (en) * 2019-03-19 2019-10-22 南京大学 A kind of code syntax errors repair method based on incomplete abstract syntax tree
CN110502754A (en) * 2019-08-26 2019-11-26 腾讯科技(深圳)有限公司 Text handling method and device

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114664121A (en) * 2022-03-23 2022-06-24 合肥置顶信息技术有限公司 Intelligent error-correcting civil aviation meteorological observation making and publishing system and method
CN114664121B (en) * 2022-03-23 2024-01-09 合肥置顶信息技术有限公司 Intelligent error correction civil aviation meteorological observation making and publishing system and method

Also Published As

Publication number Publication date
CN111897535A (en) 2020-11-06

Similar Documents

Publication Publication Date Title
CN108469952B (en) Code generation method and matched tool for managing game configuration
US8181163B2 (en) Program synthesis and debugging using machine learning techniques
CN103838672A (en) Automated testing method and device for all-purpose financial statements
CN109977014B (en) Block chain-based code error identification method, device, equipment and storage medium
CN101894236A (en) Software homology detection method and device based on abstract syntax tree and semantic matching
US9405518B2 (en) Leveraging legacy applications for use with modern applications
CN111611811B (en) Translation method, translation device, electronic equipment and computer readable storage medium
KR102147097B1 (en) A method and apparatus of data configuring learning data set for machine learning
CN111666775B (en) Text processing method, device, equipment and storage medium
CN113064586B (en) Code completion method based on abstract syntax tree augmented graph model
WO2023010916A1 (en) Software automatic repair method and system, electronic device, and storage medium
CN109271630A (en) A kind of intelligent dimension method and device based on natural language processing
KR20190089615A (en) Bug fixing system and bug fixing method
WO2021174823A1 (en) Grammatical error correction method, apparatus, computer system, and readable storage medium
CN104090865B (en) Text similarity computing method and device
CN112597023A (en) Case management method and device based on guide picture, computer equipment and storage medium
CN115904480B (en) Code reconstruction method, device, electronic equipment and storage medium
CN110781978A (en) Feature processing method and system for machine learning
CN115827234A (en) Operator scheduling method and device for multi-model training task
CN111340175B (en) Graph rewriting processing method and device, computing equipment and readable medium
CN112287005A (en) Data processing method, device, server and medium
CN116975032B (en) Data alignment method, system, electronic device and storage medium
CN111538714B (en) Instruction execution method and device, electronic equipment and storage medium
CN116661794B (en) Hardware description language semantic conversion method and device
CN110955433B (en) Automatic deployment script generation method and device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20923322

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20923322

Country of ref document: EP

Kind code of ref document: A1