WO2016131278A1 - Document error search method and device - Google Patents

Document error search method and device Download PDF

Info

Publication number
WO2016131278A1
WO2016131278A1 PCT/CN2015/091129 CN2015091129W WO2016131278A1 WO 2016131278 A1 WO2016131278 A1 WO 2016131278A1 CN 2015091129 W CN2015091129 W CN 2015091129W WO 2016131278 A1 WO2016131278 A1 WO 2016131278A1
Authority
WO
WIPO (PCT)
Prior art keywords
synonym
statement
word
words
document
Prior art date
Application number
PCT/CN2015/091129
Other languages
French (fr)
Chinese (zh)
Inventor
张晋
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Publication of WO2016131278A1 publication Critical patent/WO2016131278A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution

Definitions

  • Embodiments of the present invention relate to, but are not limited to, the field of computer data processing technologies, and in particular, to a document error detection method and apparatus.
  • the document's automatic error-checking technology can detect a variety of errors in the document, and some can automatically correct it, improving the efficiency of document development and modification.
  • the document error detection technique in the related art mainly searches for a wrong word combination, and then prompts for reminding or correcting. This method has a relatively large limitation, and there is a problem that the word consistency cannot be detected before and after the document.
  • the embodiment of the invention provides a method and device for checking errors of a document, which can solve the problem that the related technology cannot detect the consistency of words before and after the document.
  • the embodiment of the invention provides a document error detection method, including:
  • the method further includes:
  • statement comparison manner includes:
  • a string is a synonym string.
  • the method further includes: writing the synonym word recorded by the statement comparison manner into the synonym word dictionary.
  • the method before the detecting and recording the synonym word appearing in the document according to the synonym word library, the method further includes:
  • the method further includes: recording the modified content and the modified content into the historical modification database.
  • the synonym word recorded is modified into a unified word, including:
  • the synonym word appearing in the document is modified into a unified word based on the user's modification instruction; wherein the unified word is a default or user-specified synonym word.
  • the embodiment of the invention further provides a document error checking device, comprising:
  • the synonym word check module is set to check and record synonym words appearing in the document according to the synonym word library
  • the synonym word processing module is set to modify the recorded synonym words into uniform words.
  • synonym word check module is further configured to check synonym words appearing in the document by means of statement comparison, and does not belong to the detected synonym Synonymous words in the synonym sentence database are recorded.
  • the synonym word check module includes:
  • the information acquisition sub-module is configured to determine a statement length and a statement comparison manner according to the configuration information
  • the statement comparison submodule is set to determine the starting position of the search, obtain the starting statement, and compare the starting statement with all the statements after the statement to determine whether the starting statement and the statement after the statement are the same.
  • a semantic word in which all statements after the start statement are: all statements that are gradually moved backwards in characters after the start of the statement;
  • the polling processing sub-module is set to move the search start position backward by one character to obtain a new search start position, and trigger the statement comparison sub-module.
  • the device further includes: a history modification content checking module, configured to retrieve a history modification database, perform a full document search on the modified content recorded in the history modification database, and present the searched content, and according to The user's instructions are modified or ignored by historical modification.
  • a history modification content checking module configured to retrieve a history modification database, perform a full document search on the modified content recorded in the history modification database, and present the searched content, and according to The user's instructions are modified or ignored by historical modification.
  • the apparatus further includes: a document modification record module, configured to record the modified content and the modified content to the history modification database.
  • a document modification record module configured to record the modified content and the modified content to the history modification database.
  • the synonym word processing module modifies the recorded synonym words into unified words, including:
  • the synonym word processing module presents the recorded synonym sentence information
  • the synonym word processing module modifies the synonym word appearing in the document into a unified word based on the user's modification instruction; wherein the unified word is a default or a user-specified synonym .
  • An embodiment of the present invention further provides a computer readable storage medium, where program instructions are stored, when The above method can be implemented when program instructions are executed.
  • the embodiment of the invention introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the consistency of the words before and after the document.
  • FIG. 1 is a flowchart of a method for error checking a document according to Embodiment 1 of the present invention
  • FIG. 2 is a flow chart of searching synonymous words and phrases using a synonym word sentence library according to Embodiment 2 of the present invention
  • FIG. 3 is a flowchart of searching for synonymous words in a document by using a statement comparison technique according to Embodiment 2 of the present invention
  • FIG. 5 is a diagram showing an effect of displaying a synonymous sentence of a record according to Embodiment 2 of the present invention.
  • FIG. 6 is a schematic diagram of a document to be inspected in an application example of the present invention.
  • FIG. 7 is a schematic diagram of a recording situation in a synonym word sentence library in an application example of the present invention.
  • FIG. 8 is a schematic diagram of a sentence comparison search synonym word in the application example of the present invention.
  • FIG. 9 is a schematic diagram of a synonym word sentence obtained by using a synonym word sentence and a sentence comparison search in an application example of the present invention.
  • FIG. 10 is an effect diagram of a user inputting a unified word in a synonym word displayed in the application example of the present invention.
  • FIG. 11 is a schematic diagram showing the effect of modifying a synonym word into a unified word in an application example of the present invention.
  • FIG. 13 is a schematic structural diagram of a document error checking apparatus according to Embodiment 3 of the present invention.
  • the embodiment of the invention provides a document error checking method for checking the consistency of a word sentence in a document. As shown in FIG. 1 , the method includes:
  • Step S101 according to the synonym word library, checking and recording synonym words appearing in the document;
  • the synonym word sentence recorded in the synonym word sentence database is prevented from being incomplete, and thus the omission problem may occur.
  • the statement may be checked by the statement comparison method.
  • the synonymous words and sentences of the synonymous words in the synonymous words and sentences that are not included in the synonymous sentence sentence are recorded.
  • synonymous words recorded by the statement comparison method are written into the synonym word bank to update the synonym word library.
  • step (3) Move the search start position backward by one character to obtain a new search start position and return to step (2).
  • step S102 the recorded synonym word is modified into a unified word.
  • the step includes: presenting the recorded synonym sentence information, and modifying the synonym word appearing in the document into a unified word based on the user's modification instruction; wherein the unified word can be the default one-synonym Words, or, a user-specified synonym.
  • the embodiment may also perform the operation of modifying the words in the document according to the historical modification operation.
  • the execution timing of the operation may be performed before S101 or after S102, and before S101, the processing manner is as follows:
  • the method when the recorded synonym sentence is modified into a unified word, the method further includes: recording the modified content and the modified content to the history modification database.
  • the method described in the embodiment introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the document before and after.
  • the problem of word consistency introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the document before and after. The problem of word consistency.
  • This embodiment provides a document error detection method, and the document error detection method in the first embodiment is described in more detail.
  • Step A Check the synonymous words that may appear according to the records in the synonym word bank and record the results
  • step B the program traverses the document according to the records in the synonym word bank, and records the search results. After processing, it proceeds to step B.
  • Step S201 starting;
  • Step S202 detecting whether there is a record in the synonym word bank, if yes, proceeding to step S203; if not, proceeding to step S208;
  • Step S203 taking a record from the synonym word library
  • Step S204 searching for the content in the document that matches the record
  • Step S205 it is determined whether the search is performed, and if so, step S206 is performed; if not, step S207 is performed;
  • Step S206 the searched results are recorded, continue to step S207;
  • Step S207 it is determined whether there is a record that needs to continue the search, and if so, returns to step S203; if not, then proceeds to step S208;
  • Step B Searching for similar words within the document and recording the results
  • This step is to prevent some synonym words from being searched because the synonym words are not listed in the synonym sentence database, resulting in omission.
  • the program will traverse the possible synonyms in the search document according to the configuration information and record the results according to the context. After the process is completed, proceed to step C.
  • Step S301 starting;
  • Step S302 reading the configuration information, determining: the length of the same character string in the preceding paragraph of the synonym word; the length of the synonym word string; the length of the same string in the back end of the synonym word;
  • Step S303 determining a search start position 1, for the initial search, the start position may be the first character of the document, or the position specified by the user;
  • Step S304 from the search start position 1, according to the configuration information, take out the front-end string 1, the synonym string 1 and the back-end string 1;
  • Step S305 moving backward from the search start position 1 by a string length (front end + back end + synonym word string length) to obtain a search start position 2;
  • Step S306 from the search start position 2, according to the configuration information, the front end character string 2, the synonym word string 2 and the back end character string 2 are extracted; as shown in FIG. 4, from the search start position 1 and the search a schematic diagram of the start position 2 taking out a character string;
  • Step S307 it is determined whether the front end character string 1 and the front end character string 2 are the same, and the front end character string 2 and the back end character string 2 are the same, if yes, step S308 is performed; if not, step S310 is performed;
  • Step S308 it is determined whether the synonym word string 1 and the synonym string 2 are the same, if yes, then step S310; if not, then step S309;
  • Step S309 recording the searched result
  • Step S310 it is determined whether the search start position 2 has searched for the end of the document, and if so, step S312 is performed; if not, step S311 is performed;
  • Step S311 the search start position 2 is moved backward by one character length, and returns to step S306;
  • step S312 it is determined whether the search start position 1 has searched for the end of the document, and if so, the process ends; if not, the search start position 1 is moved backward by a character string length, and then step S304 is performed.
  • Step C This step is mainly to display the synonym found in step A and step B, for the user to judge whether the synonym needs to be uniformly replaced, so as to ensure the consistency of the document term; the display effect of this step is shown in FIG. 5;
  • Step D After the synonym found in Step B and Step C is displayed, the user will choose to ignore or replace the synonyms.
  • the program will update the record in the synonym dictionary and add the unified words to the record;
  • Step 1 The inspection program first takes the first record from the synonym word library, that is, the "data packet" and the “data frame” are synonymous words, and then according to Figure 2 (synonym word database record) Search flow chart) The process searches, and the two synonyms are searched in the document, so the search result will be recorded;
  • Step 2 The program is executed according to the flow chart of Figure 3 (search internal synonym word search flow chart) (assuming that the search condition configured at this time is the front end string length 5, the synonym word length 3, and the back end string length 2 ), the program will find the synonym word "processing unit", "processing module” and "processing program” in the document, the search schematic is shown in Figure 8, the front-end string and the back-end string in the figure are used The background color is indicated;
  • Step 3 The program displays the results checked in steps 2 and 3, as shown in FIG. 9;
  • Step 4 According to the display result, the user determines that the synonym needs to use the same word uniformly, so the unified word is input in the display result interface, and the effect is as shown in FIG. 10;
  • Step 5 The user selects the replacement synonym for the unified word, the program automatically modifies the document and the synonym word library, and the updated document is as shown in FIG. 11 (the modified content is marked with the background background), and the updated synonym
  • the word dictionary is shown in Figure 12.
  • This embodiment provides a document error device, as shown in FIG. 13, including:
  • the synonym word check module 1310 is configured to check and record synonym words appearing in the document according to the synonym word library
  • the synonym word processing module 1320 is configured to modify the recorded synonym sentence into a unified word.
  • the synonym sentence checking module 1310 is further configured to check synonymous words and phrases appearing in the document by means of statement comparison, and in the synonymous words and sentences detected. Synonymous words that are not in the synonym sentence library are recorded.
  • the synonym word check module 1310 includes:
  • the information acquisition sub-module is configured to determine a statement length and a statement comparison manner according to the configuration information
  • the statement comparison submodule is set to determine the starting position of the search, obtain the starting statement, and compare the starting statement with all statements after the statement to determine the starting statement and all subsequent to the statement. Whether there is a synonym word in the statement; wherein all statements after the start statement are: after the start statement, all the statements that the start position is gradually moved backwards in characters;
  • the polling processing sub-module is set to move the search start position backward by one character to obtain a new search start position, and trigger the statement comparison sub-module.
  • the device in this embodiment further includes:
  • the history modification content checking module is configured to retrieve a historical modification database for recording historical modification content, perform a full document search on the modified content recorded in the history modification database, present the searched content, and according to the user's instruction Modify or ignore by historical modification.
  • the apparatus in this embodiment further includes a document modification record module, and the synonym word processing module 1320 triggers the document modification record module to be modified when the recorded synonym word is modified into a unified word.
  • the content and the modified content are recorded to the historical modification database.
  • the embodiment of the present invention introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the word consistency before and after the document.
  • the problem is a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the word consistency before and after the document. The problem.
  • the embodiment of the invention introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the consistency of the words before and after the document.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A document error search method and device. The method comprises: according to a database for synonymous characters, words, and sentences, checking and recording the synonymous characters, words, and sentences appearing in a document (S101); by comparing sentences, checking the synonymous characters, words, and sentences appearing in the document, and recording the synonymous characters, words, and sentences that are not in the database for synonymous characters, words, and sentences. Modifying recorded synonymous characters, words, and sentences to standard wording (S102).

Description

一种文档查错方法和装置Document checking method and device 技术领域Technical field
本发明实施例涉及但不限于计算机数据处理技术领域,尤其涉及一种文档查错方法和装置。Embodiments of the present invention relate to, but are not limited to, the field of computer data processing technologies, and in particular, to a document error detection method and apparatus.
背景技术Background technique
文档的自动查错技术能够自己检查出文档中出现的多种错误,有些还可以自动更正,提高了文档开发和修改的效率。The document's automatic error-checking technology can detect a variety of errors in the document, and some can automatically correct it, improving the efficiency of document development and modification.
相关技术中的文档查错技术主要是搜索某个错误的字词搭配,发现后进行提醒或更正。这种方法有比较大的局限性,存在无法检测文档前后用词一致性的问题。The document error detection technique in the related art mainly searches for a wrong word combination, and then prompts for reminding or correcting. This method has a relatively large limitation, and there is a problem that the word consistency cannot be detected before and after the document.
发明内容Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this document. This Summary is not intended to limit the scope of the claims.
本发明实施例提供一种文档查错方法和装置,可以解决相关技术无法检测文档前后用词一致性的问题。The embodiment of the invention provides a method and device for checking errors of a document, which can solve the problem that the related technology cannot detect the consistency of words before and after the document.
本发明实施例提供一种文档查错方法,包括:The embodiment of the invention provides a document error detection method, including:
根据同义字词句库,检查并记录文档中出现的同义字词句;Check and record synonymous words appearing in the document according to the synonym word library;
将记录的同义字词句修改为统一用词。Modify the recorded synonym words into uniform words.
可选地,在将记录的同义字词句修改为统一用词之前,所述方法还包括:Optionally, before the modified synonym sentence is modified into a unified word, the method further includes:
通过语句比对的方式,检查文档中出现的同义字词句,并对检查到的同义字词句中不属于同义字词句库中的同义字词句进行记录。Through the way of statement comparison, the synonymous words appearing in the document are checked, and the synonymous words in the synonymous words and sentences that are not in the synonymous sentence sentence are recorded.
可选地,其中,所述通过语句比对的方式,检查文档中出现的同义字词句,包括:Optionally, wherein the checking the synonym words appearing in the document by means of a statement comparison, including:
根据配置信息,确定语句长度及语句比对方式; Determine the length of the statement and the way the statement is compared according to the configuration information;
确定搜索起始位置,得到起始语句,将起始语句与该语句之后的所有语句进行比对,以确定起始语句与该语句之后的所有语句中是否存在同义字词句;其中,起始语句之后的所有语句为:在起始语句后,起始位置以字符为单位逐渐向后推移得到的所有语句;Determine the starting position of the search, get the starting statement, compare the starting statement with all the statements after the statement to determine whether there are synonymous words in the starting statement and all statements after the statement; All statements after the start statement are: after the start statement, all the statements that the start position is gradually moved backwards in characters;
将搜索起始位置向后移动一个字符,得到新的搜索起始位置后返回上述语句比对步骤。Move the search start position backward by one character to get the new search start position and return to the above statement comparison step.
可选地,其中,所述语句比对方式包括:Optionally, wherein the statement comparison manner includes:
将语句划分为同义字词句前端字符串+同义字词句字符串+同义字词句后端字符串;Divide the sentence into a synonym word front-end string + synonym word string + synonym word back-end string;
将两个语句相同位置的字符串进行比对,若两个语句的前端字符串相同、后端字符串相同且同义字词句字符串不同,则判定两个语句中的同义字词句字符串为同义字词句字符串。Compare the strings of the same position in two sentences. If the front-end strings of the two statements are the same, the back-end strings are the same, and the synonym word strings are different, then the synonymous words in the two sentences are determined. A string is a synonym string.
可选地,所述方法还包括:将通过语句对比方式记录的同义字词句写入同义字词句库。Optionally, the method further includes: writing the synonym word recorded by the statement comparison manner into the synonym word dictionary.
可选地,所述根据同义字词句库,检测并记录文档中出现的同义字词句之前,所述方法还包括:Optionally, before the detecting and recording the synonym word appearing in the document according to the synonym word library, the method further includes:
调取历史修改数据库;Retrieve the history modification database;
对所述历史修改数据库中记录的被修改的内容进行全文档搜索;Performing a full document search on the modified content recorded in the history modification database;
呈现搜索到的内容,并根据用户的指示按历史修改方式进行修改或忽略。Presents the searched content and modifies or ignores it according to the user's instructions.
可选地,将记录的同义字词句修改为统一用词时,所述方法还包括:将被修改的内容和修改后的内容记录到所述历史修改数据库。Optionally, when the recorded synonym sentence is modified into a unified word, the method further includes: recording the modified content and the modified content into the historical modification database.
可选地,其中,所述将记录的同义字词句修改为统一用词,包括:Optionally, wherein the synonym word recorded is modified into a unified word, including:
呈现记录的同义字词句信息;Presenting the synonymous word information of the record;
基于用户的修改指示,将文档中出现的同义字词句修改为统一用词;其中,所述统一用词为默认的或者用户指定的一同义字词句。The synonym word appearing in the document is modified into a unified word based on the user's modification instruction; wherein the unified word is a default or user-specified synonym word.
本发明实施例还提供了一种文档查错装置,包括: The embodiment of the invention further provides a document error checking device, comprising:
同义字词句检查模块,设置为根据同义字词句库,检查并记录文档中出现的同义字词句;以及The synonym word check module is set to check and record synonym words appearing in the document according to the synonym word library;
同义字词句处理模块,设置为将记录的同义字词句修改为统一用词。The synonym word processing module is set to modify the recorded synonym words into uniform words.
可选地,其中,所述同义字词句检查模块,还设置为通过语句比对的方式,检查文档中出现的同义字词句,并对检查到的同义字词句中不属于同义字词句库中的同义字词句进行记录。Optionally, wherein the synonym word check module is further configured to check synonym words appearing in the document by means of statement comparison, and does not belong to the detected synonym Synonymous words in the synonym sentence database are recorded.
可选地,其中,所述同义字词句检查模块包括:Optionally, wherein the synonym word check module includes:
信息获取子模块,设置为根据配置信息,确定语句长度及语句比对方式;The information acquisition sub-module is configured to determine a statement length and a statement comparison manner according to the configuration information;
语句比对子模块,设置为确定搜索起始位置,得到起始语句,将起始语句与该语句之后的所有语句进行比对,以确定起始语句与该语句之后的所有语句中是否存在同义字词句;其中,起始语句之后的所有语句为:在起始语句后,起始位置以字符为单位逐渐向后推移得到的所有语句;以及The statement comparison submodule is set to determine the starting position of the search, obtain the starting statement, and compare the starting statement with all the statements after the statement to determine whether the starting statement and the statement after the statement are the same. a semantic word; in which all statements after the start statement are: all statements that are gradually moved backwards in characters after the start of the statement;
轮询处理子模块,设置为将搜索起始位置向后移动一个字符,得到新的搜索起始位置后,触发所述语句比对子模块。The polling processing sub-module is set to move the search start position backward by one character to obtain a new search start position, and trigger the statement comparison sub-module.
可选地,所述装置还包括:历史修改内容检查模块,设置为调取历史修改数据库,对所述历史修改数据库中记录的被修改的内容进行全文档搜索,呈现搜索到的内容,并根据用户的指示按历史修改方式进行修改或忽略。Optionally, the device further includes: a history modification content checking module, configured to retrieve a history modification database, perform a full document search on the modified content recorded in the history modification database, and present the searched content, and according to The user's instructions are modified or ignored by historical modification.
可选地,所述装置还包括:文档修改记录模块,设置为将被修改的内容和修改后的内容记录到所述历史修改数据库。Optionally, the apparatus further includes: a document modification record module, configured to record the modified content and the modified content to the history modification database.
可选地,其中,所述同义字词句处理模块将记录的同义字词句修改为统一用词,包括:Optionally, wherein the synonym word processing module modifies the recorded synonym words into unified words, including:
所述同义字词句处理模块呈现记录的同义字词句信息;The synonym word processing module presents the recorded synonym sentence information;
基于用户的修改指示,所述同义字词句处理模块将文档中出现的同义字词句修改为统一用词;其中,所述统一用词为默认的或者用户指定的一同义字词句。The synonym word processing module modifies the synonym word appearing in the document into a unified word based on the user's modification instruction; wherein the unified word is a default or a user-specified synonym .
本发明实施例还提供一种计算机可读存储介质,存储有程序指令,当该 程序指令被执行时可实现上述方法。An embodiment of the present invention further provides a computer readable storage medium, where program instructions are stored, when The above method can be implemented when program instructions are executed.
本发明实施例引入了同义字词句库和语句比对方案,实现了对文档中同义字词句的检查及修改,解决相关技术无法检测文档前后用词一致性的问题。The embodiment of the invention introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the consistency of the words before and after the document.
在阅读并理解了附图和详细描述后,可以明白其他方面。Other aspects will be apparent upon reading and understanding the drawings and detailed description.
附图概述BRIEF abstract
图1为本发明实施例一提供的一种文档查错方法的流程图;1 is a flowchart of a method for error checking a document according to Embodiment 1 of the present invention;
图2为本发明实施例二利用同义字词句库搜索同义字词句的流程图;2 is a flow chart of searching synonymous words and phrases using a synonym word sentence library according to Embodiment 2 of the present invention;
图3为本发明实施例二利用语句比对技术搜索文档内同义字词句的流程图;3 is a flowchart of searching for synonymous words in a document by using a statement comparison technique according to Embodiment 2 of the present invention;
图4为本发明实施例中语句比对示意图;4 is a schematic diagram of statement comparison in an embodiment of the present invention;
图5为本发明实施例二对记录的同义字词句进行显示的效果图;FIG. 5 is a diagram showing an effect of displaying a synonymous sentence of a record according to Embodiment 2 of the present invention; FIG.
图6为本发明应用示例中待检查文档的示意图;6 is a schematic diagram of a document to be inspected in an application example of the present invention;
图7为本发明应用示例中同义字词句库中的记录情况示意图;7 is a schematic diagram of a recording situation in a synonym word sentence library in an application example of the present invention;
图8为本发明应用示例中语句比对搜索同义字词句的示意图;FIG. 8 is a schematic diagram of a sentence comparison search synonym word in the application example of the present invention; FIG.
图9为本发明应用示例中利用同义字词句和语句比对搜索得到的同义字词句示意图;FIG. 9 is a schematic diagram of a synonym word sentence obtained by using a synonym word sentence and a sentence comparison search in an application example of the present invention; FIG.
图10为本发明应用示例中用户在显示的同义字词句中输入了统一用词的效果图;10 is an effect diagram of a user inputting a unified word in a synonym word displayed in the application example of the present invention;
图11为本发明应用示例中将同义字词句修改为统一用词后的效果示意图;FIG. 11 is a schematic diagram showing the effect of modifying a synonym word into a unified word in an application example of the present invention; FIG.
图12为本发明应用示例中更新后的同义字词句库的示意图;12 is a schematic diagram of an updated synonym word bank in an application example of the present invention;
图13为本发明实施例三提供的一种文档查错装置的结构示意图。FIG. 13 is a schematic structural diagram of a document error checking apparatus according to Embodiment 3 of the present invention.
本发明的实施方式 Embodiments of the invention
下面将结合附图对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments.
实施例一Embodiment 1
本发明实施例提供一种文档查错方法,用于检查文档中字词句前后一致性,如图1所示,所述方法包括:The embodiment of the invention provides a document error checking method for checking the consistency of a word sentence in a document. As shown in FIG. 1 , the method includes:
步骤S101,根据同义字词句库,检查并记录文档中出现的同义字词句;Step S101, according to the synonym word library, checking and recording synonym words appearing in the document;
本实施例中,防止同义字词句库中记录的同义字词句不全,进而可能出现遗漏的问题,在步骤S101后,可选地,可通过语句比对的方式,检查文档中出现的同义字词句,并对检查到的同义字词句中不属于同义字词句库中的同义字词句进行记录。同时,将通过语句对比方式记录的同义字词句写入同义字词句库,以对同义字词句库进行更新。In this embodiment, the synonym word sentence recorded in the synonym word sentence database is prevented from being incomplete, and thus the omission problem may occur. After step S101, optionally, the statement may be checked by the statement comparison method. The synonymous words and sentences of the synonymous words in the synonymous words and sentences that are not included in the synonymous sentence sentence are recorded. At the same time, synonymous words recorded by the statement comparison method are written into the synonym word bank to update the synonym word library.
其中,所述通过语句比对的方式,检查文档中出现的同义字词句,包括:Wherein, by means of a statement comparison, the synonymous words appearing in the document are checked, including:
(1)根据配置信息,确定语句长度及语句比对方式;其中,所述语句比对方式包括但不限于为:将语句划分为同义字词句前端字符串+同义字词句字符串+同义字词句后端字符串;将两个语句相同位置的字符串进行比对,若两个语句的前端字符串相同、后端字符串相同且同义字词句字符串不同,则判定两个语句中的同义字词句字符串为同义字词句字符串;(1) determining the length of the statement and the method of comparing the sentences according to the configuration information; wherein the manner of comparing the statements includes but is not limited to: dividing the statement into a front-end string of the synonym word + a synonym string + Synonym word backend string; compare the strings of the same position in the two statements. If the front end strings of the two statements are the same, the backend string is the same, and the synonym word string is different, then Determine that the synonym word string in the two sentences is a synonym word string;
(2)确定搜索起始位置,得到起始语句,将起始语句与该语句之后的所有语句进行比对,以确定起始语句与该语句之后的所有语句中是否存在同义字词句;其中,起始语句之后的所有语句为:在起始语句后,起始位置以字符为单位逐渐向后推移得到的所有语句;(2) determining the starting position of the search, obtaining a starting statement, comparing the starting statement with all statements after the statement to determine whether there is a synonym in the initial statement and all statements following the statement; Among them, all the statements after the start statement are: after the start statement, all the statements that the start position is gradually moved backwards in characters;
(3)将搜索起始位置向后移动一个字符,得到新的搜索起始位置后返回步骤(2)。(3) Move the search start position backward by one character to obtain a new search start position and return to step (2).
步骤S102,将记录的同义字词句修改为统一用词。In step S102, the recorded synonym word is modified into a unified word.
该步骤包括:呈现记录的同义字词句信息,基于用户的修改指示,将文档中出现的同义字词句修改为统一用词;其中,所述统一用词可以为默认的一同义字词句,或者,用户指定的一同义字词句。The step includes: presenting the recorded synonym sentence information, and modifying the synonym word appearing in the document into a unified word based on the user's modification instruction; wherein the unified word can be the default one-synonym Words, or, a user-specified synonym.
可选地,本实施例还可以按照历史修改操作,对文档中的字词句进行统 一的操作,该操作的执行时机可以在S101之前或者S102之后,在S101之前执行较好,处理方式如下:Optionally, the embodiment may also perform the operation of modifying the words in the document according to the historical modification operation. For an operation, the execution timing of the operation may be performed before S101 or after S102, and before S101, the processing manner is as follows:
调取历史修改数据库;Retrieve the history modification database;
对历史修改数据库中记录的被修改的内容进行全文档搜索;Perform a full document search on the modified content recorded in the history modification database;
呈现搜索到的内容,并根据用户的指示按历史修改方式进行修改或忽略。Presents the searched content and modifies or ignores it according to the user's instructions.
可选地,本实施例中,在将记录的同义字词句修改为统一用词时,所述方法还包括:将被修改的内容和修改后的内容记录到所述历史修改数据库。Optionally, in this embodiment, when the recorded synonym sentence is modified into a unified word, the method further includes: recording the modified content and the modified content to the history modification database.
综上所述,可知本实施例所述方法,引入了同义字词句库和语句比对方案,实现了对文档中同义字词句的检查及修改,解决相关技术无法检测文档前后用词一致性的问题。In summary, it can be seen that the method described in the embodiment introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the document before and after. The problem of word consistency.
实施例二 Embodiment 2
本实施例提供一种文档查错方法,对实施例一中的文档查错方法进行更详细的说明。This embodiment provides a document error detection method, and the document error detection method in the first embodiment is described in more detail.
本实施例所述的文档查错方法的主要处理流程步骤包括:The main processing steps of the document error detection method described in this embodiment include:
步骤A:根据同义字词句库中的记录检查可能出现的同义字词句并将结果记录;Step A: Check the synonymous words that may appear according to the records in the synonym word bank and record the results;
在此步骤中,程序会根据同义字词句库中的记录对文档进行遍历搜索,并将搜索结果记录下来,处理完后进入步骤B。In this step, the program traverses the document according to the records in the synonym word bank, and records the search results. After processing, it proceeds to step B.
该步骤的详细处理流程如图2所示,包括:The detailed processing flow of this step is shown in Figure 2, including:
步骤S201,开始;Step S201, starting;
步骤S202,检测同义字词句库中是否有记录,若是,则执行步骤S203;若否,则转步骤S208;Step S202, detecting whether there is a record in the synonym word bank, if yes, proceeding to step S203; if not, proceeding to step S208;
步骤S203,从同义字词句库中取出一条记录;Step S203, taking a record from the synonym word library;
步骤S204,搜索文档中符合记录的内容;Step S204, searching for the content in the document that matches the record;
步骤S205,判断是否搜索到,若是,则执行步骤S206;若否,则执行步骤S207; Step S205, it is determined whether the search is performed, and if so, step S206 is performed; if not, step S207 is performed;
步骤S206,将搜索到的结果记录下来,继续执行步骤S207;Step S206, the searched results are recorded, continue to step S207;
步骤S207,判断是否有需要继续搜索的记录,若是,则返回步骤S203;若否,则执行步骤S208;Step S207, it is determined whether there is a record that needs to continue the search, and if so, returns to step S203; if not, then proceeds to step S208;
步骤S208,结束。Step S208, ending.
步骤B:对文档内部相近的字词句进行搜索并将结果记录;Step B: Searching for similar words within the document and recording the results;
此步骤是为了避免由于同义字词句库中没有列出同义字词句而导致一些同义字词句没有被搜索到,造成遗漏。在此步骤中,程序会根据配置信息,根据前后文情况,遍历搜索文档中可能的近义词,并就将结果记录下来。处理完后进入步骤C。This step is to prevent some synonym words from being searched because the synonym words are not listed in the synonym sentence database, resulting in omission. In this step, the program will traverse the possible synonyms in the search document according to the configuration information and record the results according to the context. After the process is completed, proceed to step C.
该步骤的详细处理流程如图3所示,包括:The detailed processing flow of this step is shown in Figure 3, including:
步骤S301,开始;Step S301, starting;
步骤S302,读取配置信息,确定:同义字词句前段相同字符串的长度;同义字词句字符串的长度;同义字词句后端相同字符串的长度;Step S302, reading the configuration information, determining: the length of the same character string in the preceding paragraph of the synonym word; the length of the synonym word string; the length of the same string in the back end of the synonym word;
步骤S303,确定搜索起始位置1,对于初始搜索,起始位置可以是文档的第一个字符,或者,用户指定的位置;Step S303, determining a search start position 1, for the initial search, the start position may be the first character of the document, or the position specified by the user;
步骤S304,从搜索起始位置1,根据配置信息,取出前端字符串1、同义字符串1和后端字符串1;Step S304, from the search start position 1, according to the configuration information, take out the front-end string 1, the synonym string 1 and the back-end string 1;
步骤S305,从搜索起始位置1向后移动一段字符串长度(前端+后端+同义字词句字符串长度之后)得到搜索起始位置2;Step S305, moving backward from the search start position 1 by a string length (front end + back end + synonym word string length) to obtain a search start position 2;
步骤S306,从搜索起始位置2,根据配置信息,取出前端字符串2、同义字词句字符串2和后端字符串2;如图4所示,从搜索起始位置1和搜索起始位置2取出字符串的示意图;Step S306, from the search start position 2, according to the configuration information, the front end character string 2, the synonym word string 2 and the back end character string 2 are extracted; as shown in FIG. 4, from the search start position 1 and the search a schematic diagram of the start position 2 taking out a character string;
步骤S307,判断是否前端字符串1和前端字符串2相同,并且前端字符串2和后端字符串2相同,若是,则执行步骤S308;若否,则执行步骤S310;Step S307, it is determined whether the front end character string 1 and the front end character string 2 are the same, and the front end character string 2 and the back end character string 2 are the same, if yes, step S308 is performed; if not, step S310 is performed;
步骤S308,判断是否同义字词句字符串1和同义字词句字符串2相同,若是,则执行步骤S310;若否,则执行步骤S309; Step S308, it is determined whether the synonym word string 1 and the synonym string 2 are the same, if yes, then step S310; if not, then step S309;
步骤S309,将搜索到的结果记录下来;Step S309, recording the searched result;
步骤S310,判断搜索起始位置2是否已经搜索到文档结尾,若是,执行步骤S312;若否,则执行步骤S311;Step S310, it is determined whether the search start position 2 has searched for the end of the document, and if so, step S312 is performed; if not, step S311 is performed;
步骤S311,搜索起始位置2向后移动一个字符长度,返回步骤S306;Step S311, the search start position 2 is moved backward by one character length, and returns to step S306;
步骤S312,判断搜索起始位置1是否已经搜索到文档结尾,若是,则结束;若否,则将搜索起始位置1向后移动一个字符串长度后,执行步骤S304。In step S312, it is determined whether the search start position 1 has searched for the end of the document, and if so, the process ends; if not, the search start position 1 is moved backward by a character string length, and then step S304 is performed.
步骤C:此步骤主要是将步骤A和步骤B发现的近义词显示出来,供用户判断是否需要将近义词进行统一替换,以保证文档用语的一致性;此步骤的显示效果如图5所示;Step C: This step is mainly to display the synonym found in step A and step B, for the user to judge whether the synonym needs to be uniformly replaced, so as to ensure the consistency of the document term; the display effect of this step is shown in FIG. 5;
步骤D:步骤B和步骤C发现的近义词显示出来后,用户会选择忽略或将近义词统一替换。Step D: After the synonym found in Step B and Step C is displayed, the user will choose to ignore or replace the synonyms.
若选择忽略,则程序不做任何操作;If you choose to ignore, the program does nothing;
若选择替换,则将近义词统一替换,同时还进行如下处理:If you choose to replace, the synonyms are replaced uniformly, and the following processing is also performed:
如果这组近义词和统一修改的词已经保存在近义词库中,则程序不会对同义字词句库进行任何操作;If the set of synonyms and uniformly modified words have been saved in the thesaurus, the program will not perform any operations on the synonym dictionary;
如果这组近义词已经保存在近义词库中,但统一修改的词是本次新输入的,则程序会更新同义字词句库中这条记录,将统一用词补充进记录中;If the set of synonyms is already stored in the thesaurus, but the uniformly modified words are newly entered, the program will update the record in the synonym dictionary and add the unified words to the record;
如果这组近义词在同义字词句库中还没有记录(是这次新发现的),则程序会将这条记录填加入同义字词句库中;If the set of synonyms is not recorded in the synonym word bank (this is newly discovered), the program will fill this record into the synonym word library;
流程结束。The process ends.
下面通过一个示例来说明上述方法的应用过程,设有一段待检查的文档如图6所示。此时的同义字词句库中的记录情况如图7所示。此时运行检查程序,会按下列步骤执行:The application process of the above method is illustrated by an example. A document to be inspected is provided as shown in FIG. 6. The record in the synonym word bank at this time is shown in Fig. 7. At this point, run the checker and follow these steps:
步骤1:检查程序会先从同义字词句库中取出第一条记录,即“数据包”和“数据帧”是同义字词句,然后按照图2(同义字词句库记录搜索流程图) 流程进行搜索,在文档中会搜索到这两个同义字词句,所以,会将这个搜索结果记录下来;Step 1: The inspection program first takes the first record from the synonym word library, that is, the "data packet" and the "data frame" are synonymous words, and then according to Figure 2 (synonym word database record) Search flow chart) The process searches, and the two synonyms are searched in the document, so the search result will be recorded;
步骤2:程序按照图3(文档内部同义字词句搜索流程图)流程执行(假设此时配置的搜索条件为前端字符串长度5,同义字词句长度3,后端字符串长度2),则程序会发现文档中存在同义字词句“处理单元”,“处理模块”和“处理程序”,搜索示意图如图8所示,图中的前端字符串和后端字符串都用底色背景标明;Step 2: The program is executed according to the flow chart of Figure 3 (search internal synonym word search flow chart) (assuming that the search condition configured at this time is the front end string length 5, the synonym word length 3, and the back end string length 2 ), the program will find the synonym word "processing unit", "processing module" and "processing program" in the document, the search schematic is shown in Figure 8, the front-end string and the back-end string in the figure are used The background color is indicated;
步骤3:程序将步骤2和步骤3检查出的结果显示出来,如图9所示;Step 3: The program displays the results checked in steps 2 and 3, as shown in FIG. 9;
步骤4:用户根据显示结果,确定了近义词需要统一使用相同的词语,所以在显示结果界面输入了统一用词,效果如图10所示;Step 4: According to the display result, the user determines that the synonym needs to use the same word uniformly, so the unified word is input in the display result interface, and the effect is as shown in FIG. 10;
步骤5:用户选择替换近义词为统一用词,程序自动修改文档和同义字词句库,更新后的文档如图11所示(修改后的内容用底色背景标明),更新后的同义字词句库如图12所示。Step 5: The user selects the replacement synonym for the unified word, the program automatically modifies the document and the synonym word library, and the updated document is as shown in FIG. 11 (the modified content is marked with the background background), and the updated synonym The word dictionary is shown in Figure 12.
程序执行结束,文档得到了修改,同义字词句库得到了更新。At the end of the program execution, the document has been modified and the synonym word library has been updated.
实施例三 Embodiment 3
本实施例提供一种文档差错装置,如图13所示,包括:This embodiment provides a document error device, as shown in FIG. 13, including:
同义字词句检查模块1310,设置为根据同义字词句库,检查并记录文档中出现的同义字词句;以及The synonym word check module 1310 is configured to check and record synonym words appearing in the document according to the synonym word library;
同义字词句处理模块1320,设置为将记录的同义字词句修改为统一用词。The synonym word processing module 1320 is configured to modify the recorded synonym sentence into a unified word.
可选地,本实施例中,同义字词句检查模块1310,还设置为通过语句比对的方式,检查文档中出现的同义字词句,并对检查到的同义字词句中不属于同义字词句库中的同义字词句进行记录。Optionally, in this embodiment, the synonym sentence checking module 1310 is further configured to check synonymous words and phrases appearing in the document by means of statement comparison, and in the synonymous words and sentences detected. Synonymous words that are not in the synonym sentence library are recorded.
其中,同义字词句检查模块1310,包括:The synonym word check module 1310 includes:
信息获取子模块,设置为根据配置信息,确定语句长度及语句比对方式;The information acquisition sub-module is configured to determine a statement length and a statement comparison manner according to the configuration information;
语句比对子模块,设置为确定搜索起始位置,得到起始语句,将起始语句与该语句之后的所有语句进行比对,以确定起始语句与该语句之后的所有 语句中是否存在同义字词句;其中,起始语句之后的所有语句为:在起始语句后,起始位置以字符为单位逐渐向后推移得到的所有语句;以及The statement comparison submodule is set to determine the starting position of the search, obtain the starting statement, and compare the starting statement with all statements after the statement to determine the starting statement and all subsequent to the statement. Whether there is a synonym word in the statement; wherein all statements after the start statement are: after the start statement, all the statements that the start position is gradually moved backwards in characters;
轮询处理子模块,设置为将搜索起始位置向后移动一个字符,得到新的搜索起始位置后,触发所述语句比对子模块。The polling processing sub-module is set to move the search start position backward by one character to obtain a new search start position, and trigger the statement comparison sub-module.
可选地,本实施例所述装置还包括:Optionally, the device in this embodiment further includes:
历史修改内容检查模块,设置为调取用以记录历史修改内容的历史修改数据库,对所述历史修改数据库中记录的被修改的内容进行全文档搜索,呈现搜索到的内容,并根据用户的指示按历史修改方式进行修改或忽略。The history modification content checking module is configured to retrieve a historical modification database for recording historical modification content, perform a full document search on the modified content recorded in the history modification database, present the searched content, and according to the user's instruction Modify or ignore by historical modification.
可选地,本实施例中所述装置还包括文档修改记录模块,同义字词句处理模块1320在将记录的同义字词句修改为统一用词时,触发文档修改记录模块将被修改的内容和修改后的内容记录到所述历史修改数据库。Optionally, the apparatus in this embodiment further includes a document modification record module, and the synonym word processing module 1320 triggers the document modification record module to be modified when the recorded synonym word is modified into a unified word. The content and the modified content are recorded to the historical modification database.
综上所述,可知本发明实施例引入了同义字词句库和语句比对方案,实现了对文档中同义字词句的检查及修改,解决相关技术无法检测文档前后用词一致性的问题。In summary, it can be seen that the embodiment of the present invention introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the word consistency before and after the document. The problem.
本文中的实施例采用递进的方式描述,每个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是其与其他实施例的不同之处。尤其对于装置实施例而言,由于其基本相似于方法实施例,相关之处参见方法实施例的部分说明即可。The embodiments herein are described in a progressive manner, and the same or similar parts of each embodiment may be referred to each other, and each embodiment focuses on its differences from other embodiments. In particular, for the device embodiment, since it is substantially similar to the method embodiment, reference may be made to the partial description of the method embodiment.
本领域普通技术人员可以理解上述方法中的全部或部分步骤可通过程序来指令相关硬件完成,上述程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本发明实施例不限制于任何特定形式的硬件和软件的结合。One of ordinary skill in the art will appreciate that all or a portion of the above steps may be accomplished by a program that instructs the associated hardware, such as a read-only memory, a magnetic disk, or an optical disk. Alternatively, all or part of the steps of the above embodiments may also be implemented using one or more integrated circuits. Correspondingly, each module/unit in the foregoing embodiment may be implemented in the form of hardware or in the form of a software function module. Embodiments of the invention are not limited to any specific form of combination of hardware and software.
工业实用性Industrial applicability
本发明实施例引入了同义字词句库和语句比对方案,实现了对文档中同义字词句的检查及修改,解决相关技术无法检测文档前后用词一致性的问题。 The embodiment of the invention introduces a synonym word sentence database and a sentence comparison scheme, and implements the checking and modification of synonym words in the document, and solves the problem that the related technology cannot detect the consistency of the words before and after the document.

Claims (15)

  1. 一种文档查错方法,包括:A document error detection method, including:
    根据同义字词句库,检查并记录文档中出现的同义字词句;Check and record synonymous words appearing in the document according to the synonym word library;
    将记录的同义字词句修改为统一用词。Modify the recorded synonym words into uniform words.
  2. 如权利要求1所述的方法,在所述将记录的同义字词句修改为统一用词之前,所述方法还包括:The method of claim 1, before the modifying the recorded synonymous words into a unified term, the method further comprising:
    通过语句比对的方式,检查文档中出现的同义字词句,并对检查到的同义字词句中不属于同义字词句库中的同义字词句进行记录。Through the way of statement comparison, the synonymous words appearing in the document are checked, and the synonymous words in the synonymous words and sentences that are not in the synonymous sentence sentence are recorded.
  3. 如权利要求2所述的方法,其中,所述通过语句比对的方式,检查文档中出现的同义字词句,包括:The method of claim 2, wherein said checking synonymous words appearing in the document by way of statement alignment comprises:
    根据配置信息,确定语句长度及语句比对方式;Determine the length of the statement and the way the statement is compared according to the configuration information;
    确定搜索起始位置,得到起始语句,将起始语句与该语句之后的所有语句进行比对,以确定起始语句与该语句之后的所有语句中是否存在同义字词句;其中,起始语句之后的所有语句为:在起始语句后,起始位置以字符为单位逐渐向后推移得到的所有语句;Determine the starting position of the search, get the starting statement, compare the starting statement with all the statements after the statement to determine whether there are synonymous words in the starting statement and all statements after the statement; All statements after the start statement are: after the start statement, all the statements that the start position is gradually moved backwards in characters;
    将搜索起始位置向后移动一个字符,得到新的搜索起始位置后返回上述语句比对步骤。Move the search start position backward by one character to get the new search start position and return to the above statement comparison step.
  4. 如权利要求3所述的方法,其中,所述语句比对方式包括:The method of claim 3 wherein said statement comparison means comprises:
    将语句划分为同义字词句前端字符串+同义字词句字符串+同义字词句后端字符串;Divide the sentence into a synonym word front-end string + synonym word string + synonym word back-end string;
    将两个语句相同位置的字符串进行比对,若两个语句的前端字符串相同、后端字符串相同且同义字词句字符串不同,则判定两个语句中的同义字词句字符串为同义字词句字符串。Compare the strings of the same position in two sentences. If the front-end strings of the two statements are the same, the back-end strings are the same, and the synonym word strings are different, then the synonymous words in the two sentences are determined. A string is a synonym string.
  5. 如权利要求2至4任意一项所述的方法,所述方法还包括:将通过语句对比方式记录的同义字词句写入同义字词句库。The method according to any one of claims 2 to 4, further comprising: writing the synonym word recorded by the sentence comparison manner into the synonym word bank.
  6. 如权利要求1至4任意一项所述的方法,所述根据同义字词句库,检测并记录文档中出现的同义字词句之前,所述方法还包括: The method according to any one of claims 1 to 4, wherein the method further comprises: before detecting and recording a synonym word appearing in the document according to the synonym word library;
    调取历史修改数据库;Retrieve the history modification database;
    对所述历史修改数据库中记录的被修改的内容进行全文档搜索;Performing a full document search on the modified content recorded in the history modification database;
    呈现搜索到的内容,并根据用户的指示按历史修改方式进行修改或忽略。Presents the searched content and modifies or ignores it according to the user's instructions.
  7. 如权利要求6所述的方法,所述将记录的同义字词句修改为统一用词时,所述方法还包括:将被修改的内容和修改后的内容记录到所述历史修改数据库。The method according to claim 6, wherein when the recorded synonym word is modified into a unified word, the method further comprises: recording the modified content and the modified content to the history modification database.
  8. 如权利要求1或2或3或4或7任意一项所述的方法,其中,所述将记录的同义字词句修改为统一用词,包括:The method of any one of claims 1 or 2 or 3 or 4 or 7, wherein said modifying the recorded synonymous words into uniform words comprises:
    呈现记录的同义字词句信息;Presenting the synonymous word information of the record;
    基于用户的修改指示,将文档中出现的同义字词句修改为统一用词;其中,所述统一用词为默认的或者用户指定的一同义字词句。The synonym word appearing in the document is modified into a unified word based on the user's modification instruction; wherein the unified word is a default or user-specified synonym word.
  9. 一种文档差错装置,包括:A document error device comprising:
    同义字词句检查模块,设置为根据同义字词句库,检查并记录文档中出现的同义字词句;以及The synonym word check module is set to check and record synonym words appearing in the document according to the synonym word library;
    同义字词句处理模块,设置为将记录的同义字词句修改为统一用词。The synonym word processing module is set to modify the recorded synonym words into uniform words.
  10. 如权利要求9所述的装置,其中,所述同义字词句检查模块,还设置为通过语句比对的方式,检查文档中出现的同义字词句,并对检查到的同义字词句中不属于同义字词句库中的同义字词句进行记录。The apparatus according to claim 9, wherein said synonym word check module is further configured to check synonymous words appearing in the document by means of statement comparison, and to check synonymous words. Synonyms in the words and sentences that are not in the synonym of the sentence are recorded.
  11. 如权利要求10所述的装置,其中,所述同义字词句检查模块包括:The apparatus of claim 10, wherein the synonym word check module comprises:
    信息获取子模块,设置为根据配置信息,确定语句长度及语句比对方式;The information acquisition sub-module is configured to determine a statement length and a statement comparison manner according to the configuration information;
    语句比对子模块,设置为确定搜索起始位置,得到起始语句,将起始语句与该语句之后的所有语句进行比对,以确定起始语句与该语句之后的所有语句中是否存在同义字词句;其中,起始语句之后的所有语句为:在起始语句后,起始位置以字符为单位逐渐向后推移得到的所有语句;以及The statement comparison submodule is set to determine the starting position of the search, obtain the starting statement, and compare the starting statement with all the statements after the statement to determine whether the starting statement and the statement after the statement are the same. a semantic word; in which all statements after the start statement are: all statements that are gradually moved backwards in characters after the start of the statement;
    轮询处理子模块,设置为将搜索起始位置向后移动一个字符,得到新的搜索起始位置后,触发所述语句比对子模块。The polling processing sub-module is set to move the search start position backward by one character to obtain a new search start position, and trigger the statement comparison sub-module.
  12. 如权利要求9或10或11所述的装置,其中,所述装置还包括: The device of claim 9 or 10 or 11, wherein the device further comprises:
    历史修改内容检查模块,设置为调取历史修改数据库,对所述历史修改数据库中记录的被修改的内容进行全文档搜索,呈现搜索到的内容,并根据用户的指示按历史修改方式进行修改或忽略。The history modification content checking module is configured to retrieve a history modification database, perform a full document search on the modified content recorded in the history modification database, present the searched content, and modify the historical modification manner according to the user's instruction or ignore.
  13. 如权利要求12所述的装置,所述装置还包括文档修改记录模块,其设置为:将被修改的内容和修改后的内容记录到所述历史修改数据库。The apparatus of claim 12, further comprising a document modification recording module configured to record the modified content and the modified content to the history modification database.
  14. 如权利要求9或10或11或13任意一项所述的装置,其中,所述同义字词句处理模块将记录的同义字词句修改为统一用词,包括:The apparatus according to any one of claims 9 or 10 or 11 or 13, wherein said synonym word processing module modifies the recorded synonym words into uniform words, including:
    所述同义字词句处理模块呈现记录的同义字词句信息;The synonym word processing module presents the recorded synonym sentence information;
    基于用户的修改指示,所述同义字词句处理模块将文档中出现的同义字词句修改为统一用词;其中,所述统一用词为默认的或者用户指定的一同义字词句。The synonym word processing module modifies the synonym word appearing in the document into a unified word based on the user's modification instruction; wherein the unified word is a default or a user-specified synonym .
  15. 一种计算机可读存储介质,存储有程序指令,当该程序指令被执行时可实现权利要求1-8任一项所述的方法。 A computer readable storage medium storing program instructions that, when executed, implement the method of any of claims 1-8.
PCT/CN2015/091129 2015-07-16 2015-09-29 Document error search method and device WO2016131278A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510418321.8 2015-07-16
CN201510418321.8A CN106407188A (en) 2015-07-16 2015-07-16 Document error-checking method and device

Publications (1)

Publication Number Publication Date
WO2016131278A1 true WO2016131278A1 (en) 2016-08-25

Family

ID=56692153

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/091129 WO2016131278A1 (en) 2015-07-16 2015-09-29 Document error search method and device

Country Status (2)

Country Link
CN (1) CN106407188A (en)
WO (1) WO2016131278A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111339788B (en) 2020-02-18 2023-09-15 北京字节跳动网络技术有限公司 Interactive machine translation method, device, equipment and medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110179034A1 (en) * 2010-01-20 2011-07-21 Sony Corporation Information processor, method of processing information, and program
CN102999483A (en) * 2011-09-16 2013-03-27 北京百度网讯科技有限公司 Method and device for correcting text
CN103164390A (en) * 2011-12-15 2013-06-19 富士通株式会社 Document processing method and document processing device
CN103678424A (en) * 2012-09-25 2014-03-26 北大方正集团有限公司 Document proofreading method and device

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6947947B2 (en) * 2001-08-17 2005-09-20 Universal Business Matrix Llc Method for adding metadata to data
CN103927375B (en) * 2002-09-30 2017-06-23 改进搜索有限责任公司 The flicker annotation callout of cross-language search result is highlighted
CN102541824B (en) * 2010-12-26 2015-10-21 上海量明科技发展有限公司 A kind of method and system in order to realize document amendment
US8856156B1 (en) * 2011-10-07 2014-10-07 Cerner Innovation, Inc. Ontology mapper
CN102937994A (en) * 2012-11-15 2013-02-20 北京锐安科技有限公司 Similar document query method based on stop words

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110179034A1 (en) * 2010-01-20 2011-07-21 Sony Corporation Information processor, method of processing information, and program
CN102999483A (en) * 2011-09-16 2013-03-27 北京百度网讯科技有限公司 Method and device for correcting text
CN103164390A (en) * 2011-12-15 2013-06-19 富士通株式会社 Document processing method and document processing device
CN103678424A (en) * 2012-09-25 2014-03-26 北大方正集团有限公司 Document proofreading method and device

Also Published As

Publication number Publication date
CN106407188A (en) 2017-02-15

Similar Documents

Publication Publication Date Title
US10860551B2 (en) Identifying header lines and comment lines in log files
US11500894B2 (en) Identifying boundaries of substrings to be extracted from log files
US10740541B2 (en) Fact validation in document editors
US9965472B2 (en) Content revision using question and answer generation
JP5257330B2 (en) Statement recording device, statement recording method, program, and recording medium
US20130061139A1 (en) Server-based spell checking on a user device
US20120179705A1 (en) Query reformulation in association with a search box
WO2020215563A1 (en) Training sample generation method and device for text classification, and computer apparatus
US11526481B2 (en) Incremental dynamic document index generation
US10459957B2 (en) User-guided term suggestions
US20130060560A1 (en) Server-based spell checking
CN110348020A (en) A kind of English- word spelling error correction method, device, equipment and readable storage medium storing program for executing
JP5880152B2 (en) Document creation support program and document creation support apparatus
US20190243741A1 (en) Debugging for sql statement
WO2016131278A1 (en) Document error search method and device
CN113032279A (en) Web application testing and repairing method based on semantic path search
RU2693328C2 (en) Methods and systems for generating a replacement request for a user input request
WO2022028029A1 (en) Multi-round interaction method and apparatus, and storage medium
JPS61156466A (en) Word extracting system
WO2021036968A1 (en) Data labeling method and device, and storage medium
TWI492072B (en) Input system and input method
JP2012185790A (en) Dependency analysis support device
CN114037284B (en) Method for predicting App popularity evolution result based on multi-layer attribute network
CN114238376A (en) Code retrieval method, system, equipment and storage medium
CN109271392A (en) Quick discrimination and the method and apparatus for extracting relevant database entity and attribute

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15882417

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15882417

Country of ref document: EP

Kind code of ref document: A1