CN110990593B - Citation falling empty detection method and device - Google Patents

Citation falling empty detection method and device Download PDF

Info

Publication number
CN110990593B
CN110990593B CN201911298605.2A CN201911298605A CN110990593B CN 110990593 B CN110990593 B CN 110990593B CN 201911298605 A CN201911298605 A CN 201911298605A CN 110990593 B CN110990593 B CN 110990593B
Authority
CN
China
Prior art keywords
sequence number
document
detected
sequence
target object
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911298605.2A
Other languages
Chinese (zh)
Other versions
CN110990593A (en
Inventor
李少明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New Founder Holdings Development Co ltd
Beijing Founder Electronics Co Ltd
Original Assignee
New Founder Holdings Development Co ltd
Beijing Founder Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New Founder Holdings Development Co ltd, Beijing Founder Electronics Co Ltd filed Critical New Founder Holdings Development Co ltd
Priority to CN201911298605.2A priority Critical patent/CN110990593B/en
Publication of CN110990593A publication Critical patent/CN110990593A/en
Application granted granted Critical
Publication of CN110990593B publication Critical patent/CN110990593B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/382Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using citations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/38Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/383Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The embodiment of the application provides a method and a device for detecting citation falling empty, which are used for acquiring a sequence set corresponding to a document to be detected, wherein the sequence set comprises the following components: and each target object in the document to be detected is in the first order in the document to be detected. And identifying a second sequence referenced in the text of the document to be detected. And judging whether the second sequence number exists in the sequence number set. And if the second sequence number does not exist in the sequence number set, pushing prompt information, wherein the prompt information is used for indicating that the second sequence number reference is empty. Compared with the prior art which relies on manual detection by a user, the reference falling space detection method and device provided by the embodiment of the application can improve the detection accuracy and detection efficiency of the reference falling space.

Description

Citation falling empty detection method and device
Technical Field
The embodiment of the application relates to the technical field of computers, in particular to a reference blanking detection method and device.
Background
In the editing of documents such as books, journal articles, etc., users often need to refer to several figures, tables, formulas, and notes to better interpret and explain the text to facilitate the reader's understanding. Graphs, tables, formulas, and annotations are typically labeled with corresponding numbers, and text when referencing graphs, tables, formulas, and annotations facilitates the reader's finding the corresponding graphs, tables, formulas, and annotations by referencing the corresponding numbers. For example, the text in the document is referenced by "FIG. 1" to facilitate the reader in finding a map corresponding to "FIG. 1".
However, when the corresponding graph, table, formula and annotation does not exist in the sequence number referenced in the text, the problem of reference falling out occurs, so that readers cannot accurately relate the text to the graph, table formula and annotation when reading. In the prior art, whether the problem of the reference falling is detected mainly by the experience and the care of a user, but the method cannot accurately find the reference falling in the text, and has low detection efficiency.
Disclosure of Invention
The embodiment of the application provides a method and a device for detecting citation falling empty, which are used for solving the problems that citation falling empty cannot be accurately found and the citation falling empty detection efficiency is low.
In a first aspect, an embodiment of the present application provides a reference blanking detection method, where the method includes:
acquiring a sequence set corresponding to the document to be detected, wherein the sequence set comprises: the target object in the document to be detected is in a first order in the document to be detected;
identifying a second order referenced in the text of the document to be detected;
judging whether the second sequence number exists in the sequence number set or not;
and if the second sequence number does not exist in the sequence number set, pushing prompt information, wherein the prompt information is used for indicating that the second sequence number reference is empty.
Optionally, the target object includes at least one of the following types of objects: graph, table, formula.
Optionally, the type of the target object includes at least two items, the sequence number sets corresponding to the document to be detected are at least two sequence number sets, and each sequence number set includes a first sequence number of the target object of the same type in the document to be detected.
Optionally, the identifying the second sequence referenced in the text of the document to be detected specifically includes:
identifying phrases in a preset format in the text of the document to be detected, wherein the phrases comprise the second sequence number;
and extracting the second sequence number from the phrase comprising the preset format.
Optionally, the determining whether the second sequence number exists in the sequence number set includes:
judging whether the second sequence number exists in a first sequence number set, wherein the type of a target object corresponding to the first sequence number set is the same as the type of the target object represented by the second sequence number, and the first sequence number set is any one of the at least two sequence number sets.
In a second aspect, an embodiment of the present application provides a reference empty detection apparatus, including:
the first processing module is used for acquiring a sequence set corresponding to a document to be detected, and the sequence set comprises: the target object in the document to be detected is in a first order in the document to be detected;
the second processing module is used for identifying a second sequence referenced in the text of the document to be detected;
the judging module is used for judging whether the second sequence number exists in the sequence number set or not;
and the pushing module is used for pushing prompt information when the second sequence number does not exist in the sequence number set, and the prompt information is used for indicating that the second sequence number reference is empty.
Optionally, the target object includes at least one of the following types of objects: graph, table, formula.
Optionally, the type of the target object includes at least two items, the sequence number sets corresponding to the document to be detected are at least two sequence number sets, and each sequence number set includes a first sequence number of the target object of the same type in the document to be detected.
Optionally, the first processing module is specifically configured to identify a phrase including a preset format in the text of the document to be detected, extract the second sequence number from the phrase including the preset format, where the phrase includes the second sequence number.
Optionally, the judging module is specifically configured to judge whether the second sequence number exists in a first sequence number set, a type of the target object corresponding to the first sequence number set is the same as a type of the target object represented by the second sequence number, and the first sequence number set is any one of the at least two sequence number sets.
In a third aspect, an embodiment of the present application provides a reference blanking detection apparatus, including: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing computer-executable instructions stored in the memory causes the apparatus to perform the method of any one of the first aspects.
In a fourth aspect, embodiments of the present application provide a computer-readable storage medium having stored thereon computer-executable instructions that, when executed by a processor, implement the method of any of the first aspects.
According to the reference blanking detection method and device provided by the embodiment of the application, firstly, the sequence set corresponding to the document to be detected is obtained, secondly, the second sequence referenced in the text of the document to be detected is identified, and whether the second sequence exists in the sequence set is judged. If the reference exists, the description reference is correct, and the problem of reference falling is avoided; if the second sequence number is not found, indicating that the reference is found out, and pushing the prompt information of the second sequence number reference is found out so as to remind the user that the second sequence number has the reference is found out. Compared with the prior art which relies on manual detection by a user, the reference falling space detection method and device provided by the embodiment of the application can improve the detection accuracy and detection efficiency of the reference falling space.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions of the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to the drawings without inventive effort to a person skilled in the art.
FIG. 1 is a schematic view of a scenario in which references fall out;
FIG. 2 is a schematic flow chart of a reference blanking detection method according to an embodiment of the present application;
FIG. 3 is a schematic structural diagram of a reference blanking detection apparatus according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of another reference blanking detection apparatus according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
In order to better understand the reference blanking, the reference blanking is illustrated by one example. FIG. 1 is a schematic view of a scenario in which a reference falls out, as shown in FIG. 1, FIG. 1 shows a section of a document, the section including text and a table. Wherein, the upper part of the header uses the 'table 2' circled in the black frame as the serial number of the table, and the 'tables 2-9' circled in the text uses the black frame as the serial number of the table referenced by the text. In this example, the "table 2-9" referred to in the text corresponds to the table corresponding to "table 2" (originally "table 2-9"), and when the author modifies the serial number, the author only modifies the serial number of the table, and forgets to modify the serial number referred to in the text, so that the serial number of the table referred to in the text is inconsistent with the actual serial number of the table, and the problem of reference blanking occurs in the "table 2-9" referred to in the text.
At present, whether the problem of the reference falling is detected by detecting the reference falling mainly depends on experience and care of a user, but the method cannot accurately find the reference falling in the text, and the detection efficiency of the reference falling is low.
In view of the above problems, an embodiment of the present application provides a method and an apparatus for detecting a reference blanking, which firstly obtains a sequence set corresponding to a document to be detected, and secondly identifies a second sequence referenced in a text of the document to be detected, and determines whether the second sequence exists in the sequence set. If the reference exists, the description reference is correct, and the problem of reference falling is avoided; if the second sequence number is not found, indicating that the reference is found out, and pushing the prompt information of the second sequence number reference is found out so as to remind the user that the second sequence number has the reference is found out. Compared with the prior art which relies on manual detection by a user, the reference falling space detection method and device provided by the embodiment of the application can improve the detection accuracy and detection efficiency of the reference falling space.
It should be understood that the document to be detected according to the present application is a document that may contain a reference order in the text. The object corresponding to the sequence number referred to herein may be, for example, at least one of: graphs, tables, formulas, notes, etc. The application does not limit the format of the document to be detected, for example, the document to be detected can be a document in word format, or the document to be detected can be a document in txt format, or the document to be detected can be a document in pdf format, etc. In addition, the document to be detected according to the application can be an editable document or a non-editable document.
The technical scheme of the reference blanking detection method and device provided by the application is described below with reference to several specific embodiments. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
Fig. 2 is a flow chart of a reference blanking detection method according to an embodiment of the present application. The reference blanking detection method of the present embodiment, the execution subject that executes the method may be an electronic device (e.g., a terminal device, a server, etc.). As shown in fig. 2, the method may include:
s101, acquiring a sequence set corresponding to a document to be detected, wherein the sequence set comprises: the target object in the document to be detected is in the first order in the document to be detected.
The target object may comprise, for example, at least one of the following types of objects: graphs, tables, formulas, notes, etc. For example, when the target object is a graph, the first order of the target object in the document to be detected may be the graph order. When the target object is a table, the first order of the target object in the document to be detected may be the table order. When the target object is a formula, the first order of the target object in the document to be detected may be a formula number or the like. When the target object is an annotation, the first order of the target object in the document to be detected may be the annotation number.
Taking a Chinese document as an example, the expression of the first order may include, but is not limited to, the following:
first, arabic numerals, such as "FIG. 1", "Table 2", "formula 1", "annotation (2)", and the like.
Second, roman numerals, such as "FIG. I", "Table II", "Formula II", "annotation I", etc.
Third, english letters, such as "diagram A", "Table B", "formula C", "comment D", "Figure A", "Figure B", etc.
Fourth, the combination of Arabic numerals and English letters is, for example, "FIG. 1A", "Table 3C", "formula 2B", "comment 5A", "Figure 2B", "Figure A1", "Figure 1e", etc.
Fifth, combinations of Arabic numerals, english letters, and symbols, such as "FIG. A-1", "Table B.1", "formula A-2", "comment C.3", etc.
According to the method for acquiring the sequence number set corresponding to the document to be detected, one possible implementation mode is that the sequence number set corresponding to the document to be detected already exists.
In another possible implementation, the documents to be detected are first filtered according to a preset format, for example, according to a preset format of "graph X", "table X", "formula X", and "annotation X". And then, screening each screened sequence number by combining whether the content data of the graph, the table, the formula and the annotation exist before and after the position of the sequence number, if so, reserving the sequence number, and if not, screening the sequence number. By the method, the first sequence of the target object in the document to be detected can be obtained, and the first sequence is combined into a sequence set corresponding to the document to be detected.
When the target object is a graph, a table, a formula or an annotation, the sequence number of the target object is usually located above, below or right of the target object, and the sequence number of the target object may also be located in other orientations of the target object, which is not limited by the embodiment of the present application.
It should be understood that, in this embodiment, when the type of the target object includes at least two items, the sequence number set corresponding to the document to be detected may be one sequence number set, where the sequence number set includes the first sequence numbers of all types of target objects in the document to be detected. Or the sequence sets corresponding to the documents to be detected are at least two sequence sets, and each sequence set comprises a first sequence of the target object of the same type in the documents to be detected.
Illustratively, taking two types of target objects included in the document to be detected as an example, it is assumed that the two types of target objects are a graph and a table, respectively. Wherein 5 graphs are identified from the document to be detected, and the first sequence numbers corresponding to the 5 graphs are respectively shown in fig. 1, fig. 2, fig. 3, fig. 4 and fig. 5; the number of the tables identified from the documents to be detected is 5, and the first sequence numbers corresponding to the 5 tables are respectively table 1, table 2, table 3, table 4 and table 5, and then the sequence number set N corresponding to the documents to be detected can be expressed as:
n= { fig. 1, fig. 2, fig. 3, fig. 4, fig. 5, table 1, table 2, table 3, table 4, table 5}
Or the sequence sets corresponding to the documents to be detected are two sequence sets, which are respectively marked as sequence set A and sequence set B. Wherein, the sequence set A comprises the first sequence of all the target objects of the graph types in the document to be detected, the sequence set B comprises the first sequence of all the target objects of the table types in the document to be detected, and the sequence set A can be expressed as:
a= { fig. 1, fig. 2, fig. 3, fig. 4, fig. 5}
Sequence number set B can be expressed as:
b= { table 1, table 2, table 3, table 4, table 5}
S102, identifying a second sequence referenced in the text of the document to be detected.
The expression form of the second serial number is the same as the expression form of the first serial number, and the description about the first serial number is specifically referred to above, and will not be repeated.
One possible implementation manner is to search the text of the document to be detected for keywords, such as "graph", "table", "formula" or "annotation", and then identify whether a number exists at a position behind the keyword and adjacent to the keyword based on each searched keyword, and if so, use the keyword and the number as the second sequence number referenced in the text. For example, firstly, 5 keywords 'graphs' in the text of the document to be detected are found, then, whether numbers exist in adjacent positions behind the 5 keywords or not is identified based on the 5 searched keywords, and the keywords and the numbers are extracted to obtain 'FIG. 1', 'FIG. 2', 'FIG. 3', 'FIG. 4', 'FIG. 5'.
In another possible implementation manner, a phrase including a preset format in the text of the document to be detected may be identified, the phrase including the second sequence number, and the second sequence number may be extracted from the phrase including the preset format.
Still taking a chinese document as an example, taking a phrase including the sequence number of the graph as an example, the phrase in the preset format may be, for example:
"as shown In Figure X", "It is shown In Figure X", "as shown In Figure X", "see Figure X", "(see Figure X)", "per Figure X", "In Figure X", "In Figure X", "by Figure X", "From Figure X", "Figure X indicates", "Figure X indicates", "Figure X indicates", "known From Figure X", "enumerated by Figure X", and the like.
Wherein the english letter "Figure X" can also be replaced by any one of the following: "Fig X", "Fig. X", "Fig. X", etc. Brackets "()" do not distinguish full half angles.
The second sequence number extracted from the above-mentioned preset format is for example "graph X", "Figure X", etc.
Taking a phrase including a sequence number of a table as an example, the phrase in a preset format may be, for example:
"Table X", "" (Table X) "," as shown In Table X) "," (as shown In Table X) "," It is shown In Table X "" (see Table X) "," per Table X "," In Table X "," by Table X "," From Table X "," Table X indicating "," Table X "indicates", "Table X indicates", "known From Table X", "listed In Table X", and the like.
Wherein, the English letter "Table X" can be replaced by any one of the following: "Tab X", "Tab.X", "TABLE X", "TABLE X", "tab.X", etc. Brackets "()" do not distinguish full half angles.
Taking a phrase including a sequence number of a formula as an example, the phrase in the preset format may be, for example:
"Formula X", "" (Formula X) "," In Formula X "," Formula X is "," Formula X "," see Formula X "," It is shown In Formula X "," Formula X indicates "," Formula X indicates "," From Formula X "," Formula X is ", etc.
Wherein brackets "()" do not distinguish full half angles.
The second sequence number extracted from the preset format is, for example, "Formula X", etc.
In the embodiment of the application, the document to be detected in Chinese is taken as an example, and when the document to be detected is a document in other languages, such as English, japanese, french, german and the like, the first sequence number and the second sequence number, and the mode of identifying the first sequence number and the second sequence number can be adaptively adjusted according to the language of the document, so that the description is omitted.
S103, judging whether the second sequence number exists in the sequence number set.
When the second sequence number exists in the sequence number set, no reference is empty, and the flow of the round is ended.
When the second sequence number does not exist in the sequence number set, there is a reference blanking, at which point the step S104 is continued.
It should be understood that, when the type of the target object includes at least two items, the sequence number set corresponding to the document to be detected is a sequence number set, and when the sequence number set includes the first sequence numbers of all types of target objects in the document to be detected, it is determined whether the second sequence numbers exist in the sequence number set, if yes, the present round of flow ends, and if not, step S104 is continuously executed.
Or when the type of the target object comprises at least two items, the sequence number sets corresponding to the document to be detected are at least two sequence number sets, and each sequence number set comprises a first sequence number of the target object of the same type in the document to be detected, judging whether the second sequence number exists in the first sequence number set, wherein the type of the target object corresponding to the first sequence number set is the same as the type of the target object represented by the second sequence number, and the first sequence number set is any one of the at least two sequence number sets.
With continued reference to step S101, the sequence number set a and the sequence number set B are the same as each other in the example, for example, the second sequence number is "fig. 1", and the second sequence number and the sequence number set a (in this case, the sequence number set a is the first sequence number set) are both of the graph type, so it is determined whether the second sequence number "fig. 1" exists in the sequence number set a. As can be seen from the sequence number set A, the sequence number set A is provided with the 'figure 1', no reference is left, and the flow of the round is ended.
For example, if the second sequence number is "table 4-4", the second sequence number and the sequence number set B (in this case, the sequence number set B is the first sequence number set) are the same type and are both table types, so that it is determined whether the second sequence number "table 4-4" exists in the sequence number set B. As can be seen from the sequence number set B, "Table 4-4" does not exist in the sequence number set B, but does not exist in the sequence number set of other types, it is indicated that "Table 4-4" has a reference empty, and the process continues to step S104.
In this embodiment, when the type of the target object includes at least two items, the first sequence numbers of the target objects of the same type in the document to be detected are stored in the corresponding sequence number sets by dividing into a plurality of sequence number sets according to the type of the target object, and it is only necessary to determine whether the second sequence number exists in the sequence number set which is the same as the type of the target object represented by the second sequence number. According to the method, the number of traversals in the judging process is reduced by dividing a plurality of smaller sequence number sets according to the type of the target object, and the detection efficiency can be further improved.
And S104, if the second sequence number does not exist in the sequence number set, pushing prompt information, wherein the prompt information is used for indicating that the second sequence number reference is empty.
Continuing with the example in step S103, when it is determined that "Table 4-4" does not exist in sequence number set N, the prompt message is pushed, and the prompt message may be prompted to the user, for example, by means of a pop-up window. Optionally, in some embodiments, after judging all the second sequence numbers in the text in the document to be detected, a prompt message may be pushed in the form of an extensible markup language (Extensible Markup Language, XML) report, so as to indicate that all the second sequence numbers with reference falling in the document exist through the prompt message. The content of the prompt message may be as shown in table 1, for example.
Table 1 references a blanking hint example
Sequence number Results Detailed information Category(s)
1 Tables 4 to 4 As can be seen from tables 4-4, … Reference to falling empty
2 FIG. 2-1 Both types of distributions are shown in fig. 2-1. Reference to falling empty
3 FIGS. 3-4 As shown in FIGS. 3-4, according to … Reference to falling empty
In connection with steps S101 to S104, description will be given by taking, as an example, detection of a reference void in the editing process of documents such as books, journal articles, and the like. Assuming that the document includes a graph and that the text refers to the sequence number of the graph, when an existing method is adopted to check whether the reference falls out, the detection is mainly carried out by the experience and the care of personnel, and the detection is easy to make mistakes and is not efficient. In the application, the graph in the document can be firstly identified to obtain a sequence number set comprising all the graph sequences, then the graph sequences referenced in the text of the document are identified, when the graph sequence referenced in a certain part does not exist in the sequence number set, the problem of reference blanking is indicated, and further, the prompt information for indicating that the graph sequence references blanking is pushed. After the user obtains the indication information, the problem can be quickly positioned and corrected, namely, the reference blank in the document to be detected can not be missed, and the detection accuracy and the detection efficiency are improved.
According to the reference blanking detection method provided by the embodiment of the application, firstly, the sequence set corresponding to the document to be detected is obtained, secondly, the second sequence referenced in the text of the document to be detected is identified, and whether the second sequence exists in the sequence set is judged. If the reference exists, the description reference is correct, and the problem of reference falling is avoided; if the second sequence number is not found, indicating that the reference is found out, and pushing the prompt information of the second sequence number reference is found out so as to remind the user that the second sequence number has the reference is found out. Compared with the prior art which relies on manual detection by a user, the reference falling space detection method provided by the embodiment of the application can improve the detection accuracy and the detection efficiency of the reference falling space.
In addition, the above examples are described with reference to the existence of both the target object and the reference in the document to be detected. It should be understood that, when there is no reference in the text of the document to be detected, the electronic device may push the prompt information that the document is free from reference and is empty when the second order in the document is not obtained by the method of step S102.
Those of ordinary skill in the art will appreciate that: all or part of the steps for implementing the method embodiments described above may be performed by hardware associated with program instructions. The foregoing program may be stored in a computer readable storage medium. The program, when executed, performs steps including the method embodiments described above; and the aforementioned storage medium includes: various media that can store program code, such as ROM, RAM, magnetic or optical disks.
Fig. 3 is a schematic structural diagram of a reference blanking detection apparatus according to an embodiment of the present application, where as shown in fig. 3, the apparatus may include a first processing module 11, a second processing module 12, a judging module 13, and a pushing module 14. Wherein, the liquid crystal display device comprises a liquid crystal display device,
the first processing module 11 is configured to obtain a sequence set corresponding to a document to be detected, where the sequence set includes: a first order of a target object in the document to be detected;
a second processing module 11 for identifying a second order referenced in the text of the document to be detected;
a judging module 13, configured to judge whether the second sequence number exists in the sequence number set;
the pushing module 14 is configured to push a prompt message when the second sequence number does not exist in the sequence number set, where the prompt message is used to indicate that the second sequence number reference is empty.
Optionally, in some embodiments, the target object comprises at least one of the following types of objects: graph, table, formula.
Optionally, in some embodiments, the type of the target object includes at least two items, the sequence sets corresponding to the document to be detected are at least two sequence sets, and each sequence set includes a first sequence of the target object of the same type in the document to be detected.
Optionally, in some embodiments, the first processing module 11 is specifically configured to identify a phrase including a preset format in the text of the document to be detected, and extract the second sequence number from the phrase including the preset format, where the phrase includes the second sequence number.
Optionally, in some embodiments, the determining module 13 is specifically configured to determine whether the second sequence number exists in a first sequence number set, where the type of the target object corresponding to the first sequence number set is the same as the type of the target object represented by the second sequence number, and the first sequence number set is any one of at least two sequence number sets.
The reference blanking detection device provided by the embodiment of the application can execute the actions of the electronic equipment in the embodiment of the method, and the implementation principle and the technical effect are similar, and are not repeated here.
Fig. 4 is a schematic structural diagram of another reference blanking detection apparatus according to an embodiment of the present application, as shown in fig. 4, where the apparatus includes: a memory 301 and at least one processor 302.
Memory 301 for storing program instructions.
The processor 302 is configured to implement the reference blanking detection method in the embodiment of the present application when the program instruction is executed, and the specific implementation principle can be referred to the above embodiment, which is not described herein again.
The reference empty detection arrangement may further comprise an input/output interface 303.
The input/output interface 303 may include a separate output interface and an input interface, or may be an integrated interface that integrates input and output. The output interface is used for outputting data, the input interface is used for acquiring input data, the output data is the generic name output in the method embodiment, and the input data is the generic name input in the method embodiment.
The present application also provides a readable storage medium having stored therein an execution instruction, which when executed by at least one processor of the reference fall detection apparatus, when executed by the processor, implements the reference fall detection method in the above embodiment.
The present application also provides a program product comprising execution instructions stored in a readable storage medium. The at least one processor of the reference fall detection apparatus may read the execution instructions from the readable storage medium, the execution instructions being executed by the at least one processor to cause the reference fall detection apparatus to implement the reference fall detection methods provided by the various embodiments described above.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application.

Claims (10)

1. A reference blanking detection method, the method comprising:
acquiring a sequence set corresponding to a document to be detected, wherein the sequence set comprises: the target object in the document to be detected is in a first order in the document to be detected; the target object comprises at least one of the following types of objects: graph, table, formula;
identifying a second sequence referenced in the text of the document to be detected;
judging whether the second sequence number exists in the sequence number set or not;
and if the second sequence number does not exist in the sequence number set, pushing prompt information, wherein the prompt information is used for indicating that the second sequence number reference is empty.
2. The method according to claim 1, wherein the type of the target object comprises at least two items, the sequence number sets corresponding to the document to be detected are at least two sequence number sets, and each sequence number set comprises a first sequence number of the target object of the same type in the document to be detected.
3. The method according to claim 1, wherein said identifying the second order referenced in the text of the document to be detected comprises in particular:
identifying phrases in a preset format in the text of the document to be detected, wherein the phrases comprise the second sequence number;
and extracting the second sequence number from the phrase comprising the preset format.
4. The method of claim 2, wherein said determining whether said second sequence number is present in said sequence number set comprises:
judging whether the second sequence number exists in a first sequence number set, wherein the type of a target object corresponding to the first sequence number set is the same as the type of the target object represented by the second sequence number, and the first sequence number set is any one of the at least two sequence number sets.
5. A reference blanking detection apparatus, the apparatus comprising:
the first processing module is used for acquiring a sequence set corresponding to a document to be detected, and the sequence set comprises: the target object in the document to be detected is in a first order in the document to be detected; the target object comprises at least one of the following types of objects: graph, table, formula;
the second processing module is used for identifying a second sequence referenced in the text of the document to be detected;
the judging module is used for judging whether the second sequence number exists in the sequence number set or not;
and the pushing module is used for pushing prompt information when the second sequence number does not exist in the sequence number set, and the prompt information is used for indicating that the second sequence number reference is empty.
6. The apparatus of claim 5, wherein the type of the target object comprises at least two items, the sequence sets corresponding to the document to be detected are at least two sequence sets, and each sequence set comprises a first sequence of the target object of the same type in the document to be detected.
7. The apparatus of claim 5, wherein the first processing module is specifically configured to identify a phrase including a preset format in a text of the document to be detected, extract the second sequence number from the phrase including the preset format, and the phrase includes the second sequence number.
8. The apparatus of claim 6, wherein the determining module is specifically configured to determine whether the second sequence number exists in a first sequence number set, a type of a target object corresponding to the first sequence number set is the same as a type of a target object represented by the second sequence number, and the first sequence number set is any one of the at least two sequence number sets.
9. A reference blanking detection apparatus, comprising: at least one processor and memory;
the memory stores computer-executable instructions;
the at least one processor executing computer-executable instructions stored in the memory to cause the apparatus to perform the method of any one of claims 1-4.
10. A computer readable storage medium having stored thereon computer executable instructions which, when executed by a processor, implement the method of any of claims 1-4.
CN201911298605.2A 2019-12-17 2019-12-17 Citation falling empty detection method and device Active CN110990593B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911298605.2A CN110990593B (en) 2019-12-17 2019-12-17 Citation falling empty detection method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911298605.2A CN110990593B (en) 2019-12-17 2019-12-17 Citation falling empty detection method and device

Publications (2)

Publication Number Publication Date
CN110990593A CN110990593A (en) 2020-04-10
CN110990593B true CN110990593B (en) 2023-09-19

Family

ID=70094334

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911298605.2A Active CN110990593B (en) 2019-12-17 2019-12-17 Citation falling empty detection method and device

Country Status (1)

Country Link
CN (1) CN110990593B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113505570B (en) * 2021-05-25 2024-04-12 北京北大方正电子有限公司 Reference is made to empty checking method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010039489A (en) * 1999-10-01 2001-05-15 박종섭 Apparatus for storing information and apparatus and method for checking the validity of the data link using the apparatus in computer systems
CN107479910A (en) * 2017-07-07 2017-12-15 广州视源电子科技股份有限公司 Document restorative procedure, system, readable storage medium storing program for executing and computer equipment
CN109428741A (en) * 2017-08-22 2019-03-05 中兴通讯股份有限公司 A kind of detection method and device of network failure
CN109670092A (en) * 2019-01-07 2019-04-23 北京仁和汇智信息技术有限公司 XML document proofreading method and device
CN110309501A (en) * 2018-03-27 2019-10-08 北大方正集团有限公司 Cross reference method and apparatus

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8019769B2 (en) * 2008-01-18 2011-09-13 Litera Corp. System and method for determining valid citation patterns in electronic documents
US8266163B2 (en) * 2008-02-26 2012-09-11 International Business Machines Corporation Utilizing reference/ID linking in XML wrapper code generation
US9495334B2 (en) * 2012-02-01 2016-11-15 Adobe Systems Incorporated Visualizing content referenced in an electronic document

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20010039489A (en) * 1999-10-01 2001-05-15 박종섭 Apparatus for storing information and apparatus and method for checking the validity of the data link using the apparatus in computer systems
CN107479910A (en) * 2017-07-07 2017-12-15 广州视源电子科技股份有限公司 Document restorative procedure, system, readable storage medium storing program for executing and computer equipment
CN109428741A (en) * 2017-08-22 2019-03-05 中兴通讯股份有限公司 A kind of detection method and device of network failure
CN110309501A (en) * 2018-03-27 2019-10-08 北大方正集团有限公司 Cross reference method and apparatus
CN109670092A (en) * 2019-01-07 2019-04-23 北京仁和汇智信息技术有限公司 XML document proofreading method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
自然场景下的多方向文本检测与识别方法研究;孙旭;《中国优秀硕士学位论文全文数据库 信息科技辑》;全文 *

Also Published As

Publication number Publication date
CN110990593A (en) 2020-04-10

Similar Documents

Publication Publication Date Title
US8023740B2 (en) Systems and methods for notes detection
US10706228B2 (en) Heuristic domain targeted table detection and extraction technique
CN109062874B (en) Financial data acquisition method, terminal device and medium
US7469251B2 (en) Extraction of information from documents
US20060285746A1 (en) Computer assisted document analysis
US8478046B2 (en) Signature mark detection
US8340425B2 (en) Optical character recognition with two-pass zoning
US9098487B2 (en) Categorization based on word distance
US20120076415A1 (en) Computer aided validation of patent disclosures
US11615635B2 (en) Heuristic method for analyzing content of an electronic document
CN110770735A (en) Transcoding of documents with embedded mathematical expressions
Kim et al. Figure text extraction in biomedical literature
KR20120051419A (en) Apparatus and method for extracting cascading style sheet
US11663408B1 (en) OCR error correction
CN112668311A (en) Text error detection method and device
CN110990593B (en) Citation falling empty detection method and device
CN112464927B (en) Information extraction method, device and system
CN109670092A (en) XML document proofreading method and device
US8700997B1 (en) Method and apparatus for spellchecking source code
US8250072B2 (en) Detecting real word typos
Liu et al. An efficient pre-processing method to identify logical components from pdf documents
US9483463B2 (en) Method and system for motif extraction in electronic documents
JP2014137613A (en) Translation support program, method and device
Alabbas et al. Dependency tree matching with extended tree edit distance with subtrees for textual entailment
CN112836477B (en) Method and device for generating code annotation document, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20230630

Address after: 3007, Hengqin International Financial Center Building, No. 58 Huajin Street, Hengqin New District, Zhuhai City, Guangdong Province, 519030

Applicant after: New founder holdings development Co.,Ltd.

Applicant after: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

Address before: 100871, Beijing, Haidian District, Cheng Fu Road, No. 298, Zhongguancun Fangzheng building, 9 floor

Applicant before: PEKING UNIVERSITY FOUNDER GROUP Co.,Ltd.

Applicant before: BEIJING FOUNDER ELECTRONICS Co.,Ltd.

GR01 Patent grant
GR01 Patent grant