WO2021042542A1 - Table of contents storage method and apparatus, computer device and storage medium - Google Patents

Table of contents storage method and apparatus, computer device and storage medium Download PDF

Info

Publication number
WO2021042542A1
WO2021042542A1 PCT/CN2019/117749 CN2019117749W WO2021042542A1 WO 2021042542 A1 WO2021042542 A1 WO 2021042542A1 CN 2019117749 W CN2019117749 W CN 2019117749W WO 2021042542 A1 WO2021042542 A1 WO 2021042542A1
Authority
WO
WIPO (PCT)
Prior art keywords
title
target text
positions
tree structure
headings
Prior art date
Application number
PCT/CN2019/117749
Other languages
French (fr)
Chinese (zh)
Inventor
苏智辉
侯丽
佘昊天
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2021042542A1 publication Critical patent/WO2021042542A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • This application relates to the field of data processing technology, and in particular to a directory storage method, device, computer equipment, and storage medium.
  • POI Purposed Generation, a concise version of fuzzy implementation
  • API Application Programming Interface, application programming interface
  • Java programs can easily operate Microsoft Office format files. . For example, you can extract all the title structure of Word through POI, but the title hierarchy structure extracted by POI cannot be saved directly in the java program, which causes development restrictions.
  • the embodiments of the present application provide a directory storage method, device, computer equipment and storage medium, aiming to solve the problem that the directory of Word text cannot be directly saved in the java program.
  • an embodiment of the present application provides a directory storage method, which includes: reading a target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each Each of the headings is a paragraph; obtain the heading names of the headings at all levels in the target text according to the preset format; obtain the positions of the corresponding paragraphs of the headings at all levels and determine the positions of the headings at all levels according to the positions of the corresponding paragraphs of the adjacent headings Start and end positions; create a tree structure object, store the title names of all levels of titles and the corresponding start and end positions in the nodes of the tree structure object to form a directory of the target text, wherein the tree The shape structure object includes multiple nodes.
  • an embodiment of the present application also provides a directory storage device, which includes: a recording unit for reading a target text and recording the positions of all paragraphs in the target text, wherein the target text includes Headings at all levels, and each of the headings is a paragraph; the acquiring unit is used to acquire the heading names of the headings at all levels in the target text according to a preset format; the determining unit is used to acquire the positions of the corresponding paragraphs of the headings at all levels and Determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; the storage unit is used to create a tree structure object, and store the heading names of the headings at all levels and the corresponding start and end positions in the Among the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object includes a plurality of nodes.
  • an embodiment of the present application also provides a computer device, which includes a memory and a processor connected to the memory; the memory is used to store a computer program; the processor is used to run the A computer program to perform the following steps: read the target text and record the positions of all paragraphs in the target text, wherein the target text includes headings at all levels, and each heading is a paragraph; according to a preset Format to obtain the heading names of the headings at all levels in the target text; obtain the positions of the corresponding paragraphs of the headings at all levels and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; create a tree structure object, and The title name of the level heading and the corresponding start and end positions are stored in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object includes a plurality of nodes.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the following steps: Take the target text and record the positions of all the paragraphs in the target text, where the target text includes headings of all levels, and each heading is a paragraph; obtain each level in the target text according to a preset format The title name of the heading; obtain the positions of the corresponding paragraphs of the headings at all levels, and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; create a tree structure object, and combine the heading names and the corresponding The start and end positions of are stored in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object includes a plurality of nodes.
  • FIG. 1 is a schematic flowchart of a directory storage method provided by an embodiment of the application
  • FIG. 2 is a schematic diagram of a sub-flow of a directory storage method provided by an embodiment of the application
  • FIG. 3 is a schematic diagram of a sub-flow of a directory storage method provided by an embodiment of this application.
  • FIG. 4 is a schematic diagram of a sub-flow of a directory storage method provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of a tree structure object of a directory storage method provided by an embodiment of the application.
  • FIG. 6 is a schematic flowchart of a directory storage method provided by another embodiment of this application.
  • FIG. 7 is a schematic block diagram of a directory storage device provided by an embodiment of the application.
  • FIG. 8 is a schematic block diagram of specific units of a directory storage device provided by an embodiment of the application.
  • FIG. 9 is a schematic block diagram of a directory storage device provided by another embodiment of the application.
  • FIG. 10 is a schematic block diagram of a computer device provided by an embodiment of the application.
  • FIG. 1 is a schematic flowchart of a directory storage method according to an embodiment of the application.
  • the directory storage method is applied to the terminal.
  • FIG. 1 is a schematic flowchart of a directory storage method provided by an embodiment of the present application. As shown in the figure, the method includes the following steps: S110-S140.
  • POI is a tool used by java to process documents in Microsoft Office format.
  • the target text refers to text in Word format. There are multiple paragraphs in the target text, and the target text includes various levels. Headings, and each heading corresponds to a paragraph, that is, a heading is a paragraph. Specifically, the target text is read through POI, and each paragraph is marked when reading, so as to record the position of each paragraph.
  • the step S110 may include steps: S111-S112.
  • S112 Mark each paragraph in the target text by a serial number to determine the position of each paragraph.
  • the read target text is divided into paragraphs, the target text is divided into multiple paragraphs, and the target text is read one by one in the order from the top to the bottom of the text ,
  • a mark is added to the read paragraph, where the mark is a serial number, and the serial number specifically refers to the sequence of Arabic numerals, starting from 0 for marking.
  • S120 Acquire title names of titles at all levels in the target text according to a preset format.
  • the target text includes multiple levels of headings, for example, a first-level heading, a second-level heading, and a third-level heading, and there are multiple titles of the same level for each level of heading.
  • the preset format specifically refers to the format of headings at all levels. There are multiple preset formats for headings at each level. The preset format is set according to the fixed heading format of the Word text and the commonly used heading format. For example, the first-level heading is " The first chapter,” or "one,”, the second-level title is “first section,” or “( ⁇ ),", and the third-level title is "first section,” or "(1),”.
  • the title of the title format is used to obtain the title name of the secondary title; finally, the target text is traversed according to the preset format of the tertiary title to find the title that conforms to the secondary title format, thereby obtaining the title name of the tertiary title.
  • some texts may only have first-level headings and second-level headings. If there are no third-level headings, and no headings that conform to the third-level heading format can be found, then this step is ended and the next step is performed.
  • S130 Obtain the positions of the corresponding paragraphs of the headings of each level, and determine the start and end positions of the headings of each level according to the positions of the adjacent paragraphs corresponding to the headings of the same level.
  • first obtain the serial number corresponding to each level of title use the serial number corresponding to each level of title as its starting position, and then select two adjacent title serial numbers of the same level.
  • the end position of the previous title is equal to the serial number of the title after the serial number minus one, so as to get the start and end positions of the title with the previous serial number, and so on, calculate the serial numbers of all adjacent titles at the same level to get the start and end positions of all levels of titles .
  • the step S130 may include the steps: S131-S132.
  • S132 Calculate the start and end positions of each title at the same level by using a preset formula according to the serial number of the corresponding paragraph of the adjacent title at the same level.
  • the serial numbers of the first level headings are 1, 35 and 60, and the second level headings
  • the serial numbers are 5 and 20, and the serial numbers of the three-level headings are 10, 15 and 25, 30.
  • the serial numbers of the corresponding paragraphs of the obtained headings at all levels are used as the starting position of each heading; then two adjacent siblings are selected Title, set the serial number of the title of the same level with the serial number in front to X, and set the serial number of the title of the same level with the serial number in the back to P, according to the preset formula:
  • Y is the end position of the title of the same level with the serial number in front.
  • the end position of the title of the same level with the serial number in front can be obtained.
  • the start and end positions of the title of the same level with the serial number are (X, Y). For example, if the serial numbers of the first-level titles are 1, 35, and 60, select two adjacent titles of the same level, 1 and 35, where the serial number of the title of the same level with the first serial number is 1, and 1 is the same level with the first serial number.
  • the starting and ending positions of is (1, 34). In the same way, the start and end positions of other titles at all levels are obtained in the same way.
  • S140 Create a tree structure object, and store the title names of each level of title and the corresponding start and end positions in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure
  • the structure object includes multiple nodes.
  • a tree structure object is created in a java program, and the tree structure object is used to store the title names and corresponding positions of the titles at all levels in the target text.
  • the tree structure object includes multiple nodes, and each node has a unique parent node and multiple child nodes. Through this feature of the tree structure object, the hierarchical structure of the title directory in the target text can be stored.
  • the step S140 may include the steps: S141-S143.
  • S142 Store the title name of the first-level title and the corresponding start and end positions in the parent node of the tree structure object.
  • S143 Store the title name of the secondary title and the corresponding start and end positions in the child nodes of the tree structure object.
  • the tree structure object includes a root node, a parent node, and a child node.
  • the root node is used to store the file name of the target text in order to find the target text.
  • the root node is connected to multiple parent nodes.
  • the title name of the first-level title and the corresponding start and end positions are stored in the parent node.
  • the number of parent nodes corresponds to the number of the first-level title; each parent node is connected to multiple child nodes.
  • the title name of the level heading and the corresponding start and end positions are stored in the child nodes, and the number of child nodes corresponds to the number of second level headings; if the target text has a third level heading, continue to add grandchildren, child nodes and multiple grandchildren
  • the connection, the title name of the third-level heading and the corresponding start and end positions are stored in the grandchildren, and the number of grandchildren corresponds to the number of the third-level headings. That is, the number of levels of nodes is added for storage as many levels of headings as there are, and the number of heading levels corresponds to the number of node levels. For example, as shown in Figure 5.
  • step S150-S160 is further included.
  • S160 Extract the text of the target title from the target text according to the start and end positions of the target title.
  • the catalog title is stored in java
  • more development can be realized, such as the extraction of text content. If you need to extract the text content of a certain paragraph, you only need to provide the title name field corresponding to the text content of the target paragraph.
  • the title name field of the target title is in the tree structure object Perform traversal to find the same title name. If the same title name is found in the node of the tree structure object, the start and end positions stored in the node are obtained, that is, the start and end positions of the target title. Call the target text according to the file name stored in the root node of the tree structure object.
  • start and end positions of the target title first find the paragraph at the start position and the paragraph at the end position from the target text, and then extract the start and end positions
  • the content of all paragraphs between the positions is used as the text of the target heading, thereby realizing the extraction of the text content.
  • the embodiment of the present application shows a method for directory storage by reading the target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is one Paragraphs; obtain the heading names of the headings at all levels in the target text according to the preset format; obtain the positions of the corresponding paragraphs of the headings at all levels and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; create a tree Structure object, storing the title names of each level of title and the corresponding start and end positions in the nodes of the tree structure object to form the target text directory, wherein the tree structure object includes how many A node can realize the effect of storing Word text catalog in Java program.
  • FIG. 7 is a schematic block diagram of a directory storage device 200 provided by an embodiment of the present application. As shown in FIG. 7, corresponding to the above directory storage method, the present application also provides a directory storage device 200.
  • the directory storage device 200 includes a unit for executing the above-mentioned directory storage method, and the device can be configured in a desktop computer, a tablet computer, a laptop computer, and other terminals.
  • the directory storage device 200 includes: a recording unit 210, an acquiring unit 220, a determining unit 230, and a storage unit 240.
  • the recording unit 210 is configured to read a target text and record the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is a paragraph.
  • the recording unit 210 includes: a reading unit 211 and a marking unit 212.
  • the reading unit 211 is configured to read the target text and divide the target text into paragraphs.
  • the marking unit 212 is used to mark each paragraph in the target text by a serial number to determine the position of each paragraph.
  • the obtaining unit 220 is configured to obtain the title names of various levels of titles in the target text according to a preset format.
  • the determining unit 230 is configured to obtain the positions of the corresponding paragraphs of the headings of each level and determine the starting and ending positions of the headings of each level according to the positions of the corresponding paragraphs of the adjacent titles of the same level.
  • the determining unit 230 includes: an obtaining subunit 231 and a calculating unit 232.
  • the obtaining subunit 231 is used to obtain the serial numbers of the corresponding paragraphs of all headings at the same level.
  • the calculation unit 232 is configured to calculate the start and end positions of each title at the same level by using a preset formula according to the serial number of the corresponding paragraph of the adjacent title at the same level.
  • the storage unit 240 is configured to create a tree structure object, and store the title names of each level of title and the corresponding start and end positions in the nodes of the tree structure object to form a directory of the target text, wherein,
  • the tree structure object includes multiple nodes.
  • the storage unit 240 includes: a name storage unit 241, a primary storage unit 242 and a secondary storage unit 243.
  • the name storage unit 241 is configured to store the file name of the target text in the root node of the tree structure object.
  • the first-level storage unit 242 is configured to store the title name of the first-level title and the corresponding start and end positions in the parent node of the tree structure object.
  • the secondary storage unit 243 is configured to store the title name of the secondary title and the corresponding start and end positions in the child nodes of the tree structure object.
  • the directory storage device 200 further includes: a query unit 250 and an extraction unit 260.
  • the query unit 250 is configured to, if a text extraction instruction of a target title is received, query the tree structure object according to the target title to obtain the start and end positions of the target title.
  • the extracting unit 260 is configured to extract the text of the target title from the target text according to the start and end positions of the target title.
  • the above-mentioned directory storage device may be implemented in the form of a computer program, and the computer program may run on the computer device as shown in FIG. 10.
  • the computer device 500 may be a terminal, where the terminal may be an electronic device with communication functions such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
  • the terminal may be an electronic device with communication functions such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
  • the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
  • the non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032.
  • the computer program 5032 includes program instructions. When the program instructions are executed, the processor 502 can execute a directory storage method.
  • the processor 502 is used to provide calculation and control capabilities to support the operation of the entire computer device 500.
  • the internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503.
  • the processor 502 can execute a directory storage method.
  • the network interface 505 is used for network communication with other devices.
  • the structure shown in FIG. 10 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied.
  • the specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
  • the processor 502 is configured to run a computer program 5032 stored in the memory, so as to implement the directory storage method of the embodiment of the present application.
  • the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
  • the computer program includes program instructions, and the computer program can be stored in a storage medium, which is a computer-readable storage medium.
  • the program instructions are executed by at least one processor in the computer system to implement the process steps of the foregoing method embodiments.
  • the storage medium may be a computer-readable storage medium.
  • the storage medium stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the directory storage method described in the above embodiments.
  • the storage medium may be a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk or an optical disk, and other computer-readable storage media that can store program codes.
  • ROM Read-Only Memory

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A table of contents storage method and apparatus, a computer device and a storage medium, wherein same are applied to the field of data storage in data processing. The method comprises: reading a target text and recording positions of all paragraphs in the target text, wherein the target text comprises titles of various levels, and each title is a paragraph (S110); acquiring title names of the titles of various levels in the target text according to a preset format (S120); acquiring positions of the paragraphs corresponding to the titles of various levels, and determining, according to positions of the paragraphs corresponding to adjacent titles of the same level, start/end positions of the titles of various levels (S130); and creating a tree structure object, and storing, in nodes of the tree structure object, the title names of the titles of various levels and the corresponding start/end positions, to form a table of contents of the target text, wherein the tree structure object comprises a plurality of nodes (S140).

Description

目录存储方法、装置、计算机设备及存储介质Directory storage method, device, computer equipment and storage medium
本申请要求于2019年9月4日提交中国专利局、申请号为201910833398.X、申请名称为“目录存储方法、装置、计算机设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on September 4, 2019, the application number is 201910833398.X, and the application name is "Catalog storage methods, devices, computer equipment, and storage media". The entire content of the application is approved The reference is incorporated in this application.
技术领域Technical field
本申请涉及数据处理技术领域,尤其涉及一种目录存储方法、装置、计算机设备及存储介质。This application relates to the field of data processing technology, and in particular to a directory storage method, device, computer equipment, and storage medium.
背景技术Background technique
目前,POI(Poor Obfuscation Implementation,简洁版的模糊实现)提供API(Application Programming Interface,应用编程接口)给Java程序对Microsoft Office格式档案读和写的功能,Java程序可方便地操作Microsoft Office格式的档案。例如,可以通过POI提取出Word所有目录标题结构,然而POI所提取出的标题层级结构无法直接在java程序中保存,造成开发受限。Currently, POI (Poor Obfuscation Implementation, a concise version of fuzzy implementation) provides API (Application Programming Interface, application programming interface) for Java programs to read and write Microsoft Office format files. Java programs can easily operate Microsoft Office format files. . For example, you can extract all the title structure of Word through POI, but the title hierarchy structure extracted by POI cannot be saved directly in the java program, which causes development restrictions.
发明内容Summary of the invention
本申请实施例提供了一种目录存储方法、装置、计算机设备及存储介质,旨在解决Word文本的目录无法直接在java程序中保存的问题。The embodiments of the present application provide a directory storage method, device, computer equipment and storage medium, aiming to solve the problem that the directory of Word text cannot be directly saved in the java program.
第一方面,本申请实施例提供了一种目录存储方法,其包括:读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;根据预设格式获取所述目标文本中各级标题的标题名称;获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。In the first aspect, an embodiment of the present application provides a directory storage method, which includes: reading a target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each Each of the headings is a paragraph; obtain the heading names of the headings at all levels in the target text according to the preset format; obtain the positions of the corresponding paragraphs of the headings at all levels and determine the positions of the headings at all levels according to the positions of the corresponding paragraphs of the adjacent headings Start and end positions; create a tree structure object, store the title names of all levels of titles and the corresponding start and end positions in the nodes of the tree structure object to form a directory of the target text, wherein the tree The shape structure object includes multiple nodes.
第二方面,本申请实施例还提供了一种目录存储装置,其包括:记录单元,用于读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文 本中包括有各级标题,且每个所述标题为一个段落;获取单元,用于根据预设格式获取所述目标文本中各级标题的标题名称;确定单元,用于获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;存储单元,用于创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。In a second aspect, an embodiment of the present application also provides a directory storage device, which includes: a recording unit for reading a target text and recording the positions of all paragraphs in the target text, wherein the target text includes Headings at all levels, and each of the headings is a paragraph; the acquiring unit is used to acquire the heading names of the headings at all levels in the target text according to a preset format; the determining unit is used to acquire the positions of the corresponding paragraphs of the headings at all levels and Determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; the storage unit is used to create a tree structure object, and store the heading names of the headings at all levels and the corresponding start and end positions in the Among the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object includes a plurality of nodes.
第三方面,本申请实施例还提供了一种计算机设备,其包括存储器以及与所述存储器相连的处理器;所述存储器用于存储计算机程序;所述处理器用于运行所述存储器中存储的计算机程序,以执行如下步骤:读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;根据预设格式获取所述目标文本中各级标题的标题名称;获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。In a third aspect, an embodiment of the present application also provides a computer device, which includes a memory and a processor connected to the memory; the memory is used to store a computer program; the processor is used to run the A computer program to perform the following steps: read the target text and record the positions of all paragraphs in the target text, wherein the target text includes headings at all levels, and each heading is a paragraph; according to a preset Format to obtain the heading names of the headings at all levels in the target text; obtain the positions of the corresponding paragraphs of the headings at all levels and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; create a tree structure object, and The title name of the level heading and the corresponding start and end positions are stored in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object includes a plurality of nodes.
第四方面,本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器执行以下步骤:读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;根据预设格式获取所述目标文本中各级标题的标题名称;获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。In a fourth aspect, the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the processor executes the following steps: Take the target text and record the positions of all the paragraphs in the target text, where the target text includes headings of all levels, and each heading is a paragraph; obtain each level in the target text according to a preset format The title name of the heading; obtain the positions of the corresponding paragraphs of the headings at all levels, and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; create a tree structure object, and combine the heading names and the corresponding The start and end positions of are stored in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object includes a plurality of nodes.
附图说明Description of the drawings
为了更清楚地说明本申请实施例技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to explain the technical solutions of the embodiments of the present application more clearly, the following will briefly introduce the drawings used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. Ordinary technicians can obtain other drawings based on these drawings without creative work.
图1为本申请实施例提供的目录存储方法的流程示意图;FIG. 1 is a schematic flowchart of a directory storage method provided by an embodiment of the application;
图2为本申请实施例提供的目录存储方法的子流程示意图;2 is a schematic diagram of a sub-flow of a directory storage method provided by an embodiment of the application;
图3为本申请实施例提供的目录存储方法的子流程示意图;FIG. 3 is a schematic diagram of a sub-flow of a directory storage method provided by an embodiment of this application;
图4为本申请实施例提供的目录存储方法的子流程示意图;4 is a schematic diagram of a sub-flow of a directory storage method provided by an embodiment of the application;
图5为本申请实施例提供的目录存储方法的树形结构对象示意图;FIG. 5 is a schematic diagram of a tree structure object of a directory storage method provided by an embodiment of the application;
图6为本申请另一实施例提供的目录存储方法的流程示意图;6 is a schematic flowchart of a directory storage method provided by another embodiment of this application;
图7为本申请实施例提供的目录存储装置的示意性框图;FIG. 7 is a schematic block diagram of a directory storage device provided by an embodiment of the application;
图8为本申请实施例提供的目录存储装置的具体单元的示意性框图;FIG. 8 is a schematic block diagram of specific units of a directory storage device provided by an embodiment of the application; FIG.
图9为本申请另一实施例提供的目录存储装置的示意性框图;以及FIG. 9 is a schematic block diagram of a directory storage device provided by another embodiment of the application; and
图10为本申请实施例提供的计算机设备的示意性框图。FIG. 10 is a schematic block diagram of a computer device provided by an embodiment of the application.
具体实施方式detailed description
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be described clearly and completely in conjunction with the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, rather than all of them. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.
应当理解,当在本说明书和所附权利要求书中使用时,术语“包括”和“包含”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in this specification and appended claims, the terms "including" and "including" indicate the existence of the described features, wholes, steps, operations, elements and/or components, but do not exclude one or The existence or addition of multiple other features, wholes, steps, operations, elements, components, and/or collections thereof.
还应当理解,在此本申请说明书中所使用的术语仅仅是出于描述特定实施例的目的而并不意在限制本申请。如在本申请说明书和所附权利要求书中所使用的那样,除非上下文清楚地指明其它情况,否则单数形式的“一”、“一个”及“该”意在包括复数形式。It should also be understood that the terms used in the specification of this application are only for the purpose of describing specific embodiments and are not intended to limit the application. As used in the specification of this application and the appended claims, unless the context clearly indicates other circumstances, the singular forms "a", "an" and "the" are intended to include plural forms.
还应当进一步理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should be further understood that the term "and/or" used in the specification and appended claims of this application refers to any combination and all possible combinations of one or more of the associated listed items, and includes these combinations .
请参阅图1,图1为本申请实施例提供的目录存储方法的示意性流程图。该目录存储方法应用于终端中。Please refer to FIG. 1. FIG. 1 is a schematic flowchart of a directory storage method according to an embodiment of the application. The directory storage method is applied to the terminal.
图1是本申请实施例提供的目录存储方法的流程示意图。如图所示,该方法包括以下步骤:S110-S140。FIG. 1 is a schematic flowchart of a directory storage method provided by an embodiment of the present application. As shown in the figure, the method includes the following steps: S110-S140.
S110、读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落。S110. Read the target text and record the positions of all paragraphs in the target text, where the target text includes headings of all levels, and each heading is a paragraph.
在一实施例中,例如本实施例中,POI是java用于处理Microsoft Office格式文档的工具,目标文本指的是Word格式的文本,目标文本中有多个段落,目标文本中包括有各级标题,且每个标题与段落相对应即一个标题为一个段落。具体地,通过POI读取目标文本,且在读取时对每个段落进行标记,从而记录每个段落的位置。In one embodiment, for example, in this embodiment, POI is a tool used by java to process documents in Microsoft Office format. The target text refers to text in Word format. There are multiple paragraphs in the target text, and the target text includes various levels. Headings, and each heading corresponds to a paragraph, that is, a heading is a paragraph. Specifically, the target text is read through POI, and each paragraph is marked when reading, so as to record the position of each paragraph.
在一实施例中,例如本实施例中,如图2所示,所述步骤S110可包括步骤:S111-S112。In an embodiment, for example, in this embodiment, as shown in FIG. 2, the step S110 may include steps: S111-S112.
S111、读取目标文本并将所述目标文本按照段落划分。S111. Read the target text and divide the target text according to paragraphs.
S112、通过序号对所述目标文本中的每个段落进行标记以确定每个段落的位置。S112: Mark each paragraph in the target text by a serial number to determine the position of each paragraph.
在一实施例中,例如本实施例中,将所读取的目标文本按照段落进行划分,将目标文本划分为多个段落,目标文本是按照文本的顶部到底部的顺序一个一个段落读取的,每读取一个段落时则对所读取的段落添加标记,其中,该标记为序号,序号具体指的是阿拉伯数字顺序,由0开始进行标记。具体如下所示。In an embodiment, for example, in this embodiment, the read target text is divided into paragraphs, the target text is divided into multiple paragraphs, and the target text is read one by one in the order from the top to the bottom of the text , Each time a paragraph is read, a mark is added to the read paragraph, where the mark is a serial number, and the serial number specifically refers to the sequence of Arabic numerals, starting from 0 for marking. The details are as follows.
Figure PCTCN2019117749-appb-000001
Figure PCTCN2019117749-appb-000001
Figure PCTCN2019117749-appb-000002
Figure PCTCN2019117749-appb-000002
S120、根据预设格式获取所述目标文本中各级标题的标题名称。S120: Acquire title names of titles at all levels in the target text according to a preset format.
在一实施例中,例如本实施例中,所述目标文本中包括有多级标题,例如,一级标题、二级标题以及三级标题,且每级标题存在多个同级标题。预设格式具体指的是各级标题的格式,每级标题的格式预设有多种,预设格式是根据Word文本的固定标题格式以及常用的标题格式进行设置,例如,一级标题为“第一章、”或者“一、”,二级标题为“第一节、”或者“(一)、”,三级标题为“第一小节、”或者“(1)、”。具体地,首先根据一级标题的预设格式遍历目标文本找到符合一级标题格式的标题,从而获取一级标题的标题名称;然后再根据二级标题的预设格式遍历目标文本找到符合二级标题格式的标题,从而获取二级标题的标题名称;最后再根据三级标题的预设格式遍历目标文本找到符合二级标题格式的标题,从而获取三级标题的标题名称。需要注意的是,部分文本可能仅存在一级标题和二级标题,若不存在三级标题,无法找到符合三级标题格式的标题则结束本步骤执行下一步骤。In an embodiment, such as this embodiment, the target text includes multiple levels of headings, for example, a first-level heading, a second-level heading, and a third-level heading, and there are multiple titles of the same level for each level of heading. The preset format specifically refers to the format of headings at all levels. There are multiple preset formats for headings at each level. The preset format is set according to the fixed heading format of the Word text and the commonly used heading format. For example, the first-level heading is " The first chapter," or "one,", the second-level title is "first section," or "(一),", and the third-level title is "first section," or "(1),". Specifically, first traverse the target text according to the preset format of the first-level title to find the title that meets the first-level title format, thereby obtaining the title name of the first-level title; then traverse the target text according to the preset format of the second-level title to find the title that meets the second-level title The title of the title format is used to obtain the title name of the secondary title; finally, the target text is traversed according to the preset format of the tertiary title to find the title that conforms to the secondary title format, thereby obtaining the title name of the tertiary title. It should be noted that some texts may only have first-level headings and second-level headings. If there are no third-level headings, and no headings that conform to the third-level heading format can be found, then this step is ended and the next step is performed.
S130、获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置。S130: Obtain the positions of the corresponding paragraphs of the headings of each level, and determine the start and end positions of the headings of each level according to the positions of the adjacent paragraphs corresponding to the headings of the same level.
在一实施例中,例如本实施例中,首先获取各级标题对应所标记的序号,将各级标题对应的序号作为其起始位置,然后选取相邻的两个同级标题序号,序号在前的标题的结束位置等于序号在后的标题的序号减一,从而得到序号在前的标题的起止位置,以此类推,计算所有相邻的同级标题的序号从而得到各级标题的起止位置。In one embodiment, for example, in this embodiment, first obtain the serial number corresponding to each level of title, use the serial number corresponding to each level of title as its starting position, and then select two adjacent title serial numbers of the same level. The end position of the previous title is equal to the serial number of the title after the serial number minus one, so as to get the start and end positions of the title with the previous serial number, and so on, calculate the serial numbers of all adjacent titles at the same level to get the start and end positions of all levels of titles .
在一实施例中,例如本实施例中,如图3所示,所述步骤S130可包括步骤:S131-S132。In an embodiment, for example, in this embodiment, as shown in FIG. 3, the step S130 may include the steps: S131-S132.
S131、获取所有同级标题对应段落的序号。S131. Obtain the serial numbers of paragraphs corresponding to all headings at the same level.
S132、根据相邻同级标题对应段落的序号通过预设公式计算每个同级标题 的起止位置。S132: Calculate the start and end positions of each title at the same level by using a preset formula according to the serial number of the corresponding paragraph of the adjacent title at the same level.
在一实施例中,例如本实施例中,在获取到各级标题的标题名称后,首先获取各级标题对应段落的序号,例如,一级标题的序号为1、35和60,二级标题的序号为5和20,三级标题的序号为10、15以及25、30,将所获取的各级标题对应段落的序号作为每个标题的起始位置;然后选取两个相邻的同级标题,将序号在前的同级标题的序号设为X,将序号在后的同级标题的序号设为P,根据预设公式:In one embodiment, for example, in this embodiment, after obtaining the title names of all levels of headings, first obtain the serial numbers of the corresponding paragraphs of the headings of each level. For example, the serial numbers of the first level headings are 1, 35 and 60, and the second level headings The serial numbers are 5 and 20, and the serial numbers of the three-level headings are 10, 15 and 25, 30. The serial numbers of the corresponding paragraphs of the obtained headings at all levels are used as the starting position of each heading; then two adjacent siblings are selected Title, set the serial number of the title of the same level with the serial number in front to X, and set the serial number of the title of the same level with the serial number in the back to P, according to the preset formula:
Y=P-1Y=P-1
其中,Y为序号在前的同级标题的结束位置,根据预设公式即可求出序号在前的同级标题的结束位置,那么序号在前的同级标题的起止位置则为(X、Y)。例如,一级标题的序号为1、35和60,选取两个相邻的同级标题1和35,其中,序号在前的同级标题的序号为1,1即为序号在前的同级标题的起始位置,序号在后的同级标题的序号为35,根据预设公式计算得到序号在前的同级标题的结束位置为35-1=34,从而得到序号在前的同级标题的起止位置为(1,34)。同理地,其他各级标题的起止位置按照同样的方式求得。Among them, Y is the end position of the title of the same level with the serial number in front. According to the preset formula, the end position of the title of the same level with the serial number in front can be obtained. Then the start and end positions of the title of the same level with the serial number are (X, Y). For example, if the serial numbers of the first-level titles are 1, 35, and 60, select two adjacent titles of the same level, 1 and 35, where the serial number of the title of the same level with the first serial number is 1, and 1 is the same level with the first serial number. The starting position of the title, the serial number of the title of the same level after the serial number is 35, and the end position of the title of the same level with the previous serial number is calculated according to the preset formula as 35-1=34, thus the title of the same level with the previous serial number is obtained The starting and ending positions of is (1, 34). In the same way, the start and end positions of other titles at all levels are obtained in the same way.
S140、创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。S140. Create a tree structure object, and store the title names of each level of title and the corresponding start and end positions in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure The structure object includes multiple nodes.
在一实施例中,例如本实施例中,在java程序中创建一个树形结构对象,该树形结构对象用于存储目标文本中各级标题的标题名称和对应的位置。树形结构对象包括有多个节点,且每一个节点都有一个唯一的父节点和多个子节点,通过树形结构对象的这种特点可以存储目标文本中标题目录的层级结构。In an embodiment, for example, in this embodiment, a tree structure object is created in a java program, and the tree structure object is used to store the title names and corresponding positions of the titles at all levels in the target text. The tree structure object includes multiple nodes, and each node has a unique parent node and multiple child nodes. Through this feature of the tree structure object, the hierarchical structure of the title directory in the target text can be stored.
在一实施例中,例如本实施例中,如图4所示,所述步骤S140可包括步骤:S141-S143。In an embodiment, for example, in this embodiment, as shown in FIG. 4, the step S140 may include the steps: S141-S143.
S141、将所述目标文本的文件名称存储在所述树形结构对象的根节点中。S141. Store the file name of the target text in the root node of the tree structure object.
S142、将一级标题的标题名称以及对应的起止位置存储在所述树形结构对象的父节点中。S142: Store the title name of the first-level title and the corresponding start and end positions in the parent node of the tree structure object.
S143、将二级标题的标题名称以及对应的起止位置存储在所述树形结构对象的子节点中。S143: Store the title name of the secondary title and the corresponding start and end positions in the child nodes of the tree structure object.
在一实施例中,例如本实施例中,树形结构对象包括根节点、父节点以及 子节点,其中,根节点用于存储目标文本的文件名称,以便查找目标文本。根节点连接有多个父节点,一级标题的标题名称以及对应的起止位置存储在父节点中,父节点的数量与一级标题的数量相对应;每个父节点连接有多个子节点,二级标题的标题名称以及对应的起止位置存储在子节点中,子节点的数量与二级标题的数量相对应;若目标文本还有三级标题,继续添加孙节点,子节点与多个孙节点连接,三级标题的标题名称以及对应的起止位置存储在孙节点中,孙节点的数量与三级标题的数量相对应。也即,有多少层级的标题就添加多少层级的节点进行存储,标题层级的数量与节点层级的数量相对应。例如,如图5所示。In an embodiment, for example, in this embodiment, the tree structure object includes a root node, a parent node, and a child node. The root node is used to store the file name of the target text in order to find the target text. The root node is connected to multiple parent nodes. The title name of the first-level title and the corresponding start and end positions are stored in the parent node. The number of parent nodes corresponds to the number of the first-level title; each parent node is connected to multiple child nodes. The title name of the level heading and the corresponding start and end positions are stored in the child nodes, and the number of child nodes corresponds to the number of second level headings; if the target text has a third level heading, continue to add grandchildren, child nodes and multiple grandchildren The connection, the title name of the third-level heading and the corresponding start and end positions are stored in the grandchildren, and the number of grandchildren corresponds to the number of the third-level headings. That is, the number of levels of nodes is added for storage as many levels of headings as there are, and the number of heading levels corresponds to the number of node levels. For example, as shown in Figure 5.
在一实施例中,例如本实施例中,如图6所示,所述步骤S140之后,还包括步骤:S150-S160。In an embodiment, for example, in this embodiment, as shown in FIG. 6, after the step S140, the step S150-S160 is further included.
S150、若接收到目标标题的文本提取指令,根据所述目标标题查询所述树形结构对象以获取所述目标标题的起止位置。S150: If the text extraction instruction of the target title is received, query the tree structure object according to the target title to obtain the start and end positions of the target title.
S160、根据所述目标标题的起止位置从所述目标文本中提取所述目标标题的文本。S160: Extract the text of the target title from the target text according to the start and end positions of the target title.
在一实施例中,例如本实施例中,在实现目录标题在java中存储后,可实现更多的开发,如文本内容的提取。若需要提取某个段落的文本内容,只需要提供目标段落文本内容对应的标题名称的字段即可,当接收到目标标题的文本提取命令,则根据目标标题的标题名称字段在树形结构对象中进行遍历,查找相同的标题名称,若在树形结构对象的节点中查找出到与之相同的标题名称,则获取该节点中所存储的起止位置,也即为目标标题的起止位置。根据树形结构对象的根节点中存储的文件名称调用目标文本,根据所获取的目标标题的起止位置首先从目标文本中找到起始位置的段落和结束位置的段落,然后提取起始位置和结束位置之间的所有段落的内容作为目标标题的文本,从而实现了文本内容的提取。In an embodiment, for example, in this embodiment, after the catalog title is stored in java, more development can be realized, such as the extraction of text content. If you need to extract the text content of a certain paragraph, you only need to provide the title name field corresponding to the text content of the target paragraph. When the text extraction command of the target title is received, the title name field of the target title is in the tree structure object Perform traversal to find the same title name. If the same title name is found in the node of the tree structure object, the start and end positions stored in the node are obtained, that is, the start and end positions of the target title. Call the target text according to the file name stored in the root node of the tree structure object. According to the obtained start and end positions of the target title, first find the paragraph at the start position and the paragraph at the end position from the target text, and then extract the start and end positions The content of all paragraphs between the positions is used as the text of the target heading, thereby realizing the extraction of the text content.
本申请实施例展示了一种目录存储方法,通过读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;根据预设格式获取所述目标文本中各级标题的标题名称;获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;创建树形结构对象,将各级标题的所述标题名称以及对应的 所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点,可实现Word文本的目录在Java程序中存储的效果。The embodiment of the present application shows a method for directory storage by reading the target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is one Paragraphs; obtain the heading names of the headings at all levels in the target text according to the preset format; obtain the positions of the corresponding paragraphs of the headings at all levels and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level; create a tree Structure object, storing the title names of each level of title and the corresponding start and end positions in the nodes of the tree structure object to form the target text directory, wherein the tree structure object includes how many A node can realize the effect of storing Word text catalog in Java program.
图7是本申请实施例提供的一种目录存储装置200的示意性框图。如图7所示,对应于以上目录存储方法,本申请还提供一种目录存储装置200。该目录存储装置200包括用于执行上述目录存储方法的单元,该装置可以被配置于台式电脑、平板电脑、手提电脑、等终端中。具体地,请参阅图7,该目录存储装置200包括:记录单元210、获取单元220、确定单元230以及存储单元240。FIG. 7 is a schematic block diagram of a directory storage device 200 provided by an embodiment of the present application. As shown in FIG. 7, corresponding to the above directory storage method, the present application also provides a directory storage device 200. The directory storage device 200 includes a unit for executing the above-mentioned directory storage method, and the device can be configured in a desktop computer, a tablet computer, a laptop computer, and other terminals. Specifically, referring to FIG. 7, the directory storage device 200 includes: a recording unit 210, an acquiring unit 220, a determining unit 230, and a storage unit 240.
记录单元210,用于读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落。The recording unit 210 is configured to read a target text and record the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is a paragraph.
在一实施例中,例如本实施例中,如图8所示,所述记录单元210包括:读取单元211以及标记单元212。In an embodiment, for example, in this embodiment, as shown in FIG. 8, the recording unit 210 includes: a reading unit 211 and a marking unit 212.
读取单元211,用于读取目标文本并将所述目标文本按照段落划分。The reading unit 211 is configured to read the target text and divide the target text into paragraphs.
标记单元212,用于通过序号对所述目标文本中的每个段落进行标记以确定每个段落的位置。The marking unit 212 is used to mark each paragraph in the target text by a serial number to determine the position of each paragraph.
获取单元220,用于根据预设格式获取所述目标文本中各级标题的标题名称。The obtaining unit 220 is configured to obtain the title names of various levels of titles in the target text according to a preset format.
确定单元230,用于获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置。The determining unit 230 is configured to obtain the positions of the corresponding paragraphs of the headings of each level and determine the starting and ending positions of the headings of each level according to the positions of the corresponding paragraphs of the adjacent titles of the same level.
在一实施例中,例如本实施例中,如图8所示,所述确定单元230包括:获取子单元231以及计算单元232。In an embodiment, for example, in this embodiment, as shown in FIG. 8, the determining unit 230 includes: an obtaining subunit 231 and a calculating unit 232.
获取子单元231,用于获取所有同级标题对应段落的序号。The obtaining subunit 231 is used to obtain the serial numbers of the corresponding paragraphs of all headings at the same level.
计算单元232,用于根据相邻同级标题对应段落的序号通过预设公式计算每个同级标题的起止位置。The calculation unit 232 is configured to calculate the start and end positions of each title at the same level by using a preset formula according to the serial number of the corresponding paragraph of the adjacent title at the same level.
存储单元240,用于创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。The storage unit 240 is configured to create a tree structure object, and store the title names of each level of title and the corresponding start and end positions in the nodes of the tree structure object to form a directory of the target text, wherein, The tree structure object includes multiple nodes.
在一实施例中,例如本实施例中,如图8所示,所述存储单元240包括:名称存储单元241、一级存储单元242以及二级存储单元243。In an embodiment, for example, in this embodiment, as shown in FIG. 8, the storage unit 240 includes: a name storage unit 241, a primary storage unit 242 and a secondary storage unit 243.
名称存储单元241,用于将所述目标文本的文件名称存储在所述树形结构对象的根节点中。The name storage unit 241 is configured to store the file name of the target text in the root node of the tree structure object.
一级存储单元242,用于将一级标题的标题名称以及对应的起止位置存储在所述树形结构对象的父节点中。The first-level storage unit 242 is configured to store the title name of the first-level title and the corresponding start and end positions in the parent node of the tree structure object.
二级存储单元243,用于将二级标题的标题名称以及对应的起止位置存储在所述树形结构对象的子节点中。The secondary storage unit 243 is configured to store the title name of the secondary title and the corresponding start and end positions in the child nodes of the tree structure object.
在一实施例中,例如本实施例中,如图9所示,所述目录存储装置200还包括:查询单元250以及提取单元260。In an embodiment, for example, in this embodiment, as shown in FIG. 9, the directory storage device 200 further includes: a query unit 250 and an extraction unit 260.
查询单元250,用于若接收到目标标题的文本提取指令,根据所述目标标题查询所述树形结构对象以获取所述目标标题的起止位置。The query unit 250 is configured to, if a text extraction instruction of a target title is received, query the tree structure object according to the target title to obtain the start and end positions of the target title.
提取单元260,用于根据所述目标标题的起止位置从所述目标文本中提取所述目标标题的文本。The extracting unit 260 is configured to extract the text of the target title from the target text according to the start and end positions of the target title.
上述目录存储装置可以实现为一种计算机程序的形式,该计算机程序可以在如图10所示的计算机设备上运行。The above-mentioned directory storage device may be implemented in the form of a computer program, and the computer program may run on the computer device as shown in FIG. 10.
请参阅图10,图10是本申请实施例提供的一种计算机设备的示意性框图。该计算机设备500可以是终端,其中,终端可以是智能手机、平板电脑、笔记本电脑、台式电脑、个人数字助理和穿戴式设备等具有通信功能的电子设备。Please refer to FIG. 10, which is a schematic block diagram of a computer device according to an embodiment of the present application. The computer device 500 may be a terminal, where the terminal may be an electronic device with communication functions such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a personal digital assistant, and a wearable device.
参阅图10,该计算机设备500包括通过系统总线501连接的处理器502、存储器和网络接口505,其中,存储器可以包括非易失性存储介质503和内存储器504。10, the computer device 500 includes a processor 502, a memory, and a network interface 505 connected through a system bus 501, where the memory may include a non-volatile storage medium 503 and an internal memory 504.
该非易失性存储介质503可存储操作系统5031和计算机程序5032。该计算机程序5032包括程序指令,该程序指令被执行时,可使得处理器502执行一种目录存储方法。The non-volatile storage medium 503 can store an operating system 5031 and a computer program 5032. The computer program 5032 includes program instructions. When the program instructions are executed, the processor 502 can execute a directory storage method.
该处理器502用于提供计算和控制能力,以支撑整个计算机设备500的运行。The processor 502 is used to provide calculation and control capabilities to support the operation of the entire computer device 500.
该内存储器504为非易失性存储介质503中的计算机程序5032的运行提供环境,该计算机程序5032被处理器502执行时,可使得处理器502执行一种目录存储方法。The internal memory 504 provides an environment for the operation of the computer program 5032 in the non-volatile storage medium 503. When the computer program 5032 is executed by the processor 502, the processor 502 can execute a directory storage method.
该网络接口505用于与其它设备进行网络通信。本领域技术人员可以理解,图10中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备500的限定,具体的计算机设备500可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部 件布置。The network interface 505 is used for network communication with other devices. Those skilled in the art can understand that the structure shown in FIG. 10 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device 500 to which the solution of the present application is applied. The specific computer device 500 may include more or fewer components than shown in the figure, or combine certain components, or have a different component arrangement.
其中,所述处理器502用于运行存储在存储器中的计算机程序5032,以实现本申请实施例的目录存储方法。Wherein, the processor 502 is configured to run a computer program 5032 stored in the memory, so as to implement the directory storage method of the embodiment of the present application.
应当理解,在本申请实施例中,处理器502可以是中央处理单元(Central Processing Unit,CPU),该处理器502还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。其中,通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in this embodiment of the application, the processor 502 may be a central processing unit (Central Processing Unit, CPU), and the processor 502 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), Application Specific Integrated Circuit (ASIC), Field-Programmable Gate Array (FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. Among them, the general-purpose processor may be a microprocessor or the processor may also be any conventional processor.
本领域普通技术人员可以理解的是实现上述实施例的方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成。该计算机程序包括程序指令,计算机程序可存储于一存储介质中,该存储介质为计算机可读存储介质。该程序指令被该计算机系统中的至少一个处理器执行,以实现上述方法的实施例的流程步骤。Those of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by computer programs instructing relevant hardware. The computer program includes program instructions, and the computer program can be stored in a storage medium, which is a computer-readable storage medium. The program instructions are executed by at least one processor in the computer system to implement the process steps of the foregoing method embodiments.
因此,本申请还提供一种存储介质。该存储介质可以为计算机可读存储介质。该存储介质存储有计算机程序,该计算机程序被处理器执行时使处理器执行以上各实施例中所描述的目录存储方法的步骤。Therefore, this application also provides a storage medium. The storage medium may be a computer-readable storage medium. The storage medium stores a computer program, and when the computer program is executed by the processor, the processor executes the steps of the directory storage method described in the above embodiments.
所述存储介质可以是U盘、移动硬盘、只读存储器(Read-Only Memory,ROM)、磁碟或者光盘等各种可以存储程序代码的计算机可读存储介质。The storage medium may be a U disk, a mobile hard disk, a read-only memory (Read-Only Memory, ROM), a magnetic disk or an optical disk, and other computer-readable storage media that can store program codes.
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above are only specific implementations of this application, but the protection scope of this application is not limited to this. Anyone familiar with the technical field can easily think of various equivalents within the technical scope disclosed in this application. Modifications or replacements, these modifications or replacements shall be covered within the scope of protection of this application. Therefore, the protection scope of this application should be subject to the protection scope of the claims.

Claims (20)

  1. 一种目录存储方法,应用于Java中,包括:A directory storage method used in Java, including:
    读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;Reading the target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is a paragraph;
    根据预设格式获取所述目标文本中各级标题的标题名称;Obtaining the title names of all levels of titles in the target text according to a preset format;
    获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;Obtain the positions of the corresponding paragraphs of the headings at all levels and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level;
    创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。A tree structure object is created, and the title names of the titles at all levels and the corresponding start and end positions are stored in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object Including multiple nodes.
  2. 根据权利要求1所述的目录存储方法,其中,所述读取目标文本并记录所述目标文本中所有段落的位置,包括:The directory storage method according to claim 1, wherein said reading the target text and recording the positions of all paragraphs in the target text comprises:
    读取目标文本并将所述目标文本按照段落划分;Reading the target text and dividing the target text into paragraphs;
    通过序号对所述目标文本中的每个段落进行标记以确定每个段落的位置。Each paragraph in the target text is marked by a serial number to determine the position of each paragraph.
  3. 根据权利要求1所述的目录存储方法,其中,所述获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置,包括:The catalog storage method according to claim 1, wherein the obtaining the positions of the corresponding paragraphs of the headings of each level and determining the starting and ending positions of the headings of each level according to the positions of the corresponding paragraphs of the adjacent headings at the same level comprises:
    获取所有同级标题对应段落的序号;Get the serial numbers of the corresponding paragraphs of all titles at the same level;
    根据相邻同级标题对应段落的序号通过预设公式计算每个同级标题的起止位置。Calculate the start and end positions of each title at the same level through a preset formula based on the serial number of the corresponding paragraph of the adjacent title at the same level.
  4. 根据权利要求1所述的目录存储方法,其中,所述将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中,包括:The directory storage method according to claim 1, wherein the storing the title names of the titles of each level and the corresponding start and end positions in the nodes of the tree structure object comprises:
    将所述目标文本的文件名称存储在所述树形结构对象的根节点中;Storing the file name of the target text in the root node of the tree structure object;
    将一级标题的标题名称以及对应的起止位置存储在所述树形结构对象的父节点中;Storing the title name of the first-level title and the corresponding start and end positions in the parent node of the tree structure object;
    将二级标题的标题名称以及对应的起止位置存储在所述树形结构对象的子节点中。The title name of the secondary title and the corresponding start and end positions are stored in the child nodes of the tree structure object.
  5. 根据权利要求1所述的目录存储方法,其中,所述将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中之后,还包 括:The directory storage method according to claim 1, wherein, after storing the title names of the titles of each level and the corresponding start and end positions in the nodes of the tree structure object, the method further comprises:
    若接收到目标标题的文本提取指令,根据所述目标标题查询所述树形结构对象以获取所述目标标题的起止位置;If the text extraction instruction of the target title is received, query the tree structure object according to the target title to obtain the start and end positions of the target title;
    根据所述目标标题的起止位置从所述目标文本中提取所述目标标题的文本。The text of the target title is extracted from the target text according to the start and end positions of the target title.
  6. 根据权利要求3所述的目录存储方法,其中,所述预设公式为:The directory storage method according to claim 3, wherein the preset formula is:
    Y=P-1Y=P-1
    其中,Y为序号在前的同级标题的结束位置,P为序号在后的同级标题的序号。Among them, Y is the end position of the title of the same level with the serial number in front, and P is the serial number of the title of the same level with the serial number in the back.
  7. 根据权利要求5所述的目录存储方法,其中,所述根据所述目标标题查询所述树形结构对象以获取所述目标标题的起止位置,包括:The directory storage method according to claim 5, wherein the querying the tree structure object according to the target title to obtain the start and end positions of the target title comprises:
    根据所述目标标题的标题名称字段在所述树形结构对象中进行遍历以查找相同的标题名称;Traverse the tree structure object according to the title name field of the target title to find the same title name;
    若在所述树形结构对象的节点中查找到与之相同的标题名称,则获取该节点中所存储的起止位置作为目标标题的起止位置。If the same title name is found in the node of the tree structure object, the start and end positions stored in the node are acquired as the start and end positions of the target title.
  8. 一种目录存储装置,应用于Java中,包括:A directory storage device used in Java, including:
    记录单元,用于读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;A recording unit for reading the target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is a paragraph;
    获取单元,用于根据预设格式获取所述目标文本中各级标题的标题名称;The obtaining unit is configured to obtain the title names of the titles at all levels in the target text according to a preset format;
    确定单元,用于获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;The determining unit is used to obtain the positions of the corresponding paragraphs of the headings at all levels and determine the starting and ending positions of the headings at all levels according to the positions of the corresponding paragraphs of the adjacent headings at the same level;
    存储单元,用于创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。The storage unit is used to create a tree structure object, and store the title names of each level of title and the corresponding start and end positions in the nodes of the tree structure object to form a directory of the target text. The tree structure object includes multiple nodes.
  9. 根据权利要求6所述的目录存储装置,其中,所述记录单元包括:The directory storage device according to claim 6, wherein the recording unit comprises:
    读取单元,用于读取目标文本并将所述目标文本按照段落划分;A reading unit for reading the target text and dividing the target text into paragraphs;
    标记单元,用于通过序号对所述目标文本中的每个段落进行标记以确定每个段落的位置。The marking unit is used to mark each paragraph in the target text by a serial number to determine the position of each paragraph.
  10. 根据权利要求6所述的目录存储装置,其中,所述确定单元包括:The directory storage device according to claim 6, wherein the determining unit comprises:
    获取子单元,用于获取所有同级标题对应段落的序号;Get subunits, used to get the serial numbers of the corresponding paragraphs of all headings at the same level;
    计算单元,用于根据相邻同级标题对应段落的序号通过预设公式计算每个 同级标题的起止位置。The calculation unit is used to calculate the start and end positions of each heading at the same level through a preset formula based on the serial number of the corresponding paragraph of the adjacent heading at the same level.
  11. 一种计算机设备,包括存储器以及与所述存储器相连的处理器;所述存储器用于存储计算机程序;所述处理器用于运行所述存储器中存储的计算机程序,以执行如下步骤:A computer device includes a memory and a processor connected to the memory; the memory is used to store a computer program; the processor is used to run the computer program stored in the memory to perform the following steps:
    读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;Reading the target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is a paragraph;
    根据预设格式获取所述目标文本中各级标题的标题名称;Obtaining the title names of all levels of titles in the target text according to a preset format;
    获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;Obtain the positions of the corresponding paragraphs of the headings at all levels and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level;
    创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。A tree structure object is created, and the title names of the titles at all levels and the corresponding start and end positions are stored in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object Including multiple nodes.
  12. 根据权利要求11所述的计算机设备,其中,所述读取目标文本并记录所述目标文本中所有段落的位置,包括:11. The computer device according to claim 11, wherein said reading the target text and recording the positions of all paragraphs in the target text comprises:
    读取目标文本并将所述目标文本按照段落划分;Reading the target text and dividing the target text into paragraphs;
    通过序号对所述目标文本中的每个段落进行标记以确定每个段落的位置。Each paragraph in the target text is marked by a serial number to determine the position of each paragraph.
  13. 根据权利要求11所述的计算机设备,其中,所述获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置,包括:11. The computer device according to claim 11, wherein said acquiring the positions of the corresponding paragraphs of headings of each level and determining the starting and ending positions of the headings of each level according to the positions of the corresponding paragraphs of adjacent headings at the same level comprises:
    获取所有同级标题对应段落的序号;Get the serial numbers of the corresponding paragraphs of all titles at the same level;
    根据相邻同级标题对应段落的序号通过预设公式计算每个同级标题的起止位置。Calculate the start and end positions of each title at the same level through a preset formula based on the serial number of the corresponding paragraph of the adjacent title at the same level.
  14. 根据权利要求11所述的计算机设备,其中,所述将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中,包括:11. The computer device according to claim 11, wherein the storing the title names of the titles of each level and the corresponding start and end positions in the nodes of the tree structure object comprises:
    将所述目标文本的文件名称存储在所述树形结构对象的根节点中;Storing the file name of the target text in the root node of the tree structure object;
    将一级标题的标题名称以及对应的起止位置存储在所述树形结构对象的父节点中;Storing the title name of the first-level title and the corresponding start and end positions in the parent node of the tree structure object;
    将二级标题的标题名称以及对应的起止位置存储在所述树形结构对象的子节点中。The title name of the secondary title and the corresponding start and end positions are stored in the child nodes of the tree structure object.
  15. 根据权利要求11所述的计算机设备,其中,所述将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中之后,还包括:11. The computer device according to claim 11, wherein, after storing the title names of the titles of each level and the corresponding start and end positions in the nodes of the tree structure object, the method further comprises:
    若接收到目标标题的文本提取指令,根据所述目标标题查询所述树形结构对象以获取所述目标标题的起止位置;If the text extraction instruction of the target title is received, query the tree structure object according to the target title to obtain the start and end positions of the target title;
    根据所述目标标题的起止位置从所述目标文本中提取所述目标标题的文本。The text of the target title is extracted from the target text according to the start and end positions of the target title.
  16. 根据权利要求13所述的计算机设备,其中,所述预设公式为:The computer device according to claim 13, wherein the preset formula is:
    Y=P-1Y=P-1
    其中,Y为序号在前的同级标题的结束位置,P为序号在后的同级标题的序号。Among them, Y is the end position of the title of the same level with the serial number in front, and P is the serial number of the title of the same level with the serial number in the back.
  17. 根据权利要求15所述的计算机设备,其中,所述根据所述目标标题查询所述树形结构对象以获取所述目标标题的起止位置,包括:15. The computer device according to claim 15, wherein the querying the tree structure object according to the target title to obtain the start and end positions of the target title comprises:
    根据所述目标标题的标题名称字段在所述树形结构对象中进行遍历以查找相同的标题名称;Traverse the tree structure object according to the title name field of the target title to find the same title name;
    若在所述树形结构对象的节点中查找到与之相同的标题名称,则获取该节点中所存储的起止位置作为目标标题的起止位置。If the same title name is found in the node of the tree structure object, the start and end positions stored in the node are acquired as the start and end positions of the target title.
  18. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时使所述处理器执行以下步骤:A computer-readable storage medium storing a computer program, and when the computer program is executed by a processor, the processor executes the following steps:
    读取目标文本并记录所述目标文本中所有段落的位置,其中,所述目标文本中包括有各级标题,且每个所述标题为一个段落;Reading the target text and recording the positions of all paragraphs in the target text, wherein the target text includes headings of all levels, and each heading is a paragraph;
    根据预设格式获取所述目标文本中各级标题的标题名称;Obtaining the title names of all levels of titles in the target text according to a preset format;
    获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置;Obtain the positions of the corresponding paragraphs of the headings at all levels and determine the start and end positions of the headings at all levels according to the positions of the adjacent paragraphs of the headings at the same level;
    创建树形结构对象,将各级标题的所述标题名称以及对应的所述起止位置存储在所述树形结构对象的节点中以形成所述目标文本的目录,其中,所述树形结构对象包括有多个节点。A tree structure object is created, and the title names of the titles at all levels and the corresponding start and end positions are stored in the nodes of the tree structure object to form a directory of the target text, wherein the tree structure object Including multiple nodes.
  19. 根据权利要求18所述的计算机可读存储介质,其中,所述读取目标文本并记录所述目标文本中所有段落的位置的步骤,包括:18. The computer-readable storage medium according to claim 18, wherein the step of reading the target text and recording the positions of all paragraphs in the target text comprises:
    读取目标文本并将所述目标文本按照段落划分;Reading the target text and dividing the target text into paragraphs;
    通过序号对所述目标文本中的每个段落进行标记以确定每个段落的位置。Each paragraph in the target text is marked by a serial number to determine the position of each paragraph.
  20. 根据权利要求18所述的计算机可读存储介质,其中,所述获取各级标题对应段落的位置并根据相邻的同级标题对应段落的位置确定各级标题的起止位置的步骤,包括:18. The computer-readable storage medium according to claim 18, wherein the step of obtaining the positions of the corresponding paragraphs of headings of each level and determining the start and end positions of the headings of each level according to the positions of the corresponding paragraphs of adjacent headings at the same level comprises:
    获取所有同级标题对应段落的序号;Get the serial numbers of the corresponding paragraphs of all titles at the same level;
    根据相邻同级标题对应段落的序号通过预设公式计算每个同级标题的起止位置。Calculate the start and end positions of each title at the same level through a preset formula based on the serial number of the corresponding paragraph of the adjacent title at the same level.
PCT/CN2019/117749 2019-09-04 2019-11-13 Table of contents storage method and apparatus, computer device and storage medium WO2021042542A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910833398.XA CN110704573B (en) 2019-09-04 2019-09-04 Catalog storage method, catalog storage device, computer equipment and storage medium
CN201910833398.X 2019-09-04

Publications (1)

Publication Number Publication Date
WO2021042542A1 true WO2021042542A1 (en) 2021-03-11

Family

ID=69194321

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117749 WO2021042542A1 (en) 2019-09-04 2019-11-13 Table of contents storage method and apparatus, computer device and storage medium

Country Status (2)

Country Link
CN (1) CN110704573B (en)
WO (1) WO2021042542A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327892A (en) * 2021-12-28 2022-04-12 武汉天喻信息产业股份有限公司 FLASH resource management method, storage medium, electronic equipment and device

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113642320A (en) * 2020-04-27 2021-11-12 北京庖丁科技有限公司 Method, device, equipment and medium for extracting document directory structure

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1549157A (en) * 2003-05-17 2004-11-24 鸿富锦精密工业(深圳)有限公司 File browsing controlling system and method
US20090182754A1 (en) * 2008-01-16 2009-07-16 Hong Fu Jin Precision Industry(Shenzhen) Co., Ltd. System and method for parsing a text file
CN102486769A (en) * 2010-12-02 2012-06-06 北大方正集团有限公司 Document directory processing method and device
CN109918622A (en) * 2019-02-27 2019-06-21 中国地质大学(武汉) The method and system converted from Word document to LaTeX document are realized based on JAVA
CN109977366A (en) * 2017-12-27 2019-07-05 珠海金山办公软件有限公司 A kind of catalogue generation method and device

Family Cites Families (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3940491B2 (en) * 1998-02-27 2007-07-04 株式会社東芝 Document processing apparatus and document processing method
JP4343213B2 (en) * 2006-12-25 2009-10-14 株式会社東芝 Document processing apparatus and document processing method
JP5446877B2 (en) * 2008-01-11 2014-03-19 日本電気株式会社 Structure identification device
WO2010063070A1 (en) * 2008-12-03 2010-06-10 Ozmiz Pty. Ltd. Method and system for displaying data on a mobile terminal
US9286372B2 (en) * 2013-11-06 2016-03-15 Sap Se Content management with RDBMS
CN105630748A (en) * 2014-10-31 2016-06-01 富士通株式会社 Information processing device and information processing method
CN106326194B (en) * 2015-07-06 2019-03-29 北大方正集团有限公司 Catalogue generation method and device under a kind of shift scene applied to file format
CN105677764B (en) * 2015-12-30 2020-05-08 百度在线网络技术(北京)有限公司 Information extraction method and device
CN107357765B (en) * 2017-07-14 2018-11-09 北京神州泰岳软件股份有限公司 Word document flaking method and device
CN109558575B (en) * 2018-10-25 2024-03-29 平安科技(深圳)有限公司 Online form editing method, online form editing device, computer equipment and storage medium
CN110046236B (en) * 2019-03-20 2022-12-20 腾讯科技(深圳)有限公司 Unstructured data retrieval method and device
CN110196971A (en) * 2019-04-23 2019-09-03 平安科技(深圳)有限公司 Online document edit methods, device, terminal device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1549157A (en) * 2003-05-17 2004-11-24 鸿富锦精密工业(深圳)有限公司 File browsing controlling system and method
US20090182754A1 (en) * 2008-01-16 2009-07-16 Hong Fu Jin Precision Industry(Shenzhen) Co., Ltd. System and method for parsing a text file
CN102486769A (en) * 2010-12-02 2012-06-06 北大方正集团有限公司 Document directory processing method and device
CN109977366A (en) * 2017-12-27 2019-07-05 珠海金山办公软件有限公司 A kind of catalogue generation method and device
CN109918622A (en) * 2019-02-27 2019-06-21 中国地质大学(武汉) The method and system converted from Word document to LaTeX document are realized based on JAVA

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114327892A (en) * 2021-12-28 2022-04-12 武汉天喻信息产业股份有限公司 FLASH resource management method, storage medium, electronic equipment and device
CN114327892B (en) * 2021-12-28 2024-05-03 武汉天喻信息产业股份有限公司 FLASH resource management method, storage medium, electronic equipment and device

Also Published As

Publication number Publication date
CN110704573B (en) 2023-12-22
CN110704573A (en) 2020-01-17

Similar Documents

Publication Publication Date Title
US8126859B2 (en) Updating a local version of a file based on a rule
WO2020211236A1 (en) Read-write conflict resolution method and apparatus employing b+ tree and storage medium
EP3971806B1 (en) Data processing methods, apparatuses, and devices
JP6070936B2 (en) Information processing apparatus, information processing method, and program
WO2020259141A1 (en) File processing method and apparatus, and computer device
WO2021042542A1 (en) Table of contents storage method and apparatus, computer device and storage medium
US20160239509A1 (en) File explorer system usable in an emulated integrated development environment (ide)
WO2019042349A1 (en) Translation method, mobile terminal and storage device of operating system framework
CN109460406B (en) Data processing method and device
CN114880289A (en) File grouping display method and computing device
CN116821437A (en) Data processing method, device, electronic equipment and storage medium
CN111694992A (en) Data processing method and device
WO2021135598A1 (en) Method, apparatus, and computer device for updating front-end page on basis of index and value
EP3343395B1 (en) Data storage method and apparatus for mobile terminal
WO2020258652A1 (en) Character replacement method and system, computer apparatus, and computer readable storage medium
CN114816772B (en) Debugging method, debugging system and computing device for application running based on compatible layer
WO2022223038A1 (en) Key name generation method and device, and computer readable storage medium
CN107562423B (en) UI page development method and device
JP2020160494A (en) Information processing apparatus, document management system and program
KR101828466B1 (en) Method and apparatus for providing an object-based storage interface on the storage device based on file system
CN108196841B (en) Comment symbol adding method and device and electronic equipment
JP6753190B2 (en) Document retrieval device and program
CN111400342A (en) Database updating method, device, equipment and storage medium
CN117271440B (en) File information storage method, reading method and related equipment based on freeRTOS
US9323753B2 (en) Method and device for representing digital documents for search applications

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19944482

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19944482

Country of ref document: EP

Kind code of ref document: A1