CN107463676B - Text data storage method and device - Google Patents

Text data storage method and device Download PDF

Info

Publication number
CN107463676B
CN107463676B CN201710664232.0A CN201710664232A CN107463676B CN 107463676 B CN107463676 B CN 107463676B CN 201710664232 A CN201710664232 A CN 201710664232A CN 107463676 B CN107463676 B CN 107463676B
Authority
CN
China
Prior art keywords
node
target
binary tree
data structure
processed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710664232.0A
Other languages
Chinese (zh)
Other versions
CN107463676A (en
Inventor
孔令威
范渊
吴鸣旦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DBAPPSecurity Co Ltd
Original Assignee
DBAPPSecurity Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DBAPPSecurity Co Ltd filed Critical DBAPPSecurity Co Ltd
Priority to CN201710664232.0A priority Critical patent/CN107463676B/en
Publication of CN107463676A publication Critical patent/CN107463676A/en
Application granted granted Critical
Publication of CN107463676B publication Critical patent/CN107463676B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a text data storage method and a text data storage device, which relate to the technical field of text processing, and the method comprises the following steps: acquiring text data and data to be stored in the text data at the current moment; reading the number of characters of the data to be stored, and reading the position of the data to be stored in the text data, wherein the position comprises the following steps: the number of rows or columns; searching nodes corresponding to the positions in a pre-constructed binary tree data structure, determining the character numbers as node values of the corresponding nodes, and storing the character numbers, wherein the binary tree data structure specifies the storage sequence of the nodes in advance according to the intermediate-order traversal sequence of the binary tree. The invention stores the number of characters in each row or each column in the text data by using the binary tree data structure, thereby relieving the technical problem of low efficiency when the target character in the text is positioned in the prior art.

Description

Text data storage method and device
Technical Field
The present invention relates to the technical field of text processing, and in particular, to a method and an apparatus for storing text data.
Background
The storage of text characters is essentially the storage of a sequence of consecutive characters. The traditional storage mode of continuous character sequence has array and linked list, in the two storage modes, after inserting character and deleting character, the character after modifying position needs to be shifted, thus the corresponding relation between the character and the line number of the character in the text is changed.
However, for modern text editors, the correspondence between text characters and their line numbers is maintained continuously, which is beneficial for quickly locating target characters when the text is read and edited, and has higher practical value in scenes such as code editing and the like. In the positioning method in the prior art, in the two storage modes, the target character is searched from the beginning in the text, and the technical problem of low positioning efficiency exists for a large file.
Disclosure of Invention
In view of the above, an object of the present invention is to provide a method and an apparatus for storing text data, so as to alleviate the technical problem of low efficiency in locating a target character in a text in the prior art.
In a first aspect, an embodiment of the present invention provides a method for storing text character numbers, including:
acquiring text data and data to be stored in the text data at the current moment;
reading the number of characters of the data to be stored, and reading the position of the data to be stored in the text data, wherein the position comprises: the number of rows or columns;
searching nodes corresponding to the positions in a pre-constructed binary tree data structure, determining the character numbers as node values of the corresponding nodes, and storing the character numbers, wherein the binary tree data structure specifies the storage sequence of the nodes in advance according to the intermediate traversal sequence of the binary tree.
In combination with the first aspect, the embodiments of the present invention provide a first possible implementation manner of the first aspect, wherein,
reading the number of characters of the data to be stored, including:
counting the number of characters at each position in the text data;
sorting the character numbers according to the sequence of the positions to obtain a character number sequence;
and reading the character number of the data to be stored in the text data at the current moment from the character number sequence.
With reference to the first aspect, an embodiment of the present invention provides a second possible implementation manner of the first aspect, where searching for a node corresponding to the position in a pre-constructed binary tree data structure includes:
and searching a first target node from the binary tree data structure according to the storage sequence, and taking the first target node as the corresponding node, wherein the storage sequence of the first target node in the binary tree data structure is the same as the sequence position of the position in the text data.
With reference to the first aspect, an embodiment of the present invention provides a third possible implementation manner of the first aspect, where the method further includes:
acquiring an operation instruction sent by a user, executing corresponding operation on the binary tree data structure according to the operation instruction,
wherein the operation instruction at least comprises one of the following: modifying the operation instruction, inserting the operation instruction and deleting the operation instruction.
With reference to the third possible implementation manner of the first aspect, an embodiment of the present invention provides a fourth possible implementation manner of the first aspect, where performing, according to the operation instruction, a corresponding operation on the binary tree data structure includes:
acquiring a root node of the binary tree data structure, determining the root node as an initial node to be processed, repeatedly executing the following steps until a cycle stop condition is met,
a first comparison step, configured to compare the node number of the node to be processed at the current time with a target value, where the target value is a difference between a line number of a target line to be subjected to the operation in the text data and 1;
a second comparison step of comparing the node number of the left node of the node to be processed at the current time with the target value, in the case that the node number of the node to be processed at the current time is not less than the target value;
a first determination step, configured to determine, when the number of nodes of a left node of the to-be-processed node at the current time is greater than the target value, the left node of the to-be-processed node at the current time as a next-time to-be-processed node;
a second determining step, configured to determine, when the node number of a left node of the to-be-processed node at the current time is smaller than the target value, a right node of the to-be-processed node at the current time as a next-time to-be-processed node, and update the target value through an update formula, where the update formula is: k is a radical ofi=ki-1- (number of nodes +1) of left node among nodes to be processed at next time, kiFor updated target values, ki-1Is a target value before updating;
and a returning step, configured to determine a node to be processed at a next time as a node to be processed at a current time, and return to perform the comparing step, where the next time is the next time of the current time, and the cycle stop condition is that the number of nodes of a left node of the node to be processed at the current time is smaller than the target value, where after a cycle is ended, the node to be processed at the current time is determined as a second target node, and the binary tree data structure is operated based on the second target node.
With reference to the fourth possible implementation manner of the first aspect, the embodiment of the present invention provides a fifth possible implementation manner of the first aspect, where, in a case that the operation instruction is a modify operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value and a modified character number carried in the modification operation instruction, wherein the modified character number is the number of characters after the target line is modified;
replacing the node value of the second target node with the modified number of characters.
With reference to the fourth possible implementation manner of the first aspect, the embodiment of the present invention provides a sixth possible implementation manner of the first aspect, where, in a case that the operation instruction is an insert operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value and the number of inserted characters carried in the insertion operation instruction, wherein the number of inserted characters is the number of characters of a text inserted before the target line,
sequentially determining the node value of a first node to be corrected as the node value of a second node to be corrected, and determining the number of the inserted characters as the node value of the second target node;
the second node to be corrected is a node next to the first node to be corrected in the binary tree data structure, the first node to be corrected is the second target node, and nodes located after the second target node in the storage sequence.
With reference to the fourth possible implementation manner of the first aspect, the embodiment of the present invention provides a seventh possible implementation manner of the first aspect, wherein, in a case that the operation instruction includes a delete operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value carried in the deleting operation instruction;
sequentially taking a node value in a third node to be modified as a node value of a fourth node to be modified, and emptying a node value in a last node which executes storage operation in the binary tree data structure, or deleting the last node which executes storage operation;
the third node to be modified is a node next to the fourth node to be modified, the fourth node to be modified is the second target node, and nodes located in the storage sequence after the second target node.
With reference to the first aspect, an embodiment of the present invention provides an eighth possible implementation manner of the first aspect, where after determining the number of characters as a node value of the corresponding node, the method further includes:
extracting a third target node, wherein the third target node is any one node in the binary tree data structure;
calculating the sum of the node values of all nodes of the subtree represented by the third target node to obtain the node sum of the third target node, wherein the third target node is the root node of the subtree;
and correspondingly storing the node and the third target node to obtain storage information, wherein the storage information is used for positioning characters of the text data.
In a second aspect, an embodiment of the present invention further provides a storage apparatus for text data, including:
the acquisition module is used for acquiring text data and data to be stored in the text data at the current moment;
a reading module, configured to read a number of characters of the data to be stored, and read a position of the data to be stored in the text data, where the position includes: the number of rows or columns;
and the determining module is used for searching the node corresponding to the position in a pre-constructed binary tree data structure, determining the character number as the node value of the corresponding node, and storing the character number, wherein the binary tree data structure specifies the storage sequence of the nodes in advance according to the intermediate traversal sequence of the binary tree.
With reference to the second aspect, an embodiment of the present invention provides a first possible implementation manner of the second aspect, where the reading module includes:
a counting unit for counting the number of characters at each position in the text data;
the sorting unit is used for sorting the character numbers according to the sequence of the positions to obtain a character number sequence;
and the reading unit is used for reading the character number of the data to be stored in the text data at the current moment from the character number sequence.
With reference to the second aspect, an embodiment of the present invention provides a second possible implementation manner of the second aspect, where the determining module is configured to:
and searching a first target node from the binary tree data structure according to the storage sequence, and taking the first target node as the corresponding node, wherein the storage sequence of the first target node in the binary tree data structure is the same as the sequence position of the position in the text data.
With reference to the second aspect, an embodiment of the present invention provides a third possible implementation manner of the second aspect, where the text character number storage device further includes:
an operation module, configured to obtain an operation instruction sent by a user, perform a corresponding operation on the binary tree data structure according to the operation instruction,
wherein the operation instruction at least comprises one of the following: modifying the operation instruction, inserting the operation instruction and deleting the operation instruction.
With reference to the third possible implementation manner of the second aspect, an embodiment of the present invention provides a fourth possible implementation manner of the second aspect, where the operation module includes a positioning unit, and the positioning unit is configured to perform:
acquiring a root node of the binary tree data structure, determining the root node as an initial node to be processed, repeatedly executing the following steps until a cycle stop condition is met,
a first comparison step, configured to compare the node number of the node to be processed at the current time with a target value, where the target value is a difference between a line number of a target line to be subjected to the operation in the text data and 1;
a second comparison step of comparing the node number of the left node of the node to be processed at the current time with the target value, in the case that the node number of the node to be processed at the current time is not less than the target value;
a first determination step, configured to determine, when the number of nodes of a left node of the to-be-processed node at the current time is greater than the target value, the left node of the to-be-processed node at the current time as a next-time to-be-processed node;
a second determining step, configured to determine, when the node number of a left node of the to-be-processed node at the current time is smaller than the target value, a right node of the to-be-processed node at the current time as a next-time to-be-processed node, and update the target value through an update formula, where the update formula is: k is a radical ofi=ki-1- (number of nodes +1) of left node among nodes to be processed at next time, kiFor updated target values, ki-1Is a target value before updating;
and a returning step, configured to determine a node to be processed at a next time as a node to be processed at a current time, and return to perform the comparing step, where the next time is the next time of the current time, and the cycle stop condition is that the number of nodes of a left node of the node to be processed at the current time is smaller than the target value, where after a cycle is ended, the node to be processed at the current time is determined as a second target node, and the binary tree data structure is operated based on the second target node.
With reference to the fourth possible implementation manner of the second aspect, the embodiment of the present invention provides a fifth possible implementation manner of the second aspect, wherein the operation module includes a modification unit, and the modification unit is configured to:
in the case where the operation instruction is a modify operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value and a modified character number carried in the modification operation instruction, wherein the modified character number is the number of characters after the target line is modified;
replacing the node value of the second target node with the modified number of characters.
With reference to the fourth possible implementation manner of the second aspect, the embodiment of the present invention provides a sixth possible implementation manner of the second aspect, wherein the operation module includes an insertion unit, and the insertion unit is configured to:
in the case where the operation instruction is an insert operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value and the number of inserted characters carried in the insertion operation instruction, wherein the number of inserted characters is the number of characters of a text inserted before the target line,
sequentially determining the node value of a first node to be corrected as the node value of a second node to be corrected, and determining the number of the inserted characters as the node value of the second target node;
the second node to be corrected is a node next to the first node to be corrected in the binary tree data structure, the first node to be corrected is the second target node, and nodes located after the second target node in the storage sequence.
With reference to the fourth possible implementation manner of the second aspect, an embodiment of the present invention provides a seventh possible implementation manner of the second aspect, where the operation module includes a deleting unit, and the deleting unit is configured to:
in the case where the operation instruction includes a delete operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value carried in the deleting operation instruction;
sequentially taking a node value in a third node to be modified as a node value of a fourth node to be modified, and emptying a node value in a last node which executes storage operation in the binary tree data structure, or deleting the last node which executes storage operation;
the third node to be modified is a node next to the fourth node to be modified, the fourth node to be modified is the second target node, and nodes located in the storage sequence after the second target node.
With reference to the second aspect, an embodiment of the present invention provides an eighth possible implementation manner of the second aspect, where the text character number storage device further includes: a summing module to: after determining the number of characters as the node value of the corresponding node,
extracting a third target node, wherein the third target node is any one node in the binary tree data structure;
calculating the sum of the node values of all nodes of the subtree represented by the third target node to obtain the node sum of the third target node, wherein the third target node is the root node of the subtree;
and correspondingly storing the node and the third target node to obtain storage information, wherein the storage information is used for positioning characters of the text data.
The embodiment of the invention has the following beneficial effects: acquiring text data and data to be stored in the text data at the current moment; then reading the number of characters of the data to be stored, and reading the position of the data to be stored in the text data, wherein the position comprises the following steps: the number of rows or columns; searching nodes corresponding to the positions in a pre-constructed binary tree data structure, determining the character numbers as node values of the corresponding nodes, and storing the character numbers, wherein the binary tree data structure specifies the storage sequence of the nodes in advance according to the intermediate-order traversal sequence of the binary tree. The character number of the data to be stored is stored by a binary tree data structure which specifies the storage sequence of the nodes according to the middle-order traversal sequence of the binary tree, so that the search and maintenance of the character number of each row or each column in the text data are more convenient, the target character can be positioned according to the character number of each row or each column in the text data, and the technical problem of low efficiency in positioning the target character in the text in the prior art is solved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a diagram of a binary tree middle traversal;
fig. 2 is a flowchart of a method for storing text character numbers according to an embodiment of the present invention;
fig. 3 is a flowchart of another method for storing text character numbers according to an embodiment of the present invention;
fig. 4 is a flowchart of a positioning method according to an embodiment of the present invention;
FIG. 5 is a flow chart of a modification method according to an embodiment of the present invention;
fig. 6 is a flowchart of an insertion method according to an embodiment of the present invention;
fig. 7 is a flowchart of a deletion method according to an embodiment of the present invention;
fig. 8 is a flowchart of a summing method according to an embodiment of the present invention;
fig. 9 is a block diagram of a text data storage device according to a second embodiment of the present invention;
fig. 10 is a block diagram of another structure of a text data storage device according to a second embodiment of the present invention;
fig. 11 is a block diagram of an operation module according to a second embodiment of the present invention.
Icon: 100-an acquisition module; 200-a reading module; 300-a determination module; 400-an operation module; 401-a positioning unit; 402-a modification unit; 403-an insertion unit; 404-delete Unit.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In the positioning method in the prior art, in the two storage modes, the target character is searched from the beginning in the text, and the technical problem of low positioning efficiency exists for a large file. Based on this, the text data storage method and the text data storage device provided by the embodiment of the invention can solve the technical problem of low efficiency in the prior art when the target character in the text is positioned.
Before describing the embodiments of the present invention, an introduction is made to the binary tree data structure seat:
(1) the binary tree data structure stores data in the nodes, and the data stored by the target node is called a 'node value' of the target node;
(2) each node in the binary tree data structure has at most two child nodes, the child node located on the left side of the target node is called the "left node" of the target node, and the child node located on the right side is called the "right node" of the target node;
(3) the number of nodes contained in the subtree represented by the target node is called "node number" of the target node, wherein the subtree represented by the target node refers to: the root node of the subtree is a target node;
(4) the sum of the node values of all nodes contained in the subtree represented by the target node is called "node sum" of the target node;
(5) the binary tree traversal order of the binary tree data structure includes: a pre-order traversal, a post-order traversal, and a mid-order traversal. The middle-order traversal is explained by focusing on that the left sub-tree is traversed firstly, then the root node is traversed, and finally the right sub-tree is traversed, wherein when the left sub-tree and the right sub-tree are traversed, the left sub-tree is traversed firstly, then the root node is traversed, and finally the right sub-tree is traversed. Referring to FIG. 1, in the binary tree data structure shown in FIG. 1, the order of the middle-ordered traversal is DBEAFCG.
Example one
The embodiment of the invention provides a method for storing text character number, as shown in fig. 2, comprising the following steps:
step S12, acquiring the text data and the data to be stored in the text data at the current time.
Step S14, reading the number of characters of the data to be stored, and reading the position of the data to be stored in the text data, the position including: number of rows or columns.
Specifically, in the case where text data is sorted in lines, the position includes a line; and in the case where the text data is sorted in columns, the positions include columns.
Step S16, searching for a node corresponding to the position in the pre-constructed binary tree data structure, and determining the number of characters as a node value of the corresponding node to store the number of characters, wherein the binary tree data structure specifies the storage order of the nodes in advance according to the intermediate traversal order of the binary tree.
In the embodiment of the invention, the character number of the data to be stored is stored by the binary tree data structure which specifies the storage sequence of the nodes according to the middle-order traversal sequence of the binary tree in advance, so that the search and maintenance of the character number of each row or each column in the text data are more convenient, the target character can be positioned according to the character number of each row or each column in the text data, and the technical problem of lower efficiency in positioning the target character in the text in the prior art is solved.
In an optional implementation manner of the embodiment of the present invention, reading the number of characters of the data to be stored includes:
counting the number of characters at each position in the text data;
sorting the character numbers according to the sequence of the positions to obtain a character number sequence;
and reading the character number of the data to be stored in the text data at the current moment from the character number sequence.
In another optional implementation manner of the embodiment of the present invention, searching for a node corresponding to a position in a pre-constructed binary tree data structure includes:
and searching a first target node from the binary tree data structure according to the storage sequence, and taking the first target node as a corresponding node, wherein the storage sequence of the first target node in the binary tree data structure is the same as the sequence position of the position in the text data. For example, in the case where the data to be stored is the first row in the text data, then the node corresponding to the first row in the binary tree data structure shown in fig. 1 is the a node; in the case where the data to be stored is the third line in the text data, then the node corresponding to the third line in the binary tree data structure shown in fig. 1 is the F node.
In another optional implementation manner of the embodiment of the present invention, as shown in fig. 3, the text character number data storage method further includes:
step S18, obtaining the operation instruction sent by the user, executing the corresponding operation to the binary tree data structure according to the operation instruction,
wherein, the operation instruction at least comprises one of the following: modifying the operation instruction, inserting the operation instruction and deleting the operation instruction.
In the embodiment of the invention, after the character number of the text data is stored in the binary tree data structure, the operation instruction sent by the user is obtained, and the corresponding operation is executed on the binary tree data structure according to the operation instruction, so that the aim of maintaining the binary tree data structure in the process of modifying the text data is fulfilled.
Under the condition that the operation instructions respectively comprise modification operation instructions, insertion operation instructions and deletion operation instructions, the process of maintaining the binary tree data structure is as follows:
in another optional implementation manner of the embodiment of the present invention, as shown in fig. 4, in step S18, performing corresponding operations on the binary tree data structure according to the operation instruction includes: a positioning step, wherein the positioning step comprises:
step S41, acquiring a root node of the binary tree data structure, determining the root node as an initial node to be processed, repeatedly executing the following steps until a loop stop condition is satisfied,
step S42, a first comparison step, configured to compare the node number of the node to be processed at the current time with a target value, where the target value is a difference between the line number of the target line to be operated in the text data and 1.
Specifically, the positioning step is ended when the number of nodes of the node to be processed at the current time is smaller than the target value.
Step S43, a second comparison step, executed to compare the node number of the left node of the node to be processed at the current time with the target value, if the node number of the node to be processed at the current time is not less than the target value;
step S44, a first determining step, configured to determine, when the node number of the left node of the node to be processed at the current time is greater than the target value, the left node of the node to be processed at the current time as a node to be processed at the next time;
step S45, a second determining step, configured to determine, when the node number of the left node of the node to be processed at the current time is smaller than the target value, the right node of the node to be processed at the current time as the node to be processed at the next time, and update the target value through an update formula, where the update formula is: k is a radical ofi=ki-1- (number of nodes +1) of left node among nodes to be processed at next time, kiFor updated target values, ki-1Is a target value before updating;
and step S46, returning to the step, namely determining the node to be processed at the next moment as the node to be processed at the current moment, and returning to the step of performing the comparison, wherein the next moment is the next moment of the current moment, and the cycle stop condition is that the number of nodes of the left node of the node to be processed at the current moment is less than the target value.
After the loop is ended, step S47 is executed, the node to be processed at the current time is determined as the second target node, and the binary tree data structure is operated based on the second target node.
The embodiment of the invention provides the method for positioning the target row of the operation to be executed when the corresponding operation is executed on the binary tree data structure according to the operation instruction, so that the further operation can be implemented based on the positioned target row.
It should be emphasized that, optionally, after the second target node corresponding to the target line is located, the node value of the second target node may also be output to the user, so as to achieve the purpose of enabling the user to search for the number of characters of the target line. It should be noted that the "user" in the embodiment of the present invention includes a person or a device on the user side, the same applies below.
In another optional implementation manner of the embodiment of the present invention, as shown in fig. 5, in a case that the operation instruction is a modify operation instruction, the operating the binary tree data structure based on the second target node in step S47 includes:
step S51, acquiring a target numerical value and a modified character number carried in the modification operation instruction, wherein the modified character number is the number of characters after the target line is modified;
in step S52, the node value of the second target node is replaced with the modified character number.
Specifically, in the modification of a certain line of characters in the text data (including deletion, addition, and replacement of words), if the total number of lines of text is not increased or decreased, the flow of maintaining the binary tree data structure given in conjunction with step S51 and step S52 in the embodiment of the present invention is applicable.
In another optional implementation manner of the embodiment of the present invention, as shown in fig. 6, in a case that the operation instruction is an insert operation instruction, the operating the binary tree data structure based on the second target node in step S47 includes:
step S61, operating the binary tree data structure based on the second target node, including:
step S62, obtaining the target numerical value and the number of inserted characters carried in the insertion operation instruction, wherein the number of inserted characters is the number of characters of the text inserted before the target line,
step S63, sequentially determining the node value in the first node to be corrected as the node value of the second node to be corrected, and determining the number of inserted characters as the node value of the second target node;
the second node to be corrected is a node next to the first node to be corrected in the binary tree data structure, the first node to be corrected is a second target node, and nodes located after the second target node in the storage sequence.
Specifically, when inserting a line of characters into text data, the method is applicable to the procedure of maintaining the binary tree data structure in the embodiment of the present invention, which is given in conjunction with step S61, step S62, and step S63.
In another optional implementation manner of the embodiment of the present invention, as shown in fig. 7, in a case that the operation instruction includes a delete operation instruction, the operating, based on the second target node, on the binary tree data structure in step S47 includes:
step S71, acquiring a target numerical value carried in the deleting operation instruction;
step S72, sequentially taking the node value in the third node to be modified as the node value of the fourth node to be modified, and emptying the node value in the last node executing the storage operation in the binary tree data structure, or deleting the last node executing the storage operation;
the third node to be modified is the next node after the fourth node to be modified, the fourth node to be modified is the second target node, and the nodes positioned after the second target node in the storage sequence.
Specifically, when a new line of characters is added to the text data, the method is suitable for the procedure of maintaining the binary tree data structure in the embodiment of the present invention, which is provided in connection with step S71 and step S72.
In another optional implementation manner of the embodiment of the present invention, after determining the number of characters as a node value of the corresponding node, the method for storing text data further includes:
extracting a third target node, wherein the third target node is any one node in a binary tree data structure;
calculating the sum of the node values of all nodes of the subtree represented by the third target node to obtain the node sum of the third target node, wherein the third target node is the root node of the subtree;
and correspondingly storing the node and the third target node to obtain storage information, wherein the storage information is used for positioning characters of the text data.
Further, referring to fig. 8, when the user needs to obtain the total number of characters in the first n lines of the text data, the total number of characters in the first n lines may be pushed to the user by the following steps:
step S81, assigning an initial value to the sum of the numbers of the first n rows of characters: acquiring the root node of the binary tree data structure, determining the root node as the initial node to be processed, repeatedly executing the steps S82, S83, S84 and S85 until a loop stop condition is met, wherein,
step S82, configured to compare the node number of the node to be processed at the current time with a target value, where the target value is a difference between a line number of a target line to be operated in the text data and 1, that is, the target value k is n-1;
specifically, when the number of nodes is smaller than the target numerical value, the information to be displayed can be returned to be "wrong", and the step of pushing the sum of the number of the front n rows of characters to the user is finished; in the case where the number of nodes is not less than the target value, step S83 is executed.
Step S83, comparing the node number of the left node of the node to be processed at the current time with the target value.
Wherein, when the node number of the left node is greater than the target value, the step S84 is executed; in the case where the node number of the left node is smaller than the target value, step S85 is executed; in the case where the node number of the left node is equal to the target value, the loop is ended
Step S84, determining the left node in the nodes to be processed at the current moment as the node to be processed at the next moment, determining the node to be processed at the next moment as the node to be processed at the current moment, and returning to execute the step S82;
step S85, determining the right node among the nodes to be processed at the current time as the node to be processed at the next time, and executing the first assigning step: sumi=sumi-1A + node value of left node and + node value of node to be processed at next time among nodes to be processed at next time) and a second target value update step ki=ki-1- (node number of left node among nodes to be processed at the next time +1), and determining the node to be processed at the next time as the node to be processed at the current time, and returning to perform step S82, where k isiFor the target after the update in the target value update stepNominal value, ki-1Updating the previous target value in the target value updating step;
wherein, after the loop is finished, step S86 is executed, and the step of pushing the sum of the number of the first n lines of characters to the user is finished, wherein,
step S86, configured to perform the second assigning step: sumi=sumi-1+ (node of left node and node value of + node to be processed at current time among nodes to be processed at current time), and then end, sum may be addediOutput to the user, wherein, sumiSum of the numbers of the first n lines of characters obtained at the current moment in the loop stepi-1The sum of the numbers of the first n rows of characters obtained at the last moment.
The next time is a time next to the current time, and the previous time is a time previous to the current time. The loop stop condition, that is, the comparison result in step S83, is that the node number of the left node of the nodes to be processed at the present time is equal to the target value.
Example two
An embodiment of the present invention further provides a storage apparatus for text data, as shown in fig. 9, including:
an obtaining module 100, configured to obtain text data and data to be stored in the text data at the current time;
the reading module 200 is configured to read the number of characters of the data to be stored, and read the position of the data to be stored in the text data, where the position includes: the number of rows or columns;
a determining module 300, configured to search a node corresponding to the position in a pre-constructed binary tree data structure, and determine the number of characters as a node value of the corresponding node, so as to store the number of characters, where the binary tree data structure specifies a storage order of the nodes in advance according to a middle-order traversal order of the binary tree.
In the embodiment of the present invention, the obtaining module 100 obtains text data and data to be stored in the text data at the current time, then the reading module 200 reads the number of characters of the data to be stored, and reads the position of the data to be stored in the text data, and the determining module 300 stores the number of characters of the data to be stored through a binary tree data structure in which the storage order of nodes is specified in advance according to the middle-order traversal order of a binary tree, so that the search and maintenance of the number of characters of each row or each column in the text data are more convenient, and it is beneficial to locate a target character according to the number of characters of each row or each column in the text data, thereby alleviating the technical problem of low efficiency in locating the target character in the text in the prior art.
In an optional implementation manner of the embodiment of the present invention, the reading module 200 includes:
a counting unit for counting the number of characters at each position in the text data;
the sorting unit is used for sorting the character numbers according to the sequence of the positions to obtain a character number sequence;
and the reading unit is used for reading the character number of the data to be stored in the text data at the current moment from the character number sequence.
In an optional implementation manner of the embodiment of the present invention, the determining module 300 is configured to:
and searching a first target node from the binary tree data structure according to the storage sequence, and taking the first target node as a corresponding node, wherein the storage sequence of the first target node in the binary tree data structure is the same as the sequence position of the position in the text data.
In an optional implementation manner of the embodiment of the present invention, as shown in fig. 10, the text character number storage device further includes:
an operation module 400, configured to obtain an operation instruction sent by a user, perform a corresponding operation on the binary tree data structure according to the operation instruction,
wherein, the operation instruction at least comprises one of the following: modifying the operation instruction, inserting the operation instruction and deleting the operation instruction.
In an optional implementation manner of the embodiment of the present invention, referring to fig. 11, the operation module 400 includes a positioning unit 401, where the positioning unit 401 is configured to:
acquiring a root node of a binary tree data structure, determining the root node as an initial node to be processed, repeatedly executing the following steps until a cycle stop condition is met,
a first comparison step, configured to compare the node number of the node to be processed at the current time with a target value, where the target value is a difference between a line number of a target line to be operated in the text data and 1;
a second comparison step of comparing the node number of the left node of the node to be processed at the current time with the target value under the condition that the node number of the node to be processed at the current time is not less than the target value;
a first determination step, configured to determine, when the node number of a left node of the to-be-processed node at the current time is greater than a target value, the left node of the to-be-processed node at the current time as a next-time to-be-processed node;
a second determining step, configured to determine, when the node number of the left node of the to-be-processed node at the current time is smaller than the target value, the right node of the to-be-processed node at the current time as the to-be-processed node at the next time, and update the target value through an update formula, where the update formula is: k is a radical ofi=ki-1- (number of nodes +1) of left node among nodes to be processed at next time, kiFor updated target values, ki-1Is a target value before updating;
and a returning step, configured to determine the node to be processed at the next time as the node to be processed at the current time, and return to the comparison step, where the next time is the next time of the current time, and the cycle stop condition is that the number of nodes of the left node of the node to be processed at the current time is smaller than the target value, where after the cycle is ended, the node to be processed at the current time is determined as a second target node, and the binary tree data structure is operated based on the second target node.
In an alternative implementation of the embodiment of the present invention, referring to fig. 11, the operation module 400 includes a modification unit 402, where the modification unit 402 is configured to:
in the case where the operation instruction is a modify operation instruction,
operating on the binary tree data structure based on the second target node, comprising:
acquiring a target numerical value and a modified character number carried in a modification operation instruction, wherein the modified character number is the number of characters after a target line is modified;
and replacing the node value of the second target node with the modified character number.
In an optional implementation manner of the embodiment of the present invention, referring to fig. 11, the operation module 400 includes an inserting unit 403, where the inserting unit 403 is configured to:
in the case where the operation instruction is an insert operation instruction,
operating on the binary tree data structure based on the second target node, comprising:
acquiring a target numerical value and an inserted character number carried in an insertion operation instruction, wherein the inserted character number is the character number of a text inserted before a target line,
sequentially determining the node value of the first node to be corrected as the node value of the second node to be corrected, and determining the number of the inserted characters as the node value of the second target node;
the second node to be corrected is a node next to the first node to be corrected in the binary tree data structure, the first node to be corrected is a second target node, and nodes located after the second target node in the storage sequence.
In an optional implementation manner of the embodiment of the present invention, referring to fig. 11, the operation module 400 includes a deleting unit 404, where the deleting unit 404 is configured to:
in the case where the operation instruction includes a delete operation instruction,
operating on the binary tree data structure based on the second target node, comprising:
acquiring a target numerical value carried in a deleting operation instruction;
sequentially taking the node value in the third node to be modified as the node value of the fourth node to be modified, and emptying the node value in the last node which executes the storage operation in the binary tree data structure, or deleting the last node which executes the storage operation;
the third node to be modified is the next node after the fourth node to be modified, the fourth node to be modified is the second target node, and the nodes positioned after the second target node in the storage sequence.
In an optional implementation manner of the embodiment of the present invention, the text character number storage device further includes: a summing module to: after determining the number of characters as the node value of the corresponding node,
extracting a third target node, wherein the third target node is any one node in a binary tree data structure;
calculating the sum of the node values of all nodes of the subtree represented by the third target node to obtain the node sum of the third target node, wherein the third target node is the root node of the subtree;
and correspondingly storing the node and the third target node to obtain storage information, wherein the storage information is used for positioning characters of the text data.
Further, the summing module may be connected to a n-line preceding character number of text data obtaining unit, the summing unit being configured to perform the steps S81, S82, S83, S84, S85 and S86 in the first embodiment.
Further, the positioning unit 401, the modifying unit 402, the inserting unit 403, the deleting unit 404, and the summing unit in the text character number storage apparatus according to the embodiment of the present invention may be configured with a data transmission interface, and the data receiving port receives a corresponding operation instruction sent by a user, and then outputs data that the user wants to obtain to the user according to data included in the operation instruction.
The computer program product of the text storage method and apparatus provided in the embodiments of the present invention includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiments, and specific implementation may refer to the method embodiments, and details are not described here.
It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In addition, in the description of the embodiments of the present invention, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meanings of the above terms in the present invention can be understood in specific cases to those skilled in the art.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer", etc., indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, and are only for convenience of description and simplicity of description, but do not indicate or imply that the device or element being referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention.
Furthermore, the terms "first," "second," and "third" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (10)

1. A method for storing text data, comprising:
acquiring text data and data to be stored in the text data at the current moment;
reading the number of characters of the data to be stored, and reading the position of the data to be stored in the text data, wherein the position comprises: the number of rows or columns;
searching nodes corresponding to the positions in a pre-constructed binary tree data structure, determining the character numbers as node values of the corresponding nodes, and storing the character numbers, wherein the binary tree data structure specifies the storage sequence of the nodes in advance according to the middle-order traversal sequence of the binary tree;
and the pre-constructed binary tree data structure is a binary tree data structure constructed according to the text data.
2. The method of claim 1, wherein reading the number of characters of the data to be stored comprises:
counting the number of characters at each position in the text data;
sorting the character numbers according to the sequence of the positions to obtain a character number sequence;
and reading the character number of the data to be stored in the text data at the current moment from the character number sequence.
3. The method of claim 1, wherein searching for a node corresponding to the location in a pre-constructed binary tree data structure comprises:
and searching a first target node from the binary tree data structure according to the storage sequence, and taking the first target node as the corresponding node, wherein the storage sequence of the first target node in the binary tree data structure is the same as the sequence position of the position in the text data.
4. The method of claim 1, wherein after storing the number of characters in the corresponding node, the method further comprises:
acquiring an operation instruction sent by a user, executing corresponding operation on the binary tree data structure according to the operation instruction,
wherein the operation instruction at least comprises one of the following: modifying the operation instruction, inserting the operation instruction and deleting the operation instruction.
5. The method of claim 4, wherein performing the corresponding operation on the binary tree data structure according to the operation instruction comprises:
acquiring a root node of the binary tree data structure, determining the root node as an initial node to be processed, repeatedly executing the following steps until a cycle stop condition is met,
a first comparison step, configured to compare the node number of the node to be processed at the current time with a target value, where the target value is a difference between a line number of a target line to be subjected to the operation in the text data and 1;
a second comparison step of comparing the node number of the left node of the node to be processed at the current time with the target value, in the case that the node number of the node to be processed at the current time is not less than the target value;
a first determination step, configured to determine, when the number of nodes of a left node of the to-be-processed node at the current time is greater than the target value, the left node of the to-be-processed node at the current time as a next-time to-be-processed node;
a second determining step, configured to determine, when the node number of a left node of the to-be-processed node at the current time is smaller than the target value, a right node of the to-be-processed node at the current time as a next-time to-be-processed node, and update the target value through an update formula, where the update formula is: k is a radical ofi=ki-1- (number of nodes +1) of left node among nodes to be processed at next time, kiFor updated target values, ki-1Is a target value before updating;
and a returning step, configured to determine a node to be processed at a next time as a node to be processed at a current time, and return to perform the comparing step, where the next time is the next time of the current time, and the cycle stop condition is that the number of nodes of a left node of the node to be processed at the current time is smaller than the target value, where after a cycle is ended, the node to be processed at the current time is determined as a second target node, and the binary tree data structure is operated based on the second target node.
6. The method according to claim 5, wherein, in the case that the operation instruction is a modify operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value and a modified character number carried in the modification operation instruction, wherein the modified character number is the number of characters after the target line is modified;
replacing the node value of the second target node with the modified number of characters.
7. The method according to claim 5, wherein, in the case where the operation instruction is an insert operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value and the number of inserted characters carried in the insertion operation instruction, wherein the number of inserted characters is the number of characters of a text inserted before the target line,
sequentially determining the node value of a first node to be corrected as the node value of a second node to be corrected, and determining the number of the inserted characters as the node value of the second target node;
the second node to be corrected is a node next to the first node to be corrected in the binary tree data structure, the first node to be corrected is the second target node, and nodes located after the second target node in the storage sequence.
8. The method according to claim 5, wherein, in the case where the operation instruction includes a delete operation instruction,
the operating the binary tree data structure based on the second target node includes:
acquiring the target numerical value carried in the deleting operation instruction;
sequentially taking a node value in a third node to be modified as a node value of a fourth node to be modified, and emptying a node value in a last node which executes storage operation in the binary tree data structure, or deleting the last node which executes storage operation;
the third node to be modified is a node next to the fourth node to be modified, the fourth node to be modified is the second target node, and nodes located in the storage sequence after the second target node.
9. The method of claim 1, wherein after determining the number of characters as the node value of the corresponding node, the method further comprises:
extracting a third target node, wherein the third target node is any one node in the binary tree data structure;
calculating the sum of the node values of all nodes of the subtree represented by the third target node to obtain the node sum of the third target node, wherein the third target node is the root node of the subtree;
and correspondingly storing all the nodes of the subtree represented by the third target node and the third target node to obtain storage information, wherein the storage information is used for positioning characters of the text data.
10. A storage device of text data, comprising:
the acquisition module is used for acquiring text data and data to be stored in the text data at the current moment;
a reading module, configured to read a number of characters of the data to be stored, and read a position of the data to be stored in the text data, where the position includes: the number of rows or columns;
a determining module, configured to search a node corresponding to the position in a pre-constructed binary tree data structure, and determine the number of characters as a node value of the corresponding node, so as to store the number of characters, where the binary tree data structure specifies a storage order of nodes in advance according to a middle-order traversal order of a binary tree;
and the pre-constructed binary tree data structure is a binary tree data structure constructed according to the text data.
CN201710664232.0A 2017-08-04 2017-08-04 Text data storage method and device Active CN107463676B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710664232.0A CN107463676B (en) 2017-08-04 2017-08-04 Text data storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710664232.0A CN107463676B (en) 2017-08-04 2017-08-04 Text data storage method and device

Publications (2)

Publication Number Publication Date
CN107463676A CN107463676A (en) 2017-12-12
CN107463676B true CN107463676B (en) 2020-06-30

Family

ID=60548543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710664232.0A Active CN107463676B (en) 2017-08-04 2017-08-04 Text data storage method and device

Country Status (1)

Country Link
CN (1) CN107463676B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108829872B (en) * 2018-06-22 2021-03-09 武汉轻工大学 Method, device, system and storage medium for rapidly processing lossless compressed file
CN110032366B (en) * 2019-04-19 2022-07-22 北京奇艺世纪科技有限公司 Code positioning method and device

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011221845A (en) * 2010-04-12 2011-11-04 Fujitsu Ltd Text processing apparatus, text processing method and text processing program
CN102831121A (en) * 2011-06-15 2012-12-19 阿里巴巴集团控股有限公司 Method and system for extracting webpage information
CN104750834A (en) * 2015-04-03 2015-07-01 浪潮通信信息系统有限公司 Rule storage method and matching method and device

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011221845A (en) * 2010-04-12 2011-11-04 Fujitsu Ltd Text processing apparatus, text processing method and text processing program
CN102831121A (en) * 2011-06-15 2012-12-19 阿里巴巴集团控股有限公司 Method and system for extracting webpage information
CN104750834A (en) * 2015-04-03 2015-07-01 浪潮通信信息系统有限公司 Rule storage method and matching method and device

Also Published As

Publication number Publication date
CN107463676A (en) 2017-12-12

Similar Documents

Publication Publication Date Title
US7882109B2 (en) Computer representation of a data tree structure and the associated encoding/decoding methods
CN110781960B (en) Training method, classification method, device and equipment of video classification model
CN107644070B (en) Data indexing method, data query method and electronic equipment
CN109522271B (en) Batch insertion and deletion method and device for B + tree nodes
CN107463676B (en) Text data storage method and device
CN106651972B (en) Binary image coding and decoding methods and devices
CN112347142B (en) Data processing method and device
CN111984732B (en) Method, node and blockchain network for implementing decentralization search on blockchain
CN111708921B (en) Number selection method, device, equipment and storage medium
CN109254962B (en) Index optimization method and device based on T-tree and storage medium
CN111611788B (en) Data processing method and device, electronic equipment and storage medium
CN112765976A (en) Text similarity calculation method, device and equipment and storage medium
CN106933934B (en) Data table connection method and device
CN106569986B (en) Character string replacing method and device
CN1768480B (en) Encoding device and method, decoding device and method
CN108376054B (en) Processing method and device for indexing identification data
CN113065419B (en) Pattern matching algorithm and system based on flow high-frequency content
KR20140031269A (en) Method and device for determining font
CN115344538A (en) Log processing method, device and equipment and readable storage medium
CN112883718B (en) Spelling error correction method and device based on Chinese character sound-shape similarity and electronic equipment
CN114385624A (en) Encoding method, encoding searching method, device, electronic equipment and storage medium
CN112579839B (en) Multi-mode matching method and device for large-scale features and storage medium
CN105095276B (en) Method and device for mining maximum repetitive sequence
CN110765079B (en) Table information searching method and device
CN113127861A (en) Rule hit detection method and device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310052 188 Lianhui street, Xixing street, Binjiang District, Hangzhou, Zhejiang Province

Applicant after: DBAPPSECURITY Ltd.

Address before: Zhejiang Zhongcai Building No. 68 Binjiang District road Hangzhou City, Zhejiang Province, the 310051 and 15 layer

Applicant before: DBAPPSECURITY Co.,Ltd.

GR01 Patent grant
GR01 Patent grant