CN115878851A - Method and device for editing XML file, electronic equipment and storage medium - Google Patents

Method and device for editing XML file, electronic equipment and storage medium Download PDF

Info

Publication number
CN115878851A
CN115878851A CN202211524323.1A CN202211524323A CN115878851A CN 115878851 A CN115878851 A CN 115878851A CN 202211524323 A CN202211524323 A CN 202211524323A CN 115878851 A CN115878851 A CN 115878851A
Authority
CN
China
Prior art keywords
storage unit
edited
line record
editing
xml file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211524323.1A
Other languages
Chinese (zh)
Inventor
张志毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Uc Mobile China Co ltd
Original Assignee
Uc Mobile China Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Uc Mobile China Co ltd filed Critical Uc Mobile China Co ltd
Priority to CN202211524323.1A priority Critical patent/CN115878851A/en
Publication of CN115878851A publication Critical patent/CN115878851A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The application discloses a method and a device for editing an XML file, electronic equipment and a storage medium, and relates to the technical field of computers. The method comprises the following steps: obtaining an original XML file to be edited; analyzing a data stream of an original XML file by using an analyzer to acquire a starting position and an ending position of a target storage unit related to a line record to be edited; sequentially reading the data stream of the original XML file by taking the storage unit as a unit, and judging whether the read storage unit is a target storage unit or not; if yes, the following operations are executed: writing the information in the target storage unit into a preset editable storage unit; editing the record of the line to be edited in the editable storage unit according to the editing requirement aiming at the record of the line to be edited to form an edited storage unit; and writing back the edited storage unit. Therefore, the problems of memory overflow and long time consumption for editing and storing caused by easily editing the XML file are solved.

Description

Method and device for editing XML file, electronic equipment and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for editing an XML file, an electronic device, and a storage medium.
Background
In order to meet the requirements of people on working and learning anytime and anywhere, browsers APP of more and more mobile terminals support browsing and editing Excel documents. However, due to the limited virtual machine memory of the mobile terminal device browser APP, it is very challenging to implement the function of fast editing and saving the Excel document with an excessively large data volume without memory overflow. The Excel document is formed by combining various files, wherein a core file for storing information is an XML file.
The technique for editing and modifying XML files adopted in the prior art usually uses a DOM method, and the method needs to load a core file, namely an XML file, which plays a role in storing information in an Excel document, into a memory in an integrated manner, and then analyze the XML file to form a DOM tree; the tree structure of the whole document is resident in the memory, so that the document is very convenient for various operations and supports multiple functions of deletion, modification, rearrangement and the like, but the DOM method usually needs to occupy 5-10 times of the memory of an XML file, for a large file, the problems of excessive memory consumption, easy memory overflow and the like caused by one-time reading exist, for mobile terminal equipment, a scene that the Excel file needs to be locally read at a mobile terminal through a browser APP, a mobile communication software APP and the like, the memory requirement is difficult to meet, and the reading failure caused by the memory overflow is very easy to cause.
In the prior art, another VTD-XML parsing mode exists, which is a non-extraction XML parsing method, better solves the defect that DOM occupies too large memory, and also provides the characteristics of rapid parsing and traversal, XPath support, increment updating and the like. The specific processing procedure of the VTD-XML method is to read the original XML file into a memory in a binary mode without decoding, then analyze the position of each element (element) on the binary byte array and record the relevant information, and the record is called VTD (Virtual Token Descriptor); the subsequent traversal operation is performed on the saved records; if the XML content needs to be extracted, the information such as the position in the record is utilized to decode on the original byte array and return the character string. Because the complete DOM tree does not need to be stored, the memory occupied by the method is reduced to the memory with the file size of 1.3-1.5 times. However, for a scenario that the mobile terminal APP can only provide hundreds of megabytes (M) of memory, the memory usage at this level is not friendly, and if a very large XML file is encountered, such as a file of hundreds of megabytes (M) or thousands of megabytes (G), memory overflow may also occur, which may cause program crash.
In addition, the write-back speed of the method is very slow when the Excel super-large file is edited and stored, and the user experience is influenced due to the consumption of too long time of the user.
Under the prior art, there is also a SAX parsing approach that employs event-driven rather than document-driven processing for XML files. The event-driven method is characterized in that a program operation method based on a callback mechanism is adopted in the scene, XML files are analyzed from outside to inside layer by layer, and are sequentially read and displayed, and the method does not need to store complete document information in a memory.
Therefore, in the situation that a great amount of Excel files need to be opened through a mobile phone browser or a mobile phone APP at present, a method which occupies a small amount of memory and can edit the Excel files, which are mainly core files for information storage, namely XML files, is urgently needed.
Disclosure of Invention
In view of this, embodiments of the present application provide a method and an apparatus for editing an XML document, an electronic device, and a readable storage medium. The method provided by the embodiment of the application can solve the problems that in the prior art, when the XML document file is edited, the memory overflow is easy to cause and the time consumption for editing and storing is long.
The application provides a method for editing an XML file, which comprises the following steps:
obtaining an original XML file to be edited;
analyzing the data stream of the original XML file by using an analyzer to acquire the starting position and the ending position of a target storage unit related to the line record to be edited;
sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation on the read content; if yes, the following operations are executed:
writing the information in the target storage unit into a preset editable storage unit;
editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
and writing back the edited storage unit.
Optionally, the parsing, by using a parser, the parsing a data stream of the original XML file to obtain a start position and an end position of a target storage unit related to a record of a line to be edited, including:
reading the original XML file by using a parser, and encoding the read data into a character string;
reading characters in the character string, and identifying a start mark and an end mark of a line record in the character string;
when the editing operation is to modify the line record, the following operations are performed:
according to the start mark and the end mark of the identified line record, combining with the progress monitoring of a progress monitor, obtaining the start mark and the end mark of the line record to be modified corresponding to the original XML file;
taking the start position of a storage unit where the start mark of the line record to be modified is located as the start position of the target storage unit, and taking the end position of the storage unit where the end mark of the line record to be modified is located as the end position of the target storage unit;
when the editing operation is inserting line recording, the following operations are executed:
according to the identified start mark and end mark of the line record, combining the progress monitoring of a progress monitor to obtain the start mark and end mark of the previous line record corresponding to the line record to be inserted of the original XML file;
and taking the starting position of the storage unit where the starting mark of the previous line record is located as the starting position of the target storage unit, and taking the ending position of the storage unit where the ending mark of the previous line record is located as the ending position of the target storage unit.
Optionally, the process monitoring of the process monitor includes:
according to the identified start mark and end mark of the line record, sequentially arranging and progressively increasing the count of the line record to obtain the sequence number of each line record;
determining a start mark and an end mark related to the row record to be edited according to the sequence number and information of the position to be edited in a table provided by editing operation received by an operation interface; the editing operation comprises a line record modifying operation and a line record adding operation.
Optionally, the sequentially reading the data stream of the original XML file with the storage unit as a unit, and determining whether the read storage unit is the target storage unit according to the start position and the end position of the target storage unit, includes:
sequentially reading a data stream of an original XML file, and recording size = size + n, wherein n is the size of the actually read data stream, and size is the size of the currently read total data;
judging whether the size is in the range of (starting position of the target storage unit-storage unit size, starting position of the target storage unit), if so, judging that the read storage unit is the target storage unit;
continuing to read the data stream of the original XML file, judging whether the size is larger than the end position of the target storage unit, and if not, judging that the read storage unit is the target storage unit; if yes, the read storage unit is judged not to be the target storage unit.
Optionally, in the editable storage unit, according to an editing requirement for the line record to be edited, the line record to be edited is edited to form an edited storage unit, where if the editing is to modify an existing line record, the method includes:
performing character string segmentation on the information stored in the editable storage unit;
identifying the line number of the line record to be edited according to the character string, and determining the specific starting position and ending position of the line record to be edited in the character string by combining the starting mark and the ending mark of the line record;
taking out the information between the starting position and the ending position, and converting the information into element objects, wherein the element objects comprise cell objects;
according to the editing requirement, carrying out editing operation on the target cell which needs to be edited specifically;
and converting the target table cells after the editing operation and the contained element objects into character strings, and writing back the character strings to the original sequence positions in the character strings in the editable storage unit to form an edited storage unit.
Optionally, in the editable storage unit, according to an editing requirement for the line record to be edited, editing the line record to be edited to form an edited storage unit, where if the editing is inserting a new line record, the editing includes:
performing character string segmentation on the information stored in the editable storage unit;
determining the end position of the previous line record of the line record to be inserted according to the character string and the line number of the line record to be inserted in combination with the end mark of the line record;
and inserting a new serialized character string of the line record from the ending position of the previous line record, and sequentially placing the information originally at the position to the back to form an edited storage unit.
Optionally, the analyzer analyzes by using an SAX analysis method. The present application further provides an apparatus for editing an XML file, the apparatus comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an original XML file to be edited;
the analysis unit is used for analyzing the data stream of the original XML file by using an analyzer and acquiring the starting position and the ending position of a target storage unit related to the record of the line to be edited;
a judging unit, configured to sequentially read the data stream of the original XML file by using a storage unit as a unit, and judge whether the read storage unit is a target storage unit according to a start position and an end position of the target storage unit;
the writing unit is used for writing the information in the target storage unit into a preset editable storage unit when the judgment result of the judging unit is yes;
the editing unit is used for editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
the write-back unit is used for executing write-back operation on the read content when the judgment result of the judgment unit is negative; and the write-back operation is executed on the storage unit after the edition is formed.
The present application also provides an electronic device, including:
a processor;
a memory;
the memory is used for storing a program of the method for editing the XML file, and the program performs the following operations when being read and executed by the processor:
obtaining an original XML file to be edited;
analyzing the data stream of the original XML file by using an analyzer to acquire the starting position and the ending position of a target storage unit related to the line record to be edited;
sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation on the read content; if yes, the following operations are executed:
writing the information in the target storage unit into a preset editable storage unit;
editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
and writing back the edited storage unit.
The present application further provides a computer-readable storage medium having computer-executable instructions stored therein, which when executed by a processor, perform the following operations:
obtaining an original XML file to be edited;
analyzing the data stream of the original XML file by using an analyzer to acquire the starting position and the ending position of a target storage unit related to the line record to be edited;
sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation on the read content; if yes, the following operations are executed:
writing the information in the target storage unit into a preset editable storage unit;
editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
and writing the edited storage unit back.
According to the technical scheme provided by the embodiment of the application, after an original XML file to be edited is obtained, a parser is used for parsing the data stream of the original XML file, and the starting position and the ending position of a target storage unit related to a line record to be edited are obtained; sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation; if yes, the following operations are executed: writing the information in the target storage unit into a preset editable storage unit; editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit; and writing back the edited storage unit. According to the method provided by the application, when the XML file is edited, the target storage units are extracted according to the target storage units in the XML file related to the line record to be edited, the target storage units are written into the preset editable storage units, the editable line record is edited in the editable storage units, and then the edited storage units are placed back to the storage position of the XML file; therefore, the target storage unit needed by part of the extraction can be selected according to the needs without obtaining the tree structure of the XML and storing the information in the XML into the memory. The method can realize the editing of the XML file without a large amount of memory, is more favorable for saving the memory and time when the XML file is edited, and solves the problems of memory overflow and long time consumption for editing and storing the XML file easily caused in the prior art.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings needed to be used in the description of the embodiments of the present application will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic structural diagram of an EXCEL file provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for editing an XML file according to an embodiment of the present application;
FIG. 3 is a schematic diagram of a partially truncated content of an EXCEL file provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a user editing an XML file provided by an embodiment of the present application;
FIG. 5A is a diagram of an EXCEL form before editing by a user according to an embodiment of the present application;
FIG. 5B is a diagram illustrating an EXCEL form edited by a user according to an embodiment of the present application;
FIG. 6 is a schematic diagram illustrating a user editing an XML file according to an embodiment of the present application;
FIG. 7 is a block diagram of an apparatus for editing XML files provided by an embodiment of the present application;
fig. 8 is a schematic logical structure diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The embodiment of the application provides a method and a device for editing an XML file, electronic equipment and a storage medium, so that memory and time for editing the XML file can be saved.
In order to enable those skilled in the art to better understand the technical solution of the present application, the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. This application is capable of many other embodiments than those described below and it is therefore intended that all such other embodiments as may be obtained by those skilled in the art based upon the teachings herein be within the scope of this application without undue experimentation.
It should be noted that the terms "first," "second," "third," and the like in the claims, the description, and the drawings of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. The data so used is interchangeable under appropriate circumstances such that the embodiments of the application described herein are capable of operation in sequences other than those illustrated or otherwise described herein. Furthermore, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Before describing the claimed solution, the related background of the solution is first introduced.
XML (extensible markup language), a common data exchange format, is a way to represent data independent of a computer platform, an operating platform, and a programming language, and has the same parsing way in different language environments, but different grammars. XML files are a simple way to store data that can be read by other software. A user can open, edit and create an XML file in any text editor.
The underlying workspace and save data in the EXCEL file is in XML format, so editing and saving operations on the EXCEL file require operations on the data stored in XML format. FIG. 1 is a schematic diagram of the structure of an EXCEL file. As shown in fig. 1, the document data is stored in a sheet.xml format, each worksheet (sheet) has a corresponding sheet.xml file, and a plurality of worksheets (sheets) have a plurality of sheet.xml files.
Currently, the XML file is parsed in the following ways: DOM parsing, VTD-XML parsing, SAX parsing.
The DOM (Document Object Model) parsing mode is a parsing method for reading an XML file into a memory to generate a Document Object representing the whole XML file, then parsing each tag into a corresponding Object and generating a DOM tree, and the operations of increasing, deleting and modifying the XML file are converted into the operations of increasing, deleting and modifying nodes in the DOM tree. The tree structure of the whole document is resident in the memory, so that the functions of deletion, modification, rearrangement and the like are facilitated, but the memory is consumed by one-time reading, and the memory overflow is easy. For the limited virtual memory of the APP on the mobile terminal, if the document is an XML file occupying hundreds or thousands of megabytes of memory, the problem of memory overflow is easily caused.
The analysis mode of the VTD (Virtual Token Descriptor) is an XML analysis method without extraction, the method well solves the defect that the DOM occupies too large memory, and can realize the characteristics of rapid analysis and traversal, support to Xpath (XML path language), incremental updating and the like. When the non-extraction purpose operation is realized, the original XML file can be read into the memory in a binary manner without decoding, and then the position information of each element (element) is analyzed on a binary byte array, and the relevant information is recorded, and the record is called VTD. And performing subsequent traversal operation on the stored record, and if the XML content needs to be extracted, decoding the original byte (byte) array by using information such as the position in the record and returning a character string.
Both the above two parsing methods of the XML file support the modification of the XML file, but both the two technologies have their own advantages and disadvantages.
The DOM mode needs to load the whole XML file into a memory at one time, and then a DOM tree is built and analyzed. In the process, a memory with the file size 5-10 times that of a mobile terminal APP can be consumed, and for a limited memory of the mobile terminal APP, the problem that the memory overflows to cause crash of the mobile terminal APP can easily occur due to the consumption of the overlarge memory, so that a DOM mode cannot be used for modifying a large XML file in a mobile terminal APP scene.
The VTD mode supports increment updating of the XML file, the speed of modifying the write-back of the XML file is higher than that of DOM, the occupied memory is smaller than that of DOM, and the memory 1.3-1.5 times as large as the XML file is still needed. When a file of more than 300M or a larger file is encountered, the memory which can be used by the mobile terminal APP is less and less, and the problem of memory overflow cannot be solved. Meanwhile, the VTD needs to load the file into the memory before the previous analysis, which is time-consuming in the process, so that the overall speed of using the VTD and the memory occupation cannot meet the requirement of the mobile terminal APP on storing the huge EXCEL document.
Unlike the two methods, the SAX (Simple API for XML, simple interface for operating XML) parsing method has completely different advantages and disadvantages.
The SAX analysis mode is realized by adopting an XML API (Application Program Interface) driven by an event; the event driving is a program running method based on a callback mechanism. When an SAX parser is used for parsing operation, a series of events are triggered according to a scanning sequence, when a document (document) starting symbol and a document (document) ending symbol are scanned, and an element (element) starting symbol and an element ending symbol are scanned, related processing methods are called, and corresponding operation is carried out by the operation methods until the whole document scanning is finished; the SAX parser scans and parses at the same time, and does not need to completely load files during parsing, so that the problem of memory overflow caused by overlarge XML files due to the fact that the files need to be loaded is solved. However, this parsing method is limited by that the file content can only be read forward in sequence, and does not support file modification.
Therefore, in the existing XML file parsing mode, both DOM and VTD can not solve the problems of memory overflow and write-back speed acceleration, and are not suitable for being applied to a mobile terminal APP to edit an EXCEL oversized document; the SAX analysis mode can only be used for file browsing, but not for file editing.
In order to solve the above problems, the present application provides a method and an apparatus for editing an XML file. The method is based on the SAX analysis mode, but solves the problem that the file cannot be edited in the SAX analysis mode, and can avoid the problems of memory overflow, low write-back speed and the like.
The method comprises the following steps: obtaining an original XML file to be edited; analyzing a data stream of an original XML file by using an analyzer to obtain a starting position and an ending position of a target storage unit related to a line record to be edited; sequentially reading an original XML file stream by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation; if yes, the following operations are executed: and writing the information in the target storage unit into a preset editable storage unit, editing the record of the line to be edited in the editable storage unit according to the editing requirement aiming at the record of the line to be edited to form an edited storage unit, and writing back the edited storage unit. The technical scheme can solve the problems that in the prior art, the mobile terminal is easy to cause memory overflow and long time is consumed for editing and storing the XML file. The method for editing the XML file can be applied to the mobile terminal APP.
For example, for an Excel file stored in a smart phone, when the operation of editing the file is started, a reading process of the file needs to be started first, the process includes many steps, and according to the core improvement of the application, the application only focuses on reading the XML file; firstly, analyzing a data stream of an original XML file by using an analyzer, displaying related information on an interface in the analyzing process, and acquiring a specific editing position obtained from the interface, wherein the editing position is recorded by taking a line record as a unit; after the analysis is finished, executing a process of reading while writing back, distinguishing whether the read data of the storage unit is data related to the line record to be edited in the reading process, if so, editing the read content aiming at specific editing operation, and writing back after the processing is finished; if not, the read content is directly written back. Therefore, the operation of reading and writing back is executed, the complete document information does not need to be stored in the memory, and the memory is saved.
The method, apparatus, electronic device, and computer-readable storage medium described herein are further described in detail with reference to the following specific embodiments and accompanying drawings. It should be understood that the specific embodiments described herein are merely illustrative of the application and do not limit the application.
The following describes a method for editing an XML file according to an embodiment of the present application in detail with reference to fig. 2. Fig. 2 is a flowchart illustrating a method for editing an XML file according to an embodiment of the present application. It should be noted that the steps shown in the flowchart may be performed in a computer system such as a set of computer-executable instructions, and in some cases, the steps shown may be performed in a different logical order than that shown in the flowchart.
As shown in fig. 2, the method for generating an XML file according to the embodiment of the present application includes the following steps:
step S201, an original XML file to be edited is obtained.
In the introduction of the related background above, it has been introduced that XML-formatted files are used to save data in an EXCEL file, and thus the edit save operation for the EXCEL document is realized by editing the XML file.
In this step, the original XML file to be edited in the EXCEL document is obtained. If a plurality of sheets are contained in one EXCEL document, an original XML file to be edited is obtained among a plurality of XML files storing data.
For example, when a certain Excel file is opened on a certain smart mobile terminal, when a user selects a certain sheet, for example, sheet1, through a touch screen and inputs therein editing contents for a certain cell in a certain line record, at this time, the original XML file to be edited, which needs to be obtained, is the XML file related to the sheet1, for example, the sheet1.XML file in fig. 1.
It should be noted here that although the user can input the modification of the form on the interface through the touch screen, the process is only performed on the surface of the operation interface, and no change is made to the actual content inside the XML file; the touch screen operation is only equivalent to issuing a command for modifying a certain cell of a certain line record, and the modification of the content of the XML file record needs to be implemented according to the scheme provided by the embodiment. As for the principle and process of inputting relevant modification information in the form interface provided by the screen of the smart device, there are many perfect implementation schemes in the prior art, which are not discussed in the present application and are not described herein.
Step S202, using a parser to parse the data stream of the original XML file, and obtaining a start position and an end position of the target storage unit related to the record of the line to be edited.
The method comprises the steps of using a parser to parse the data stream of the original XML file, and obtaining the starting position and the ending position of the target storage unit related to the line record to be edited.
The XML file is generally stored in a hard disk or other nonvolatile storage devices, and may be read into a memory in a data stream form of an input stream (Inputstream) by using a Buffer with a fixed capacity as a unit of a storage unit, and in this process, the meaning of information in the memory is analyzed, specifically, binary data is converted into a character string, and the character string is analyzed according to a rule of the XML file.
In this step, the parser parses in an event-driven manner based on the SAX parser, and can parse the start tag and the end tag of the line record contained therein.
As shown in fig. 3, the contents are intercepted for a portion of an XML file. In this schematic diagram, in addition to the parsed data, there are a start mark "</Row" of the Row record and an end mark "</Row >" of the Row record, and there are also a Cell start mark "</Cell" and a Cell end mark "</Cell >" at each Cell.
Referring to fig. 4, the first row shows the sequential storage of the XML document contents, the second row shows that the XML document contents form a data input stream, the data input stream is sequentially arranged, and the storage unit Buffer in the third row is used as an information container to store the data of the document. The essence is that the contents in the XML file are read in sequentially through Inputstream and are stored through a storage unit with fixed capacity in the memory; since the parser of this embodiment uses the SAX parser, the XML file content is sequentially read in the memory by using several buffers, and the parsing is performed during the reading process, which is called parsing and is approximately equivalent to understanding the meaning of the information. The storage unit Buffer can be understood as an imaginary container for storing an input stream of file data, i.e. a cache unit used for temporarily storing data in a memory; the reading and analyzing processes are sequentially carried out from front to back; in the process, an onStart () method and an onEnd () method of the Element callback handler are used to obtain a start mark and an end mark of the line record respectively, and a reading position of the current XML file data stream is obtained through a progress monitor (progress monitor inputstream).
For example, row1 in fig. 4 indicates a first line record, the start position of this line record falls in, for example, buffer2, after the start mark is obtained, it is determined that it is at the Buffer2 position, that is, its target memory cell is Buffer2, then it is determined that the start position (Startoffset) of the target memory cell records a value 8096 2, which is specifically the end position of Buffer2, and as long as the value is subtracted by 8096, the next storage byte is the start position of the target memory cell; and if the end position of Row1 falls in Buffer3, determining that the end position (endOffset) of the target memory cell is 8096 × 3, where the end position is the end position of Buffer3, that is, the end position of the target memory cell; by analogy, the end position of the Buffer corresponding to each line recording start position is recorded as startOffset, the end position of the Buffer corresponding to each line recording end position is recorded as endOffset, and the target memory cell can be represented as (startOffset-8096, endOffset), wherein the data arrangement of the line recording is arranged according to the ascending order of the line sequence numbers.
The plurality of cache units temporarily store the analyzed data in the storage unit in sequence according to the line recording sequence. In this step, an 8096 byte buffer is used as an example of the memory unit, and memory units with other sizes may also be used.
The parsing process may be performed for each line record when the XML file is read into the memory, or may be performed for a line record to be edited according to the editing requirement.
According to the above process, in combination with the requirements for different edits to the XML, in this step, a parser is used to parse the data stream of the original XML file to obtain the start position and the end position of the target storage unit related to the record of the line to be edited, which may specifically include the following steps:
reading the original XML file using a parser (in this embodiment, specifically, a SAX parser), and encoding the read data into a character string;
reading characters in the character string, and identifying a start mark and an end mark of a line record in the character string;
when the editing operation is to modify the line record, the following operations are performed:
according to the start mark and the end mark of the identified line record, combining with the progress monitoring of a progress monitor, obtaining the start mark and the end mark of the line record to be modified corresponding to the original XML file;
and taking the start position of the storage unit where the start mark of the line record to be modified is located as the start position of the target storage unit, and taking the end position of the storage unit where the end mark of the line record to be modified is located as the end position of the target storage unit.
When the editing operation is inserting line recording, the following operations are executed:
according to the identified start mark and end mark of the line record, combining the progress monitoring of a progress monitor to obtain the start mark and end mark of the previous line record corresponding to the line record to be inserted of the original XML file;
and taking the starting position of the storage unit where the starting mark of the previous line record is located as the starting position of the target storage unit, and taking the ending position of the storage unit where the ending mark of the previous line record is located as the ending position of the target storage unit.
In the above steps, for the modify operation and the insert operation, the target storage unit needs to be selected in different ways to meet the requirements of the corresponding operations.
The progress monitoring of the progress monitor includes:
according to the identified start mark and end mark of the line record, sequentially arranging and progressively increasing the count of the line record to obtain the sequence number of each line record;
and determining a start mark and an end mark related to the row record to be edited according to the sequence number and information of the position to be edited in the table provided by the editing operation received by the operation interface. The editing operation comprises a line record modifying operation and a line record adding operation.
Specifically, a specific example is given below to explain a method of acquiring the start position and the end position of the target storage unit associated with the line record to be edited. Fig. 4 is a schematic diagram of editing an XML file according to an embodiment of the present application. As shown in fig. 4, the first line is a data list of XML file line records parsed by SAX, and the data of the line records are sequentially increased according to the line sequence number. Xml file content, the third line is a plurality of storage unit buffers, and the buffers with the size of 8096 bytes are used for storage in the scheme. Combining with progress monitoring of a progress monitor, obtaining that the start mark of Row1 falls on Buffer2 according to the start mark and the end mark of the identified line record, wherein the recording value (in bytes) of the end position (startOffset) of the first storage unit is 8096 x 2, and the next storage byte obtained by subtracting 8906 from the recording value is the start position of the Row1 target storage unit; if the end mark of Row1 falls in Buffer3, the end position (endOffset) of the final memory cell records a value of 8096 × 3; combining the start position and the end position of Row1 to obtain that the target memory cell of Row1 is Buffer2-Buffer3, and the first range of the specific byte can be represented as (8096 × 2-8096, 8096 × 3), that is, the data of Row1 is stored in memory cells Buffer2-Buffer3, similarly, the start mark of Row2 falls on Buffer4, then the end position (startOffset) of the first memory cell of Row2 is 8096 × 4, and the next memory byte after subtracting 8096 from the start mark is the start position of the target memory cell of Row 2; the end mark of Row2 is located at Buffer6, the end position (endOffset) of the last storage unit of Row2 records the value of 8096 × 6, namely, the target storage units of Row2 data storage are Buffer 4-Buffer 6, the first range of specific bytes can be expressed as (8096 × 4-8096, 8096 × 6), the Nth Row, namely Rown, is combined with progress monitoring of a progress monitor to identify the start mark and the end mark of Rown, the start mark of Rown is located at Buffer m, the end position (startOffset) of the first storage unit records the value of 8096 × m, the next storage byte after the 8096 is subtracted is the start position of the target storage unit of Rown, the end mark of Rown is located at Buffer N, the end position (endOffset) of the last storage unit of Rown is located at Buffer N, the end mark of Rown is located at 8096 × N, and the end mark of Rown is represented as the target storage unit of 96 × 96, and the data storage unit of Rown is represented as (8096 — 96 × 96), wherein the end mark of Rown is represented by the data storage unit 96 — 96, and the end mark of Rown, the end mark of Row2 is represented by the data storage unit of the range of 8096 × N, and the end mark of Row2, and the end mark of Row, and the data of the end mark of Row is represented by the data storage unit.
In this step, a storage unit (i.e., a cache unit of a memory) corresponding to any line record may be obtained, and more specifically, a start byte position and an end byte position of any line record storage unit may be obtained. For a line record needing to be modified, the storage unit(s) where the line record is located can be taken as a target storage unit; for an insert row record, the memory cell of the row record preceding the row record to be inserted may be obtained and taken as the target memory cell; i.e. the meaning of the target memory location differs between the two editing operations.
For a user operating an EXCEL document, the behavior of editing the EXCEL by the user is to operate the cells, and the existing cell data may be modified, some cell data in a certain line may be added or deleted, and a certain line may be added or deleted. The delete cell is equal to the edit cell, and the addition and the edit are taken as examples in the following description. Namely, the editing operation comprises a line record modification operation and a line record newly-added operation. As shown in fig. 5A and 5B, schematic diagrams of the user's editing operation on EXCEL are shown. Fig. 5A is an EXCEL table schematic diagram before user editing, and fig. 5B is an EXCEL table schematic diagram after user editing.
The user operations performed by the user for the EXCEL sheet form include:
editing an existing cell A1 in Row1, and modifying the data of the cell A1 from 1 to 12;
adding a cell B2 in Row2, wherein the data of the newly added cell B2 is 22;
adding Row3 and a new cell A3, wherein the data of the new cell A3 is 3;
deleting the existing cell A4 in Row 4;
cell a10 is deleted in Row10, and the Row is also deleted.
In the step of analyzing the data stream of the original XML file and acquiring the starting position and the ending position of the target storage unit related to the line record to be edited, when the editing operation is to modify the line record, the following operations are executed: according to the identified start mark and end mark of the line record, combining the progress monitoring of a progress monitor to obtain the start mark and end mark of the line record to be modified corresponding to the original XML file; and taking the start position of the storage unit where the start mark of the line record to be modified is located as the start position of the target storage unit, and taking the end position of the storage unit where the end mark of the line record to be modified is located as the end position of the target storage unit.
For Row1 in the above example, reading a start mark of the line record to be edited, acquiring a storage unit where the start mark is located, and taking the storage unit as a starting unit of a target storage unit; reading an end mark of the line record to be modified, acquiring a storage unit where the end mark is positioned, and taking the storage unit as an end unit of a target storage unit; and further according to the aforementioned formula, the specific byte range corresponding to the target storage unit can be calculated.
Another editing operation is to insert a new line record, specifically, in the step of parsing the data stream of the original XML file to obtain the start position and the end position of the target storage unit related to the line record to be edited, when the editing is to insert a new line record, the method includes:
according to the identified start mark and end mark of the line record, combining the progress monitoring of a progress monitor to obtain the start mark and end mark of the previous line record corresponding to the line record to be inserted of the original XML file;
and taking the start position of the storage unit where the start mark of the previous line record is located as the start position of the target storage unit, and taking the end position of the storage unit where the end mark of the previous line record is located as the end position of the target storage unit.
For the new Row3 in the above example, the Row number of the new Row record is determined to be Row3, according to Row3, the previous Row record Row2 is determined to be the leading Row record, the start mark of the leading Row record Row2 is read, and the memory cell where the start mark is located is obtained to be the starting cell of the target memory cell; reading an end mark of the leading Row record Row2, acquiring a storage unit where the end mark is positioned, and taking the storage unit as an end unit of a target storage unit; further, the specific byte range corresponding to the target storage unit can be calculated according to the foregoing formula.
Meanwhile, the Buffer position corresponding to each line record does not necessarily start from the start position of the Buffer, for example, for Row1, there may be some configuration files related to fonts, word sizes, etc. occupying the Buffer unit before Row 1. If the first data of Row1 read subsequently using the input data stream occupies 10000 bytes, since the data size that can be stored by one Buffer is 8096 bytes, the data with the first byte size of Row1 is on 1905 bytes of Buffer2, i.e. 10000-8096+1, and 1905 is obtained. Therefore, the data recorded in one row may span several buffers, and one Buffer may store several rows of data correspondingly.
The method is used for obtaining the starting position and the ending position of the line record to be edited in the original XML file by utilizing SAX analysis and combining with a progress monitor, and is convenient for processing the line record to be edited subsequently so as to finish the editing work of the original XML file.
Step S203, sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing a write-back operation on the read content, and performing step S206-1; if yes, go to step S204.
In step S202, the start position and the end position of the target storage unit associated with the line record to be edited are obtained, which is used to determine the line record in the original XML file data stream that has not undergone editing operation and the line record that has undergone editing operation, and if it is determined that the read content is not associated with the target storage unit, step S206-1 is performed, i.e., a write-back operation is performed on the content read by the storage unit, and the line record that has not undergone editing operation is written back to the file in units of storage units. If the read content is determined to be related to the target storage unit, go to step S204.
In the previous step, obtaining a start position and an end position of a target storage unit related to a line record to be edited, in this step, sequentially reading the data stream of the original XML file in units of storage units, and determining whether the read storage unit is the target storage unit according to the start position and the end position of the target storage unit includes:
sequentially reading the data stream of the original XML file, and recording size = size + n, wherein n is the size of the actually read data stream, and the size is the size of the total data currently read;
judging whether the size is in the range of (starting position of the target storage unit-storage unit size, starting position of the target storage unit), if so, judging that the read storage unit is the target storage unit;
continuing to read the data stream of the original XML file, judging whether the size is larger than the end position of the target storage unit, and if not, judging that the read storage unit is the target storage unit; if yes, the read storage unit is judged not to be the target storage unit.
During specific editing, reading information from a file, analyzing the number of the read information during reading when the file is read through a Buffer, determining the read position according to the number of the read information, and if a certain read storage unit does not contain a line record needing editing, namely the line record is not a target storage unit, directly writing back the line record into an Excel file through a file output stream Outputstream; for judging that the read memory cell is the target memory cell, the operation of step S204 is performed.
The write-back to the Excel file can be actually performed in a new storage space (a storage space of a hard disk or other nonvolatile storage media) additionally opened, and after the write-back is completed, the original XML file is replaced by the edited XML file in the new storage space. The following write-back is also synonymous and will not be described further.
When reading a file by Buffer, since it is sequential reading, and there is no structure for storing the file, it is necessary to determine the read position by the number of recorded information. In the reading process, according to the target cell position already known in the previous step, which information is read as the information of the target storage unit can be known in a mode of counting the number of the information.
The following describes a specific processing procedure by taking the determination procedure of Row1 in fig. 6 as an example. After the start position and the end position of a target storage unit related to a line record to be edited are obtained, the data stream of an original XML file is sequentially read by taking the storage unit as a unit, that is, data is read from the start of the original XML file, the reading process and the process of writing back the file are continuously performed, the size of the total data read is recorded as size, and the data is read by taking the data capacity which can be accommodated by the storage unit as a unit.
For example, in the actual reading process, reading is performed according to the sequence with the storage unit as the unit, and it is known through the foregoing that the storage unit occupied by Row1 is Buffer2 to Buffer3, the storage position of the data stream of the Row1 file is approximately in the interval of (8096 × 2-8096, 8096 × 3) ], that is, the target storage unit is Buffer2 to Buffer3, the byte occupied by the data is (8096 × 2-8096, 8096 × 3]. When the data size is read to 8096, the data in Buffer1 is full, at this time, the size =8096, it is determined that the data size =8096 read at this time is not in the interval of (8096 × 2-8096, 8096 × 3) ], that is, the read storage unit is not the target storage unit, and the data of the storage unit is written back to the EXCEL document by using the file output stream (Outputstream).
Next, data reading in the storage unit Buffer2 is continued, and if reading n is 1903, it is assumed that reading is ended, which means that reading of data before Row1 is ended at this time, and the actually read data size n is 1903, and then size =8096+1903=9999. And then, reading is continued, a start mark size =2096+1904=10000 of Row1 is read, whether the size =10000 is in an interval of (8096 × 2-8096, 8096 × 3) is judged, and when the size =10000 is judged to be in the interval, the read storage unit is judged to be a target storage unit, namely, buffer2 is a target storage unit.
Assuming that the data amount n read each time is a fixed value, and is data of one memory cell, that is, 8096, the data stored in Buffer2 is all read, at this time, size =8096+8096=16192, and at this time, the data of Row to be edited, which is recorded in Row, is in the middle of (8096 + 2-8096, 8096+ 3), that is, the data of Row1 has not been read yet.
Reading the data in the next Buffer, i.e. Buffer3, is continued, if when n =3808, the end mark of Row1 recording is read, which indicates that the data reading of Row1 recording is ended at this time, and size = size + n =16192+3808=20000 at this time. If the end position of the record data of Row1 located before is Buffer3, the record value of the end position of the target storage unit is 8096 × 3, that is, the position of size =24288, where the size =20000 is in the byte range of the (8096 × 2-8096, 8096 × 3] target storage unit.
That is, in the process, it is determined that the storage unit Buffer1 is a non-target storage unit, and the data in the Buffer1 is written back to the EXCEL document. Meanwhile, judging that the obtained storage units Buffer2-Buffer3 are target storage units, obtaining specific starting positions and ending positions of Row1 data, and writing information of the target storage units into preset editable storage units in the following steps.
Through the steps, the original XML file is sequentially read by taking the storage unit as a unit, the read information of the non-target storage unit is written back to the EXCEL document, and meanwhile, the information of the target storage unit is also acquired.
Step S204, judging that the read storage unit is a target storage unit, and writing information in the target storage unit into a preset editable storage unit;
step S205, in the editable storage unit, according to the editing requirement aiming at the line record to be edited, editing the line record to be edited to form an edited storage unit;
and step S206-2, writing back the edited storage unit.
The steps S204, S205, and S206-2 are configured to, after determining that the read storage unit is a target storage unit, write information in the target storage unit into a preset editable storage unit, perform editing processing on a to-be-edited line record in the editable storage unit according to an editing requirement for the to-be-edited line record to form an edited storage unit, and finally write back the edited storage unit. The steps are key links for realizing sequential reading and simultaneously performing editing operation according to the technical scheme of the application.
Step S204 is configured to write information related to the target storage unit into the preset editable storage unit after determining that the read storage unit is the target storage unit, where the preset editable storage unit is constructed and may be an editable information constructor stringBuilder, that is, a target storage unit recorded in a line to be edited in the editable information constructor.
As shown in fig. 6, the fourth line is all the line records to be edited, and the fifth line is an editable information constructor in which the first line record Row1 to be edited is constructed. Specifically, the editable information constructor includes data information in Buffer2 and Buffer 3.
In step S205, after writing the information in the target storage unit into the preset editable storage unit, the editable storage unit edits the record of the line to be edited according to the editing requirement for the record of the line to be edited, where when the editing is to modify the existing record of the line, the method includes:
performing character string segmentation on the information stored in the editable storage unit; identifying the line number of the line record to be edited according to the character string, and determining the specific starting position and ending position of the line record to be edited in the character string by combining the starting mark and the ending mark of the line record; taking out the information between the starting position and the ending position, and converting the information into element objects, wherein the element objects comprise cell objects; according to the editing requirement, carrying out editing operation on a specific target cell needing to be edited; and converting the target table cells after the editing operation and the contained element objects into character strings, and writing back the character strings to the original sequence positions in the character strings in the editable storage unit to form an edited storage unit.
When the editing of the line record to be edited is to modify the existing line record, the editable storage unit, namely the information stored in the Buffer contained in the editable information constructor stringBuilder, is subjected to character string segmentation, namely a character string set is segmented (split) by using a start character string "< row" and an end character string "</row >" or "/>" (when the line record does not have a c label for identifying a cell, the line is identified by using a "/>" character string for identifying the end of the line). And then identifying the line number of the line record to be edited according to the character string, determining the specific starting position and ending position of the line record to be edited in the character string by combining the starting mark and ending mark of the line record, further taking out the character strings between the starting positions and the ending positions of all the target storage units determined in the step by using subString, and converting the character strings into element objects, wherein the element objects comprise cell objects. At this time, according to the requirement of editing and modifying the line record of the existing line, the operation of editing and newly adding the corresponding cell is carried out aiming at the target cell, then the target cell after the editing operation and the contained element object are converted into character strings, and the character strings are written back to the original sequence position of the character strings in the editable storage unit, so as to form the storage unit after editing.
For example, in the example of fig. 6, the target storage units Buffer2 and Buffer3 recorded in Row1 store data of Buffer2 and Buffer3 in a character string in an editable information constructor, and according to the Row number of Row record Row1 to be edited, the specific start position and end position of Row1 in the character string are determined by combining the start mark and end mark of the Row record, and the data size occupied by Row1 is from 1905 of Buffer2 to 3808 of Buffer3, and the information between the specific start position and end position is taken out and converted into an element object, and then according to the edit modification operation of Row1 by the user, the edit-modified cell element object is converted into a character string and written back to the original position in the character string, so as to obtain the editable information after the edit processing.
In step S205, after writing the information in the target storage unit into the preset editable storage unit, in the editable storage unit, according to the editing requirement for the line record to be edited, editing the line record to be edited, where when the editing is inserting a new line record, the method includes: performing character string segmentation on the information stored in the editable storage unit; determining the end position of the previous line record of the line record to be inserted according to the character string and the line number of the line record to be inserted in combination with the end mark of the line record; and inserting a new serialized character string of the line record from the end position of the previous line record, and sequentially placing the information originally at the position behind to form an edited storage unit.
When editing of the row record to be edited is to insert a new row record, character string segmentation is performed on information stored in a Buffer included in an editable storage unit, namely the editable information constructor stringBuilder, namely a character string set is segmented (split) by using a start character string "< row" and an end character string "</row >" or "/>" (when the row record does not have a c tag identifying a cell, the row end is identified by using a "/>" character string). And determining the end position of the previous line record of the line record to be inserted according to the line number of the line record to be inserted and the end mark of the line record. And inserting the serialized character string of the new line record from the end position of the previous line record of the line record to be inserted. The new line record serialization character string is formed by converting the new line record data into an element object and further converting the element object into the character string. Then the information at the position is sequentially placed at the back to form an edited storage unit.
For example, in fig. 6, row3 is a newly added line record, and after the end position of the line record Row2 immediately preceding the line record Row3 to be inserted is determined by knowing that the memory locations occupied by Row2 are buffers 4 to 6, the serialized character string of Row3 is inserted thereafter. The information originally at this position is placed in the following order to form an edited storage unit.
No matter whether the editing is to modify the existing line record or insert a new line record, after the line record to be edited is edited to form an edited storage unit, step S206-2 is executed, that is, the edited storage unit is written back.
Step S206-2 is for writing back the data of the post-editing storage unit edited in step S205 into the EXCEL document.
The process of the method for editing the XML file is a process of reading and writing back at the same time, a data stream of an original XML file is sequentially read by using a storage unit as a unit, whether the read storage unit is a target storage unit is judged in the reading process according to the obtained starting position and the ending position of the target storage unit related to the record of a line to be edited, if not, the data in the storage unit is not processed, and the data in the storage unit is directly written back to an EXCEL document by using the storage unit as the unit; and if so, writing the information of the target storage unit into a preset editable storage unit, editing the preset editable storage unit, and writing the data of the storage unit subjected to editing back into the EXCEL document.
When all the records of the lines to be edited of the whole file are not processed, data are continuously read in sequence according to the input file data stream, the data in the target storage unit are written into a preset editable information constructor (Stringbuilder), namely, the data are edited in a plurality of storage units, and the edited storage units are written back; for non-target storage units, i.e., without editing processing, write back directly into the EXCEL document. The whole processing process is completely carried out in a sequential reading and writing mode, information is edited in an editable information constructor and written back, and then the reading and writing of the following storage unit are continued; namely, the data of the XML file is processed sequentially from front to back; the file editing effect is realized in the file sequential processing process; until the whole data stream corresponding to the whole XML file is completely processed.
The method for editing the XML file comprises the steps of obtaining an original XML file to be edited, analyzing a data stream of the original XML file by using an analyzer, obtaining a starting position and an ending position of a target storage unit related to a line record to be edited, sequentially reading the data stream of the original XML file by taking the storage unit as a unit, judging whether the read storage unit is the target storage unit or not according to the starting position and the ending position of the target storage unit, and if not, performing write-back operation on read contents; and if so, writing the information in the target storage unit into a preset editable storage unit, editing the record of the line to be edited in the editable storage unit according to the editing requirement aiming at the record of the line to be edited to form an edited storage unit, and finally writing back the edited storage unit. According to the method provided by the application, when the XML file is edited, the target storage units are extracted according to the target storage units in the XML file related to the line record to be edited, the target storage units are written into the preset editable storage units, the line record to be edited in the editable storage units is edited, and then the edited storage units are placed back to the storage position of the XML file. Therefore, the target storage unit needed by part can be extracted according to selection, and the XML file can be edited without a large amount of memory, so that the memory and time for editing the XML file can be saved. Therefore, the technical scheme provided by the application can be applied to the mobile terminal application which needs to save the memory space and needs to edit and store the EXCEL document quickly.
Further, if the method for editing the XML file provided by the application is applied to the actual mobile terminal for storing the EXCEL document, all the records of the lines to be edited need to be sequentially subjected to the method for editing the XML file provided by the application, the first record of the lines to be edited is selected first, the operation of editing the XML file is performed, after the first record of the lines to be edited is processed, the process of reading and writing back is continuously performed, then the next record of the lines to be edited is selected for operation until the whole XML file is completely written back into the EXCEL document, and the editing work of the whole EXCEL document is completed. A second embodiment of the present application provides an apparatus for editing an XML document, which corresponds to the method for editing an XML document provided in the first embodiment of the present application, and is briefly described herein. For the implementation of the present embodiment, reference may be made to the first embodiment.
Please refer to fig. 7, which is a block diagram of an apparatus according to a second embodiment of the present application.
An apparatus for editing an XML file according to a second embodiment of the present application, includes: acquisition means 701, analysis means 702, determination means 703, writing means 704, editing means 705, and write-back means 706.
An obtaining unit 701, configured to obtain an original XML file to be edited;
the parsing unit 702 is configured to parse the data stream of the original XML file by using a parser, and obtain a start position and an end position of a target storage unit related to a line record to be edited;
a determining unit 703, configured to sequentially read the data stream of the original XML file by using a storage unit as a unit, and determine whether the read storage unit is a target storage unit according to a start position and an end position of the target storage unit;
a writing unit 704, configured to write the information in the target storage unit into a preset editable storage unit when the determination result of the determining unit is yes;
an editing unit 705, configured to perform, in the editable storage unit, editing processing on the to-be-edited line record according to an editing requirement for the to-be-edited line record, so as to form an edited storage unit; a write-back unit 706, configured to perform a write-back operation on the read content if the determination result of the determining unit is negative; and the write-back operation is executed on the storage unit after the edition is formed.
Optionally, the parsing unit of the apparatus is configured to:
reading the original XML file by using a parser, and encoding the read data into a character string;
reading characters in the character string, and identifying a start mark and an end mark of a line record in the character string;
when the editing operation is to modify the line record, the following operations are performed:
according to the start mark and the end mark of the identified line record, combining with the progress monitoring of a progress monitor, obtaining the start mark and the end mark of the line record to be modified corresponding to the original XML file;
taking the start position of a storage unit where the start mark of the line record to be modified is located as the start position of the target storage unit, and taking the end position of the storage unit where the end mark of the line record to be modified is located as the end position of the target storage unit;
when the editing operation is inserting line recording, the following operations are executed:
according to the identified start mark and end mark of the line record, combining the progress monitoring of a progress monitor to obtain the start mark and end mark of the previous line record corresponding to the line record to be inserted of the original XML file;
and taking the starting position of the storage unit where the starting mark of the previous line record is located as the starting position of the target storage unit, and taking the ending position of the storage unit where the ending mark of the previous line record is located as the ending position of the target storage unit.
Optionally, the parsing unit of the apparatus is configured to:
according to the identified start mark and end mark of the line record, sequentially arranging and progressively increasing the count of the line record to obtain the sequence number of each line record;
and determining a start mark and an end mark related to the row record to be edited according to the sequence number and information of the position to be edited in the table provided by the editing operation received by the operation interface.
Optionally, the determining unit of the apparatus is configured to:
sequentially reading the data stream of the original XML file, and recording size = size + n, wherein n is the size of the actually read data stream, and size is the size of the currently read total data;
judging whether the size is in the range of (starting position of the target storage unit-storage unit size, starting position of the target storage unit), if so, judging that the read storage unit is the target storage unit;
continuing to read the data stream of the original XML file, judging whether the size is larger than the end position of the target storage unit, and if not, judging that the read storage unit is the target storage unit; if yes, the read storage unit is judged not to be the target storage unit.
Optionally, the editing unit of the apparatus is configured to:
if the editing is to modify an existing row record, the method includes:
performing character string segmentation on the information stored in the editable storage unit;
identifying the line number of the line record to be edited according to the character string, and determining the specific starting position and the specific ending position of the line record to be edited in the character string by combining the starting mark and the ending mark of the line record;
taking out the information between the starting position and the ending position, and converting the information into element objects, wherein the element objects comprise cell objects;
according to the editing requirement, carrying out editing operation on the target cell which needs to be edited specifically;
and converting the edited target cell and the contained element object into character strings, and writing back the character strings to the original sequence positions of the character strings in the editable storage unit to form an edited storage unit.
Optionally, the editing unit of the apparatus is further configured to:
if the editing is inserting a new line record, the method includes:
performing character string segmentation on the information stored in the editable storage unit;
determining the ending position of the previous line record of the line record to be inserted according to the character string and the line number of the line record to be inserted in combination with the ending mark of the line record;
and inserting a new serialized character string of the line record from the end position of the previous line record, and sequentially placing the information originally at the position behind to form an edited storage unit.
Optionally, the analyzer of the apparatus analyzes by using an SAX analysis method.
Please refer to fig. 8, which is a schematic diagram of an electronic device according to a third embodiment of the present application.
The electronic device includes:
a processor 801;
a memory 802;
the memory is used for storing a program of the method for editing the XML file, and the program performs the following operations when being read and executed by the processor:
obtaining an original XML file to be edited;
analyzing the data stream of the original XML file by using an analyzer to acquire the starting position and the ending position of a target storage unit related to the record of the line to be edited;
sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation on the read content; if yes, the following operations are executed:
writing the information in the target storage unit into a preset editable storage unit;
editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
and writing back the edited storage unit.
The fourth embodiment of the present application further provides a computer-readable storage medium, in which computer-executable instructions are stored, and when the computer-executable instructions are executed by a processor, the following operations are performed:
obtaining an original XML file to be edited;
analyzing the data stream of the original XML file by using an analyzer to acquire the starting position and the ending position of a target storage unit related to the line record to be edited;
sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation on the read content; if yes, the following operations are executed:
writing the information in the target storage unit into a preset editable storage unit;
editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
and writing back the edited storage unit.
In the above embodiments of the present application, the descriptions of the respective embodiments have respective emphasis, and for parts that are not described in detail in a certain embodiment, reference may be made to related descriptions of other embodiments.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
1. Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.
2. As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
Although the present invention has been described with reference to the preferred embodiments, it is not intended to be limited thereto, and variations and modifications may be made by those skilled in the art without departing from the spirit and scope of the present invention.

Claims (10)

1. A method of editing an XML file, the method comprising:
obtaining an original XML file to be edited;
analyzing the data stream of the original XML file by using an analyzer to acquire the starting position and the ending position of a target storage unit related to the record of the line to be edited;
sequentially reading the data stream of the original XML file by taking a storage unit as a unit, and judging whether the read storage unit is a target storage unit or not according to the starting position and the ending position of the target storage unit; if not, performing write-back operation on the read content; if yes, the following operations are executed:
writing the information in the target storage unit into a preset editable storage unit;
editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
and writing back the edited storage unit.
2. The method according to claim 1, wherein the parsing the data stream of the original XML file using the parser to obtain the start position and the end position of the target storage unit related to the line record to be edited includes:
reading the original XML file by using a parser, and encoding the read data into a character string;
reading characters in the character string, and identifying a start mark and an end mark of a line record in the character string;
when the editing operation is to modify the line record, the following operations are performed:
according to the start mark and the end mark of the identified line record, combining with the progress monitoring of a progress monitor, obtaining the start mark and the end mark of the line record to be modified corresponding to the original XML file;
taking the start position of a storage unit where the start mark of the line record to be modified is located as the start position of the target storage unit, and taking the end position of the storage unit where the end mark of the line record to be modified is located as the end position of the target storage unit;
when the editing operation is inserting line recording, the following operations are executed:
according to the identified start mark and end mark of the line record, combining the progress monitoring of a progress monitor to obtain the start mark and end mark of the previous line record corresponding to the line record to be inserted of the original XML file;
and taking the starting position of the storage unit where the starting mark of the previous line record is located as the starting position of the target storage unit, and taking the ending position of the storage unit where the ending mark of the previous line record is located as the ending position of the target storage unit.
3. The method of claim 2, wherein the progress monitoring of the progress monitor comprises:
according to the identified start mark and end mark of the line record, sequentially arranging and progressively increasing the count of the line record to obtain the sequence number of each line record;
determining a start mark and an end mark related to the row record to be edited according to the sequence number and information of the position to be edited in a table provided by editing operation received by an operation interface; the editing operation comprises a line record modifying operation and a line record adding operation.
4. The method according to claim 1, wherein the reading the data stream of the original XML file sequentially in units of storage units and determining whether the read storage unit is a target storage unit according to a start position and an end position of the target storage unit comprises:
sequentially reading the data stream of the original XML file, and recording size = size + n, wherein n is the size of the actually read data stream, and size is the size of the currently read total data;
judging whether the size is in the range of (starting position of the target storage unit-storage unit size, starting position of the target storage unit), if so, judging that the read storage unit is the target storage unit;
continuing to read the data stream of the original XML file, judging whether the size is larger than the end position of the target storage unit, and if not, judging that the read storage unit is the target storage unit; if yes, the read storage unit is judged not to be the target storage unit.
5. The method according to claim 1, wherein in the editable storage unit, according to an editing requirement for the line record to be edited, the line record to be edited is edited to form an edited storage unit, and if the editing is to modify an existing line record, the method includes:
performing character string segmentation on the information stored in the editable storage unit;
identifying the line number of the line record to be edited according to the character string, and determining the specific starting position and the specific ending position of the line record to be edited in the character string by combining the starting mark and the ending mark of the line record;
taking out the information between the starting position and the ending position, and converting the information into element objects, wherein the element objects comprise cell objects;
according to the editing requirement, carrying out editing operation on the target cell which needs to be edited specifically;
and converting the edited target cell and the contained element object into character strings, and writing back the character strings to the original sequence positions of the character strings in the editable storage unit to form an edited storage unit.
6. The method according to claim 1, wherein in the editable storage unit, according to an editing request for the line record to be edited, the line record to be edited is edited to form an edited storage unit, and if the editing is to insert a new line record, the method includes:
performing character string segmentation on the information stored in the editable storage unit;
determining the ending position of the previous line record of the line record to be inserted according to the character string and the line number of the line record to be inserted in combination with the ending mark of the line record;
and inserting a new serialized character string of the line record from the end position of the previous line record, and sequentially placing the information originally at the position behind to form an edited storage unit.
7. The method of claim 1, wherein the parser parses using a SAX parsing method.
8. An apparatus for editing an XML file, the apparatus comprising:
the device comprises an acquisition unit, a processing unit and a processing unit, wherein the acquisition unit is used for acquiring an original XML file to be edited;
the analysis unit is used for analyzing the data stream of the original XML file by using an analyzer and acquiring the starting position and the ending position of a target storage unit related to the record of the line to be edited;
a judging unit, configured to sequentially read the data stream of the original XML file by using a storage unit as a unit, and judge whether the read storage unit is a target storage unit according to a start position and an end position of the target storage unit;
the writing unit is used for writing the information in the target storage unit into a preset editable storage unit when the judgment result of the judging unit is yes;
the editing unit is used for editing the line record to be edited in the editable storage unit according to the editing requirement aiming at the line record to be edited to form an edited storage unit;
the write-back unit is used for executing write-back operation on the read content when the judgment result of the judgment unit is negative; and the write-back operation is executed on the storage unit after the edition is formed.
9. An electronic device, comprising: a processor, a memory, and computer program instructions stored on the memory and executable on the processor; the processor, when executing the computer program instructions, implements a method of editing an XML file as claimed in any one of claims 1 to 7 above.
10. A computer-readable storage medium having stored thereon computer-executable instructions for implementing a method of editing an XML document according to any one of claims 1 to 7 when executed by a processor.
CN202211524323.1A 2022-11-30 2022-11-30 Method and device for editing XML file, electronic equipment and storage medium Pending CN115878851A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211524323.1A CN115878851A (en) 2022-11-30 2022-11-30 Method and device for editing XML file, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211524323.1A CN115878851A (en) 2022-11-30 2022-11-30 Method and device for editing XML file, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115878851A true CN115878851A (en) 2023-03-31

Family

ID=85765106

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211524323.1A Pending CN115878851A (en) 2022-11-30 2022-11-30 Method and device for editing XML file, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115878851A (en)

Similar Documents

Publication Publication Date Title
CN1838111B (en) Method for editing file and recording modification mark
CN107391544B (en) Processing method, device and equipment of column type storage data and computer storage medium
CN110209387B (en) Method and device for generating top-level HDL file and computer readable storage medium
CN100338605C (en) Recording method for extendable mark language file repairing trace
CN114610957A (en) Data processing method, device, equipment and computer storage medium
CN103226510A (en) Method and device for analyzing vmcore file
CN113283228A (en) Document generation method and device, electronic equipment and storage medium
CN115878851A (en) Method and device for editing XML file, electronic equipment and storage medium
CN111858402A (en) Read-write data processing method and system based on cache
CN112765110B (en) PDF annotation data generation method, device, equipment and storage medium
CN115774745A (en) Extraction method and system for high-capacity Excel file data
CN114490848A (en) File analysis processing method and device, storage medium and electronic equipment
CN112988866A (en) Method and device for exporting excel file, electronic equipment and storage medium
CN111797063A (en) Streaming data processing method and system
CN115858685B (en) Online synchronization method and device for demand files, terminal and readable storage medium
CN111581921B (en) Text editing method and device, computer storage medium and terminal
CN112965822B (en) Method for optimizing memory performance of javaScript/typeScript program by using array pool
CN111427854B (en) Stack structure realizing method, device, equipment and medium for supporting storage batch data
CN110096682B (en) Method for realizing real-time cooperative processing of data in document based on modoc data structure
CN112445784B (en) Text structuring method, equipment and system
CN116341496A (en) Document typesetting method, device and equipment
CN116954622B (en) Method for associating abstract syntax tree with source code coordinates, electronic device and medium
CN112732816B (en) Data export method and system
CN110928804B (en) Garbage recycling optimization method, device, terminal equipment and machine-readable medium
CN113676186A (en) Estimation positioning method and system for character string

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination