CN110457526A - Unitized data analytic method based on xml document - Google Patents

Unitized data analytic method based on xml document Download PDF

Info

Publication number
CN110457526A
CN110457526A CN201910702087.XA CN201910702087A CN110457526A CN 110457526 A CN110457526 A CN 110457526A CN 201910702087 A CN201910702087 A CN 201910702087A CN 110457526 A CN110457526 A CN 110457526A
Authority
CN
China
Prior art keywords
xml document
data
field
communication protocol
file
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910702087.XA
Other languages
Chinese (zh)
Inventor
王升
王军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Tech University
Original Assignee
Nanjing Tech University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Tech University filed Critical Nanjing Tech University
Priority to CN201910702087.XA priority Critical patent/CN110457526A/en
Publication of CN110457526A publication Critical patent/CN110457526A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/80Information retrieval; Database structures therefor; File system structures therefor of semi-structured data, e.g. markup language structured data such as SGML, XML or HTML

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Document Processing Apparatus (AREA)

Abstract

The invention proposes a kind of unitized data analytic method based on xml document, determines feature possessed by each field in communication protocol;Communication protocol is written as corresponding xml document;File destination is created according to the root element of xml document, using the field name of element each in xml document write-in file destination as gauge outfit;Message is saved in the form of binary, selects the message file to be parsed, according to xml document root element, finds the message for meeting the description of this xml document;It is successively read each element of xml document, according to the length attribute of element, the data of corresponding length are read in messages, are parsed data according to data type attribute, and the data after parsing are saved in file destination, the end-tag until reading root element.The present invention solves the problems, such as that each agreement will modify analysis software, enhances versatility, and data can be accurately parsed as long as writing out legal xml document.

Description

Unitized data analytic method based on xml document
Technical field
The invention belongs to data processing technique, specially a kind of unitized data analytic method based on xml document.
Background technique
At present both at home and abroad for the parsing of data, mainly exactly each agreement is encoded by the way of, is parsed When according to information such as length, title, the types of field each in code, the binary file received is parsed into natural language File.This mode is often that one agreement of every parsing will modify a software code, this process is comparatively laborious, and works as When not having translation and compiling environment in equipment, modified code also will be unable to apply.In communication system, need to carry out plurality of devices It monitors, the communication protocol between distinct device is also different, therefore this single data analysis mode has been far from satisfying It needs.
Summary of the invention
It is an object of the invention to propose a kind of unitized data analytic method based on xml document.
Realize technical solution of the invention are as follows: a kind of unitized data analytic method based on xml document, it is specific to walk Suddenly are as follows:
Step 1, analysis communication protocol, determine feature possessed by each field in communication protocol;
Communication protocol is written as corresponding xml document according to determining field feature by step 2;
Step 3 creates file destination according to the root element of xml document, and the field name of element each in xml document is write Enter file destination as gauge outfit;
Step 4 saves message in the form of binary, selects the message file to be parsed, according to xml document root member Element finds the message for meeting the description of this xml document;
Step 5, each element for being successively read xml document are read corresponding in messages according to the length attribute of element Data are parsed according to data type attribute, and the data after parsing are saved in the mesh that step 3 creates by the data of length It marks in file, the end-tag until reading root element.
Preferably, feature possessed by field described in step 1 includes title, length, type, the precision of field.
Preferably, communication protocol is written as corresponding xml document by step 2 method particularly includes:
Define xml document root element, the attribute of the root element include generate file destination name information unit mark, Source destination port number and purpose destination port number;
Each field of communication protocol is written as to an element of xml document, each element of xml document includes word Title, the length, type, precision of section.
Compared with prior art, the present invention its remarkable advantage are as follows: using the present invention, user only needs the communication according to oneself The attribute of root element and each attribute of an element in protocol modification xml document add xml document according to the number of field in agreement Element, satisfactory each frame data can parse the description of agreement according to xml document by program;The present invention It solves the problems, such as that each agreement will modify analysis software, enhances versatility, as long as writing out legal xml document Data can accurately be parsed;The present invention gets rid of the constraint that must have translation and compiling environment, expands actual application range.
Further detailed description is done to the present invention with reference to the accompanying drawing.
Detailed description of the invention
Fig. 1 is the unitized data analytic method flow chart based on xml document.
Fig. 2 is the form schematic diagram of element in the xml document of the invention used.
Fig. 3 is data process of analysis figure of the present invention.
Specific embodiment
As shown in Figure 1, a kind of unitized data analytic method based on xml document, includes the following steps:
Step 1, analysis communication protocol, specify feature possessed by each field in communication protocol, for example, field title, Length (byte number), type, precision etc..Common data type includes: unsigned int, has symbol integer, has symbol floating-point Type, without symbol floating type, enumeration type etc., real-coded GA to be arranged the precision of data, by every kind of data type setting one A mark identifies the data type of this field for software.
In certain embodiments, common communication protocol is analyzed, data type is clearly distinguished, and to each A mark is arranged in data type, such as: it is " UINT " by unsigned int unsigned int traffic sign placement, int type is set It is set to " INT " etc., data type that is using in communication protocol and may using all is set into label, analysis software can To identify all data types set, arbitrary communication protocol can be parsed.
Fixed title is set to the title of each field, length, data type, precision etc., it is convenient in subsequent reading Xml document be can accurately read, such as: field name is set as " name ", and length is set as " length ", and type is set as " style ", precision setting are " precision " etc., are arranged according to the actual conditions of oneself.
Communication protocol is written as corresponding xml document by step 2, the field feature determined according to step 1, defines xml text The contents such as the element of part and statement.Specifically:
The root element of xml document is defined, the attribute of root element includes file destination name (such as .txt text generated after parsing Part), information unit mark, source destination port number and purpose destination port number, meet the requirements for being found in different messages Message parsed.
Each field of communication protocol is written as to an element of xml document, each element of xml document includes word The attributes such as title, length, type, the precision of section.
In certain embodiments, the format of Xml file is as follows:
<xml statement>
<root element starts label>
<element 1>
<element 2>
……
<root element end-tag>
According to the number of the format of xml document and protocol fields, the addition element in xml document, each xml document pair Answer an agreement.
As shown in Fig. 2, attribute should include file destination name (such as .txt file), information list when element is root element Member mark, source destination port number and purpose destination port number etc., can need to modify according to oneself;When element is common element When, attribute should include title, length, type, the precision etc. of field.
The xml document that step 3, read step 2 define creates file destination according to the root element of xml document, by xml text The name attribute write-in file destination of each element is as gauge outfit in part, convenient for later reading.
As shown in figure 3, the xml document write is put under specified file directory, when executing analysis software, first It can read successively xml document, establish file destination, and file is written into file header, in this way in subsequent resolving, Ke Yizhi It connects and the data of parsing is write direct into file destination.
Step 4 will be preserved in the form of binary from the message of outside admission.The message file to be parsed is selected, Message file is read, xml document is read again, according to the information unit of xml document root element mark, source destination port number and mesh Destination port number attribute, find meet this xml document description message.
Step 5 continues to read xml document downwards, each element of xml document is successively read, according to the length of element Attribute reads the data of respective byte number in messages, is later parsed data according to data type attribute, and will parsing Data afterwards are saved in the file destination of step 3 creation, the end-tag until reading root element completes this frame number According to parsing.
Continue to read the data in message file, finds the data that next frame meets xml document description, execute step again Step in rapid 5, until the reading data of entire message file finishes.
Embodiment 1
A kind of unitized data analytic method based on xml document, includes the following steps:
Step 1 passes through analysis communication protocol, and each field of communication protocol includes: title, length (byte number), type Etc. features, length have 1,2,4 byte, type have unsigned int (UINT), integer (INT), floating type (FLOAT), without symbol The types such as number floating type (UFLOAT), enumeration type (ENUM), hexadecimal output (HEX), address (ADDRESS).
Communication protocol is written as corresponding xml document according to determining field feature by step 2;It is written as according to agreement Root element includes the xml document of file, identification, SID, DID attribute, and format is as follows:
1 xml document format of table
Step 3 creates file destination file.txt according to the file attribute of the root element of face xml document, and by xml document In each element name attribute write-in file.txt as gauge outfit, convenient for later reading;
Artificially add the heading of the preceding paragraph regular length: the report of " msg: "+2 bytes when saving a frame message before message Literary length, as the separation between two frame messages.The head that a frame message is found by searching for " msg: ", according to the length of back Read the complete message of a frame.
The format of 1~element of element 14 and length are fixed in step 4, xml document, it is possible to according to xml text The element property of part takes out the value of element 10, element 12, element 13, and respectively with the attribute of root element Identification, SID, DID make comparisons, and judge whether to be equal to each other, and go to search again if being not exclusively equal to each other " msg: " head starts the new message of a frame;
If step 5, be equal to each other continue to read xml document element, according to " length " of each element, " style " attribute executes corresponding parsing, this frame packet parsing is finished, and the data after parsing are saved in root element In the specified file destination of file attribute.After the parsing for completing a frame message, " msg: " head is searched again, starts new primary solution Analysis, the end until reading message file, resolving terminate.

Claims (3)

1. a kind of unitized data analytic method based on xml document, which is characterized in that specific steps are as follows:
Step 1, analysis communication protocol, determine feature possessed by each field in communication protocol;
Communication protocol is written as corresponding xml document according to determining field feature by step 2;
Step 3 creates file destination according to the root element of xml document, and mesh is written in the field name of element each in xml document File is marked as gauge outfit;
Step 4 saves message in the form of binary, and the message file to be parsed is selected to be looked for according to xml document root element To the message for meeting the description of this xml document;
Step 5, each element for being successively read xml document read corresponding length according to the length attribute of element in messages Data, data are parsed according to data type attribute, and the data after parsing are saved in the target text that step 3 creates End-tag in part, until reading root element.
2. the unitized data analytic method according to claim 1 based on xml document, which is characterized in that institute in step 1 State the title, length, type, precision that feature possessed by field includes field.
3. the unitized data analytic method according to claim 2 based on xml document, which is characterized in that step 2 will lead to Letter agreement is written as corresponding xml document method particularly includes:
The root element of xml document is defined, the attribute of the root element includes the file destination name information unit mark generated, source mesh Mark port numbers and purpose destination port number;
Each field of communication protocol is written as to an element of xml document, each element of xml document includes field Title, length, type, precision.
CN201910702087.XA 2019-07-31 2019-07-31 Unitized data analytic method based on xml document Withdrawn CN110457526A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910702087.XA CN110457526A (en) 2019-07-31 2019-07-31 Unitized data analytic method based on xml document

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910702087.XA CN110457526A (en) 2019-07-31 2019-07-31 Unitized data analytic method based on xml document

Publications (1)

Publication Number Publication Date
CN110457526A true CN110457526A (en) 2019-11-15

Family

ID=68484314

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910702087.XA Withdrawn CN110457526A (en) 2019-07-31 2019-07-31 Unitized data analytic method based on xml document

Country Status (1)

Country Link
CN (1) CN110457526A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111741019A (en) * 2020-07-28 2020-10-02 常州昊云工控科技有限公司 Communication protocol analysis method and system based on field description
CN113347196A (en) * 2021-06-21 2021-09-03 浙江理工大学 Analysis method and device for analyzing network data, electronic equipment and storage medium
CN113676437A (en) * 2020-05-14 2021-11-19 中国移动通信集团云南有限公司 Parameter analysis method, parameter acquisition method, parameter setting method and device
CN114818656A (en) * 2022-06-30 2022-07-29 深圳华锐分布式技术股份有限公司 Binary file analysis method, device, equipment and medium based on gray scale upgrading
CN115334177A (en) * 2022-07-07 2022-11-11 浙江众合科技股份有限公司 Binary data message analysis method based on xml configuration file recursion
CN116800868A (en) * 2023-08-29 2023-09-22 南京天创电子技术有限公司 Visual communication protocol analysis method and system based on XML
CN116932626A (en) * 2023-07-27 2023-10-24 北京和德宇航技术有限公司 Data analysis method, device, equipment and storage medium

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113676437A (en) * 2020-05-14 2021-11-19 中国移动通信集团云南有限公司 Parameter analysis method, parameter acquisition method, parameter setting method and device
CN113676437B (en) * 2020-05-14 2023-08-18 中国移动通信集团云南有限公司 Parameter acquisition method, parameter setting method and device
CN111741019A (en) * 2020-07-28 2020-10-02 常州昊云工控科技有限公司 Communication protocol analysis method and system based on field description
CN113347196A (en) * 2021-06-21 2021-09-03 浙江理工大学 Analysis method and device for analyzing network data, electronic equipment and storage medium
CN114818656A (en) * 2022-06-30 2022-07-29 深圳华锐分布式技术股份有限公司 Binary file analysis method, device, equipment and medium based on gray scale upgrading
CN114818656B (en) * 2022-06-30 2022-09-23 深圳华锐分布式技术股份有限公司 Binary file analysis method, device, equipment and medium based on gray scale upgrading
CN115334177A (en) * 2022-07-07 2022-11-11 浙江众合科技股份有限公司 Binary data message analysis method based on xml configuration file recursion
CN115334177B (en) * 2022-07-07 2023-12-05 浙江众合科技股份有限公司 Binary data message analysis method based on xml configuration file recursion realization
CN116932626A (en) * 2023-07-27 2023-10-24 北京和德宇航技术有限公司 Data analysis method, device, equipment and storage medium
CN116932626B (en) * 2023-07-27 2024-04-02 北京和德宇航技术有限公司 Data analysis method, device, equipment and storage medium
CN116800868A (en) * 2023-08-29 2023-09-22 南京天创电子技术有限公司 Visual communication protocol analysis method and system based on XML
CN116800868B (en) * 2023-08-29 2023-11-07 南京天创电子技术有限公司 Visual communication protocol analysis method and system based on XML

Similar Documents

Publication Publication Date Title
CN110457526A (en) Unitized data analytic method based on xml document
CN102737012B (en) text information comparison method and system
US20070136363A1 (en) Systems and methods for report design and generation
CN101526963A (en) Method for identifying web page coding, device and terminal equipment
CN106294493B (en) Method and device for realizing document format conversion
CN100489862C (en) Marked language archive analytical method, analytical module and user terminal
CN108595389A (en) A method of Word document is converted into txt plain text documents
CN102289407A (en) Method for automatically testing document format conversion and device thereof
CN102982010A (en) Method and device for abstracting document structure
US20060235868A1 (en) Methods and apparatus for representing markup language data
CN100585561C (en) Method for clipping relocatable ELF files in embedded system
CN108664546B (en) XML data structure conversion method and device
WO2021051624A1 (en) Data acquisition method and apparatus, and electronic device and storage medium
CN113609820A (en) Method, device and equipment for generating word file based on extensible markup language file
CN108563629B (en) Automatic log analysis rule generation method and device
CN113225320A (en) Network message analysis method for keeping user configurable message format secret
CN107025125B (en) A kind of source code flow coding/decoding method and system
US8656371B2 (en) System and method of report representation
US20110087698A1 (en) Search expression creating system, search expression creating method, search expression creating program, and recording medium
CN104079450A (en) Method and device for generating characteristic pattern set
CN104753891A (en) XML (Extensive Markup Language) message analyzing method and device
CN104216868B (en) A kind of adaptation method and device of document display format
CN101464875B (en) Method for representing electronic dictionary catalog data by XML
CN115801922A (en) Analytic rule generation method based on serial communication byte code protocol
CN112965772A (en) Web page display method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20191115

WW01 Invention patent application withdrawn after publication