CN115334177B - Binary data message analysis method based on xml configuration file recursion realization - Google Patents
Binary data message analysis method based on xml configuration file recursion realization Download PDFInfo
- Publication number
- CN115334177B CN115334177B CN202210802565.6A CN202210802565A CN115334177B CN 115334177 B CN115334177 B CN 115334177B CN 202210802565 A CN202210802565 A CN 202210802565A CN 115334177 B CN115334177 B CN 115334177B
- Authority
- CN
- China
- Prior art keywords
- message
- field
- structure object
- attribute
- refer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 32
- 238000000034 method Methods 0.000 claims abstract description 29
- 230000014509 gene expression Effects 0.000 claims description 22
- 238000003491 array Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000004540 process dynamic Methods 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L69/00—Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
- H04L69/22—Parsing or analysis of headers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
- G06F8/42—Syntactic analysis
- G06F8/427—Parsing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/71—Version control; Configuration management
Landscapes
- Engineering & Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the application provides a binary data message analysis method based on the recursion realization of an xml configuration file, which comprises the steps of reading the xml configuration file and constructing a message structure object dictionary based on the read characteristic field; obtaining a data message, and obtaining a message structure object corresponding to the data message from a constructed message structure object dictionary; traversing the field structure in the message structure object, and analyzing based on the attribute of the field structure. The attribute of the field in the xml file is expanded, so that the dynamic message with any structure can be completely analyzed through xml configuration. Compared with a hard coding analysis mode, the method greatly reduces the time required for revising the message configuration because of interface document change.
Description
Technical Field
The application relates to the field of message analysis, in particular to a binary data message analysis method based on xml configuration file recursion.
Background
At present, two methods exist for binary message analysis in the rail transit industry. One is based on hard coding method, the analysis logic of each message is hard coded in the code, the advantage is that the analysis logic can be customized according to the specific message, the analysis is very efficient and flexible, the disadvantage is that once the message interface has variation, the code is needed to be modified again, and the variation cost is larger. The other is an analysis method based on the Xml configuration file, and the Xml has the advantages of good readability, strong expandability and the like. The message structure is configured in the xml file, and the field name, the field length and the field type are generally configured. Where the field type is typically a base data type such as unsigned integer, signed integer, ASCII string, address type, enumeration type, etc. However, if the contents of the message segment are complex types such as variable length arrays, deep structures and the like, the prior art cannot meet the requirements. In addition, in the context of modifying the message, the value of the message crc check field needs to be dynamically calculated according to the values of other fields of the message, and if not, the modified message cannot be used by the service system.
Disclosure of Invention
The embodiment of the application provides a binary data message analysis method based on the recursion realization of an xml configuration file, which solves the problems that the existing analysis method based on the xml configuration file cannot process a dynamic array, a complex structure body, cannot dynamically update the value of a field and the like. The method expands the attribute of the field in the xml file, so that the dynamic message with any structure can be completely analyzed through the xml configuration. In addition, compared with a hard coding analysis mode, the method greatly reduces the time required for revising the message configuration because of interface document change.
Specifically, the binary data message parsing method based on xml configuration file recursion implementation provided by the embodiment of the application comprises the following steps:
s1, reading an xml configuration file, and constructing a message structure object dictionary based on the read characteristic field;
s2, acquiring a data message, and acquiring a message structure object corresponding to the data message from a constructed message structure object dictionary;
and S3, traversing the field structure in the message structure object, and analyzing based on the attribute of the field structure.
Optionally, the S1 includes:
s11, reading an xml configuration file of a binary data message, and acquiring a characteristic field in the xml configuration file;
s12, constructing a message structure object dictionary by taking the message structure object as a value according to the acquired characteristic field as an index key.
Optionally, the S2 includes:
s21, acquiring a binary data message and extracting features;
s22, obtaining the corresponding message structure object from the message structure object dictionary according to the characteristic index.
Optionally, the step S3 includes:
s31, traversing a field structure object list in the message structure object, if the list has a field structure object which is not traversed, taking out the field structure object and executing S32, otherwise, exiting the layer of message analysis;
s32, judging the extracted field structure object based on the count attribute;
s33, judging the extracted field structure object based on the refer to attribute, and if the extracted field structure object contains the refer to attribute, re-executing the steps S2 to S3; if the fetched field structure object does not contain the refer attribute, step S31 is re-executed.
Optionally, the S32 includes:
s321, if the field structure object contains a count attribute, judging whether the count attribute is a variable;
s322, if the count attribute is a variable, dynamically determining whether the extracted field structure object contains the count attribute according to the known field analysis result, and analyzing the field containing the count attribute in an array mode;
s323, if the count attribute is constant, analyzing in a fixed-length array mode;
s324, if the field structure object does not contain a count attribute, the field structure object is parsed in a single element mode.
Optionally, the S33 includes:
s331, if the field structure object contains a refer to attribute, judging whether the refer to attribute is an expression;
s332, if the referring to attribute is in the form of an expression, dynamically determining a sub-message structure object of the referring to according to the previous analysis result, and analyzing in a sub-message mode;
s333, if the refer to attribute is not in the form of an expression, directly determining a sub-message structure object of the refer to according to the value of the refer to attribute, and analyzing in a sub-message mode;
s334, if the field structure object does not contain the refer attribute, a step of parsing the base type field is performed.
Optionally, the step S334 further includes:
s3341, determining the length of the field according to length, determining the type of the field according to format, and analyzing the value of the field;
s3342 if the field structure object format attribute is a crc function expression, it will not be used in the process of parsing the binary data message into a data dictionary, and the value of the expression is calculated in real time when the data dictionary is serialized into the binary data message.
The beneficial effects are that:
the method solves the problems that the existing xml configuration file-based analysis method cannot process dynamic arrays, complex structures, cannot dynamically update field values and the like. The method expands the attribute of the field in the xml file, so that the dynamic message with any structure can be completely analyzed through the xml configuration. In addition, compared with a hard coding analysis mode, the method greatly reduces the time required for revising the message configuration because of interface document change.
Drawings
In order to more clearly illustrate the technical solutions of the present application, the drawings that are needed in the description of the embodiments will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a binary data message parsing method based on the recursive implementation of an xml configuration file according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating an embodiment of the present application;
fig. 3 is a binary data packet xml configuration diagram according to an embodiment of the present application.
Detailed Description
In order to make the structure and advantages of the present application more apparent, the structure of the present application will be further described with reference to the accompanying drawings.
Specifically, as shown in fig. 1, the binary data message parsing method based on xml configuration file recursion according to the embodiment of the present application includes:
s1, reading an xml configuration file, and constructing a message structure object dictionary based on the read characteristic field;
s2, acquiring a data message, and acquiring a message structure object corresponding to the data message from a constructed message structure object dictionary;
and S3, traversing the field structure in the message structure object, and analyzing based on the attribute of the field structure.
In implementation, in order to solve the problems of the two methods for parsing messages at present, a binary data message parsing method based on xml configuration file recursion is provided. The configuration of another sub-packet is referenced by the refer to extension attribute of the field. The program can know whether the value of the field is a structural object or not through the refer to attribute of the message field, and if the refer to is the name of a sub-message, namely, points to a sub-message configuration, the sub-message configuration is used for analyzing the value of the field. If the refer to is an expression containing other sub-messages, the expression is analyzed in the analysis process, and the dynamic judgment of which sub-message is used for completing the analysis can be realized.
It is noted that, the message parsing method provided by the application also indicates whether the value of the field is an array or not through the count extension attribute of the field. If the count is a constant, the value of the field is a fixed-length array, if the count is the name of another field, the value of the field is a variable-length array, and the length of the array is dynamically determined according to the value of the name field in the message parsing process.
Specifically, the S1 for constructing the dictionary operation of the message structure object in the foregoing method includes:
s11, reading an xml configuration file of a binary data message, and acquiring a characteristic field in the xml configuration file;
s12, constructing a message structure object dictionary by taking the message structure object as a value according to the acquired characteristic field as an index key.
Optionally, the step S2 of the foregoing method for obtaining a message structure object corresponding to a data message includes:
s21, acquiring a binary data message and extracting features;
s22, obtaining the corresponding message structure object from the message structure object dictionary according to the characteristic index.
Optionally, S3 in the foregoing method for parsing based on the attribute of the field structure includes:
s31, traversing a field structure object list in the message structure object, if the list has a field structure object which is not traversed, taking out the field structure object and executing S32, otherwise, exiting the layer of message analysis;
s32, judging the extracted field structure object based on the count attribute;
s33, judging the extracted field structure object based on the refer to attribute, and if the extracted field structure object contains the refer to attribute, re-executing the steps S2 to S3; if the fetched field structure object does not contain the refer attribute, step S31 is re-executed.
In implementation, S32 includes:
s321, if the field structure object contains a count attribute, judging whether the count attribute is a variable;
s322, if the count attribute is a variable, dynamically determining whether the extracted field structure object contains the count attribute according to the known field analysis result, and analyzing the field containing the count attribute in an array mode;
s323, if the count attribute is constant, analyzing in a fixed-length array mode;
s324, if the field structure object does not contain a count attribute, the field structure object is parsed in a single element mode.
S33 includes:
s331, if the field structure object contains a refer to attribute, judging whether the refer to attribute is an expression;
s332, if the referring to attribute is in the form of an expression, dynamically determining a sub-message structure object of the referring to according to the previous analysis result, and analyzing in a sub-message mode;
s333, if the refer to attribute is not in the form of an expression, directly determining a sub-message structure object of the refer to according to the value of the refer to attribute, and analyzing in a sub-message mode;
s334, if the field structure object does not contain the refer attribute, a step of parsing the base type field is performed.
S334 includes:
s3341, determining the length of the field according to length, determining the type of the field according to format, and analyzing the value of the field;
s3342 if the field structure object format attribute is a crc function expression, it will not be used in the process of parsing the binary data message into a data dictionary, and the value of the expression is calculated in real time when the data dictionary is serialized into the binary data message.
In the above-mentioned judging step, the format extension attribute of the field and the type representing the field may be hex (hexadecimal), decimal (decimal), binary (binary), ascii (character string), or may be an expression, for example: crc32 (field 1, field2, field 3) or crc32 (field 1.) indicates that the value of this field after a modification of a message needs to be dynamically calculated from the other fields of the message. The same fields of a plurality of messages only need to be configured once through the refer to extension attribute and the ebend extension attribute, then the messages are referenced in a plurality of places, and the referenced sub-message configuration is embedded into the referenced main message during analysis.
Based on the technical solution proposed in the foregoing, a specific execution flow is shown in fig. 2:
step 1, reading an xml configuration file of a binary data message, taking characteristic fields such as SID, DID, type and the like of the message as index keys, taking a message structure object as a value, and constructing a message structure object dictionary. The message structure object includes basic attributes of the message and a field structure object list, and the field structure object includes various attributes of fields, including a field accounting for several bytes (length), a field type (format), whether a field is an array, an array length (count), a message structure object (refer to) referencing a sub-message, and the like.
And 2, acquiring a binary data message, and acquiring a corresponding message structure object from the message structure object dictionary according to the index.
And 3, traversing a field structure object list in the message structure object, if the list has non-traversed field structure objects, taking out one of the field structure objects, entering the next step, and otherwise, exiting the layer of message analysis.
And 4, if the field structure object contains the count attribute, judging whether the count attribute is a variable or not. If so, dynamically determining the value of the count attribute according to the analysis result of the previous field. Fields containing count attributes are parsed in an array. Otherwise, parse in a single element.
Step 6, if the field structure object contains a refer to attribute, it is determined whether the refer to attribute is an expression. If yes, dynamically determining the sub-message structure object of the refer to according to the previous analysis result. The fields containing the referring attribute are parsed in a sub-message manner, and the process returns to step 3 (recursive implementation), otherwise, the process proceeds to step 7.
And 7, determining the length of the field according to length, determining the type of the field according to format, and analyzing the value of the field. If the field structure object format attribute is a crc function expression, the crc function expression is not used in the process of analyzing the binary data message into a data dictionary. But the values of the expressions are calculated in real time as the data dictionary is serialized into binary data messages.
Fig. 3 is an xml configuration diagram of binary data message, and replaces the configuration of service message of actual track traffic with the configuration of example in life. The main message includes Name, age, heightAndweight, luckyNumber, phoneNumbers, friendsCount, friends, CRCValue fields.
Line 5, the height and weight embeds Height, weight field into the main message in an embedding way;
line 7, phoneNumbers, is a fixed length array with an internal element of 4 bytes decimal number and number of 5;
on line 9, friends is a structure with an internal element type of Friend, and the number of the Friends is a dynamic variable-length array of a Friend count value;
line 10, when any field of the LuckyNumber and later is changed, the value of the crcvvalue field is recalculated according to the LuckyNumber and later;
on line 25, the value of Food is a structure dynamically determined according to the value of Type, and when Type is equal to 1 or 2, food is parsed by the configuration of Meet or Vegetable sub-messages, respectively.
The foregoing is illustrative of the present application and is not to be construed as limiting thereof, but rather, the present application is to be construed as limited to the appended claims.
Claims (4)
1. The binary data message analysis method based on the xml configuration file recursion is characterized by comprising the following steps:
s1, reading an xml configuration file, and constructing a message structure object dictionary based on the read characteristic field;
s2, acquiring a data message, and acquiring a message structure object corresponding to the data message from a constructed message structure object dictionary;
s3, traversing the field structure in the message structure object, and analyzing based on the attribute of the field structure;
the step S3 comprises the following steps:
s31, traversing a field structure object list in the message structure object, if the list has a field structure object which is not traversed, taking out the field structure object and executing S32, otherwise, exiting the layer of message analysis;
s32, judging the extracted field structure object based on the count attribute;
s33, judging the extracted field structure object based on the refer to attribute, and if the extracted field structure object contains the refer to attribute, re-executing the step S2; if the fetched field structure object does not contain the refer attribute, executing the step of analyzing the basic type field;
the S32 includes:
s321, if the field structure object contains a count attribute, judging whether the count attribute is a variable;
s322, if the count attribute is a variable, dynamically determining whether the extracted field structure object contains the count attribute according to the known field analysis result, and analyzing the field containing the count attribute in an array mode;
s323, if the count attribute is constant, analyzing in a fixed-length array mode;
s324, if the field structure object does not contain a count attribute, analyzing in a single element mode;
the S33 includes:
s331, if the field structure object contains a refer to attribute, judging whether the refer to attribute is an expression;
s332, if the referring to attribute is in the form of an expression, dynamically determining a sub-message structure object of the referring to according to the previous analysis result, and analyzing in a sub-message mode;
s333, if the refer to attribute is not in the form of an expression, directly determining a sub-message structure object of the refer to according to the value of the refer to attribute, and analyzing in a sub-message mode;
s334, if the field structure object does not contain the refer attribute, a step of parsing the base type field is performed.
2. The binary data message parsing method based on xml configuration file recursion according to claim 1, wherein S1 includes:
s11, reading an xml configuration file of a binary data message, and acquiring a characteristic field in the xml configuration file;
s12, constructing a message structure object dictionary by taking the message structure object as a value according to the acquired characteristic field as an index key.
3. The binary data message parsing method based on xml configuration file recursion according to claim 1, wherein S2 includes:
s21, acquiring a binary data message and extracting features;
s22, obtaining the corresponding message structure object from the message structure object dictionary according to the characteristic index.
4. The binary data message parsing method based on xml configuration file recursion according to claim 1, wherein S334 further comprises:
s3341, determining the length of the field according to length, determining the type of the field according to format, and analyzing the value of the field;
s3342 if the field structure object format attribute is a crc function expression, it will not be used in the process of parsing the binary data message into a data dictionary, and the value of the expression is calculated in real time when the data dictionary is serialized into the binary data message.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210802565.6A CN115334177B (en) | 2022-07-07 | 2022-07-07 | Binary data message analysis method based on xml configuration file recursion realization |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210802565.6A CN115334177B (en) | 2022-07-07 | 2022-07-07 | Binary data message analysis method based on xml configuration file recursion realization |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115334177A CN115334177A (en) | 2022-11-11 |
CN115334177B true CN115334177B (en) | 2023-12-05 |
Family
ID=83918626
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210802565.6A Active CN115334177B (en) | 2022-07-07 | 2022-07-07 | Binary data message analysis method based on xml configuration file recursion realization |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115334177B (en) |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101221556A (en) * | 2008-02-01 | 2008-07-16 | 中国建设银行股份有限公司 | Method and device for XML document analysis |
CN106095792A (en) * | 2016-05-27 | 2016-11-09 | 中国银联股份有限公司 | The method and apparatus generating database manipulation code |
CN109885569A (en) * | 2018-12-29 | 2019-06-14 | 天津南大通用数据技术股份有限公司 | Field extraction and structural method are carried out to XML data based on configuration file |
CN110457526A (en) * | 2019-07-31 | 2019-11-15 | 南京理工大学 | Unitized data analytic method based on xml document |
CN110719296A (en) * | 2019-10-25 | 2020-01-21 | 福建网能科技开发有限责任公司 | Method for automatically analyzing message data in terminal communication protocol |
CN112037074A (en) * | 2020-09-11 | 2020-12-04 | 中国银行股份有限公司 | Visualization-based data file analysis method and device |
CN113225320A (en) * | 2021-04-21 | 2021-08-06 | 南京理工大学 | Network message analysis method for keeping user configurable message format secret |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US9842090B2 (en) * | 2007-12-05 | 2017-12-12 | Oracle International Corporation | Efficient streaming evaluation of XPaths on binary-encoded XML schema-based documents |
-
2022
- 2022-07-07 CN CN202210802565.6A patent/CN115334177B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101221556A (en) * | 2008-02-01 | 2008-07-16 | 中国建设银行股份有限公司 | Method and device for XML document analysis |
CN106095792A (en) * | 2016-05-27 | 2016-11-09 | 中国银联股份有限公司 | The method and apparatus generating database manipulation code |
CN109885569A (en) * | 2018-12-29 | 2019-06-14 | 天津南大通用数据技术股份有限公司 | Field extraction and structural method are carried out to XML data based on configuration file |
CN110457526A (en) * | 2019-07-31 | 2019-11-15 | 南京理工大学 | Unitized data analytic method based on xml document |
CN110719296A (en) * | 2019-10-25 | 2020-01-21 | 福建网能科技开发有限责任公司 | Method for automatically analyzing message data in terminal communication protocol |
CN112037074A (en) * | 2020-09-11 | 2020-12-04 | 中国银行股份有限公司 | Visualization-based data file analysis method and device |
CN113225320A (en) * | 2021-04-21 | 2021-08-06 | 南京理工大学 | Network message analysis method for keeping user configurable message format secret |
Also Published As
Publication number | Publication date |
---|---|
CN115334177A (en) | 2022-11-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100614677B1 (en) | Method for compressing/decompressing a structured document | |
JP3368883B2 (en) | Data compression device, database system, data communication system, data compression method, storage medium, and program transmission device | |
US5748122A (en) | Data processing apparatus and data processing method | |
CN109104405B (en) | Binary protocol encoding and decoding method and device | |
CN102571966B (en) | Network transmission method for large extensible markup language (XML) document | |
US8364621B2 (en) | Method and device for coding a structured document and method and device for decoding a document so coded | |
US20100287460A1 (en) | Method and device for coding a structured document | |
EP1803225A1 (en) | Adaptive compression scheme | |
CN108733317B (en) | Data storage method and device | |
US20090201180A1 (en) | Compression for deflate algorithm | |
JP4653381B2 (en) | Structured document compression / decompression method | |
US6801141B2 (en) | Method for lossless data compression using greedy sequential context-dependent grammar transform | |
US20200294629A1 (en) | Gene sequencing data compression method and decompression method, system and computer-readable medium | |
CN113094346A (en) | Big data coding and decoding method and device based on time sequence | |
JP2007520112A (en) | Quickly queryable data compression format for XML files | |
CN115065623A (en) | Active and passive combined reverse analysis method for private industrial control protocol | |
CN115334177B (en) | Binary data message analysis method based on xml configuration file recursion realization | |
CN111464515A (en) | Data conversion method, device, equipment and storage medium | |
CN115576603B (en) | Method and device for acquiring variable values in code segment | |
CN105793842B (en) | Conversion method and device between serialized message | |
CN112214461B (en) | Fuzzy XML compression method for remote sensing metadata | |
CN108628631A (en) | A method of the abbreviation in parameter is extended automatically | |
CN113452712A (en) | Analytic method based on Unionpay ISO 8583 message | |
CN115412619B (en) | Method for real-time monitoring and analyzing log message with user configurable message format | |
CN115514829B (en) | Automatic UDP data message conversion method based on XML |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |