A method of restoring Java and serializes file data
Technical field
The present invention relates to field of information security technology, in particular to a kind of method for restoring Java serializing file data.
Background technique
The parsing of serializing file is frequently encountered in data parsing recovery field.Java serializing is that Java itself is mentioned
Supply a kind of Data Serialization mode, it allow developer by the data of the one or a series of structuring based on class with
The form storage of stream is in the storage device.This mechanism greatly facilitates developer and needs to the stream transmission of complex data structures
It asks.
Original Java serializing restoration methods have following obvious shortcoming and inconvenience:
It 1, can not be by the original unserializing mechanism of Java by sequence in the case that serialized data original data structure is lost
The data of columnization are restored to original structure.
2, usually can be by manually being parsed in the case where that can not know former data structure, but work as data
When more complex, artificial parsing is just more difficult and time consuming.
3, the identical data of structure can be parsed by writing the script to specified structure, but this method not can guarantee
Parse the versatility of script.The serialized data of numerous unknown structures can not be coped with.
4, manually the data in arbitrary sequence file can not be restored to the memory of structuring by parsing and script parsing
Object is called directly for system.
Summary of the invention
The present invention in view of the drawbacks of the prior art, provides a kind of method of recovery Java serializing file data, can have
Solution the above-mentioned problems of the prior art of effect.
A method of restoring Java and serialize file data, comprising the following steps:
S1: analyzing and records the identifier in Java serializing file data about data type and structure;
S2: define intermediate structure for store the data type name of each node generated in resolving, domain name claims and
Value;
S3: the node in parsing serializing file is traversed according to rule, the intermediate result of a top layer is obtained, parses
Safeguard that a class defines ID list in the process;
S4: intermediate result is converted to json character string using the json tool that android is carried by expansion intermediate result.
S5: extracting the structure of class and generates class template restores for internal storage data;
S6: restore complete serialized data into memory.
Preferably, data type identifier shares ten, divides for identifying its data type modified in the S1
8 basic data types, java class and the array in java are not corresponded to;
Concrete meaning are as follows: 0x42 indicates byte;0x43 indicates char;0x44 indicates double;0x46 indicates float;
0x49 indicates int;0x50 indicates long;Ox4c indicates object;0x53 indicates short;0x5a indicates boolean;0x5b is indicated
Array.
Preferably, in the S1 structure control mark mainly include the following types:
0x71 has been described for identifying class, with reference to the record of description list;
0x72, for identifying the beginning of generic attribute description;
0x73, for identifying the beginning of object factory;
0x74 is String for identifying the structure;
0x75, for identifying the beginning of array description;
0x77 is block number evidence for identifying next data;
0x78, for identifying the end of class formation description;
0x70, for identifying class, whether there is or not superclass.
Preferably, the detailed step of the S2 is as follows:
S201: analysis data location mode obtains data structure, for designing analytical algorithm;
Analyze class formation storage mode: the description of class formation by data type identifier in S1, structure control identifier,
Data length and four part of data are completed, which is a string of byte streams, structure are as follows: 0x72, class name length, class name refer to
Line and identifier, domain quantity, domain list, 0x78, reference indication;
Data meaning and data store organisation are analyzed: the end mark of data and the description of followed by class formation, is a series of numbers
It is gone here and there according to end to end byte is described, structure are as follows: data type descriptor, data;
S202: intermediate structure is defined;
The intermediate structure needs to store class description number, typonym, type name, categorical data.
Preferably, the detailed step of the S3 is as follows:
S301: traversal parses entire file in a recursive manner, and classification building intermediate result finally obtains a total centre
As a result;
S302: the class formation for completing description for the first time, the intermediate structure storage that such is converted in lists, from
0 is numbered, when encounter mark 0x71 has been described when, describe and load according to the digital ID retrieval class after the table;
Preferably, the detailed step of the S5 is as follows:
S501: extreme saturation intermediate result generate node and parsing;S502 is executed if outermost node layer is class description;
S503: interim .java source file is created;
S504: with class describe in class name in the entitled original of class;
S505: the nodename for reading the intermediate result got obtains nodename;
S506: the data type of the intermediate result interior joint storage got is read, variable is dynamically established;
S507: statistics number of nodes;
S508: the information in domain is written to interim .Java file according to data type and title;
S509: get and set method is written in source file;
S5010: storage source file is itself class name;
S5011: it is .class file that on-the-flier compiler, which generates source file,.
Preferably, the detailed step of the S6 is as follows:
S601: for the data for having generated class template, outermost node layer is taken;
S602: S603 is executed when needing and directly restoring full sequence data, when only needing to restore in serialized data
Partial data, execute S604-S606;
S603: class object is directly generated and using java to right using the serializing class reading data interface that java is carried
The access mode of elephant accesses to class object, terminates;
S604: class object is generated using reflection;
S605: the data field part in the medium object of generation is traversed;
S606: parsing the item of each data field, and the set method for reflecting call parameters is class parameter assignment or the calling side get
Method gets parms value, terminates.
Compared with prior art the present invention has the advantages that
The data store organisation and data storage method of the Java serializing file of binary form storage are independently analyzed,
And define the structure for being named as " intermediate data structure ", for store the data type name that must be recorded in analytic process,
Domain name such as claims at the information.
On the basis of the analysis result to data store organisation and mode, intermediate data structure and algorithm are designed to sequence
Change data to be converted, ignores data type, directly convert data to the JSON format as unit of intermediate data structure.
The data structure embodied in the JSON of generation is converted into java class by algorithm for design, and carries out dynamic volume to it
It translates.
Realize generation and the access interface of the object to the class of on-the-flier compiler, and algorithm for design fills the data in JSON
Into the memory object of dynamic generation, also in combination with the original unserializing method of Java, Java serialized data is directly turned
It is changed to memory object.
Arbitrary standards Java serializing document analysis can be become into JSON in the case where that can not know former data structure
Or the cross-platform general data of XML format.
For Java platform itself, on the basis of will serialize document analysis into JSON or XML, generated for target platform
Original entity class, and the data of parsing are called in the form of object for target platform.
The entity class that Java platform is restored when parsing can be with on-the-flier compiler for .class file, for target platform subsequent right
It is used in the processing of data.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention more comprehensible, by the following examples, to the present invention do into
One step is described in detail.
A method of restoring Java and serialize file data, comprising the following steps:
S1: analyzing and records the identifier in Java serializing file data about data type and structure;
S2: define intermediate structure for store the data type name of each node generated in resolving, domain name claims and
Value;
S3: the node in parsing serializing file is traversed according to rule, the intermediate result of a top layer is obtained, parses
Safeguard that a class defines ID list in the process;
S4: expansion intermediate result is converted to json format;(so far data are resumed out, can be used)
S5: extracting the structure of class and generates class template restores for internal storage data;(hereafter the step of is to realize
Dynamic call in android or java system to data are restored).
S6: restore complete serialized data into memory.
It is described in detail for committed step described above:
S1: analyzing and records in Java serializing file data about data type and the identifier of structure control;
The data type identifier shares ten, respectively corresponds java for identifying its data type modified
In 8 basic data types, java class and array.
Concrete meaning are as follows: 0x42 indicates byte;0x43 indicates char;0x44 indicates double;0x46 indicates float;
0x49 indicates int;0x50 indicates long;Ox4c indicates object;0x53 indicates short;0x5a indicates boolean;0x5b is indicated
Array.
Structure control mark mainly include the following types:
0x71 has been described for identifying class, with reference to the record of description list;
0x72, for identifying the beginning of generic attribute description;
0x73, for identifying the beginning of object factory;
0x74 is String for identifying the structure;
0x75, for identifying the beginning of array description;
0x77 is block number evidence for identifying next data;
0x78, for identifying the end of class formation description;
0x70, for identifying class, whether there is or not superclass.
S2: define intermediate structure for store the data type name of each node generated in resolving, domain name claims and
Value;
S201: analysis data location mode obtains data structure, for designing analytical algorithm;
Analyze class formation storage mode:
The description of class formation passes through data type identifier, structure control identifier, data length and data four in S1
Part is completed, which is a string of byte streams.Structure is as follows:
0x72, class name length, class name, fingerprint and identifier, domain quantity, domain list (its structure are as follows: data type mark,
Domain name length, domain name), 0x78, reference indication (0x70/0x76).
Wherein 0x70 indicates no superclass without reference, and 0x76 then indicates next to describe its parent.
Data meaning and data store organisation are analyzed:
The end mark of data and the description of followed by class formation is that volume of data describes end to end byte string.Structure
It is as follows:
Data type descriptor, data;
When data type descriptor is 0x71, according to the data that its latter two byte stores, retouched from the type recorded
It states and directly acquires type specification in list.
S202: intermediate structure is defined;
The intermediate structure needs to store class description number, typonym, type name, categorical data;
Java serialize in file no matter domain or the data object of class or class, require with structure storage.
The rule traversal parsing that S3 is traversed according to depth of recursion serializes the node in file, obtains in a top layer
Between as a result, in resolving safeguard a class define ID list;
S301: traversal parses entire file in a recursive manner, and classification building intermediate result finally obtains a total centre
As a result.
S302: the class formation for completing description for the first time, the intermediate structure storage that such is converted in lists, from
0 is numbered.When encounter mark 0x71 has been described when, describe and load according to the digital ID retrieval class after the table.
S4: intermediate result is converted to json character string using the json tool that android is carried by expansion intermediate result.
S5: extracting the structure of class and generate class template for internal storage data restore (step for outermost node layer be from
Class is defined, class built in system or fundamental type are skipped);
S501: extreme saturation intermediate result generate node and parsing;S502 is executed if outermost node layer is class description;
S503: interim .java source file is created;
S504: with class describe in class name in the entitled original of class;
S505: the nodename for reading the intermediate result got obtains nodename;
S506: the data type of the intermediate result interior joint storage got is read, variable is dynamically established;
S507: statistics number of nodes;
S508: the information in domain is written to interim .Java file according to data type and title;
S509: get and set method is written in source file;
S5010: storage source file is itself class name;
S5011: it is .class file that on-the-flier compiler, which generates source file,.
S6: restore complete serialized data into memory;
S601: for the data for having generated class template, outermost node layer is taken;
S602: S603 is executed when needing and directly restoring full sequence data, when only needing to restore in serialized data
Partial data, execute S604-S606;
S603: class object is directly generated and using java to right using the serializing class reading data interface that java is carried
The access mode of elephant accesses to class object, terminates;
S604: class object is generated using reflection;
S605: the data field part in the medium object of generation is traversed;
S606: parsing the item of each data field, and the set method for reflecting call parameters is class parameter assignment or the calling side get
Method gets parms value, terminates.
Those of ordinary skill in the art will understand that the embodiments described herein, which is to help reader, understands this hair
Bright implementation method, it should be understood that protection scope of the present invention is not limited to such specific embodiments and embodiments.Ability
The those of ordinary skill in domain disclosed the technical disclosures can make its various for not departing from essence of the invention according to the present invention
Its various specific variations and combinations, these variations and combinations are still within the scope of the present invention.