WO2021097785A1 - 数据序列化、数据反序列化方法、装置和计算机设备 - Google Patents

数据序列化、数据反序列化方法、装置和计算机设备 Download PDF

Info

Publication number
WO2021097785A1
WO2021097785A1 PCT/CN2019/120134 CN2019120134W WO2021097785A1 WO 2021097785 A1 WO2021097785 A1 WO 2021097785A1 CN 2019120134 W CN2019120134 W CN 2019120134W WO 2021097785 A1 WO2021097785 A1 WO 2021097785A1
Authority
WO
WIPO (PCT)
Prior art keywords
target
data
function
serialization
definition information
Prior art date
Application number
PCT/CN2019/120134
Other languages
English (en)
French (fr)
Inventor
王嘉兴
李升林
Original Assignee
云图技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 云图技术有限公司 filed Critical 云图技术有限公司
Priority to PCT/CN2019/120134 priority Critical patent/WO2021097785A1/zh
Publication of WO2021097785A1 publication Critical patent/WO2021097785A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms

Definitions

  • This application relates to the field of computer technology, in particular to a data serialization and data deserialization method, device and computer equipment.
  • Serialization refers to the process of converting the scattered data structure in the memory into a continuous byte stream when data needs to be stored and transmitted.
  • commonly used serial chemicals include protobuf, XML, json, etc.
  • these tools all use intrusive serialization methods. Developers need to use the data structure specified by the serialization tool, otherwise serialization cannot be achieved.
  • the embodiments of the present application provide a data serialization and data deserialization method, device, and computer equipment, which can be applied to the serialization and non-serialization processing of non-designated data structures.
  • the embodiment of the application provides a data serialization method, including: obtaining target code, where the target code includes user-defined structure type definition information of the target data and a serialization function signature; and determining the serialization function signature and the serialization function signature according to the target code.
  • Structure type definition information generate the target serialization function according to the serialization function signature and structure type definition information; obtain the target data to be serialized, and serialize the target data according to the target serialization function to obtain the corresponding target word Throttling.
  • the target code includes intermediate code; correspondingly, obtaining the target code includes: obtaining the target source code, where the target source code includes the source code of the serialization function signature and the source code of the structure type definition information of the target data; The source code is compiled to generate the intermediate code.
  • the structure type definition information of the target data includes structure type definition information and pointer information of each structure in the plurality of structures; the target serialization function is generated according to the serialization function signature and the structure type definition information, Including: According to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, the write function of each structure in the multiple structures is generated; the target is generated according to the write function of each structure Serialization function.
  • the write function of each structure in the multiple structures is generated according to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: generating multiple structures in the following manner.
  • Write function of the current structure in a structure Obtain the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where the first structure of the multiple structures is Determined according to the first parameter in the serialization function signature; according to the structure type definition information and pointer information of the current structure, the write function corresponding to the current structure is generated.
  • the method further includes: in the process of generating the write function of each structure in the plurality of structures, recording each structure and the corresponding write function of each structure in a preset mapping table.
  • the target data is a ring structure, including multiple nodes, and the structure type definition information includes data type information and pointer information; accordingly, the target data is serialized according to the target serialization function to obtain the corresponding
  • the target byte stream includes: according to the target serialization function, serialize each node in the multiple nodes in the target data to obtain the byte stream corresponding to each node, where each node corresponds to In the process of the byte stream, each node and the first address of the byte stream corresponding to each node are recorded in the preset memo, so that when the same node is encountered again, a pointer to the first address of the node is generated pointer.
  • the target byte stream includes: a data header, an address tag segment, and a data segment; wherein the data header includes: a version number, a check code, the length of the address tag segment, and the length of the data segment; the address tag segment includes the target The address and pointer level are used to indicate that the data type in the target address is a pointer level pointer; the data segment includes the data information of the target data.
  • the embodiment of the present application also provides a data deserialization method, including: obtaining target code, where the target code includes user-defined target structure type definition information and deserialization function signature; determining the target structure type according to the target code Define the information and the deserialization function signature; define the information and deserialization function signature according to the target structure type to generate the target deserialization function; obtain the target byte stream to be deserialized, and compare the target according to the target deserialization function The byte stream is deserialized to obtain the target data corresponding to the target byte stream, where the structure type definition information of the target data is the target structure type definition information.
  • the target code is an intermediate code; correspondingly, obtaining the target code includes: obtaining the target source code, where the target source code includes the source code of the preset deserialization function signature and the source code of the target structure type definition information; The target source code is compiled to generate intermediate code.
  • the target structure type definition information includes structure type definition information and pointer information of each structure in the plurality of structures; the target deserialization function is generated according to the deserialization function signature and the structure type definition information, Including: according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures to generate the read function of each structure in the multiple structures; generate according to the read function of each structure The target deserialization function.
  • the read function of each structure in the multiple structures is generated according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: generating in the following manner Read function of the current structure among multiple structures: Obtain the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure. Among them, the first structure in the multiple structures It is determined according to the return type of the deserialization function signature; according to the structure type definition information and pointer information of the current structure, the read function corresponding to the current structure is generated.
  • An embodiment of the present application also provides a data serialization device, including: an acquisition module for acquiring target code, where the target code includes user-defined structure type definition information and serialization function signatures of the target data; a determination module, Used to determine the serialization function signature and structure type definition information according to the target code; generation module, used to generate the target serialization function according to the serialization function signature and structure type definition information; processing module, used to obtain the target to be serialized Data, and serialize the target data according to the target serialization function to obtain the corresponding target byte stream
  • An embodiment of the present application also provides a computer device, including a processor and a memory for storing executable instructions of the processor.
  • the processor executes the instructions to implement the steps of the data serialization method described in any of the foregoing embodiments. .
  • the embodiments of the present application also provide a computer-readable storage medium on which computer instructions are stored, which when executed, implement the steps of the data serialization method described in any of the foregoing embodiments.
  • a data serialization method is provided to obtain structure type definition information including user-defined target data and the target code of the serialization function signature; determine the serialization function signature and structure type according to the target code Define the information; generate the target serialization function according to the serialization function signature and the structure type definition information; obtain the target data to be serialized, and serialize the target data according to the target serialization function to obtain the corresponding target byte stream.
  • the user can define the serialization function signature and the structure type definition information of the target data, and then determine the serialization function signature and the structure type of the target data according to the target code including the serialization function signature and the structure type definition information.
  • the target serialization function can be generated for the target data of any data structure type, and then the target data is obtained, and the target sequence is obtained according to the target sequence.
  • the target byte stream after the target data is serialized by the function, so as to realize the serialization of the target data of any user-defined data structure type, without the need to convert the target data into a specified type of data structure, which simplifies the data serialization process.
  • FIG. 1 shows a schematic diagram of an application scenario of a data serialization method and a data deserialization method in an embodiment of the present application
  • Figure 2 shows a flowchart of a data serialization method in an embodiment of the present application
  • FIG. 3 shows a schematic diagram of the structure of target data including multiple structures in an embodiment of the present application
  • FIG. 4 shows a schematic diagram of generating a target serialization function for the structure type of the target data in FIG. 3 in an embodiment of the present application
  • FIG. 5 shows a schematic diagram of the structure of target data with a ring structure in an embodiment of the present application
  • FIG. 6 shows a schematic diagram of the structure of target data with a ring structure in an embodiment of the present application
  • FIG. 7 shows a schematic diagram of a storage format of a target byte stream generated in an embodiment of the present application
  • FIG. 8 shows a schematic diagram of a storage format of a target byte stream generated in an embodiment of the present application
  • FIG. 9 shows a flowchart of a data deserialization method in an embodiment of the present application.
  • FIG. 10 shows a schematic diagram of a data serialization device in an embodiment of the present application.
  • FIG. 11 shows a schematic diagram of a data deserialization device in an embodiment of the present application.
  • Fig. 12 shows a schematic diagram of a computer device in an embodiment of the present application.
  • Fig. 1 shows a schematic diagram of an application scenario of a data serialization method and a data deserialization method in an embodiment of the present application.
  • client A wants to send target data to client B.
  • Client A serializes the target data to obtain the corresponding target byte stream.
  • Client A sends the target byte stream obtained by serialization to client B.
  • Client B deserializes the target byte stream to obtain the corresponding target data.
  • the above application scenarios are only exemplary.
  • data serialization is also applicable to scenarios where data needs to be stored, and data structures scattered in memory need to be converted into byte streams.
  • Data deserialization is applicable to data stored in disks. Or when the data received from the network needs to be used, the byte stream is deserialized into a data structure.
  • Fig. 2 shows a flowchart of a data serialization method in an embodiment of the present application.
  • this application provides method operation steps or device structures as shown in the following embodiments or drawings, the method or device may include more or less operation steps or module units based on conventional or no creative labor. .
  • the execution order of these steps or the module structure of the device is not limited to the execution order or module structure shown in the description of the embodiments of this application and the drawings.
  • the described method or module structure is applied to an actual device or terminal product, it can be executed sequentially or in parallel according to the method or module structure connection shown in the embodiments or drawings (for example, parallel processors or multi-threaded processing Environment, even distributed processing environment).
  • the data serialization method may include the following steps:
  • Step S201 Obtain the target code.
  • the target code includes user-defined target data structure type definition information and serialization function signatures.
  • the serialization function signature may include the information of the serialization function, for example, it may include information such as function name, parameter type, number of parameters, parameter order, and class and namespace where the parameters are located.
  • the structure type definition information of the target data includes the type and definition information of each structure of the target data.
  • the types of the data structure of the target data may include, but are not limited to: arrays, linked lists, trees, heaps, hash tables, and so on.
  • Step S202 Determine the serialization function signature and structure type definition information according to the target code.
  • Step S203 Generate a target serialization function according to the serialization function signature and structure type definition information.
  • the serialization function signature and the structure type definition information of the target data can be determined according to the target code.
  • the structure type definition information of the target data may include the definition information of each structure in the target data.
  • Step S204 Obtain the target data to be serialized, and perform serialization processing on the target data according to the target serialization function to obtain the corresponding target byte stream.
  • the target data to be serialized can be obtained, and the target data can be serialized according to the target serialization function to obtain the corresponding target byte stream.
  • the target data may be a data structure input by the user, or data generated by the target code, or a data structure in the memory, which is not limited in this application.
  • the user can define the serialization function signature and the structure type definition information of the target data, and then determine the serialization function signature and the structure type definition information according to the target code including the serialization function signature and the structure type definition information.
  • the structure type definition information of the target data, and the target serialization function is generated according to the serialization function signature and the structure type definition information of the target data.
  • the target serialization function can be generated for the target data of any data structure type defined by the user, and then Get the target data, and serialize the target byte stream of the target data according to the target serialization function, so as to realize the serialization of the target data of any data structure type defined by the user without converting the target data into the specified type of data
  • the structure simplifies the process of data serialization.
  • the target code may include intermediate code; correspondingly, obtaining the target code may include: obtaining the target source code, where the target source code includes the source code of the serialized function signature and the structure type definition information of the target data Source code; compile the target source code to generate intermediate code.
  • the intermediate code is an equivalent internal representation code that is easy to translate into machine code, which is between high-level language code and machine code.
  • the above-mentioned data serialization method can be completed by a compiler that compiles the target code and converts it into machine code for execution.
  • the target source code may be code written by the user in advance, for example, it may be C language code, Java code, C++ language code, etc. After obtaining the target source code, in order to facilitate processing, the target source code can be compiled through the front end of the compiler to generate intermediate code.
  • the format of the intermediate code may include one of the following: llvm (Low Level Virtual Machine), wasm (WebAssembly), JVM (Java Virtual Machine, Java virtual machine), etc.
  • the compiler middle layer can determine the serialization function signature and the structure type definition information of the target data according to the compiled intermediate code, and generate the target serialization function according to the serialization function signature and the structure type definition information of the target data, and according to the target
  • the serialization function completes the serialization function in the intermediate code to obtain the completed intermediate code.
  • the compiler backend converts the completed intermediate code into machine code. After that, the target data is obtained, and the machine code is called to serialize the target data.
  • the target source code is converted into intermediate code, which is convenient for the middle layer of the compiler to analyze and process to generate the target serialization function.
  • the structure type definition information of the target data includes the structure type definition information and pointer information of each structure in the multiple structures; the target serialization is generated according to the serialization function signature and the structure type definition information Functions can include: generating write functions for each structure in multiple structures according to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures; according to the write function of each structure , Generate the target serialization function.
  • the structure type definition information of the target data includes structure type definition information and pointer information of each structure in the multiple structures.
  • the write function of each structure can be generated according to the serialization signature and the structure type definition information and pointer information of each structure in the multiple structures, and then the target serialization function can be generated according to the write function of each structure.
  • the write function is the write function, which can write the data in the structure into the binary byte data packet.
  • the write function of each structure in the multiple structures is generated according to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, which may include : Generate the write function of the current structure in multiple structures in the following way: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where The first structure is determined according to the first parameter in the serialization function signature; according to the structure type definition information and pointer information of the current structure, the write function corresponding to the current structure is generated.
  • the first structure is determined according to the first parameter in the serialization function signature, and the first structure is generated according to the structure type definition information and pointer information of the first structure The write function of the structure.
  • the current structure can be determined according to the pointer information of the first structure, and the write function corresponding to the current structure can be generated according to the structure type definition information and pointer information of the current structure, and so on, directly generated The write function corresponding to the last structure.
  • the method may further include: in the process of generating the write function of each structure in the multiple structures, recording each structure and the corresponding write function of each structure in a preset mapping table. Into the function.
  • FIG. 3 shows a schematic diagram of target data including multiple structures in an embodiment of the present application.
  • the target data includes structures A, B, C, and D.
  • Structure A includes two pointers that point to structure B and structure D, respectively.
  • the pointer of structure B points to the structure.
  • C the pointer of structure C points to structure A.
  • the definition of the target data and the target code of the serialization function signature can be as follows.
  • A, B, C, and D are the structures contained in the target data
  • char*packing_A (struct A*, unsigned*) is the serialization function signature.
  • FIG. 4 shows a schematic diagram of generating a target serialization function for target data including four structures A, B, C, and D.
  • the target serialization function is generated according to the structure definition type information of the target data and the serialization function signature, which can specifically include the type A that needs to be serialized from the first parameter of packing_A, and recursively start from A
  • a write function is generated for each structure.
  • a mapping table needs to be used to record each structure and the corresponding function to prevent repeated generation.
  • the generated target serialization function can be as follows.
  • write_A, write_B, and write_C have indirect recursive calls.
  • Packing_State is a data type defined in the serialization library, which contains information in the serialization process, including the current processing state, space, memory, and mapping table.
  • the write function and write_package function are functions defined in the serialization library.
  • the write_package function is used to process the overall process, and the write function is used to prevent repeated writing of the structure.
  • the above program exemplarily shows the generated target serialization function in C language.
  • the target data is an array. Due to the differences in the definition of arrays in different languages, the serialization methods of arrays are different in different languages. For languages like Java, arrays have clear types and lengths. When serializing, you only need to write the length of the array first, and then write the array elements one by one. For languages such as C/C++, it is impossible to determine how many elements a pointer points to. In order to solve this problem, you can define a marked structure, package the array length, the first address of the array, and the marking information to clarify the definition of the array.
  • the target data is a ring structure that may include multiple nodes.
  • the structure type definition information includes data type information and pointer information; accordingly, the target data is serialized according to the target serialization function , To obtain the corresponding target byte stream, which may include: according to the target serialization function, serialize each node in the multiple nodes in the target data to obtain the byte stream corresponding to each node, where the In the process of the byte stream corresponding to each node, the first address of each node and the byte stream corresponding to each node is recorded in the preset memo, so that when the same node is encountered again, a pointer to the node is generated The pointer of the first address.
  • FIG. 5 and FIG. 6 schematically show the target data of two ring structures.
  • the first type of ring structure such as a circular linked list
  • FIG. 5 has a circular direction. If it is not processed, the packaging process (that is, the serialization process) cannot be terminated.
  • the second type of ring structure in Figure 6 has repeated pointing. If it is not processed, node 3 will become two copies, and the data addresses obtained after unpacking are different, which may cause errors.
  • serialization is performed on each of the multiple nodes in the target data to obtain the byte stream corresponding to each node.
  • a preset memo is used to record the node that has been written and its subscript in the sequence (ie, the first address of the corresponding byte stream).
  • the same node is encountered again Use a pointer to point to its subscript in the sequence.
  • the target byte stream may include: a data header, an address tag segment, and a data segment; where the data header includes: version number, check code, address tag segment length, and data segment length; address tag The segment includes the target address and the pointer level, which are used to indicate that the data type in the target address is a pointer-level pointer; the data segment includes the data information of the target data.
  • the storage format of the target byte stream generated after the target is serialized according to the data serialization method provided in the embodiment of the present application may include three sections: a data header, an address mark section, and a data section. Among them, the version number, check code, address mark segment length and data segment length are stored in the data header.
  • the address tag segment includes a target address and a pointer level, and is used to indicate that the data type in the target address is a pointer of the pointer level.
  • the data information of the target data is stored in the data segment.
  • FIG. 7 and FIG. 8 respectively show schematic diagrams of the storage formats of the target byte streams generated after the serialization of two target data.
  • 1,2,3,4,5,6 is a circular linked list, and the next pointer of node 6 points to node 1, which is the target byte obtained after serialization
  • the data packet structure of the flow is shown in Figure 7.
  • the address tag segment has only one element, and the data used to tag address 28 is a level 1 pointer.
  • each linked list has only one node, and the next pointer points to the same address NULL, then a 3-level pointer is used to point to the node in the address tag segment
  • the next field of 3 the next field of node 3 points to the field that pointed to NULL before.
  • the target serialization function of another language can be generated, thereby realizing cross-language communication.
  • the structure type definition information of the target data may be serialized.
  • the same structure type definition information has different representation methods.
  • By traversing the type definition information a uniform format of structure type definition information representation method can be generated.
  • Verification It is used to calculate the hash value of the structure type definition information. It is stored in the data header and the deserialization function. It can be unpacked (reversed). Before serialization), check whether the data type of the packet is consistent with the data type of the deserialization and unpacking; 2.
  • Cross-language code generation used to pass the serialized structure type definition information to the code generation written in another language Tool, you can use it to generate code in this language to achieve cross-language communication.
  • the embodiment of the application also provides a data deserialization method. Specifically, as shown in FIG. 9, the data deserialization method provided by some embodiments of the present application may include the following steps:
  • Step S901 Obtain target code, where the target code includes user-defined target structure type definition information and deserialization function signatures.
  • the target code includes user-defined target data structure type definition information and deserialization function signature.
  • the deserialization function signature may include information about the deserialization function, for example, it may include information such as the function name, parameter type, number of parameters, parameter order, and the class and namespace where the parameters are located.
  • the target structure type definition information includes the type and definition information of each structure.
  • the types of the data structure of the target data may include, but are not limited to: arrays, linked lists, trees, heaps, hash tables, and so on.
  • Step S902 Determine the target structure type definition information and the deserialization function signature according to the target code.
  • Step S903 Generate a target deserialization function according to the target structure type definition information and the deserialization function signature.
  • the deserialization function signature and target structure type definition information can be determined according to the target code.
  • the target structure type definition information may include the definition information of each structure in the finally generated target data.
  • Step S904 Obtain the target byte stream to be deserialized, and perform deserialization processing on the target byte stream according to the target deserialization function to obtain the target data corresponding to the target byte stream, where the structure type of the target data
  • the definition information is the target structure type definition information.
  • the target byte stream to be deserialized can be obtained, and the target byte stream can be deserialized according to the target deserialization function to obtain the target byte stream correspondence Target data.
  • the target byte stream may be a byte stream obtained from the network or a byte stream read from a disk, which is not limited in this application.
  • the structure type definition information of the target data obtained after deserialization is the target structure type definition information.
  • the user can define the deserialization function signature and give the target structure type definition information of the target data generated after deserialization, and then according to the target code including the deserialization function signature and target structure type definition information Determine the deserialization function signature and target structure type definition information, and generate the target deserialization function based on the deserialization function signature and target structure type definition information, so that you can generate user-defined arbitrary target data for the target byte stream
  • the target deserialization function of the target data of the structure type and then obtain the target byte stream, and after the target byte stream is deserialized according to the target deserialization function, the corresponding target data is obtained, thereby achieving the target byte stream
  • It is deserialized into target data of any user-defined data structure type, without the need to convert the specified type data obtained after deserialization into target data of the target structure type, which simplifies the process of data deserialization.
  • the target code is intermediate code; correspondingly, obtaining the target code includes: obtaining the target source code, where the target source code includes the source code of the preset deserialization function signature and the source code of the target structure type definition information ; Compile the target source code to generate intermediate code.
  • the above-mentioned data deserialization method can be completed by a compiler that compiles the target code and then converts it into machine code for execution.
  • the target source code may be code written by the user in advance, for example, it may be C language code, Java code, C++ language code, etc.
  • the target source code can be compiled through the front end of the compiler to generate intermediate code.
  • the format of the intermediate code may include one of the following: llvm (Low Level Virtual Machine), wasm (WebAssembly), JVM (Java Virtual Machine, Java virtual machine), etc.
  • the compiler middle layer can determine the deserialization function signature and target structure type definition information according to the compiled intermediate code, and generate the target deserialization function according to the deserialization function signature and target structure type definition information, and de-serialize according to the target
  • the serialization function completes the deserialization function in the intermediate code to obtain the completed intermediate code.
  • the compiler backend converts the completed intermediate code into machine code. After that, obtain the target byte stream and call the machine code to perform deserialization of the target byte stream.
  • the target source code is converted into intermediate code, which is convenient for the middle layer of the compiler to perform analysis and processing to generate the target deserialization function.
  • the target structure type definition information includes the structure type definition information and pointer information of each structure in the multiple structures; the target deserialization is generated according to the deserialization function signature and the structure type definition information Functions include: generating read functions for each structure in multiple structures according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures; according to the read function of each structure , Generate the target deserialization function.
  • the target structure type definition information includes structure type definition information and pointer information of each structure in the multiple structures.
  • the read function of each structure can be generated according to the deserialization signature and the structure type definition information and pointer information of each structure in the multiple structures, and then the target deserialization function can be generated according to the read function of each structure.
  • the read function is the read function, which can convert the data in the binary byte data packet into the target data defined in the target structure type definition information.
  • the corresponding target deserialization function can be generated for the target byte stream to be converted into target data containing multiple structures.
  • the write function corresponding to the first structure determines the first structure according to the return type of the deserialization function signature, and generate the first structure according to the structure type definition information and pointer information of the first structure The read function.
  • the current structure can be determined according to the pointer information of the first structure, and the read function corresponding to the current structure can be generated according to the structure type definition information and pointer information of the current structure, and so on, directly generated
  • the read function corresponding to the last structure can be generated according to the deserialization function signature and the type definition information and pointer information of each structure, and then the target deserialization function corresponding to the target byte stream is generated according to the read function .
  • the read function of each structure in the multiple structures is generated according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: Method to generate the read function of the current structure in multiple structures: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where the first of the multiple structures The structure is determined according to the return type of the deserialization function signature; according to the structure type definition information and pointer information of the current structure, a read function corresponding to the current structure is generated.
  • the method may further include: in the process of generating the read function of each structure in the multiple structures, recording each structure and the corresponding reading of each structure in a preset mapping table. Take the function.
  • FIG. 3 shows a schematic diagram of target data including multiple structures in an embodiment of the present application.
  • the target data includes structures A, B, C, and D.
  • Structure A includes two pointers that point to structure B and structure D, respectively.
  • the pointer of structure B points to the structure.
  • C the pointer of structure C points to structure A.
  • the target structure type definition information and the target code of the deserialization function signature can be as follows.
  • A, B, C, and D are the structures contained in the target data, and struct A*unpacking_A(char*) is the signature of the deserialization function.
  • the target deserialization function is generated according to the target structure definition type information and the deserialization function signature, which can specifically include the type A that needs to be deserialized from the return type of unpacking_A, starting from A Recursively generate a read function for each structure.
  • the process of generating you need to use a mapping table to record each structure and the corresponding function to prevent repeated generation.
  • the generated target deserialization function is shown below.
  • UnPacking_State is a data type defined in the serialization library, which contains information during the deserialization process.
  • read and read_package are functions defined in the serialization library, read_package is used to handle the overall deserialization process, and read is used to prevent repeated allocation of structures.
  • the above program exemplarily shows the generated target deserialization function in C language. It is understandable that the generated target deserialization function can be in the format of other languages other than the C language, or in the intermediate code format, which is not limited in this application.
  • the target byte stream may include: a data header, an address tag segment, and a data segment; where the data header includes: version number, check code, address tag segment length, and data segment length; address tag The segment includes the target address and the pointer level, which are used to indicate that the data type in the target address is a pointer-level pointer; the data segment includes the data information of the target data.
  • the storage format of the target byte stream may include three sections: a data header, an address mark section, and a data section. Among them, the version number, check code, address mark segment length and data segment length are stored in the data header.
  • the address tag segment includes a target address and a pointer level, and is used to indicate that the data type in the target address is a pointer of the pointer level.
  • the data information of the target data is stored in the data segment.
  • deserialization code in another language can be generated, thereby realizing cross-language communication.
  • an embodiment of the present application also provides a data serialization device, as described in the following embodiment. Since the problem-solving principle of the data serialization device is similar to that of the data serialization method, the implementation of the data serialization device can refer to the implementation of the data serialization method, and the repetition will not be repeated.
  • the term "unit” or "module” can be a combination of software and/or hardware that implements a predetermined function.
  • the devices described in the following embodiments are preferably implemented by software, implementation by hardware or a combination of software and hardware is also possible and conceived.
  • FIG. 10 is a structural block diagram of the data serialization device according to an embodiment of the present application. As shown in FIG. 10, it includes: an acquisition module 1001, a determination module 1002, a generation module 1003, and a processing module 1004. The structure is described below.
  • the obtaining module 1001 is used to obtain target code, where the target code includes user-defined structure type definition information and serialization function signatures of the target data.
  • the determining module 1002 is used to determine the serialization function signature and structure type definition information according to the target code.
  • the generating module 1003 is used to generate the target serialization function according to the serialization function signature and the structure type definition information.
  • the processing module 1004 is used to obtain the target data to be serialized, and serialize the target data according to the target serialization function to obtain the corresponding target byte stream.
  • the target code includes intermediate code; correspondingly, the acquisition module can be specifically used to: obtain the target source code, where the target source code includes the source code of the serialization function signature and the source code of the structure type definition information of the target data ; Compile the target source code to generate intermediate code.
  • the structure type definition information of the target data includes the structure type definition information and pointer information of each structure in the multiple structures; the generation module can be specifically used to: according to the serialization function signature and multiple The structure type definition information and pointer information of each structure in the structure generate the write function of each structure in the multiple structures; and the target serialization function is generated according to the write function of each structure.
  • the write function of each structure in the multiple structures is generated, including: Generate the write function of the current structure among multiple structures: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure.
  • the first structure of the multiple structures The body is determined according to the first parameter in the serialization function signature; according to the structure type definition information and pointer information of the current structure, the write function corresponding to the current structure is generated.
  • the device further includes a recording module, and the recording module may be specifically configured to: record in a preset mapping table during the process of generating the write function of each of the multiple structures by the generating module Each structure and the write function corresponding to each structure.
  • the target data is a ring structure, which may include multiple nodes, and the structure type definition information includes data type information and pointer information; accordingly, the processing module may be specifically used to: serialize functions according to the target, Serialize each of the multiple nodes in the target data to obtain the byte stream corresponding to each node.
  • the processing module may be specifically used to: serialize functions according to the target, Serialize each of the multiple nodes in the target data to obtain the byte stream corresponding to each node.
  • record in the preset memo Each node and the first address of the byte stream corresponding to each node, so that in the case of encountering the same node again, a pointer to the first address of the node is generated.
  • the target byte stream may include: a data header, an address tag segment, and a data segment; where the data header includes: version number, check code, address tag segment length, and data segment length; address tag The segment includes the target address and the pointer level, which are used to indicate that the data type in the target address is a pointer-level pointer; the data segment includes the data information of the target data.
  • an embodiment of the present application also provides a data deserialization device, as described in the following embodiment. Since the problem-solving principle of the data deserialization device is similar to that of the data deserialization method, the implementation of the data deserialization device can refer to the implementation of the data deserialization method, and the repetition will not be repeated.
  • the term "unit” or "module” can be a combination of software and/or hardware that implements a predetermined function.
  • the devices described in the following embodiments are preferably implemented by software, implementation by hardware or a combination of software and hardware is also possible and conceived.
  • FIG. 11 is a structural block diagram of a data deserialization device according to an embodiment of the present application. As shown in FIG. 11, it includes: an acquisition module 1101, a determination module 1102, a generation module 1103, and a processing module 1104. The structure is described below .
  • the obtaining module 1101 is used to obtain target code, where the target code includes user-defined target structure type definition information and deserialization function signatures.
  • the determining module 1102 is used to determine the target structure type definition information and the deserialization function signature according to the target code.
  • the generating module 1103 is used to generate the target deserialization function according to the target structure type definition information and the deserialization function signature.
  • the processing module 1104 is used to obtain the target byte stream to be deserialized, and perform deserialization processing on the target byte stream according to the target deserialization function to obtain the target data corresponding to the target byte stream, where the structure of the target data
  • the body type definition information is the target structure type definition information.
  • the target code is an intermediate code; correspondingly, the acquisition module can be specifically used to: acquire the target source code, where the target source code includes the source code of the preset deserialization function signature and the target structure type definition information Source code; compile the target source code to generate intermediate code.
  • the target structure type definition information includes structure type definition information and pointer information of each structure in the multiple structures; the generation module can be specifically used to: deserialize function signatures and multiple structures The structure type definition information and pointer information of each structure in the body generate the read function of each structure in the multiple structures; according to the read function of each structure, the target deserialization function is generated.
  • the read function of each structure in the multiple structures is generated according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: Method to generate the read function of the current structure in multiple structures: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where the first of the multiple structures The structure is determined according to the return type of the deserialization function signature; according to the structure type definition information and pointer information of the current structure, a read function corresponding to the current structure is generated.
  • the user can define the serialization function signature and the structure type definition information of the target data, and then according to the serialization function signature and the structure type definition
  • the target code of the information determines the serialization function signature and the structure type definition information of the target data, and generates the target serialization function according to the serialization function signature and the structure type definition information of the target data, so it can target any data structure type target
  • the data generates the target serialization function, and then obtains the target data and serializes the target data according to the target serialization function to generate the corresponding target byte stream, so as to realize the serialization of the target data of any data structure type defined by the user, and There is no need to convert the target data into a data structure of a specified type, which simplifies the process of data serialization.
  • the embodiment of the present application also provides a computer device.
  • the computer device may specifically It includes an input device 121, a processor 122, and a memory 123.
  • the memory 123 is used to store processor executable instructions.
  • the processor 122 executes the instructions, the steps of the data serialization method or the data deserialization method described in any of the foregoing embodiments are implemented.
  • the input device may specifically be one of the main devices for information exchange between the user and the computer system.
  • the input device may include a keyboard, a mouse, a camera, a scanner, a light pen, a handwriting input board, a voice input device, etc.; the input device is used to input raw data and programs for processing these numbers into the computer.
  • the input device can also obtain and receive data transmitted from other modules, units, and devices.
  • the processor can be implemented in any suitable way.
  • the processor may take the form of a microprocessor or processor, and a computer-readable medium, logic gates, switches, application-specific integrated circuits ( Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc.
  • the memory may specifically be a memory device used to store information in modern information technology.
  • the memory can include multiple levels. In a digital system, as long as it can store binary data, it can be a memory; in an integrated circuit, a circuit with a storage function without a physical form is also called a memory, such as RAM, FIFO, etc.; In the system, storage devices in physical form are also called memories, such as memory sticks, TF cards, and so on.
  • the embodiment of the present application also provides a computer storage medium based on a data serialization method or a data deserialization method.
  • the computer storage medium stores computer program instructions, which implement any of the foregoing implementations when the computer program instructions are executed. The steps of the data serialization method or data deserialization method described in the example.
  • the above-mentioned storage medium includes, but is not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), cache (Cache), and hard disk (Hard Disk Drive, HDD). Or memory card (Memory Card).
  • the memory can be used to store computer program instructions.
  • the network communication unit may be an interface set up in accordance with standards stipulated by the communication protocol and used for network connection communication.
  • modules or steps of the embodiments of the present application described above can be implemented by a general computing device, and they can be concentrated on a single computing device or distributed among multiple computing devices.
  • they can be implemented by the program code executable by the computing device, so that they can be stored in the storage device for execution by the computing device, and in some cases, they can be different from here
  • the steps shown or described are performed in the order of, or they are respectively fabricated into individual integrated circuit modules, or multiple modules or steps of them are fabricated into a single integrated circuit module to achieve. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Stored Programmes (AREA)

Abstract

本申请提供了一种数据序列化、数据反序列化方法、装置和计算机设备,其中,该数据序列化方法包括:获取目标代码,其中,目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名;根据目标代码确定序列化函数签名和结构体类型定义信息;根据序列化函数签名和结构体类型定义信息生成目标序列化函数;获取待序列化的目标数据,并根据目标序列化函数对目标数据进行序列化处理,得到对应的目标字节流。上述数据序列化方法,可以实现用户定义的非指定数据结构类型的目标数据的序列化,无需将目标数据转换为指定类型的数据结构,简化了数据序列化的过程。

Description

数据序列化、数据反序列化方法、装置和计算机设备 技术领域
本申请涉及计算机技术领域,特别涉及一种数据序列化、数据反序列化方法、装置和计算机设备。
背景技术
序列化是指当数据需要存储和传输时,把内存中分散的数据结构转换为连续的字节流的过程。目前,常用的序列化工具有protobuf、XML、json等,然而,这些工具都采用侵入式的序列化方法,开发者需要使用序列化工具指定的数据结构,否则无法实现序列化。
因此,亟需一种能够适用于非指定的数据结构的序列化的方法。
发明内容
本申请实施例提供了一种数据序列化、数据反序列化方法、装置和计算机设备,能够适用于非指定的数据结构的序列化和非序列化处理。
本申请实施例提供了一种数据序列化方法,包括:获取目标代码,其中,目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名;根据目标代码确定序列化函数签名和结构体类型定义信息;根据序列化函数签名和结构体类型定义信息生成目标序列化函数;获取待序列化的目标数据,并根据目标序列化函数对目标数据进行序列化处理,得到对应的目标字节流。
在一个实施例中,目标代码包括中间码;相应的,获取目标代码,包括:获取目标源码,其中,目标源码包括序列化函数签名的源码以及目标数据的结构体类型定义信息的源码;对目标源码进行编译,生成中间码。
在一个实施例中,目标数据的结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;根据序列化函数签名和结构体类型定义信息生成目标序列化函数,包括:根据序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数;根据各个结构体的写入函数,生成目标序列化函数。
在一个实施例中,根据序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数,包括:按照以下方式生成多个结构体中当前结构体的写入函数:获取上一个结构体的指针信息,并根据上一个结构体的指针信息确定出当前结构体,其中,多个结构体中的第一个结构体是根据序列化函数签名中的第一个参数确定的;根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的写入函数。
在一个实施例中,该方法还包括:在生成多个结构体中各个结构体的写入函数的过程中,在预设的映射表中记录各结构体以及各结构体对应的写入函数。
在一个实施例中,目标数据为环结构,包括多个结点,结构体类型定义信息包括数据类型信息和指针信息;相应地,根据目标序列化函数,对目标数据进行序列化处理,得到对应的目标字节流,包括:根据目标序列化函数,对目标数据中的多个结点中各结点进行序列化处理,得到各结点对应的字节流,其中,在生成各结点对应的字节流的过程中,在预设备忘录中记录各结点以及各结点对应的字节流的首地址,使得在再次遇到相同结点的情况下,生成指向结点的首地址的指针。
在一个实施例中,目标字节流包括:数据头、地址标记段和数据段;其中,数据头中包括:版本号、校验码、地址标记段长度以及数据段长度;地址标记段包括目标地址和指针级别,用于表明目标地址中的数据类型为指针级别的指针;数据段中包括目标数据的数据信息。
本申请实施例还提供了一种数据反序列化方法,包括:获取目标代码,其中,目标代码包括用户定义的目标结构体类型定义信息和反序列化函数签名;根据目标代码确定目标结构体类型定义信息和反序列化函数签名;根据目标结构体类型定义信息和反序列化函数签名,生成目标反序列化函数;获取待反序列化的目标字节流,并根据目标反序列化函数对目标字节流进行反序列化处理,得到目标字节流对应的目标数据,其中,目标数据的结构体类型定义信息为目标结构体类型定义信息。
在一个实施例中,目标代码为中间码;相应的,获取目标代码,包括:获取目标源码,其中,目标源码包括预设反序列化函数签名的源码以及目标结构体类型定义信息的源码;对目标源码进行编译,生成中间码。
在一个实施例中,目标结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;根据反序列化函数签名和结构体类型定义信息生成目标反序列化函数,包括:根据反序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的读取函数;根据各个结构体的读取函数,生成目标反序列化函数。
在一个实施例中,根据反序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的读取函数,包括:按照以下方式生成多个结构体中当前结构体的读取函数:获取上一个结构体的指针信息,并根据上一个结构体的指针信息确定出当前结构体,其中,多个结构体中的第一个结构体是根据反序列化函数签名的返回类型确定的;根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的读取函数。
本申请实施例还提供了一种数据序列化装置,包括:获取模块,用于获取目标代码,其中,目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名;确定模块,用于根据目标代码确定序列化函数签名和结构体类型定义信息;生成模块,用于根据序列化函数签名和结构体类型定义信息生成目标序列化函数;处理模块,用于获取待序列化的目标数据,并根据目标序列化函数对目标数据进行序列化处理,得到对应的目标字节流
本申请实施例还提供一种计算机设备,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现上述任意实施例中所述的数据序列化方法的步骤。
本申请实施例还提供一种计算机可读存储介质,其上存储有计算机指令,所述指令被执行时实现上述任意实施例中所述的数据序列化方法的步骤。
在本申请实施例中,提供了一种数据序列化方法,获取包括用户定义的目标数据的结构体类型定义信息和序列化函数签名的目标代码;根据目标代码确定序列化函数签名和结构体类型定义信息;根据序列化函数签名和结构体类型定义信息生成目标序列化函数;获取待序列化的目标数据,并根据目标序列化函数对目标数据进行序列化处理,得到对应的目标字节流。上述方案中,可以由用户定义序列化函数签名和目标数据的结构体类型定义信息,然后根据包括序列化函数签名和结构体类型定义信息的目标代码确定序列化函数签名和目标数据的结构体类型定义信息,并根据序列化函数签名和目标数据的结构体类型定义信息生成目标序列化函数,因而可以针对任意的数据结构类型的目标数据生成目标序列化函数,然后获取目标数据,并根据目标序列化函数对目标数据进行序列化后的目标字节流,从而实现用户定义的任意数据结构类型的目标数据的序列化,而无需将目标数据转换为指定类型的数据结构,简化了数据序列化的过程。
附图说明
此处所说明的附图用来提供对本申请的进一步理解,构成本申请的一部分,并不构成对本申请的限定。在附图中:
图1示出了本申请一实施例中数据序列化方法和数据反序列化方法的应用场景的示意图;
图2示出了本申请一实施例中的数据序列化方法的流程图;
图3示出了本申请一实施例中包括多个结构体的目标数据的结构示意图;
图4示出了本申请一实施例中针对图3中的目标数据的结构类型生成目标序列化函数的示意图;
图5示出了本申请一实施例中结构为环结构的目标数据的结构示意图;
图6示出了本申请一实施例中结构为环结构的目标数据的结构示意图;
图7示出了本申请一实施例中生成的目标字节流的存储格式的示意图;
图8示出了本申请一实施例中生成的目标字节流的存储格式的示意图;
图9示出了本申请一实施例中的数据反序列化方法的流程图;
图10示出了本申请一实施例中的数据序列化装置的示意图;
图11示出了本申请一实施例中的数据反序列化装置的示意图;
图12示出了本申请一实施例中的计算机设备的示意图。
具体实施方式
下面将参考若干示例性实施方式来描述本申请的原理和精神。应当理解,给出这些实施方式仅仅是为了使本领域技术人员能够更好地理解进而实现本申请,而并非以任何方式限制本申请的范围。 相反,提供这些实施方式是为了使本申请公开更加透彻和完整,并且能够将本公开的范围完整地传达给本领域的技术人员。
本领域的技术人员知道,本申请的实施方式可以实现为一种系统、装置设备、方法或计算机程序产品。因此,本申请公开可以具体实现为以下形式,即:完全的硬件、完全的软件(包括固件、驻留软件、微代码等),或者硬件和软件结合的形式。
考虑到现有的序列化工具仅能对一种或几种指定的数据结构进行序列化,对于其他数据结构,则需要将其转换为指定的数据结构。这样,一方面需要开发者学习指定的数据结构的使用;另一方面,在应用中必须使用自己定义的结构的情况下,则需要在序列化前把自己定义的结构转换为序列化工具指定的结构,在反序列化后再把序列化工具指定的结构转换为自己定义的结构,过程比较繁琐,此外,如果自己定义的结构中存在私有成员无法访问或设置,则无法转换。
针对上述问题,本申请实施例提供了数据序列化方法和数据反序列化方法。图1示出了本申请一实施例中数据序列化方法和数据反序列化方法的应用场景的示意图。例如,在不同客户端之间传输数据时,需要进行数据序列化和数据反序列化。如图1所示,客户端A希望将目标数据发送至客户端B。客户端A对目标数据进行序列化处理,得到对应的目标字节流。客户端A将序列化得到的目标字节流发送至客户端B。客户端B对目标字节流进行反序列化处理,得到对应的目标数据。上述应用场景仅是示例性的,例如,数据序列化还适用于数据需要存储的场景,需要将内存中分散的数据结构转换为字节流,数据反序列化适用于在存储在磁盘中的数据或者从网络接收的数据需要使用时,将字节流反序列化为数据结构。
图2示出了本申请一实施例中数据序列化方法的流程图。虽然本申请提供了如下述实施例或附图所示的方法操作步骤或装置结构,但基于常规或者无需创造性的劳动在所述方法或装置中可以包括更多或者更少的操作步骤或模块单元。在逻辑性上不存在必要因果关系的步骤或结构中,这些步骤的执行顺序或装置的模块结构不限于本申请实施例描述及附图所示的执行顺序或模块结构。所述的方法或模块结构的在实际中的装置或终端产品应用时,可以按照实施例或者附图所示的方法或模块结构连接进行顺序执行或者并行执行(例如并行处理器或者多线程处理的环境,甚至分布式处理环境)。
具体地,如图2所示,本申请一些实施例提供的数据序列化方法可以包括以下步骤:
步骤S201,获取目标代码。
其中,目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名。其中,序列化函数签名可以包括序列化函数的信息,例如,可以包括函数名、参数类型、参数个数、参数顺序以及参数所在的类和命名空间等信息。目标数据的结构体类型定义信息包括目标数据的各结构体的类型和定义信息。其中,目标数据的数据结构的类型可以包括但不限于:数组、链表、树、堆和散列表等。
步骤S202,根据目标代码确定序列化函数签名和结构体类型定义信息。
步骤S203,根据序列化函数签名和结构体类型定义信息生成目标序列化函数。
在获取目标代码之后,可以根据目标代码确定序列化函数签名和目标数据的结构体类型定义信息。其中,目标数据的结构体类型定义信息可以包括目标数据中的各结构体的定义信息。在确定出序列化函数签名和目标数据的结构体类型定义信息之后,即可以根据序列化函数签名和目标数据的结构体类型定义信息生成目标序列化函数。
步骤S204,获取待序列化的目标数据,并根据目标序列化函数,对目标数据进行序列化处理,得到对应的目标字节流。
在生成目标序列化函数并获取目标数据之后,可以获取待序列化的目标数据,并可以根据目标序列化函数对目标数据进行序列化处理,得到对应的目标字节流。其中,目标数据可以是用户输入的数据结构,也可以是由目标代码生成的数据,还可以是内存中的数据结构,本申请对此不作限制。
上述实施例中的数据序列化方法,可以由用户定义序列化函数签名和目标数据的结构体类型定义信息,然后根据包括序列化函数签名和结构体类型定义信息的目标代码确定序列化函数签名和目标数据的结构体类型定义信息,并根据序列化函数签名和目标数据的结构体类型定义信息生成目标序列化函数,因而可以针对用户定义的任意数据结构类型的目标数据生成目标序列化函数,然后获取目标数据,并根据目标序列化函数对目标数据进行序列化后的目标字节流,从而实现用户定义的任意数据结构类型的目标数据的序列化,而无需将目标数据转换为指定类型的数据结构,简化了数据序列化的过程。
在本申请一些实施例中,目标代码可以包括中间码;相应的,获取目标代码,可以包括:获取目标源码,其中,目标源码包括序列化函数签名的源码以及目标数据的结构体类型定义信息的源码;对目标源码进行编译,生成中间码。
其中,中间码是一种易于翻译成机器码的等效内部表示代码,其介于高级语言代码和机器代码之间。具体地,上述数据序列化方法可以由编译器将目标代码进行编译处理后转换成机器码执行来完成的。目标源码可以为用户先行写出的代码,例如,可以为C语言代码、Java代码、C++语言代码等。在获得目标源码之后,为了便于处理,可以对通过编译器前端对目标源码进行编译,生成中间码。其中,中间码的格式可以包括以下之一:llvm(Low Level Virtual Machine,底层虚拟机)、wasm(WebAssembly)、JVM(Java Virtual Machine,Java虚拟机)等。编译器中间层可以根据编译得到的中间码确定序列化函数签名和目标数据的结构体类型定义信息,并根据序列化函数签名和目标数据的结构体类型定义信息生成目标序列化函数,并根据目标序列化函数将中间码中的序列化函数补全,得到补全后的中间码。编译器后端将补全后的中间码转换为机器码。之后,获取目标数据,并调用该机器码对目标数据进行序列化处理。通过上述方式,将目标源码转换为中间码,便于编译器中间层进行分析处理以生成目标序列化函数。
在本申请一些实施例中,目标数据的结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;根据序列化函数签名和结构体类型定义信息生成目标序列化函数,可以 包括:根据序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数;根据各个结构体的写入函数,生成目标序列化函数。
具体地,对于包括多个结构体的目标数据而言,目标数据的结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息。可以根据序列化签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成各结构体的写入函数,然后根据各结构体的写入函数,生成目标序列化函数。其中,写入函数为write函数,可以将结构体中的数据写入二进制字节数据包中。通过上述方式,可以针对包含多个结构体的目标数据生成对应的目标序列化函数。
进一步地,在本申请一些实施例中,根据序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数,可以包括:按照以下方式生成多个结构体中当前结构体的写入函数:获取上一个结构体的指针信息,并根据上一个结构体的指针信息确定出当前结构体,其中,多个结构体中的第一个结构体是根据序列化函数签名中的第一个参数确定的;根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的写入函数。
在生成第一个结构体对应的写入函数时,根据序列化函数签名中的第一个参数确定第一结构体,并根据第一结构体的结构体类型定义信息和指针信息生成第一个结构体的写入函数。接着,可以根据第一个结构体的指针信息,确定当前结构体,并根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的写入函数,以此类推,直接生成最后一个结构体对应的写入函数。通过上述方式,可以实现根据序列化函数签名和各结构体的类型定义信息和指针信息生成各结构体对应的写入函数,之后根据写入函数生成目标数据对应的目标序列化函数。
在本申请一些实施例中,该方法还可以包括:在生成多个结构体中各个结构体的写入函数的过程中,在预设的映射表中记录各结构体以及各结构体对应的写入函数。
具体地,考虑到有些情况下,多个结构体为环形结构,为了避免重复写入,可以在生成多个结构体中各结构体的写入函数的过程中,在预设的映射表中记录各结构体以及各结构体对应的写入函数,使得在遇到相同结构体时,避免重复生成写入函数。
请参考图3,图3示出了本申请一实施例中包含多个结构体的目标数据的示意图。具体地,如图3所示,目标数据包括结构体A、B、C、D,其中,结构体A包括两个指针,分别指向结构体B和结构体D,结构体B的指针指向结构体C,结构体C的指针指向结构体A。例如,目标数据的定义和序列化函数签名的目标代码可以如下所示。
Figure PCTCN2019120134-appb-000001
Figure PCTCN2019120134-appb-000002
其中,A、B、C、D为目标数据所包含的结构体,char*packing_A(struct A*,unsigned*)为序列化函数签名。
请参考图4,图4示出了针对包括A、B、C、D四个结构体的目标数据生成目标序列化函数的示意图。如图4所示,根据目标数据的结构体定义类型信息和序列化函数签名生成目标序列化函数,具体可以包括,从packing_A的第一个参数中得到需要序列化的类型A,从A开始递归地对每个结构体生成一个write函数,在生成的过程中,需要使用映射表记录每个结构体和对应的函数,防止重复生成,生成的目标序列化函数可以如下所示。
Figure PCTCN2019120134-appb-000003
其中,write_A、write_B、write_C存在间接递归调用。Packing_State是在序列化库里定义的数据类型,其中包含了序列化过程中的信息,可以包括当前处理的状态,空间、内存、映射表。write函数和write_package函数是在序列化库的定义的函数,write_package函数用于处理整体过程,write函数用于防止结构体的重复写入。其中,上面的程序用C语言示例性地示出了生成的目标序列化函数。
在本申请一些实施例中,目标数据为数组。由于不同语言对数组的定义存在差异,因此不同语言的情况下,数组的序列化方法有所不同。对于Java这类语言,数组有明确的类型和长度,序列化时只需要先写入数组的长度,再逐个写入数组元素即可。对于C/C++这类语言,无法确定一个指针指向多少个元素,为了解决这个问题,可以定义一个带标记的结构体,把数组长度、数组首地址和标记信息包装起来,以明确数组的定义。
在本申请一些实施例中,目标数据为环结构,可以包括多个结点,结构体类型定义信息包括数据类型信息和指针信息;相应地,根据目标序列化函数,对目标数据进行序列化处理,得到对应的目标字节流,可以包括:根据目标序列化函数,对目标数据中的多个结点中各结点进行序列化处理,得到各结点对应的字节流,其中,在生成各结点对应的字节流的过程中,在预设备忘录中记录各结点以及各结点对应的字节流的首地址,使得在再次遇到相同结点的情况下,生成指向结点的首地址的指针。
示例性地,请参考图5和图6,示意性示出了两种环结构的目标数据。具体地,图5中的第一种环结构(如循环链表)有循环指向,如果不加处理,打包过程(也就是序列化过程)就无法终止。图6中的第二种环结构有重复指向,如果不加处理,结点3会变成两份,解包后得到的数据地址不同,可能会引发错误。对此,根据目标序列化函数,对目标数据中的多个结点中各结点进行序列化处理,得到各结点对应的字节流。并且,在写入各结点的过程中用预设备忘录记录已经写入的结点和它在序列中的下标(即,对应的字节流的首地址),再次遇到相同结点时用一个指针指向它在序列中的下标。通过上述方式,解决了环结构的目标数据序列化过程无法终止以及某些结点重复写入的问题。
在本申请一些实施例中,目标字节流可以包括:数据头、地址标记段和数据段;其中,数据头中包括:版本号、校验码、地址标记段长度以及数据段长度;地址标记段包括目标地址和指针级别,用于表明目标地址中的数据类型为指针级别的指针;数据段中包括目标数据的数据信息。
具体地,根据本申请实施例中提供的数据序列化方法对目标进行序列化后生成的目标字节流的存储格式可以包括三段:数据头、地址标记段和数据段。其中,数据头中存储有版本号、校验码、地址标记段长度以及数据段长度。地址标记段包括目标地址和指针级别,用于表明目标地址中的数据类型为该指针级别的指针。数据段中存储有目标数据的数据信息。
请参考图7和图8,分别示出了两种目标数据序列化后生成的目标字节流的存储格式的示意图。如图7所示,以循环链表为例,(1,2,3,4,5,6)是一个循环链表,结点6的next指针指向结点1,它序列化后得到的目标字节流的数据包结构如图7所示。在图7中,地址标记段只有一个元素,用于标记地址为28的数据是一个1级指针。如图8所示,对于链表(1)、(2)、(3),每个链表都只有一个结点,next指针指向相同的地址NULL,则在地址标记段用一个3级指针指向结点3的next域,结点3的next域指向之前指向NULL的域。
在本申请一些实施例中,根据从编译器接口中得到的语言类型、目标数据的结构体类型定义信息和序列化函数签名,可以生成另一种语言的目标序列化函数,从而实现跨语言通信。
在本申请一些实施例中,可以将目标数据的结构体类型定义信息序列化。对于不同的源代码、中间码,相同的结构体类型定义信息会有不同的表示方法,通过遍历类型定义信息,可以生成一份统一格式的结构体类型定义信息表示方法。例如,可以用自举的方法将结构体类型定义信息序列化,得到一份内存上连续的结构体类型定义信息。这份连续的结构体类型定义信息有以下两个用途:1、 校验:用于计算结构体类型定义信息的散列值,分别存在数据头和反序列化函数中,可以在解包(反序列化)前校验包的数据类型与反序列化解包的数据类型是否一致;2、跨语言代码生成:用于将序列化后的结构体类型定义信息传给另一种语言写的代码生成工具,可以用它生成这个语言的代码,实现跨语言通信。
本申请实施例还提供了一种数据反序列化方法。具体地,如图9所示,本申请一些实施例提供的数据反序列化方法可以包括以下步骤:
步骤S901,获取目标代码,其中,目标代码包括用户定义的目标结构体类型定义信息和反序列化函数签名。
具体地,目标代码包括用户定义的目标数据的结构体类型定义信息和反序列化函数签名。其中,反序列化函数签名可以包括反序列化函数的信息,例如,可以包括函数名、参数类型、参数个数、参数顺序以及参数所在的类和命名空间等信息。目标结构体类型定义信息包括各结构体的类型和定义信息。其中,目标数据的数据结构的类型可以包括但不限于:数组、链表、树、堆和散列表等。
步骤S902,根据目标代码确定目标结构体类型定义信息和反序列化函数签名。
步骤S903,根据目标结构体类型定义信息和反序列化函数签名,生成目标反序列化函数。
在获取目标代码之后,可以根据目标代码确定反序列化函数签名和目标结构体类型定义信息。其中,目标结构体类型定义信息可以包括最终生成的目标数据中的各结构体的定义信息。在确定出反序列化函数签名和目标结构体类型定义信息之后,即可以根据反序列化函数签名和目标结构体类型定义信息生成目标反序列化函数。
步骤S904,获取待反序列化的目标字节流,并根据目标反序列化函数对目标字节流进行反序列化处理,得到目标字节流对应的目标数据,其中,目标数据的结构体类型定义信息为目标结构体类型定义信息。
在生成目标反序列化函数并获取目标数据之后,可以获取待反序列化的目标字节流,并可以根据目标反序列化函数对目标字节流进行反序列化处理,得到目标字节流对应的目标数据。其中,目标字节流可以是从网络获取的字节流,也可以是从磁盘读取的字节流,本申请对此不做限制。其中,反序列化后得到的目标数据的结构体类型定义信息为目标结构体类型定义信息。
上述方案中,可以由用户定义反序列化函数签名并给出反序列化后生成的目标数据的目标结构体类型定义信息,然后根据包括反序列化函数签名和目标结构体类型定义信息的目标代码确定反序列化函数签名和目标结构体类型定义信息,并根据反序列化函数签名和目标结构体类型定义信息生成目标反序列化函数,因而可以得到针对目标字节流生成用户定义的任意目标数据结构类型的目标数据的目标反序列化函数,然后获取目标字节流,并根据目标反序列化函数对目标字节流进行反序列化处理后,得到对应的目标数据,从而实现目标字节流反序列化成用户定义的任意数据结构类型的目标数据,而无需将反序列化后得到的指定类型数据转换为目标结构体类型的目标数据,简化了数据反序列化的过程。
在本申请一些实施例中,目标代码为中间码;相应的,获取目标代码,包括:获取目标源码,其中,目标源码包括预设反序列化函数签名的源码以及目标结构体类型定义信息的源码;对目标源码进行编译,生成中间码。
具体地,上述数据反序列化方法可以由编译器将目标代码进行编译处理后转换成机器码执行来完成的。目标源码可以为用户先行写出的代码,例如,可以为C语言代码、Java代码、C++语言代码等。在获得目标源码之后,为了便于处理,可以对通过编译器前端对目标源码进行编译,生成中间码。其中,中间码的格式可以包括以下之一:llvm(Low Level Virtual Machine,底层虚拟机)、wasm(WebAssembly)、JVM(Java Virtual Machine,Java虚拟机)等。编译器中间层可以根据编译得到的中间码确定反序列化函数签名和目标结构体类型定义信息,并根据反序列化函数签名和目标结构体类型定义信息生成目标反序列化函数,并根据目标反序列化函数将中间码中的反序列化函数补全,得到补全后的中间码。编译器后端将补全后的中间码转换为机器码。之后,获取目标字节流,并调用该机器码即可以进行目标字节流的反序列化处理。通过上述方式,将目标源码转换为中间码,便于编译器中间层进行分析处理以生成目标反序列化函数。
在本申请一些实施例中,目标结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;根据反序列化函数签名和结构体类型定义信息生成目标反序列化函数,包括:根据反序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的读取函数;根据各个结构体的读取函数,生成目标反序列化函数。
具体地,对于包括多个结构体的目标数据而言,目标结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息。可以根据反序列化签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成各结构体的读取函数,然后根据各结构体的读取函数,生成目标反序列化函数。其中,读取函数为read函数,可以将二进制字节数据包中的数据转换为目标结构体类型定义信息中定义的目标数据。通过上述方式,可以针对要转换成包含多个结构体的目标数据的目标字节流生成对应的目标反序列化函数。
在生成第一个结构体对应的写入函数时,根据反序列化函数签名的返回类型确定第一结构体,并根据第一结构体的结构体类型定义信息和指针信息生成第一个结构体的读取函数。接着,可以根据第一个结构体的指针信息,确定当前结构体,并根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的读取函数,以此类推,直接生成最后一个结构体对应的读取函数。通过上述方式,可以实现根据反序列化函数签名和各结构体的类型定义信息和指针信息生成各结构体对应的读取函数,之后根据读取函数生成目标字节流对应的目标反序列化函数。
在本申请一些实施例中,根据反序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的读取函数,包括:按照以下方式生成多个结构体中当前结构体的读取函数:获取上一个结构体的指针信息,并根据上一个结构体的指针信息确定出当前 结构体,其中,多个结构体中的第一个结构体是根据反序列化函数签名的返回类型确定的;根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的读取函数。
在本申请一些实施例中,该方法还可以包括:在生成多个结构体中各个结构体的读取函数的过程中,在预设的映射表中记录各结构体以及各结构体对应的读取函数。
具体地,考虑到有些情况下,多个结构体为环形结构,为了避免重复读取,可以在生成多个结构体中各结构体的读取函数的过程中,在预设的映射表中记录各结构体以及各结构体对应的读取函数,使得在遇到相同结构体时,避免重复生成读取函数,从而避免结构体重复读取。
请参考图3,图3示出了本申请一实施例中包含多个结构体的目标数据的示意图。具体地,如图3所示,目标数据包括结构体A、B、C、D,其中,结构体A包括两个指针,分别指向结构体B和结构体D,结构体B的指针指向结构体C,结构体C的指针指向结构体A。例如,目标结构体类型定义信息和反序列化函数签名的目标代码可以如下所示。
Figure PCTCN2019120134-appb-000004
其中,A、B、C、D为目标数据所包含的结构体,struct A*unpacking_A(char*)为反序列化函数签名。
对于上述目标结构体类型定义信息,根据目标结构体定义类型信息和反序列化函数签名生成目标反序列化函数,具体可以包括,从unpacking_A的返回类型得到需要反序列化的类型A,从A开始递归地对每个结构体生成一个read函数,在生成的过程中,需要使用映射表记录每个结构体和对应的函数,防止重复生成,生成的目标反序列化函数如下所示。
Figure PCTCN2019120134-appb-000005
Figure PCTCN2019120134-appb-000006
其中,read_A、read_B、read_C存在间接递归调用。UnPacking_State是在序列化库里定义的数据类型,其中包含了反序列化过程中的信息。read和read_package是在序列化库的定义的函数,read_package用于处理整体反序列化过程,read用于防止结构体的重复分配。其中,上面的程序用C语言示例性地示出了生成的目标反序列化函数。可以理解的是,生成的目标反序列化函数可以是除了C语言之外的其他语言的格式,也可以是中间码格式,本申请对此不作限制。
在本申请一些实施例中,目标字节流可以包括:数据头、地址标记段和数据段;其中,数据头中包括:版本号、校验码、地址标记段长度以及数据段长度;地址标记段包括目标地址和指针级别,用于表明目标地址中的数据类型为指针级别的指针;数据段中包括目标数据的数据信息。
具体地,目标字节流的存储格式可以包括三段:数据头、地址标记段和数据段。其中,数据头中存储有版本号、校验码、地址标记段长度以及数据段长度。地址标记段包括目标地址和指针级别,用于表明目标地址中的数据类型为该指针级别的指针。数据段中存储有目标数据的数据信息。
在本申请一些实施例中,根据从编译器接口中得到的语言类型、目标结构体类型定义信息和反序列化函数签名,可以生成另一种语言的反序列化代码,从而实现跨语言通信。
基于同一发明构思,本申请实施例中还提供了一种数据序列化装置,如下面的实施例所述。由于数据序列化装置解决问题的原理与数据序列化方法相似,因此数据序列化装置的实施可以参见数据序列化方法的实施,重复之处不再赘述。以下所使用的,术语“单元”或者“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。图10是本申请实施例的数据序列化装置的一种结构框图,如图10所示,包括:获取模块1001、确定模块1002、生成模块1003和处理模块1004,下面对该结构进行说明。
获取模块1001用于获取目标代码,其中,目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名。
确定模块1002用于根据目标代码确定序列化函数签名和结构体类型定义信息。
生成模块1003用于根据序列化函数签名和结构体类型定义信息生成目标序列化函数。
处理模块1004用于获取待序列化的目标数据,并根据目标序列化函数对目标数据进行序列化处理,得到对应的目标字节流。
在本申请一些实施例中,目标代码包括中间码;相应的,获取模块可以具体用于:获取目标源码,其中,目标源码包括序列化函数签名的源码以及目标数据的结构体类型定义信息的源码;对目标源码进行编译,生成中间码。
在本申请一些实施例中,目标数据的结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;生成模块可以具体用于:根据序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数;根据各个结构体的写入函数,生成目标序列化函数。
在本申请一些实施例中,根据序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数,包括:按照以下方式生成多个结构体中当前结构体的写入函数:获取上一个结构体的指针信息,并根据上一个结构体的指针信息确定出当前结构体,其中,多个结构体中的第一个结构体是根据序列化函数签名中的第一个参数确定的;根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的写入函数。
在本申请一些实施例中,该装置还包括记录模块,记录模块可以具体用于:在生成模块生成多个结构体中各个结构体的写入函数的过程中,在预设的映射表中记录各结构体以及各结构体对应的写入函数。
在本申请一些实施例中,目标数据为环结构,可以包括多个结点,结构体类型定义信息包括数据类型信息和指针信息;相应地,处理模块可以具体用于:根据目标序列化函数,对目标数据中的多个结点中各结点进行序列化处理,得到各结点对应的字节流,其中,在生成各结点对应的字节流的过程中,在预设备忘录中记录各结点以及各结点对应的字节流的首地址,使得在再次遇到相同结点的情况下,生成指向结点的首地址的指针。
在本申请一些实施例中,目标字节流可以包括:数据头、地址标记段和数据段;其中,数据头中包括:版本号、校验码、地址标记段长度以及数据段长度;地址标记段包括目标地址和指针级别,用于表明目标地址中的数据类型为指针级别的指针;数据段中包括目标数据的数据信息。
基于同一发明构思,本申请实施例中还提供了一种数据反序列化装置,如下面的实施例所述。由于数据反序列化装置解决问题的原理与数据反序列化方法相似,因此数据反序列化装置的实施可以参见数据反序列化方法的实施,重复之处不再赘述。以下所使用的,术语“单元”或者“模块”可以实现预定功能的软件和/或硬件的组合。尽管以下实施例所描述的装置较佳地以软件来实现,但是硬件,或者软件和硬件的组合的实现也是可能并被构想的。图11是本申请实施例的数据反序列化装置的一种结构框图,如图11所示,包括:获取模块1101、确定模块1102、生成模块1103和处理模块1104,下面对该结构进行说明。
获取模块1101用于获取目标代码,其中,目标代码包括用户定义的目标结构体类型定义信息和反序列化函数签名。
确定模块1102用于根据目标代码确定目标结构体类型定义信息和反序列化函数签名。
生成模块1103用于根据目标结构体类型定义信息和反序列化函数签名,生成目标反序列化函数。
处理模块1104用于获取待反序列化的目标字节流,并根据目标反序列化函数对目标字节流进行反序列化处理,得到目标字节流对应的目标数据,其中,目标数据的结构体类型定义信息为目标结构体类型定义信息。
在本申请一些实施例中,目标代码为中间码;相应的,获取模块可以具体用于:获取目标源码,其中,目标源码包括预设反序列化函数签名的源码以及目标结构体类型定义信息的源码;对目标源码进行编译,生成中间码。
在本申请一些实施例中,目标结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;生成模块可以具体用于:根据反序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的读取函数;根据各个结构体的读取函数,生成目标反序列化函数。
在本申请一些实施例中,根据反序列化函数签名和多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的读取函数,包括:按照以下方式生成多个结构体中当前结构体的读取函数:获取上一个结构体的指针信息,并根据上一个结构体的指针信息确定出当前结构体,其中,多个结构体中的第一个结构体是根据反序列化函数签名的返回类型确定的;根据当前结构体的结构体类型定义信息和指针信息,生成与当前结构体对应的读取函数。
从以上的描述中,可以看出,本申请实施例实现了如下技术效果:可以由用户定义序列化函数签名和目标数据的结构体类型定义信息,然后根据包括序列化函数签名和结构体类型定义信息的目标代码确定序列化函数签名和目标数据的结构体类型定义信息,并根据序列化函数签名和目标数据的结构体类型定义信息生成目标序列化函数,因而可以针对任意的数据结构类型的目标数据生成目标序列化函数,然后获取目标数据并根据目标序列化函数对目标数据进行序列化处理,生成对应的目标字节流,从而实现用户定义的任意数据结构类型的目标数据的序列化,而无需将目标数据转换为指定类型的数据结构,简化了数据序列化的过程。
本申请实施方式还提供了一种计算机设备,具体可以参阅图12所示的基于本申请实施例提供的数据序列化方法或数据反序列化方法的计算机设备组成结构示意图,所述计算机设备具体可以包括输入设备121、处理器122、存储器123。其中,所述存储器123用于存储处理器可执行指令。所述处理器122执行所述指令时实现上述任意实施例中所述的数据序列化方法或数据反序列化方法的步骤。
在本实施方式中,所述输入设备具体可以是用户和计算机系统之间进行信息交换的主要装置之一。所述输入设备可以包括键盘、鼠标、摄像头、扫描仪、光笔、手写输入板、语音输入装置等;输入设备用于把原始数据和处理这些数的程序输入到计算机中。所述输入设备还可以获取接收其他模块、单元、设备传输过来的数据。所述处理器可以按任何适当的方式实现。例如,处理器可以采 取例如微处理器或处理器以及存储可由该(微)处理器执行的计算机可读程序代码(例如软件或固件)的计算机可读介质、逻辑门、开关、专用集成电路(Application Specific Integrated Circuit,ASIC)、可编程逻辑控制器和嵌入微控制器的形式等等。所述存储器具体可以是现代信息技术中用于保存信息的记忆设备。所述存储器可以包括多个层次,在数字系统中,只要能保存二进制数据的都可以是存储器;在集成电路中,一个没有实物形式的具有存储功能的电路也叫存储器,如RAM、FIFO等;在系统中,具有实物形式的存储设备也叫存储器,如内存条、TF卡等。
在本实施方式中,该计算机设备具体实现的功能和效果,可以与其它实施方式对照解释,在此不再赘述。
本申请实施方式中还提供了一种基于数据序列化方法或数据反序列化方法的计算机存储介质,所述计算机存储介质存储有计算机程序指令,在所述计算机程序指令被执行时实现上述任意实施例中所述数据序列化方法或数据反序列化方法的步骤。
在本实施方式中,上述存储介质包括但不限于随机存取存储器(Random Access Memory,RAM)、只读存储器(Read-Only Memory,ROM)、缓存(Cache)、硬盘(Hard Disk Drive,HDD)或者存储卡(Memory Card)。所述存储器可以用于存储计算机程序指令。网络通信单元可以是依照通信协议规定的标准设置的,用于进行网络连接通信的接口。
在本实施方式中,该计算机存储介质存储的程序指令具体实现的功能和效果,可以与其它实施方式对照解释,在此不再赘述。
显然,本领域的技术人员应该明白,上述的本申请实施例的各模块或各步骤可以用通用的计算装置来实现,它们可以集中在单个的计算装置上,或者分布在多个计算装置所组成的网络上,可选地,它们可以用计算装置可执行的程序代码来实现,从而,可以将它们存储在存储装置中由计算装置来执行,并且在某些情况下,可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。这样,本申请实施例不限制于任何特定的硬件和软件结合。
应该理解,以上描述是为了进行图示说明而不是为了进行限制。通过阅读上述描述,在所提供的示例之外的许多实施方式和许多应用对本领域技术人员来说都将是显而易见的。因此,本申请的范围不应该参照上述描述来确定,而是应该参照前述权利要求以及这些权利要求所拥有的等价物的全部范围来确定。
以上所述仅为本申请的优选实施例而已,并不用于限制本申请,对于本领域的技术人员来说,本申请实施例可以有各种更改和变化。凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (14)

  1. 一种数据序列化方法,其特征在于,包括:
    获取目标代码,其中,所述目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名;
    根据所述目标代码确定所述序列化函数签名和所述结构体类型定义信息;
    根据所述序列化函数签名和所述结构体类型定义信息生成目标序列化函数;
    获取待序列化的目标数据,并根据所述目标序列化函数对所述目标数据进行序列化处理,得到对应的目标字节流。
  2. 根据权利要求1所述的方法,其特征在于,所述目标代码包括中间码;
    相应的,获取目标代码,包括:
    获取目标源码,其中,所述目标源码包括序列化函数签名的源码以及目标数据的结构体类型定义信息的源码;
    对所述目标源码进行编译,生成中间码。
  3. 根据权利要求1所述的方法,其特征在于,所述目标数据的结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;
    根据所述序列化函数签名和所述结构体类型定义信息生成目标序列化函数,包括:
    根据所述序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成所述多个结构体中各个结构体的写入函数;
    根据所述各个结构体的写入函数,生成目标序列化函数。
  4. 根据权利要求3所述的方法,其特征在于,根据所述序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数,包括:
    按照以下方式生成所述多个结构体中当前结构体的写入函数:
    获取上一个结构体的指针信息,并根据所述上一个结构体的指针信息确定出当前结构体,其中,所述多个结构体中的第一个结构体是根据所述序列化函数签名中的第一个参数确定的;
    根据所述当前结构体的结构体类型定义信息和指针信息,生成与所述当前结构体对应的写入函数。
  5. 根据权利要求3所述的方法,其特征在于,还包括:
    在生成所述多个结构体中各个结构体的写入函数的过程中,在预设的映射表中记录所述各结构体以及所述各结构体对应的写入函数。
  6. 根据权利要求1所述的方法,其特征在于,所述目标数据为环结构,包括多个结点,所述结构体类型定义信息包括数据类型信息和指针信息;
    相应地,根据所述目标序列化函数,对所述目标数据进行序列化处理,得到对应的目标字 节流,包括:
    根据所述目标序列化函数,对所述目标数据中的多个结点中各结点进行序列化处理,得到所述各结点对应的字节流,其中,在生成所述各结点对应的字节流的过程中,在预设备忘录中记录所述各结点以及所述各结点对应的字节流的首地址,使得在再次遇到相同结点的情况下,生成指向所述结点的首地址的指针。
  7. 根据权利要求1所述的方法,其特征在于,所述目标字节流包括:数据头、地址标记段和数据段;
    其中,所述数据头中包括:版本号、校验码、地址标记段长度以及数据段长度;所述地址标记段包括目标地址和指针级别,用于表明所述目标地址中的数据类型为所述指针级别的指针;所述数据段中包括所述目标数据的数据信息。
  8. 一种数据反序列化方法,其特征在于,包括:
    获取目标代码,其中,所述目标代码包括用户定义的目标结构体类型定义信息和反序列化函数签名;
    根据所述目标代码确定所述目标结构体类型定义信息和所述反序列化函数签名;
    根据所述目标结构体类型定义信息和所述反序列化函数签名,生成目标反序列化函数;
    获取待反序列化的目标字节流,并根据所述目标反序列化函数对所述目标字节流进行反序列化处理,得到所述目标字节流对应的目标数据,其中,所述目标数据的结构体类型定义信息为所述目标结构体类型定义信息。
  9. 根据权利要求8所述的方法,其特征在于,所述目标代码为中间码;
    相应的,获取目标代码,包括:
    获取目标源码,其中,所述目标源码包括预设反序列化函数签名的源码以及所述目标结构体类型定义信息的源码;
    对所述目标源码进行编译,生成中间码。
  10. 根据权利要求8所述的方法,其特征在于,所述目标结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;
    根据所述反序列化函数签名和所述结构体类型定义信息生成目标反序列化函数,包括:
    根据所述反序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成所述多个结构体中各个结构体的读取函数;
    根据所述各个结构体的读取函数,生成目标反序列化函数。
  11. 根据权利要求10所述的方法,其特征在于,根据所述反序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成所述多个结构体中各个结构体的读取函数,包括:
    按照以下方式生成所述多个结构体中当前结构体的读取函数:
    获取上一个结构体的指针信息,并根据所述上一个结构体的指针信息确定出当前结构体,其中,所述多个结构体中的第一个结构体是根据所述反序列化函数签名的返回类型确定的;
    根据所述当前结构体的结构体类型定义信息和指针信息,生成与所述当前结构体对应的读取函数。
  12. 一种数据序列化装置,其特征在于,包括:
    获取模块,用于获取目标代码,其中,所述目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名;
    确定模块,用于根据所述目标代码确定所述序列化函数签名和所述结构体类型定义信息;
    生成模块,用于根据所述序列化函数签名和所述结构体类型定义信息生成目标序列化函数;
    处理模块,用于获取待序列化的目标数据,并根据所述目标序列化函数对所述目标数据进行序列化处理,得到对应的目标字节流。
  13. 一种计算机设备,其特征在于,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现权利要求1至7中任一项所述方法的步骤。
  14. 一种计算机可读存储介质,其上存储有计算机指令,其特征在于,所述指令被执行时实现权利要求1至7中任一项所述方法的步骤。
PCT/CN2019/120134 2019-11-22 2019-11-22 数据序列化、数据反序列化方法、装置和计算机设备 WO2021097785A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/120134 WO2021097785A1 (zh) 2019-11-22 2019-11-22 数据序列化、数据反序列化方法、装置和计算机设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2019/120134 WO2021097785A1 (zh) 2019-11-22 2019-11-22 数据序列化、数据反序列化方法、装置和计算机设备

Publications (1)

Publication Number Publication Date
WO2021097785A1 true WO2021097785A1 (zh) 2021-05-27

Family

ID=75980382

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/120134 WO2021097785A1 (zh) 2019-11-22 2019-11-22 数据序列化、数据反序列化方法、装置和计算机设备

Country Status (1)

Country Link
WO (1) WO2021097785A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992624A (zh) * 2017-12-22 2018-05-04 百度在线网络技术(北京)有限公司 解析序列化数据的方法、装置、存储介质及终端设备
CN109117209A (zh) * 2018-07-23 2019-01-01 广州多益网络股份有限公司 序列化和反序列化方法及装置
US10318516B1 (en) * 2015-09-22 2019-06-11 Amazon Technologies, Inc. System for optimizing serialization of values
CN110275789A (zh) * 2019-06-24 2019-09-24 恒生电子股份有限公司 数据处理方法及装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10318516B1 (en) * 2015-09-22 2019-06-11 Amazon Technologies, Inc. System for optimizing serialization of values
CN107992624A (zh) * 2017-12-22 2018-05-04 百度在线网络技术(北京)有限公司 解析序列化数据的方法、装置、存储介质及终端设备
CN109117209A (zh) * 2018-07-23 2019-01-01 广州多益网络股份有限公司 序列化和反序列化方法及装置
CN110275789A (zh) * 2019-06-24 2019-09-24 恒生电子股份有限公司 数据处理方法及装置

Similar Documents

Publication Publication Date Title
CN111124551B (zh) 数据序列化、数据反序列化方法、装置和计算机设备
US11010681B2 (en) Distributed computing system, and data transmission method and apparatus in distributed computing system
US9892144B2 (en) Methods for in-place access of serialized data
JP6994071B2 (ja) Protobufベースのプロジェクトのための包括的な検証手法
US9836397B2 (en) Direct memory access of dynamically allocated memory
US9141510B2 (en) Memory allocation tracking
US20030126590A1 (en) System and method for dynamic data-type checking
US10606614B2 (en) Container-based language runtime using a variable-sized container for an isolated method
US20060271347A1 (en) Method for generating commands for testing hardware device models
US7979761B2 (en) Memory test device and memory test method
US8396904B2 (en) Utilizing information from garbage collector in serialization of large cyclic data structures
JP2017174418A (ja) モデルチェックのためのデータ構造抽象化
US10083127B2 (en) Self-ordering buffer
CN114144764A (zh) 使用影子栈的栈跟踪
JP7163966B2 (ja) 変換方法、変換装置および変換プログラム
CN112000589A (zh) 一种数据写入方法、数据读取方法、装置及电子设备
US20110099166A1 (en) Extending types hosted in database to other platforms
WO2021097785A1 (zh) 数据序列化、数据反序列化方法、装置和计算机设备
US7505997B1 (en) Methods and apparatus for identifying cached objects with random numbers
JP7025104B2 (ja) 情報処理装置、方法およびプログラム
US6883006B2 (en) Additions on circular singly linked lists
US9697210B1 (en) Data storage testing
CN116126429B (zh) 一种非数据类型对象的引用持久化及其恢复的方法
US20240036940A1 (en) Method and system for acceleration or offloading utilizing a unified data pointer
AU776882B2 (en) Generating optimized computer data field conversion routines

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19953161

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19953161

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 19953161

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 091222)

122 Ep: pct application non-entry in european phase

Ref document number: 19953161

Country of ref document: EP

Kind code of ref document: A1