WO2021097785A1 - 数据序列化、数据反序列化方法、装置和计算机设备 - Google Patents
数据序列化、数据反序列化方法、装置和计算机设备 Download PDFInfo
- Publication number
- WO2021097785A1 WO2021097785A1 PCT/CN2019/120134 CN2019120134W WO2021097785A1 WO 2021097785 A1 WO2021097785 A1 WO 2021097785A1 CN 2019120134 W CN2019120134 W CN 2019120134W WO 2021097785 A1 WO2021097785 A1 WO 2021097785A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- target
- data
- function
- serialization
- definition information
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformation of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/44—Arrangements for executing specific programs
- G06F9/448—Execution paradigms, e.g. implementations of programming paradigms
Definitions
- This application relates to the field of computer technology, in particular to a data serialization and data deserialization method, device and computer equipment.
- Serialization refers to the process of converting the scattered data structure in the memory into a continuous byte stream when data needs to be stored and transmitted.
- commonly used serial chemicals include protobuf, XML, json, etc.
- these tools all use intrusive serialization methods. Developers need to use the data structure specified by the serialization tool, otherwise serialization cannot be achieved.
- the embodiments of the present application provide a data serialization and data deserialization method, device, and computer equipment, which can be applied to the serialization and non-serialization processing of non-designated data structures.
- the embodiment of the application provides a data serialization method, including: obtaining target code, where the target code includes user-defined structure type definition information of the target data and a serialization function signature; and determining the serialization function signature and the serialization function signature according to the target code.
- Structure type definition information generate the target serialization function according to the serialization function signature and structure type definition information; obtain the target data to be serialized, and serialize the target data according to the target serialization function to obtain the corresponding target word Throttling.
- the target code includes intermediate code; correspondingly, obtaining the target code includes: obtaining the target source code, where the target source code includes the source code of the serialization function signature and the source code of the structure type definition information of the target data; The source code is compiled to generate the intermediate code.
- the structure type definition information of the target data includes structure type definition information and pointer information of each structure in the plurality of structures; the target serialization function is generated according to the serialization function signature and the structure type definition information, Including: According to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, the write function of each structure in the multiple structures is generated; the target is generated according to the write function of each structure Serialization function.
- the write function of each structure in the multiple structures is generated according to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: generating multiple structures in the following manner.
- Write function of the current structure in a structure Obtain the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where the first structure of the multiple structures is Determined according to the first parameter in the serialization function signature; according to the structure type definition information and pointer information of the current structure, the write function corresponding to the current structure is generated.
- the method further includes: in the process of generating the write function of each structure in the plurality of structures, recording each structure and the corresponding write function of each structure in a preset mapping table.
- the target data is a ring structure, including multiple nodes, and the structure type definition information includes data type information and pointer information; accordingly, the target data is serialized according to the target serialization function to obtain the corresponding
- the target byte stream includes: according to the target serialization function, serialize each node in the multiple nodes in the target data to obtain the byte stream corresponding to each node, where each node corresponds to In the process of the byte stream, each node and the first address of the byte stream corresponding to each node are recorded in the preset memo, so that when the same node is encountered again, a pointer to the first address of the node is generated pointer.
- the target byte stream includes: a data header, an address tag segment, and a data segment; wherein the data header includes: a version number, a check code, the length of the address tag segment, and the length of the data segment; the address tag segment includes the target The address and pointer level are used to indicate that the data type in the target address is a pointer level pointer; the data segment includes the data information of the target data.
- the embodiment of the present application also provides a data deserialization method, including: obtaining target code, where the target code includes user-defined target structure type definition information and deserialization function signature; determining the target structure type according to the target code Define the information and the deserialization function signature; define the information and deserialization function signature according to the target structure type to generate the target deserialization function; obtain the target byte stream to be deserialized, and compare the target according to the target deserialization function The byte stream is deserialized to obtain the target data corresponding to the target byte stream, where the structure type definition information of the target data is the target structure type definition information.
- the target code is an intermediate code; correspondingly, obtaining the target code includes: obtaining the target source code, where the target source code includes the source code of the preset deserialization function signature and the source code of the target structure type definition information; The target source code is compiled to generate intermediate code.
- the target structure type definition information includes structure type definition information and pointer information of each structure in the plurality of structures; the target deserialization function is generated according to the deserialization function signature and the structure type definition information, Including: according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures to generate the read function of each structure in the multiple structures; generate according to the read function of each structure The target deserialization function.
- the read function of each structure in the multiple structures is generated according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: generating in the following manner Read function of the current structure among multiple structures: Obtain the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure. Among them, the first structure in the multiple structures It is determined according to the return type of the deserialization function signature; according to the structure type definition information and pointer information of the current structure, the read function corresponding to the current structure is generated.
- An embodiment of the present application also provides a data serialization device, including: an acquisition module for acquiring target code, where the target code includes user-defined structure type definition information and serialization function signatures of the target data; a determination module, Used to determine the serialization function signature and structure type definition information according to the target code; generation module, used to generate the target serialization function according to the serialization function signature and structure type definition information; processing module, used to obtain the target to be serialized Data, and serialize the target data according to the target serialization function to obtain the corresponding target byte stream
- An embodiment of the present application also provides a computer device, including a processor and a memory for storing executable instructions of the processor.
- the processor executes the instructions to implement the steps of the data serialization method described in any of the foregoing embodiments. .
- the embodiments of the present application also provide a computer-readable storage medium on which computer instructions are stored, which when executed, implement the steps of the data serialization method described in any of the foregoing embodiments.
- a data serialization method is provided to obtain structure type definition information including user-defined target data and the target code of the serialization function signature; determine the serialization function signature and structure type according to the target code Define the information; generate the target serialization function according to the serialization function signature and the structure type definition information; obtain the target data to be serialized, and serialize the target data according to the target serialization function to obtain the corresponding target byte stream.
- the user can define the serialization function signature and the structure type definition information of the target data, and then determine the serialization function signature and the structure type of the target data according to the target code including the serialization function signature and the structure type definition information.
- the target serialization function can be generated for the target data of any data structure type, and then the target data is obtained, and the target sequence is obtained according to the target sequence.
- the target byte stream after the target data is serialized by the function, so as to realize the serialization of the target data of any user-defined data structure type, without the need to convert the target data into a specified type of data structure, which simplifies the data serialization process.
- FIG. 1 shows a schematic diagram of an application scenario of a data serialization method and a data deserialization method in an embodiment of the present application
- Figure 2 shows a flowchart of a data serialization method in an embodiment of the present application
- FIG. 3 shows a schematic diagram of the structure of target data including multiple structures in an embodiment of the present application
- FIG. 4 shows a schematic diagram of generating a target serialization function for the structure type of the target data in FIG. 3 in an embodiment of the present application
- FIG. 5 shows a schematic diagram of the structure of target data with a ring structure in an embodiment of the present application
- FIG. 6 shows a schematic diagram of the structure of target data with a ring structure in an embodiment of the present application
- FIG. 7 shows a schematic diagram of a storage format of a target byte stream generated in an embodiment of the present application
- FIG. 8 shows a schematic diagram of a storage format of a target byte stream generated in an embodiment of the present application
- FIG. 9 shows a flowchart of a data deserialization method in an embodiment of the present application.
- FIG. 10 shows a schematic diagram of a data serialization device in an embodiment of the present application.
- FIG. 11 shows a schematic diagram of a data deserialization device in an embodiment of the present application.
- Fig. 12 shows a schematic diagram of a computer device in an embodiment of the present application.
- Fig. 1 shows a schematic diagram of an application scenario of a data serialization method and a data deserialization method in an embodiment of the present application.
- client A wants to send target data to client B.
- Client A serializes the target data to obtain the corresponding target byte stream.
- Client A sends the target byte stream obtained by serialization to client B.
- Client B deserializes the target byte stream to obtain the corresponding target data.
- the above application scenarios are only exemplary.
- data serialization is also applicable to scenarios where data needs to be stored, and data structures scattered in memory need to be converted into byte streams.
- Data deserialization is applicable to data stored in disks. Or when the data received from the network needs to be used, the byte stream is deserialized into a data structure.
- Fig. 2 shows a flowchart of a data serialization method in an embodiment of the present application.
- this application provides method operation steps or device structures as shown in the following embodiments or drawings, the method or device may include more or less operation steps or module units based on conventional or no creative labor. .
- the execution order of these steps or the module structure of the device is not limited to the execution order or module structure shown in the description of the embodiments of this application and the drawings.
- the described method or module structure is applied to an actual device or terminal product, it can be executed sequentially or in parallel according to the method or module structure connection shown in the embodiments or drawings (for example, parallel processors or multi-threaded processing Environment, even distributed processing environment).
- the data serialization method may include the following steps:
- Step S201 Obtain the target code.
- the target code includes user-defined target data structure type definition information and serialization function signatures.
- the serialization function signature may include the information of the serialization function, for example, it may include information such as function name, parameter type, number of parameters, parameter order, and class and namespace where the parameters are located.
- the structure type definition information of the target data includes the type and definition information of each structure of the target data.
- the types of the data structure of the target data may include, but are not limited to: arrays, linked lists, trees, heaps, hash tables, and so on.
- Step S202 Determine the serialization function signature and structure type definition information according to the target code.
- Step S203 Generate a target serialization function according to the serialization function signature and structure type definition information.
- the serialization function signature and the structure type definition information of the target data can be determined according to the target code.
- the structure type definition information of the target data may include the definition information of each structure in the target data.
- Step S204 Obtain the target data to be serialized, and perform serialization processing on the target data according to the target serialization function to obtain the corresponding target byte stream.
- the target data to be serialized can be obtained, and the target data can be serialized according to the target serialization function to obtain the corresponding target byte stream.
- the target data may be a data structure input by the user, or data generated by the target code, or a data structure in the memory, which is not limited in this application.
- the user can define the serialization function signature and the structure type definition information of the target data, and then determine the serialization function signature and the structure type definition information according to the target code including the serialization function signature and the structure type definition information.
- the structure type definition information of the target data, and the target serialization function is generated according to the serialization function signature and the structure type definition information of the target data.
- the target serialization function can be generated for the target data of any data structure type defined by the user, and then Get the target data, and serialize the target byte stream of the target data according to the target serialization function, so as to realize the serialization of the target data of any data structure type defined by the user without converting the target data into the specified type of data
- the structure simplifies the process of data serialization.
- the target code may include intermediate code; correspondingly, obtaining the target code may include: obtaining the target source code, where the target source code includes the source code of the serialized function signature and the structure type definition information of the target data Source code; compile the target source code to generate intermediate code.
- the intermediate code is an equivalent internal representation code that is easy to translate into machine code, which is between high-level language code and machine code.
- the above-mentioned data serialization method can be completed by a compiler that compiles the target code and converts it into machine code for execution.
- the target source code may be code written by the user in advance, for example, it may be C language code, Java code, C++ language code, etc. After obtaining the target source code, in order to facilitate processing, the target source code can be compiled through the front end of the compiler to generate intermediate code.
- the format of the intermediate code may include one of the following: llvm (Low Level Virtual Machine), wasm (WebAssembly), JVM (Java Virtual Machine, Java virtual machine), etc.
- the compiler middle layer can determine the serialization function signature and the structure type definition information of the target data according to the compiled intermediate code, and generate the target serialization function according to the serialization function signature and the structure type definition information of the target data, and according to the target
- the serialization function completes the serialization function in the intermediate code to obtain the completed intermediate code.
- the compiler backend converts the completed intermediate code into machine code. After that, the target data is obtained, and the machine code is called to serialize the target data.
- the target source code is converted into intermediate code, which is convenient for the middle layer of the compiler to analyze and process to generate the target serialization function.
- the structure type definition information of the target data includes the structure type definition information and pointer information of each structure in the multiple structures; the target serialization is generated according to the serialization function signature and the structure type definition information Functions can include: generating write functions for each structure in multiple structures according to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures; according to the write function of each structure , Generate the target serialization function.
- the structure type definition information of the target data includes structure type definition information and pointer information of each structure in the multiple structures.
- the write function of each structure can be generated according to the serialization signature and the structure type definition information and pointer information of each structure in the multiple structures, and then the target serialization function can be generated according to the write function of each structure.
- the write function is the write function, which can write the data in the structure into the binary byte data packet.
- the write function of each structure in the multiple structures is generated according to the serialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, which may include : Generate the write function of the current structure in multiple structures in the following way: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where The first structure is determined according to the first parameter in the serialization function signature; according to the structure type definition information and pointer information of the current structure, the write function corresponding to the current structure is generated.
- the first structure is determined according to the first parameter in the serialization function signature, and the first structure is generated according to the structure type definition information and pointer information of the first structure The write function of the structure.
- the current structure can be determined according to the pointer information of the first structure, and the write function corresponding to the current structure can be generated according to the structure type definition information and pointer information of the current structure, and so on, directly generated The write function corresponding to the last structure.
- the method may further include: in the process of generating the write function of each structure in the multiple structures, recording each structure and the corresponding write function of each structure in a preset mapping table. Into the function.
- FIG. 3 shows a schematic diagram of target data including multiple structures in an embodiment of the present application.
- the target data includes structures A, B, C, and D.
- Structure A includes two pointers that point to structure B and structure D, respectively.
- the pointer of structure B points to the structure.
- C the pointer of structure C points to structure A.
- the definition of the target data and the target code of the serialization function signature can be as follows.
- A, B, C, and D are the structures contained in the target data
- char*packing_A (struct A*, unsigned*) is the serialization function signature.
- FIG. 4 shows a schematic diagram of generating a target serialization function for target data including four structures A, B, C, and D.
- the target serialization function is generated according to the structure definition type information of the target data and the serialization function signature, which can specifically include the type A that needs to be serialized from the first parameter of packing_A, and recursively start from A
- a write function is generated for each structure.
- a mapping table needs to be used to record each structure and the corresponding function to prevent repeated generation.
- the generated target serialization function can be as follows.
- write_A, write_B, and write_C have indirect recursive calls.
- Packing_State is a data type defined in the serialization library, which contains information in the serialization process, including the current processing state, space, memory, and mapping table.
- the write function and write_package function are functions defined in the serialization library.
- the write_package function is used to process the overall process, and the write function is used to prevent repeated writing of the structure.
- the above program exemplarily shows the generated target serialization function in C language.
- the target data is an array. Due to the differences in the definition of arrays in different languages, the serialization methods of arrays are different in different languages. For languages like Java, arrays have clear types and lengths. When serializing, you only need to write the length of the array first, and then write the array elements one by one. For languages such as C/C++, it is impossible to determine how many elements a pointer points to. In order to solve this problem, you can define a marked structure, package the array length, the first address of the array, and the marking information to clarify the definition of the array.
- the target data is a ring structure that may include multiple nodes.
- the structure type definition information includes data type information and pointer information; accordingly, the target data is serialized according to the target serialization function , To obtain the corresponding target byte stream, which may include: according to the target serialization function, serialize each node in the multiple nodes in the target data to obtain the byte stream corresponding to each node, where the In the process of the byte stream corresponding to each node, the first address of each node and the byte stream corresponding to each node is recorded in the preset memo, so that when the same node is encountered again, a pointer to the node is generated The pointer of the first address.
- FIG. 5 and FIG. 6 schematically show the target data of two ring structures.
- the first type of ring structure such as a circular linked list
- FIG. 5 has a circular direction. If it is not processed, the packaging process (that is, the serialization process) cannot be terminated.
- the second type of ring structure in Figure 6 has repeated pointing. If it is not processed, node 3 will become two copies, and the data addresses obtained after unpacking are different, which may cause errors.
- serialization is performed on each of the multiple nodes in the target data to obtain the byte stream corresponding to each node.
- a preset memo is used to record the node that has been written and its subscript in the sequence (ie, the first address of the corresponding byte stream).
- the same node is encountered again Use a pointer to point to its subscript in the sequence.
- the target byte stream may include: a data header, an address tag segment, and a data segment; where the data header includes: version number, check code, address tag segment length, and data segment length; address tag The segment includes the target address and the pointer level, which are used to indicate that the data type in the target address is a pointer-level pointer; the data segment includes the data information of the target data.
- the storage format of the target byte stream generated after the target is serialized according to the data serialization method provided in the embodiment of the present application may include three sections: a data header, an address mark section, and a data section. Among them, the version number, check code, address mark segment length and data segment length are stored in the data header.
- the address tag segment includes a target address and a pointer level, and is used to indicate that the data type in the target address is a pointer of the pointer level.
- the data information of the target data is stored in the data segment.
- FIG. 7 and FIG. 8 respectively show schematic diagrams of the storage formats of the target byte streams generated after the serialization of two target data.
- 1,2,3,4,5,6 is a circular linked list, and the next pointer of node 6 points to node 1, which is the target byte obtained after serialization
- the data packet structure of the flow is shown in Figure 7.
- the address tag segment has only one element, and the data used to tag address 28 is a level 1 pointer.
- each linked list has only one node, and the next pointer points to the same address NULL, then a 3-level pointer is used to point to the node in the address tag segment
- the next field of 3 the next field of node 3 points to the field that pointed to NULL before.
- the target serialization function of another language can be generated, thereby realizing cross-language communication.
- the structure type definition information of the target data may be serialized.
- the same structure type definition information has different representation methods.
- By traversing the type definition information a uniform format of structure type definition information representation method can be generated.
- Verification It is used to calculate the hash value of the structure type definition information. It is stored in the data header and the deserialization function. It can be unpacked (reversed). Before serialization), check whether the data type of the packet is consistent with the data type of the deserialization and unpacking; 2.
- Cross-language code generation used to pass the serialized structure type definition information to the code generation written in another language Tool, you can use it to generate code in this language to achieve cross-language communication.
- the embodiment of the application also provides a data deserialization method. Specifically, as shown in FIG. 9, the data deserialization method provided by some embodiments of the present application may include the following steps:
- Step S901 Obtain target code, where the target code includes user-defined target structure type definition information and deserialization function signatures.
- the target code includes user-defined target data structure type definition information and deserialization function signature.
- the deserialization function signature may include information about the deserialization function, for example, it may include information such as the function name, parameter type, number of parameters, parameter order, and the class and namespace where the parameters are located.
- the target structure type definition information includes the type and definition information of each structure.
- the types of the data structure of the target data may include, but are not limited to: arrays, linked lists, trees, heaps, hash tables, and so on.
- Step S902 Determine the target structure type definition information and the deserialization function signature according to the target code.
- Step S903 Generate a target deserialization function according to the target structure type definition information and the deserialization function signature.
- the deserialization function signature and target structure type definition information can be determined according to the target code.
- the target structure type definition information may include the definition information of each structure in the finally generated target data.
- Step S904 Obtain the target byte stream to be deserialized, and perform deserialization processing on the target byte stream according to the target deserialization function to obtain the target data corresponding to the target byte stream, where the structure type of the target data
- the definition information is the target structure type definition information.
- the target byte stream to be deserialized can be obtained, and the target byte stream can be deserialized according to the target deserialization function to obtain the target byte stream correspondence Target data.
- the target byte stream may be a byte stream obtained from the network or a byte stream read from a disk, which is not limited in this application.
- the structure type definition information of the target data obtained after deserialization is the target structure type definition information.
- the user can define the deserialization function signature and give the target structure type definition information of the target data generated after deserialization, and then according to the target code including the deserialization function signature and target structure type definition information Determine the deserialization function signature and target structure type definition information, and generate the target deserialization function based on the deserialization function signature and target structure type definition information, so that you can generate user-defined arbitrary target data for the target byte stream
- the target deserialization function of the target data of the structure type and then obtain the target byte stream, and after the target byte stream is deserialized according to the target deserialization function, the corresponding target data is obtained, thereby achieving the target byte stream
- It is deserialized into target data of any user-defined data structure type, without the need to convert the specified type data obtained after deserialization into target data of the target structure type, which simplifies the process of data deserialization.
- the target code is intermediate code; correspondingly, obtaining the target code includes: obtaining the target source code, where the target source code includes the source code of the preset deserialization function signature and the source code of the target structure type definition information ; Compile the target source code to generate intermediate code.
- the above-mentioned data deserialization method can be completed by a compiler that compiles the target code and then converts it into machine code for execution.
- the target source code may be code written by the user in advance, for example, it may be C language code, Java code, C++ language code, etc.
- the target source code can be compiled through the front end of the compiler to generate intermediate code.
- the format of the intermediate code may include one of the following: llvm (Low Level Virtual Machine), wasm (WebAssembly), JVM (Java Virtual Machine, Java virtual machine), etc.
- the compiler middle layer can determine the deserialization function signature and target structure type definition information according to the compiled intermediate code, and generate the target deserialization function according to the deserialization function signature and target structure type definition information, and de-serialize according to the target
- the serialization function completes the deserialization function in the intermediate code to obtain the completed intermediate code.
- the compiler backend converts the completed intermediate code into machine code. After that, obtain the target byte stream and call the machine code to perform deserialization of the target byte stream.
- the target source code is converted into intermediate code, which is convenient for the middle layer of the compiler to perform analysis and processing to generate the target deserialization function.
- the target structure type definition information includes the structure type definition information and pointer information of each structure in the multiple structures; the target deserialization is generated according to the deserialization function signature and the structure type definition information Functions include: generating read functions for each structure in multiple structures according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures; according to the read function of each structure , Generate the target deserialization function.
- the target structure type definition information includes structure type definition information and pointer information of each structure in the multiple structures.
- the read function of each structure can be generated according to the deserialization signature and the structure type definition information and pointer information of each structure in the multiple structures, and then the target deserialization function can be generated according to the read function of each structure.
- the read function is the read function, which can convert the data in the binary byte data packet into the target data defined in the target structure type definition information.
- the corresponding target deserialization function can be generated for the target byte stream to be converted into target data containing multiple structures.
- the write function corresponding to the first structure determines the first structure according to the return type of the deserialization function signature, and generate the first structure according to the structure type definition information and pointer information of the first structure The read function.
- the current structure can be determined according to the pointer information of the first structure, and the read function corresponding to the current structure can be generated according to the structure type definition information and pointer information of the current structure, and so on, directly generated
- the read function corresponding to the last structure can be generated according to the deserialization function signature and the type definition information and pointer information of each structure, and then the target deserialization function corresponding to the target byte stream is generated according to the read function .
- the read function of each structure in the multiple structures is generated according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: Method to generate the read function of the current structure in multiple structures: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where the first of the multiple structures The structure is determined according to the return type of the deserialization function signature; according to the structure type definition information and pointer information of the current structure, a read function corresponding to the current structure is generated.
- the method may further include: in the process of generating the read function of each structure in the multiple structures, recording each structure and the corresponding reading of each structure in a preset mapping table. Take the function.
- FIG. 3 shows a schematic diagram of target data including multiple structures in an embodiment of the present application.
- the target data includes structures A, B, C, and D.
- Structure A includes two pointers that point to structure B and structure D, respectively.
- the pointer of structure B points to the structure.
- C the pointer of structure C points to structure A.
- the target structure type definition information and the target code of the deserialization function signature can be as follows.
- A, B, C, and D are the structures contained in the target data, and struct A*unpacking_A(char*) is the signature of the deserialization function.
- the target deserialization function is generated according to the target structure definition type information and the deserialization function signature, which can specifically include the type A that needs to be deserialized from the return type of unpacking_A, starting from A Recursively generate a read function for each structure.
- the process of generating you need to use a mapping table to record each structure and the corresponding function to prevent repeated generation.
- the generated target deserialization function is shown below.
- UnPacking_State is a data type defined in the serialization library, which contains information during the deserialization process.
- read and read_package are functions defined in the serialization library, read_package is used to handle the overall deserialization process, and read is used to prevent repeated allocation of structures.
- the above program exemplarily shows the generated target deserialization function in C language. It is understandable that the generated target deserialization function can be in the format of other languages other than the C language, or in the intermediate code format, which is not limited in this application.
- the target byte stream may include: a data header, an address tag segment, and a data segment; where the data header includes: version number, check code, address tag segment length, and data segment length; address tag The segment includes the target address and the pointer level, which are used to indicate that the data type in the target address is a pointer-level pointer; the data segment includes the data information of the target data.
- the storage format of the target byte stream may include three sections: a data header, an address mark section, and a data section. Among them, the version number, check code, address mark segment length and data segment length are stored in the data header.
- the address tag segment includes a target address and a pointer level, and is used to indicate that the data type in the target address is a pointer of the pointer level.
- the data information of the target data is stored in the data segment.
- deserialization code in another language can be generated, thereby realizing cross-language communication.
- an embodiment of the present application also provides a data serialization device, as described in the following embodiment. Since the problem-solving principle of the data serialization device is similar to that of the data serialization method, the implementation of the data serialization device can refer to the implementation of the data serialization method, and the repetition will not be repeated.
- the term "unit” or "module” can be a combination of software and/or hardware that implements a predetermined function.
- the devices described in the following embodiments are preferably implemented by software, implementation by hardware or a combination of software and hardware is also possible and conceived.
- FIG. 10 is a structural block diagram of the data serialization device according to an embodiment of the present application. As shown in FIG. 10, it includes: an acquisition module 1001, a determination module 1002, a generation module 1003, and a processing module 1004. The structure is described below.
- the obtaining module 1001 is used to obtain target code, where the target code includes user-defined structure type definition information and serialization function signatures of the target data.
- the determining module 1002 is used to determine the serialization function signature and structure type definition information according to the target code.
- the generating module 1003 is used to generate the target serialization function according to the serialization function signature and the structure type definition information.
- the processing module 1004 is used to obtain the target data to be serialized, and serialize the target data according to the target serialization function to obtain the corresponding target byte stream.
- the target code includes intermediate code; correspondingly, the acquisition module can be specifically used to: obtain the target source code, where the target source code includes the source code of the serialization function signature and the source code of the structure type definition information of the target data ; Compile the target source code to generate intermediate code.
- the structure type definition information of the target data includes the structure type definition information and pointer information of each structure in the multiple structures; the generation module can be specifically used to: according to the serialization function signature and multiple The structure type definition information and pointer information of each structure in the structure generate the write function of each structure in the multiple structures; and the target serialization function is generated according to the write function of each structure.
- the write function of each structure in the multiple structures is generated, including: Generate the write function of the current structure among multiple structures: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure.
- the first structure of the multiple structures The body is determined according to the first parameter in the serialization function signature; according to the structure type definition information and pointer information of the current structure, the write function corresponding to the current structure is generated.
- the device further includes a recording module, and the recording module may be specifically configured to: record in a preset mapping table during the process of generating the write function of each of the multiple structures by the generating module Each structure and the write function corresponding to each structure.
- the target data is a ring structure, which may include multiple nodes, and the structure type definition information includes data type information and pointer information; accordingly, the processing module may be specifically used to: serialize functions according to the target, Serialize each of the multiple nodes in the target data to obtain the byte stream corresponding to each node.
- the processing module may be specifically used to: serialize functions according to the target, Serialize each of the multiple nodes in the target data to obtain the byte stream corresponding to each node.
- record in the preset memo Each node and the first address of the byte stream corresponding to each node, so that in the case of encountering the same node again, a pointer to the first address of the node is generated.
- the target byte stream may include: a data header, an address tag segment, and a data segment; where the data header includes: version number, check code, address tag segment length, and data segment length; address tag The segment includes the target address and the pointer level, which are used to indicate that the data type in the target address is a pointer-level pointer; the data segment includes the data information of the target data.
- an embodiment of the present application also provides a data deserialization device, as described in the following embodiment. Since the problem-solving principle of the data deserialization device is similar to that of the data deserialization method, the implementation of the data deserialization device can refer to the implementation of the data deserialization method, and the repetition will not be repeated.
- the term "unit” or "module” can be a combination of software and/or hardware that implements a predetermined function.
- the devices described in the following embodiments are preferably implemented by software, implementation by hardware or a combination of software and hardware is also possible and conceived.
- FIG. 11 is a structural block diagram of a data deserialization device according to an embodiment of the present application. As shown in FIG. 11, it includes: an acquisition module 1101, a determination module 1102, a generation module 1103, and a processing module 1104. The structure is described below .
- the obtaining module 1101 is used to obtain target code, where the target code includes user-defined target structure type definition information and deserialization function signatures.
- the determining module 1102 is used to determine the target structure type definition information and the deserialization function signature according to the target code.
- the generating module 1103 is used to generate the target deserialization function according to the target structure type definition information and the deserialization function signature.
- the processing module 1104 is used to obtain the target byte stream to be deserialized, and perform deserialization processing on the target byte stream according to the target deserialization function to obtain the target data corresponding to the target byte stream, where the structure of the target data
- the body type definition information is the target structure type definition information.
- the target code is an intermediate code; correspondingly, the acquisition module can be specifically used to: acquire the target source code, where the target source code includes the source code of the preset deserialization function signature and the target structure type definition information Source code; compile the target source code to generate intermediate code.
- the target structure type definition information includes structure type definition information and pointer information of each structure in the multiple structures; the generation module can be specifically used to: deserialize function signatures and multiple structures The structure type definition information and pointer information of each structure in the body generate the read function of each structure in the multiple structures; according to the read function of each structure, the target deserialization function is generated.
- the read function of each structure in the multiple structures is generated according to the deserialization function signature and the structure type definition information and pointer information of each structure in the multiple structures, including: Method to generate the read function of the current structure in multiple structures: get the pointer information of the previous structure, and determine the current structure according to the pointer information of the previous structure, where the first of the multiple structures The structure is determined according to the return type of the deserialization function signature; according to the structure type definition information and pointer information of the current structure, a read function corresponding to the current structure is generated.
- the user can define the serialization function signature and the structure type definition information of the target data, and then according to the serialization function signature and the structure type definition
- the target code of the information determines the serialization function signature and the structure type definition information of the target data, and generates the target serialization function according to the serialization function signature and the structure type definition information of the target data, so it can target any data structure type target
- the data generates the target serialization function, and then obtains the target data and serializes the target data according to the target serialization function to generate the corresponding target byte stream, so as to realize the serialization of the target data of any data structure type defined by the user, and There is no need to convert the target data into a data structure of a specified type, which simplifies the process of data serialization.
- the embodiment of the present application also provides a computer device.
- the computer device may specifically It includes an input device 121, a processor 122, and a memory 123.
- the memory 123 is used to store processor executable instructions.
- the processor 122 executes the instructions, the steps of the data serialization method or the data deserialization method described in any of the foregoing embodiments are implemented.
- the input device may specifically be one of the main devices for information exchange between the user and the computer system.
- the input device may include a keyboard, a mouse, a camera, a scanner, a light pen, a handwriting input board, a voice input device, etc.; the input device is used to input raw data and programs for processing these numbers into the computer.
- the input device can also obtain and receive data transmitted from other modules, units, and devices.
- the processor can be implemented in any suitable way.
- the processor may take the form of a microprocessor or processor, and a computer-readable medium, logic gates, switches, application-specific integrated circuits ( Application Specific Integrated Circuit, ASIC), programmable logic controller and embedded microcontroller form, etc.
- the memory may specifically be a memory device used to store information in modern information technology.
- the memory can include multiple levels. In a digital system, as long as it can store binary data, it can be a memory; in an integrated circuit, a circuit with a storage function without a physical form is also called a memory, such as RAM, FIFO, etc.; In the system, storage devices in physical form are also called memories, such as memory sticks, TF cards, and so on.
- the embodiment of the present application also provides a computer storage medium based on a data serialization method or a data deserialization method.
- the computer storage medium stores computer program instructions, which implement any of the foregoing implementations when the computer program instructions are executed. The steps of the data serialization method or data deserialization method described in the example.
- the above-mentioned storage medium includes, but is not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read-Only Memory, ROM), cache (Cache), and hard disk (Hard Disk Drive, HDD). Or memory card (Memory Card).
- the memory can be used to store computer program instructions.
- the network communication unit may be an interface set up in accordance with standards stipulated by the communication protocol and used for network connection communication.
- modules or steps of the embodiments of the present application described above can be implemented by a general computing device, and they can be concentrated on a single computing device or distributed among multiple computing devices.
- they can be implemented by the program code executable by the computing device, so that they can be stored in the storage device for execution by the computing device, and in some cases, they can be different from here
- the steps shown or described are performed in the order of, or they are respectively fabricated into individual integrated circuit modules, or multiple modules or steps of them are fabricated into a single integrated circuit module to achieve. In this way, the embodiments of the present application are not limited to any specific combination of hardware and software.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Stored Programmes (AREA)
Abstract
Description
Claims (14)
- 一种数据序列化方法,其特征在于,包括:获取目标代码,其中,所述目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名;根据所述目标代码确定所述序列化函数签名和所述结构体类型定义信息;根据所述序列化函数签名和所述结构体类型定义信息生成目标序列化函数;获取待序列化的目标数据,并根据所述目标序列化函数对所述目标数据进行序列化处理,得到对应的目标字节流。
- 根据权利要求1所述的方法,其特征在于,所述目标代码包括中间码;相应的,获取目标代码,包括:获取目标源码,其中,所述目标源码包括序列化函数签名的源码以及目标数据的结构体类型定义信息的源码;对所述目标源码进行编译,生成中间码。
- 根据权利要求1所述的方法,其特征在于,所述目标数据的结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;根据所述序列化函数签名和所述结构体类型定义信息生成目标序列化函数,包括:根据所述序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成所述多个结构体中各个结构体的写入函数;根据所述各个结构体的写入函数,生成目标序列化函数。
- 根据权利要求3所述的方法,其特征在于,根据所述序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成多个结构体中各个结构体的写入函数,包括:按照以下方式生成所述多个结构体中当前结构体的写入函数:获取上一个结构体的指针信息,并根据所述上一个结构体的指针信息确定出当前结构体,其中,所述多个结构体中的第一个结构体是根据所述序列化函数签名中的第一个参数确定的;根据所述当前结构体的结构体类型定义信息和指针信息,生成与所述当前结构体对应的写入函数。
- 根据权利要求3所述的方法,其特征在于,还包括:在生成所述多个结构体中各个结构体的写入函数的过程中,在预设的映射表中记录所述各结构体以及所述各结构体对应的写入函数。
- 根据权利要求1所述的方法,其特征在于,所述目标数据为环结构,包括多个结点,所述结构体类型定义信息包括数据类型信息和指针信息;相应地,根据所述目标序列化函数,对所述目标数据进行序列化处理,得到对应的目标字 节流,包括:根据所述目标序列化函数,对所述目标数据中的多个结点中各结点进行序列化处理,得到所述各结点对应的字节流,其中,在生成所述各结点对应的字节流的过程中,在预设备忘录中记录所述各结点以及所述各结点对应的字节流的首地址,使得在再次遇到相同结点的情况下,生成指向所述结点的首地址的指针。
- 根据权利要求1所述的方法,其特征在于,所述目标字节流包括:数据头、地址标记段和数据段;其中,所述数据头中包括:版本号、校验码、地址标记段长度以及数据段长度;所述地址标记段包括目标地址和指针级别,用于表明所述目标地址中的数据类型为所述指针级别的指针;所述数据段中包括所述目标数据的数据信息。
- 一种数据反序列化方法,其特征在于,包括:获取目标代码,其中,所述目标代码包括用户定义的目标结构体类型定义信息和反序列化函数签名;根据所述目标代码确定所述目标结构体类型定义信息和所述反序列化函数签名;根据所述目标结构体类型定义信息和所述反序列化函数签名,生成目标反序列化函数;获取待反序列化的目标字节流,并根据所述目标反序列化函数对所述目标字节流进行反序列化处理,得到所述目标字节流对应的目标数据,其中,所述目标数据的结构体类型定义信息为所述目标结构体类型定义信息。
- 根据权利要求8所述的方法,其特征在于,所述目标代码为中间码;相应的,获取目标代码,包括:获取目标源码,其中,所述目标源码包括预设反序列化函数签名的源码以及所述目标结构体类型定义信息的源码;对所述目标源码进行编译,生成中间码。
- 根据权利要求8所述的方法,其特征在于,所述目标结构体类型定义信息包括多个结构体中各结构体的结构体类型定义信息和指针信息;根据所述反序列化函数签名和所述结构体类型定义信息生成目标反序列化函数,包括:根据所述反序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成所述多个结构体中各个结构体的读取函数;根据所述各个结构体的读取函数,生成目标反序列化函数。
- 根据权利要求10所述的方法,其特征在于,根据所述反序列化函数签名和所述多个结构体中各结构体的结构体类型定义信息和指针信息生成所述多个结构体中各个结构体的读取函数,包括:按照以下方式生成所述多个结构体中当前结构体的读取函数:获取上一个结构体的指针信息,并根据所述上一个结构体的指针信息确定出当前结构体,其中,所述多个结构体中的第一个结构体是根据所述反序列化函数签名的返回类型确定的;根据所述当前结构体的结构体类型定义信息和指针信息,生成与所述当前结构体对应的读取函数。
- 一种数据序列化装置,其特征在于,包括:获取模块,用于获取目标代码,其中,所述目标代码包括用户定义的目标数据的结构体类型定义信息和序列化函数签名;确定模块,用于根据所述目标代码确定所述序列化函数签名和所述结构体类型定义信息;生成模块,用于根据所述序列化函数签名和所述结构体类型定义信息生成目标序列化函数;处理模块,用于获取待序列化的目标数据,并根据所述目标序列化函数对所述目标数据进行序列化处理,得到对应的目标字节流。
- 一种计算机设备,其特征在于,包括处理器以及用于存储处理器可执行指令的存储器,所述处理器执行所述指令时实现权利要求1至7中任一项所述方法的步骤。
- 一种计算机可读存储介质,其上存储有计算机指令,其特征在于,所述指令被执行时实现权利要求1至7中任一项所述方法的步骤。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/120134 WO2021097785A1 (zh) | 2019-11-22 | 2019-11-22 | 数据序列化、数据反序列化方法、装置和计算机设备 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/CN2019/120134 WO2021097785A1 (zh) | 2019-11-22 | 2019-11-22 | 数据序列化、数据反序列化方法、装置和计算机设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021097785A1 true WO2021097785A1 (zh) | 2021-05-27 |
Family
ID=75980382
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/120134 WO2021097785A1 (zh) | 2019-11-22 | 2019-11-22 | 数据序列化、数据反序列化方法、装置和计算机设备 |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2021097785A1 (zh) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107992624A (zh) * | 2017-12-22 | 2018-05-04 | 百度在线网络技术(北京)有限公司 | 解析序列化数据的方法、装置、存储介质及终端设备 |
CN109117209A (zh) * | 2018-07-23 | 2019-01-01 | 广州多益网络股份有限公司 | 序列化和反序列化方法及装置 |
US10318516B1 (en) * | 2015-09-22 | 2019-06-11 | Amazon Technologies, Inc. | System for optimizing serialization of values |
CN110275789A (zh) * | 2019-06-24 | 2019-09-24 | 恒生电子股份有限公司 | 数据处理方法及装置 |
-
2019
- 2019-11-22 WO PCT/CN2019/120134 patent/WO2021097785A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10318516B1 (en) * | 2015-09-22 | 2019-06-11 | Amazon Technologies, Inc. | System for optimizing serialization of values |
CN107992624A (zh) * | 2017-12-22 | 2018-05-04 | 百度在线网络技术(北京)有限公司 | 解析序列化数据的方法、装置、存储介质及终端设备 |
CN109117209A (zh) * | 2018-07-23 | 2019-01-01 | 广州多益网络股份有限公司 | 序列化和反序列化方法及装置 |
CN110275789A (zh) * | 2019-06-24 | 2019-09-24 | 恒生电子股份有限公司 | 数据处理方法及装置 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111124551B (zh) | 数据序列化、数据反序列化方法、装置和计算机设备 | |
US11010681B2 (en) | Distributed computing system, and data transmission method and apparatus in distributed computing system | |
US9892144B2 (en) | Methods for in-place access of serialized data | |
JP6994071B2 (ja) | Protobufベースのプロジェクトのための包括的な検証手法 | |
US9836397B2 (en) | Direct memory access of dynamically allocated memory | |
US9141510B2 (en) | Memory allocation tracking | |
US20030126590A1 (en) | System and method for dynamic data-type checking | |
US10606614B2 (en) | Container-based language runtime using a variable-sized container for an isolated method | |
US20060271347A1 (en) | Method for generating commands for testing hardware device models | |
US7979761B2 (en) | Memory test device and memory test method | |
US8396904B2 (en) | Utilizing information from garbage collector in serialization of large cyclic data structures | |
JP2017174418A (ja) | モデルチェックのためのデータ構造抽象化 | |
US10083127B2 (en) | Self-ordering buffer | |
CN114144764A (zh) | 使用影子栈的栈跟踪 | |
JP7163966B2 (ja) | 変換方法、変換装置および変換プログラム | |
CN112000589A (zh) | 一种数据写入方法、数据读取方法、装置及电子设备 | |
US20110099166A1 (en) | Extending types hosted in database to other platforms | |
WO2021097785A1 (zh) | 数据序列化、数据反序列化方法、装置和计算机设备 | |
US7505997B1 (en) | Methods and apparatus for identifying cached objects with random numbers | |
JP7025104B2 (ja) | 情報処理装置、方法およびプログラム | |
US6883006B2 (en) | Additions on circular singly linked lists | |
US9697210B1 (en) | Data storage testing | |
CN116126429B (zh) | 一种非数据类型对象的引用持久化及其恢复的方法 | |
US20240036940A1 (en) | Method and system for acceleration or offloading utilizing a unified data pointer | |
AU776882B2 (en) | Generating optimized computer data field conversion routines |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19953161 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19953161 Country of ref document: EP Kind code of ref document: A1 |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19953161 Country of ref document: EP Kind code of ref document: A1 |
|
32PN | Ep: public notification in the ep bulletin as address of the adressee cannot be established |
Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 091222) |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 19953161 Country of ref document: EP Kind code of ref document: A1 |