WO2023185401A1 - 一种数据处理方法、编解码加速器和相关设备 - Google Patents

一种数据处理方法、编解码加速器和相关设备 Download PDF

Info

Publication number
WO2023185401A1
WO2023185401A1 PCT/CN2023/080127 CN2023080127W WO2023185401A1 WO 2023185401 A1 WO2023185401 A1 WO 2023185401A1 CN 2023080127 W CN2023080127 W CN 2023080127W WO 2023185401 A1 WO2023185401 A1 WO 2023185401A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
target
processed
memory
information
Prior art date
Application number
PCT/CN2023/080127
Other languages
English (en)
French (fr)
Inventor
王睿
熊婕
秦涛
黄敬雷
李吉
史济源
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023185401A1 publication Critical patent/WO2023185401A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • Embodiments of the present application relate to the field of computers, and in particular, to a data processing method, a codec accelerator and related equipment.
  • serialization refers to the process of converting scattered data structures in memory into continuous byte streams arranged in a specific way when data needs to be transmitted or stored, while the function of deserialization is the opposite.
  • a memory allocation method of multiple allocation expansion is used to perform serialization or deserialization. Specifically, a memory buffer is pre-allocated for serialization or deserialization. During the execution process, if there is insufficient memory, a larger memory buffer will be re-applied and the data in the original memory buffer will be The data set is copied to the larger memory buffer, and then the original memory buffer is released.
  • memory may be applied for multiple times and the encoding and decoding accelerator is called again for serialization or deserialization, which increases the computational burden and wastes computing resources.
  • Embodiments of the present application provide a data processing method, encoding and decoding accelerator and related equipment.
  • the data to be processed is parsed to obtain the target template serial number and the target data distribution serial number.
  • the first memory overhead corresponding to the fixed-length field in the data to be processed is determined.
  • the target processing template corresponding to the target template serial number and the target data distribution information determine the second memory overhead corresponding to the variable length field in the data to be processed.
  • the target memory space of the output data is determined. In this way, the memory space required for output data can be accurately calculated and a memory application can be made once, which reduces the computing burden and saves computing resources.
  • the first aspect of the embodiment of the present application provides a data processing method, including:
  • the encoding and decoding accelerator After the encoding and decoding accelerator obtains the data to be processed, it will parse the data to be processed and obtain the target template serial number and target data distribution serial number corresponding to the data to be processed. According to the target template serial number, the target processing template corresponding to the data to be processed can be obtained.
  • the target processing template indicates the attribute information of the data to be processed, including the type, definition and other information of the data to be processed.
  • the target data distribution information the target data distribution information corresponding to the data to be processed can be obtained.
  • the target data distribution information indicates the data distribution of the data to be processed, that is, the structure of the data to be processed. Simply put, it is the structure of the data to be processed. Process the arrangement of each part of the data in the data.
  • Target data distribution information can also be bound to this information
  • the memory overhead of the corresponding fixed-length field therefore, according to the target data distribution information, the first memory overhead corresponding to the fixed-length field in the data to be processed can be determined.
  • the encoding and decoding accelerator can calculate the size of the target memory space used to store the output data obtained after encoding and decoding the data to be processed based on these two memory overheads, and apply for the target memory space .
  • the target data distribution information corresponding to the target data distribution sequence number determine the first memory overhead corresponding to the fixed-length field in the data to be processed; according to the target processing template and target data distribution information corresponding to the target template sequence number, determine the variable-length field in the data to be processed The corresponding second memory overhead. Finally, according to the first memory cost and the second memory cost, the target memory space of the output data is determined. In this way, the memory space required for output data can be accurately calculated and a memory application can be made once, which reduces the computing burden and saves computing resources.
  • the codec accelerator before parsing the data to be processed, can obtain the first address information indicating the storage location of the registration information, and obtain the initial registration information according to the first address information.
  • the registration information includes M processing templates and N data distribution information, where M and N are both positive integers.
  • the encoding and decoding accelerator will establish a correspondence between these M processing templates and N pieces of data distribution information, so that one processing template corresponds to at least one piece of data distribution information.
  • the encoding and decoding accelerator can also determine the memory overhead of each fixed-length field in each processing template based on a variety of methods.
  • the encoding and decoding accelerator can analyze the members in the processing template and calculate the memory overhead of each fixed-length field in each template, thereby determining the memory cost of each fixed-length field in each processing template. memory overhead.
  • the processor or terminal device can also directly indicate the memory overhead of each fixed-length field when registering the processing template, so that the encoding and decoding accelerator can directly obtain the memory overhead of each fixed-length field in the processing template.
  • the memory overhead of each fixed-length field in each processing template can also be determined through other methods, which are not limited here. For example, during the configuration process, the memory overhead of each fixed-length field in a processing template with a small number of members is noted.
  • one processing template can include multiple types of members, one processing template can correspond to multiple data Distribution information, which provides technical support for the embodiments of the present application to implement serialization and/or deserialization in a non-invasive manner, that is, it improves the implementability of the technical solution.
  • non-intrusive means that the serialization and/or deserialization functions can be implemented without using specified input data structures. This is because the processing template in the registration information covers a variety of situations.
  • the encoding and decoding accelerator can determine the corresponding processing template and determine the processing method of the data to be processed.
  • the number of registered processing templates can also be reduced, thereby reducing memory consumption.
  • the fixed-length memory overhead corresponding to each data distribution information can be determined, that is, the data distribution The information is bound to the fixed-length memory overhead, so that in the subsequent processing process, the encoding and decoding accelerator can quickly determine the fixed-length memory overhead corresponding to the data to be processed based on the target data distribution information of the data to be processed, improving processing efficiency.
  • the target data distribution information corresponding to the data to be processed is included in N pieces of data distribution information.
  • the encoding and decoding accelerator can directly determine the fixed-length memory overhead corresponding to the target data distribution information as the first memory overhead corresponding to the fixed-length field in the data to be processed.
  • the data distribution information is bound to the fixed-length memory overhead, so that the encoding and decoding accelerator does not need to perform calculations during data processing, thus saving computing resources.
  • the fixed-length memory overhead corresponding to the data to be processed can be directly and quickly determined based on the target data distribution information, which also improves processing efficiency.
  • the codec accelerator before parsing the data to be processed, can obtain the first address information indicating the storage location of the registration information, and obtain the registration information according to the first address information.
  • the registration information includes M processing templates, where M is a positive integer. Then determine the memory overhead of each fixed-length field included in each of the M processing templates. The way in which the encoding and decoding accelerator determines the memory overhead of each fixed-length field has been introduced above and will not be repeated here.
  • the encoding and decoding accelerator if the encoding and decoding accelerator only determines the memory overhead of each fixed-length field in each processing template before parsing the data to be processed, then before determining the fixed-length field in the data to be processed, When corresponding to the first memory overhead, the encoding and decoding accelerator needs to determine at least one fixed-length field included in the data to be processed based on the target data distribution information and the target processing template, and then determine the memory of each fixed-length field in the at least one fixed-length field. Overhead, determine the first memory overhead.
  • the target data distribution information indicates the member distribution of the data to be processed
  • the target processing template indicates the attribute information of the data to be processed.
  • the attribute information of the data to be processed includes the type, processing method and other information of each data member in the data to be processed.
  • the encoding and decoding accelerator can determine the variable length fields included in the data to be processed based on the distribution of the data to be processed, the attribute information of the data to be processed, and the data to be processed, thereby calculating the second memory overhead corresponding to the variable length fields.
  • the encoding and decoding accelerator will accurately calculate the second memory overhead corresponding to the variable length field based on the information of the data to be processed, so that the calculation results are accurate and the accuracy of the technical solution of this application is improved.
  • the encoding and decoding accelerator before parsing the data to be processed, the encoding and decoding accelerator will obtain the M address information and M template serial numbers corresponding to the M processing templates, and establish the M address information and M template serial numbers.
  • the mapping relationship between template serial numbers is used to obtain the first mapping table.
  • the encoding and decoding accelerator parses the data to be processed to obtain the target template serial number, it can query the first mapping table according to the target template serial number to obtain the second address information corresponding to the target processing template. Then, according to the second address information, the target processing template is obtained from the memory.
  • the first mapping table indicating the mapping relationship between the address information and the template serial number is stored in the codec accelerator, and the target processing template is obtained from the memory by querying the first mapping table, which is not in the codec accelerator.
  • Local storage of each processing template reduces the occupation of local resources.
  • the encoding and decoding accelerator before parsing the data to be processed, will also obtain N address information and N data distribution serial numbers corresponding to N data distribution information, and establish these N address information Mapping relationship with N data distribution serial numbers to obtain the second mapping table.
  • the encoding and decoding accelerator parses the data to be processed to obtain the target After marking the data distribution serial number, the second mapping table can be queried according to the target data distribution serial number to determine the third address information corresponding to the target data distribution serial number. Then, according to the third address information, the target data distribution information is obtained from the memory.
  • a second mapping table indicating the mapping relationship between address information and data distribution sequence numbers is stored in the encoding and decoding accelerator, and the target data distribution information is obtained from the memory by querying the second mapping table, without compiling the data.
  • the decoding accelerator locally stores each data distribution information, reducing the occupation of local resources.
  • the codec accelerator can apply for the target memory space in a variety of ways, and can apply for the target memory space through a memory management accelerator external to the codec accelerator. In addition, you can also apply for the target memory space through other methods, for example, through the memory management module inside the encoding and decoding accelerator to apply for the target memory space, which is not limited here.
  • the second aspect of the embodiment of the present application provides a coding and decoding accelerator, including:
  • the processing unit is used to parse the data to be processed and obtain the target template serial number and target data distribution serial number corresponding to the data to be processed.
  • the acquisition unit is used to obtain the target processing template corresponding to the data to be processed according to the target template serial number.
  • the acquisition unit is also used to obtain the target data distribution information corresponding to the data to be processed according to the target data distribution sequence number.
  • Processing unit also used for:
  • the first memory overhead corresponding to the fixed-length field in the data to be processed is determined.
  • the target processing template and the data to be processed is calculated.
  • a target memory space is applied for, and the target memory space is used to store output data obtained by encoding and decoding the data to be processed.
  • the encoding and decoding accelerator is used to perform the method described in the first aspect, and its beneficial effects are as shown in the first aspect, which will not be described again here.
  • the third aspect of the embodiment of the present application provides a data processing system.
  • the processing system includes a codec accelerator.
  • the codec accelerator is used to execute the method described in the first aspect.
  • the beneficial effects are as shown in the first aspect, here No longer.
  • the fourth aspect of the embodiments of the present application provides a computer-readable storage medium.
  • the computer-readable storage medium stores a program.
  • the computer executes the program, the method of the first aspect is executed.
  • a fifth aspect of the embodiments of the present application provides a computer program product, which is characterized in that when the computer program product is executed on a computer, the computer executes the method of the first aspect.
  • Figure 1a is a schematic diagram of a system architecture applying the data processing method provided by the embodiment of the present application
  • Figure 1b is a schematic diagram of another system architecture applying the data processing method provided by the embodiment of the present application.
  • Figure 2 is a schematic flow chart of the data processing method provided by the embodiment of the present application.
  • Figure 3 is a schematic diagram of the corresponding relationship between the processing template and data distribution information provided by the embodiment of the present application.
  • Figure 4 is a schematic structural diagram of the data processing system provided by the embodiment of the present application.
  • FIG. 5 is a schematic diagram of the data processing method provided by the embodiment of the present application.
  • Figure 6 is a schematic structural diagram of a codec accelerator provided by an embodiment of the present application.
  • Figure 7 is another schematic structural diagram of a codec accelerator provided by an embodiment of the present application.
  • Embodiments of the present application provide a data processing method, encoding and decoding accelerator and related equipment.
  • the data to be processed is parsed to obtain the target template serial number and the target data distribution serial number.
  • the first memory overhead corresponding to the fixed-length field in the data to be processed is determined.
  • the target processing template corresponding to the target template serial number and the target data distribution information determine the second memory overhead corresponding to the variable length field in the data to be processed.
  • the target memory space of the output data is determined. In this way, the memory space required for output data can be accurately calculated and a memory application can be made once, which reduces the computing burden and saves computing resources.
  • At least one of a, b, or c can mean: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, c can be single or multiple .
  • Metadata also known as intermediary data, relay data, etc., is data used to describe data. It is mainly used to describe attribute information of data to support storage location instructions, historical data, resource search, file records, etc. Function. In data structure related applications, metadata can describe the name, relationship, etc. of data.
  • serialization/deserialization uses an intrusive method, that is, the original data needs to be moved and copied using a specified input data structure in order to realize the serialization/deserialization function, which is highly restrictive.
  • the non-invasive approach is the opposite.
  • the data processing method provided by the embodiment of this application registers multiple processing templates and data distribution types during configuration, can flexibly adapt to different situations, and supports serialization and deserialization in a non-invasive manner.
  • Figure 1a and Figure 1b illustrate the application of the data processing method provided by the embodiment of the present application. Schematic diagram of the system architecture used.
  • the processor 102 configures the registration information, stores the registration information in the memory 103, and sends the address information of the registration information to the codec accelerator 101.
  • This causes the codec accelerator 101 to obtain registration information from the memory 103 according to the address information, which provides a basis for dynamic codec.
  • the codec accelerator 101 accesses the memory 103 according to the description information of the data to be processed, obtains the registration information corresponding to the data to be processed, and processes the data to be processed based on the registration information to obtain output data.
  • the encoding and decoding accelerator 101 can determine the memory overhead of the fixed-length fields and variable-length fields in the data to be processed based on the registration information corresponding to the data to be processed and the data to be processed, thereby obtaining the target memory space required for outputting the data. , to apply for the target memory space from memory 103.
  • the codec accelerator 101 may directly apply for the target memory space to the memory 103.
  • the memory management module in the codec accelerator 101 may apply for the target memory space to the memory 103.
  • the codec accelerator 101 can also apply for the target memory space through other methods.
  • the codec accelerator 101 applies for the target memory space to the memory management accelerator 104, and the memory management accelerator 104 applies to the memory 103. Apply.
  • Figure 2 is a schematic flow chart of the data processing method provided by the embodiment of the present application, including the following steps:
  • the processor triggers the codec accelerator to process data in a variety of ways. Specifically, the processor can send data packets to the codec accelerator according to the agreed protocol. The codec accelerator parses the data packet, obtains the data to be processed, and triggers the processing of the data to be processed. In addition, the processor can also trigger the codec accelerator based on other methods, which are not limited here. For example, the processor calls the interface of the codec accelerator, and based on the agreed protocol, writes the address information (including the first address and length) of the task description information to be processed into the register corresponding to the codec accelerator, thereby triggering the codec accelerator to process the task. . Among them, the task description information to be processed is used to describe the data to be processed. The encoding and decoding accelerator parses the task description information to be processed to obtain the data to be processed.
  • the encoding and decoding accelerator After obtaining the data to be processed, the encoding and decoding accelerator will parse the data to be processed and obtain the target template serial number and target data distribution serial number corresponding to the data to be processed.
  • the target template serial number is used to indicate the target processing template corresponding to the data to be processed
  • the target data distribution serial number is used to indicate the target data distribution information corresponding to the data to be processed.
  • the encoding and decoding accelerator Before parsing the data to be processed and obtaining the target template serial number, the encoding and decoding accelerator will obtain the registration information.
  • the registration information includes M processing templates, and M is a positive integer.
  • the address information of the initial registration information will be stored in the register of the encoding and decoding accelerator.
  • the initial registration information is an encoding file.
  • the encoding and decoding accelerator will preprocess the initial registration information, calculate the memory space required for the decoded registration information, and apply for the memory space.
  • the codec accelerator parses the encoded file, stores the decoded registration information in the applied memory space, and stores the first address information (including the first address and length) corresponding to the decoded registration information into the codec accelerator. in the mapping table.
  • the codec accelerator preprocesses the initial registration information and can convert the initial registration information into a format that the codec accelerator can understand, so that subsequent processing can proceed smoothly.
  • the encoding and decoding accelerator obtains the registration information from the memory based on the first address information and Configure the first mapping table for the M processing templates in the registration information. Specifically, the encoding and decoding accelerator will set M template serial numbers for the M processing templates, and obtain the M address information corresponding to the M processing templates. Then a mapping relationship between the M address information and the M template serial numbers is established to obtain the first mapping table.
  • the address information includes the first address and length.
  • the first mapping table may be as shown in Table 1:
  • the address information 0x10002000 in Table 1 indicates that the first address of processing template No. 0 is 0x10002000.
  • the first address is the memory address expressed in hexadecimal. It can be understood that after the original registration information is decoded by the codec accelerator, the codec accelerator defines the length of each processing template in the decoded registration information. Therefore, in the first mapping table, there is no need to store the length of each processing template. Length, only the mapping relationship between template serial number and address information needs to be stored.
  • the first mapping table is established. After parsing the data to be processed to obtain the target template serial number, the encoding and decoding accelerator can query the first mapping table based on the target template serial number to obtain the second address information corresponding to the target processing template. Then, according to the second address information, the target processing template is obtained from the memory.
  • the first mapping table indicating the mapping relationship between the address information and the template serial number is stored in the codec accelerator, and the target processing template is obtained from the memory by querying the first mapping table, which is not in the codec accelerator.
  • Local storage of each processing template reduces the occupation of local resources.
  • the target data distribution serial number obtain the target data distribution information corresponding to the data to be processed.
  • the registration information obtained by the encoding and decoding accelerator also includes N data distribution information, where N is a positive integer.
  • the encoding and decoding accelerator will configure a second mapping table for the N pieces of data distribution information. Specifically, the encoding and decoding accelerator will set N data distribution serial numbers for the N pieces of data distribution information, and obtain the N corresponding to the N pieces of data distribution information. address information. Then a mapping relationship between the N address information and the N data distribution sequence numbers is established to obtain a second mapping table.
  • the second mapping table may be as shown in Table 2:
  • the address information 0x08049324 in Table 2 indicates that the first address of data distribution information No. 0 is 0x008049324.
  • the first address is the memory address expressed in hexadecimal. It can be understood that after the original registration information is decoded by the codec accelerator, the codec accelerator defines the length of each data distribution information in the decoded registration information. Therefore, there is no need to store each data distribution in the second mapping table. The length of the information only needs to store the mapping relationship between the data distribution sequence number and the address information.
  • Table 1 and Table 2 only show the mapping relationship between template serial numbers and address information, as well as data analysis. This is an example of the mapping relationship between serial numbers and address information, but it does not mean that such a table must exist in actual applications.
  • the second mapping table is established. After parsing the data to be processed to obtain the target data distribution serial number, the encoding and decoding accelerator can query the second mapping table based on the target data distribution serial number to obtain the third address information corresponding to the target data distribution information. Then, according to the third address information, the target data distribution information is obtained from the memory.
  • a second mapping table indicating the mapping relationship between address information and data distribution sequence numbers is stored in the encoding and decoding accelerator, and the target data distribution information is obtained from the memory by querying the second mapping table, without compiling the data.
  • the decoding accelerator locally stores each data distribution information, reducing the occupation of local resources.
  • the target data distribution information determine the first memory overhead corresponding to the fixed-length field in the data to be processed.
  • the encoding and decoding accelerator may determine the first memory overhead of the fixed-length field in the data to be processed based on different methods. Different situations will be described below.
  • the encoding and decoding accelerator obtains registration information from the memory according to the first address information.
  • the registration information includes M processing templates and N data distribution information, where M and N are both positive integers.
  • the processing template indicates the attribute information of the data to be processed, including the type, name, length, comments and other information of each member in the data to be processed.
  • Data distribution information indicates the arrangement of each member in the data to be processed.
  • Each of the M processing templates corresponds to at least one piece of data distribution information.
  • processing template 1 includes five members, namely member 0 to member 4, the corresponding relationship between processing template 1 and data distribution information can be as shown in Figure 3 .
  • Figure 3 is a schematic diagram of the corresponding relationship between the processing template and data distribution information provided by the embodiment of the present application.
  • data distribution information 1 to data distribution information 3 all correspond to processing template 1, and the members in each data distribution information are included in the members of processing template 1. That is to say, the members of processing template 1 are the complete set , the members of each data distribution information corresponding to processing template 1 are subsets of the complete set.
  • one processing template can correspond to multiple data distribution information, which provides technical support for the implementation of serialization and/or deserialization in a non-invasive manner in the embodiments of this application, that is, it improves the technical solution. achievability.
  • non-intrusive means that the serialization and/or deserialization functions can be implemented without using specified input data structures. This is because the processing template in the registration information covers a variety of situations.
  • the encoding and decoding accelerator can determine the corresponding processing template and determine the processing method of the data to be processed.
  • the number of registered processing templates can also be reduced, thereby reducing memory consumption.
  • the codec accelerator also determines the memory overhead of each fixed-length field in each processing template.
  • the encoding and decoding accelerator can determine the memory overhead of fixed-length fields in various ways. For example, if the memory overhead occupied by each fixed-length field member is directly indicated during configuration, then the encoding and decoding accelerator can directly Get the memory overhead. Or, if it is not specified during registration, the encoding and decoding accelerator analyzes the members of each fixed-length field and calculates the memory overhead of each member of the fixed-length field. In addition, the encoding and decoding accelerator can also determine each fixed-length field based on other methods, which are not limited here. For example, the memory overhead of processing some fixed-length field members in the template is noted during the configuration process, and the memory overhead of other fixed-length field members is added by encoding and decoding. The speed controller analyzes the fixed-length field members and determines the memory overhead of processing each fixed-length field member in the template.
  • processing template 1 shown in Figure 3 assume that among the five members included in processing template 1, the types of member 0, member 2 and member 3 are fixed-length fields, and the types of member 1 and member 4 are variable-length fields. . Then the codec accelerator will determine the memory overhead of member 0, member 2 and member 3.
  • the encoding and decoding accelerator After the encoding and decoding accelerator determines the memory overhead of each fixed-length field, it will determine each of the N pieces of data distribution information based on the memory overhead of each fixed-length field and the correspondence between the M processing templates and the N pieces of data distribution information.
  • the fixed-length memory overhead corresponding to the data distribution information In the embodiment shown in Figure 3, it is assumed that the memory overhead corresponding to fixed-length field member 0, member 2, and member 3 in processing template 1 is 4 bytes, 8 bytes, and 4 bytes respectively.
  • the encoding and decoding accelerator can determine that the fixed-length memory overhead corresponding to data distribution information 1, data distribution information 2 and data distribution information 3 corresponding to processing template 1 is 12 bytes, 0 bytes and 4 bytes respectively, that is, the data distribution The information is bound to fixed-length memory overhead.
  • the encoding and decoding accelerator can directly determine the fixed-length memory overhead corresponding to the target data distribution information as the first memory overhead corresponding to the fixed-length field in the data to be processed. For example, assuming that the target data distribution information corresponding to the data to be processed is data distribution information 1 shown in Figure 3, the encoding and decoding accelerator can determine that the first memory overhead corresponding to the fixed-length field in the data to be processed is 12 bytes.
  • the data distribution information is bound to the fixed-length memory overhead, so that the encoding and decoding accelerator does not need to perform calculations during data processing, thus saving computing resources.
  • the fixed-length memory overhead corresponding to the data to be processed can be directly and quickly determined based on the target data distribution information, which also improves processing efficiency.
  • the encoding and decoding accelerator After the encoding and decoding accelerator obtains the target data distribution information and the target processing template corresponding to the data to be processed, it will determine at least one fixed-length field corresponding to the data to be processed based on the target data distribution information and the target processing template. , and then determine the first memory cost corresponding to the fixed-length field in the data to be processed based on the memory cost of each fixed-length field in the at least one fixed-length field.
  • the target processing template is processing template 1 shown in Figure 3
  • the target data distribution information is data distribution information 3 shown in Figure 3
  • the types of member 3 and 3 are fixed-length fields, and their memory overheads are 4 bytes, 8 bytes, and 4 bytes respectively; the types of other members are variable-length fields.
  • the encoding and decoding accelerator can determine that member 0 and member 3 in the data to be processed are fixed-length fields, that is, it determines that the first memory overhead is 8 bytes.
  • the target processing template and the data to be processed calculate the second memory overhead corresponding to the variable length field in the data to be processed.
  • the target data distribution information indicates the member distribution of the data to be processed
  • the target processing template indicates the attribute information of the data to be processed.
  • the attribute information of the data to be processed includes the type, processing method and other information of each data member in the data to be processed.
  • Variable-length fields need to be analyzed based on the specific content of the data to be processed to determine their corresponding secondary memory overhead. Therefore, after the encoding and decoding accelerator obtains the data to be processed, the target data distribution information and the target processing template, it will query the target processing template based on the target data distribution information to determine the type of each member in the data to be processed. If it is determined to be a variable-length field , combined with the data to be processed, determine the memory overhead of each variable-length field, thereby determining the second memory overhead.
  • the target processing template is processing template 1 shown in Figure 3
  • the target data distribution information is data distribution information 3 shown in Figure 3
  • the types of other members are variable-length fields.
  • the encoding and decoding accelerator determines the type of each member according to the data distribution information 3 and query processing template 1. It is determined that only member 1 is a variable-length field, and then combined with the data to be processed, the memory overhead corresponding to the variable-length field is obtained, that is, the second memory overhead is obtained.
  • the encoding and decoding accelerator will accurately calculate the second memory overhead corresponding to the variable length field based on the information of the data to be processed, so that the calculation results are accurate and the accuracy of the technical solution of this application is improved.
  • the target memory space is used to store the output data obtained by encoding and decoding the data to be processed.
  • the codec accelerator can apply for target memory space in a variety of ways. It can apply for the target memory space through the memory management accelerator external to the codec accelerator. In addition, you can also apply for the target memory space through other methods, for example, through the memory management module inside the encoding and decoding accelerator to apply for the target memory space, which is not limited here.
  • step 206 after the encoding and decoding accelerator stores the output data into the target memory space, it will also send an interrupt request to the processor to inform the processor that the data processing is completed, so that the processor can retrieve the data from the memory. Get output data.
  • the interrupt request can carry the address information of the target memory space.
  • the data processing system 400 includes a codec accelerator 401, a processor 402, a memory 403 and a memory management accelerator 404.
  • the data processing method provided by the embodiments of this application focuses on the design of the software and hardware interaction framework, and sets up interactive interfaces for different purposes between the codec accelerator, the processor, and the memory.
  • the codec accelerator 401 provides a registration interface to the processor 402, and the processor 402 can use the registration interface.
  • the configuration operation of the interactive environment and registration information is performed before the codec is executed, thereby supporting the codec accelerator 401 to independently access the memory 403 to obtain the data to be processed according to the registration information in a non-invasive manner, and the codec algorithm program specified by the registration information. , to achieve dynamic encoding and decoding features.
  • the interactive environment includes input/output queue space, input/output space for registration information, etc.
  • the codec accelerator 401 also provides a write interface to the processor 402.
  • the processor 402 can use the write interface to write the address information corresponding to the registration information and the address information corresponding to the description information of the task to be processed into the register corresponding to the codec accelerator 401. in for subsequent encoding and decoding accelerator 401 to access memory 403 to obtain data.
  • the codec accelerator 401 also provides a trigger interface to the processor 402.
  • the processor 402 can use the trigger interface to write the address information corresponding to the registration information and the address information corresponding to the description information of the task to be processed into the codec accelerator 401. , triggering the codec accelerator 401 to start processing the registration information or the pending data corresponding to the task to be processed.
  • the codec accelerator 401 can also be provided with an interrupt notification interface. After the registration information processing or the data to be processed is completed, the codec accelerator 401 can send an interrupt request to the processor 402 through the interrupt notification interface to notify the processor 402 to obtain the registration receipt. Or output data.
  • the codec accelerator 401 is also provided with a memory access interface, through which the memory 403 can be actively accessed to obtain registration information, task description information to be processed, etc. And when the task to be processed is a batch task (also called a batch task), the encoding and decoding output data of each subtask in the batch task, and the output description information are written into the memory after the encoding and decoding batch task is completed 403. When the task to be processed is a single task, the output data of the single task and the description information of the output data are written into the memory 403 .
  • the codec accelerator 401 is also provided with an application memory interface.
  • the codec accelerator 401 determines the first memory overhead based on the fixed-length field based on the registration information preprocessing, and the second memory overhead corresponding to the variable-length field determined during the processing. Accurately calculate the target memory space and apply for the target memory space to the memory management accelerator 404 by applying for the memory interface.
  • the data processing system may not include a memory management accelerator, and the memory management module inside the encoding and decoding accelerator implements the function of the memory management accelerator.
  • Figure 5 is a schematic diagram of the data processing method provided by the embodiment of the present application.
  • the registration information needs to be configured.
  • the processor such as central processing unit, CPU
  • the following steps need to be performed to complete the input of registration information, which correspond to steps 501 and 502 in Figure 5.
  • the processor prepares the interactive environment and encapsulates the environment information and codec information as registration information.
  • preparing the interactive environment refers to preparing the software and hardware interactive environment, including the cache space Buffer1 for the CPU to store registration input information, the cache space Buffer2 for storing encoding and decoding batch task description information, and the cache space Buffer3 for the Ser/DeSer accelerator to store registration output information. , buffer space Buffer4 that stores encoding and decoding batch task output description information.
  • Buffer2 and Buffer4 may not be registered.
  • the encoding and decoding accelerator is notified to process the next one; in the case of not registering buffer2, when encoding and decoding batch tasks, the encoding and decoding accelerator needs to be informed of the address information of the batch task description information.
  • the output data can be stored directly to the specified address. Without registering buffer4, the batch task description information needs to carry the address information of the output data.
  • the encapsulation environment information means that according to the registration information format given by the Ser/DeSer accelerator, the CPU needs to write the Buffer2 and Buffer4 cache space information (first address + length) into Buffer1.
  • Encapsulating the encoding and decoding information means that according to the registration information format given by the Ser/DeSer accelerator, the CPU needs to write the encoding and decoding algorithm information to be used (program entry, type, etc.) into Buffer1; according to the encoding format specified by the Ser/DeSer accelerator (such as JSON, etc.), encode the serialized input data structure Ser_InStruct, deserialized output data structure DeSer_OutStruct, serialization/deserialization mode description Schema, etc., and write their encoded data first address and length information into Buffer1.
  • the processing template in the registration information includes a serialized input data structure and/or a deserialized output data structure, and the data distribution information includes a serialization/deserialization mode description.
  • the processor triggers the codec accelerator to obtain registration information.
  • the CPU can call the Ser/DeSer accelerator interface and write the above registration information (first address and length) to the corresponding accelerator register to trigger the accelerator to process the registration information.
  • the rule registration module On the Ser/DeSer accelerator side, after receiving the registration information and being triggered by the CPU, the rule registration module is started and the following steps are performed to complete the information registration, which correspond to steps 503 to 505 in Figure 5.
  • the codec accelerator accesses buffer1 to obtain registration information.
  • the encoding and decoding accelerator accesses Buffer1 to obtain the registration information and performs the following analysis:
  • the Buffer2 and Buffer4 cache space information (first address + length) into the corresponding registers, then preprocess encoded files such as Ser_InStruct, DeSer_OutStruct and Schema, and calculate the decoded space size.
  • the registration information obtained by the codec accelerator by accessing buffer1 is the initial registration information configured by the processor or terminal device, and the decoded file obtained by preprocessing is the registration information used in subsequent processing.
  • the codec accelerator applies for memory.
  • the encoding and decoding accelerator calculates the decoded space size, it applies for memory space through the memory application module.
  • This memory space stores the decoded registration information. Then store the first address and length of the decoded data memory of Ser_InStruct, DeSer_OutStruct and Schema in the accelerator internal mapping table, that is, store the first address information in the internal mapping table.
  • the codec accelerator will also start the storage calculation module to synthesize Ser_InStruct and Schema, DeSer_OutStruct and Schema respectively, calculate the fixed memory overhead (such as metadata fields, fixed-length fields, etc.) when serializing and deserializing based on them, and cache them. .
  • the codec accelerator encapsulates the registration information receipt and writes it to buffer3.
  • the Ser/DeSer accelerator encapsulates the registration parsing results (such as Schema parsing results, mapping table serial numbers, etc.), or directly writes them into Buffer3, or uses the memory application module to apply for registration receipt space for storage, and then stores the first address and length of the receipt space.
  • Buffer3 can notify the CPU to obtain the registration receipt based on interrupts, or wait for CPU polling or scheduled query to actively obtain it.
  • the codec accelerator feedback registration receipt has multiple functions. On the one hand, it can inform the CPU that the registration is completed, and the CPU can release the relevant memory to reduce memory usage. On the other hand, the CPU can feedback the serial number based on the registration receipt. If the serial number ⁇ 0, the registration is considered successful; if it is less than 0, the registration fails and the client needs to register again, which improves the success rate of registration.
  • the Ser/DeSer accelerator can be used for dynamic encoding and decoding.
  • the CPU software side create a task and perform the following steps, corresponding to steps 506 and 507 in Figure 5.
  • the processor creates a task and writes to buffer2.
  • the CPU will encapsulate task information, that is, N sub-task description information (including the encoding and decoding algorithm type used by the task, Ser_InStruct sequence number, DeSer_OutStruct sequence number, Schema sequence number, first address of the data to be encoded and decoded, length, etc.), encapsulated into batch task description information Work Metadata (Including the number of batch tasks, subtask description information list, etc.), written to Buffer2.
  • N sub-task description information including the encoding and decoding algorithm type used by the task, Ser_InStruct sequence number, DeSer_OutStruct sequence number, Schema sequence number, first address of the data to be encoded and decoded, length, etc.
  • Work Metadata including the number of batch tasks, subtask description information list, etc.
  • the processor triggers the encoding and decoding accelerator to perform encoding and decoding.
  • the processor can call the Ser/DeSer accelerator interface and write the first address and length of the batch task description information Work Metadata into the corresponding accelerator register to trigger the accelerator processing task.
  • the encoding and decoding accelerator After the encoding and decoding accelerator obtains the first address and length of the batch task description information Work Metadata, it will start the data acquisition module, access Buffer2, and obtain the Work Metadata. Then parse the Work Metadata and cache the description information of each subtask to the accelerator's internal receiving queue.
  • the encoding and decoding accelerator accesses memory to obtain the target processing template and target data distribution information.
  • the dynamic rule loading module schedules multiple subtasks in sequence, parses the description information of each subtask, queries the local mapping table of the encoding and decoding accelerator, and loads the corresponding encoding and decoding algorithms, Ser_InStruct, DeSer_OutStruct, Schema and other information to the execution unit in the core processing module. And access to the codec accelerator local cache.
  • each execution unit in the core processing module obtains the encoding and decoding input data from the memory according to the address of the data to be encoded/decoded in the subtask description information, that is, the data to be processed.
  • the codec accelerator applies for memory.
  • the copy distribution module will notify the core processing module to apply for the output memory required for copying, and call this module to copy the output data after the encoding is completed; for the decoding task, the copy distribution module will notify the core processing module, During the decoding process, corresponding memory application and copying are performed.
  • the storage calculation module is called to calculate the output memory cost, that is, the target memory space, based on the fixed memory cost cached during the registration phase. Use the memory application module to apply for target memory space.
  • the encoding and decoding accelerator will also use the specified algorithm to encode and decode, obtain the output data, and then store the output data into the target memory space.
  • the encoding and decoding accelerator encapsulates the output data description information and writes it into buffer4.
  • the output notification module encapsulates the output description information of each subtask (such as DeSer_OutStruct serial number, output data address, length, etc.) and outputs the batch task description information Output Metadata (including the number of batch tasks, Subtask description information list, etc.) is written to Buffer4, and the CPU can be notified to obtain the Output Metadata based on interrupts, or wait for CPU polling or scheduled query to actively obtain it.
  • the output notification module encapsulates the output description information of each subtask (such as DeSer_OutStruct serial number, output data address, length, etc.) and outputs the batch task description information Output Metadata (including the number of batch tasks, Subtask description information list, etc.) is written to Buffer4, and the CPU can be notified to obtain the Output Metadata based on interrupts, or wait for CPU polling or scheduled query to actively obtain it.
  • Figure 6 is a schematic structural diagram of a codec accelerator 600 provided by an embodiment of the present application, including:
  • the processing unit 601 is used to parse the data to be processed and obtain the target template serial number and target data distribution serial number corresponding to the data to be processed.
  • the obtaining unit 602 is used to obtain the target processing template corresponding to the data to be processed according to the target template serial number.
  • the obtaining unit 602 is also used to obtain the target data distribution information corresponding to the data to be processed according to the target data distribution serial number.
  • the processing unit 601 is also configured to determine the first memory overhead corresponding to the fixed-length field in the data to be processed based on the target data distribution information; and calculate the variable-length field in the data to be processed based on the target data distribution information, the target processing template and the data to be processed.
  • the corresponding second memory overhead according to the first memory overhead and the second memory overhead, apply for a target memory space, and the target memory space is used to store the output data obtained by encoding and decoding the data to be processed.
  • the acquisition unit 602 is also configured to: acquire the first address information, the first address information indicates the storage location of the registration information, the registration information includes M processing templates and N data distribution information, M Each processing template in the processing template corresponds to at least one piece of data distribution information, and M and N are both positive integers; according to the first address information, M processing templates and N pieces of data distribution information are obtained.
  • the processing unit 601 is also used to determine the memory overhead of each fixed-length field in each processing template; based on the memory overhead of each fixed-length field and the correspondence between the M processing templates and the N pieces of data distribution information, determine The fixed-length memory overhead corresponding to each of the N pieces of data distribution information.
  • the target data distribution information is included in N pieces of data distribution information.
  • the processing unit 601 is specifically used to determine the fixed-length memory overhead corresponding to the target data distribution information, which is the first memory overhead.
  • the obtaining unit 602 is also used to obtain the first address information.
  • the first address information indicates the storage location of the registration information.
  • the registration information includes M processing templates, where M is a positive integer; according to the first address Information, obtain M processing templates.
  • the processing unit 601 is also used to determine the memory overhead of each fixed-length field included in each of the M processing templates.
  • the processing unit 601 is specifically configured to determine at least one fixed-length field corresponding to the data to be processed according to the target data distribution information and the target processing template; according to each fixed-length field in the at least one fixed-length field memory overhead, determine the first memory overhead.
  • the target data distribution information indicates the member distribution of the data to be processed
  • the target processing The management template indicates the attribute information of the data to be processed
  • the processing unit 601 is specifically configured to determine the variable-length fields included in the data to be processed based on the member distribution of the data to be processed, the attribute information of the data to be processed, and the data to be processed; and calculate the second memory overhead corresponding to the variable-length fields.
  • the obtaining unit 602 is also used to obtain M address information and M template serial numbers corresponding to the M processing templates.
  • the processing unit 601 is also used to establish a mapping relationship between M address information and M template serial numbers to obtain a first mapping table.
  • the acquisition unit 602 is specifically configured to determine the second address information corresponding to the target processing template from the first mapping table according to the target template serial number; and obtain the target processing template from the memory according to the second address information.
  • the obtaining unit 602 is also used to obtain N pieces of address information and N pieces of data distribution sequence numbers corresponding to the N pieces of data distribution information.
  • the processing unit 601 is also used to establish a mapping relationship between N address information and N data distribution serial numbers to obtain a second mapping table.
  • the acquisition unit 602 is specifically configured to determine the third address information corresponding to the target data distribution information from the second mapping table according to the target data distribution sequence number; and obtain the target data distribution information from the memory according to the third address information.
  • the processing unit 601 is specifically configured to apply for a target memory space through a memory management accelerator or a memory management module.
  • the codec accelerator 600 can perform the operations performed by the codec accelerator in the embodiments shown in Figures 1a to 5, which will not be described again here.
  • FIG. 7 is a schematic structural diagram of a coding and decoding accelerator provided by an embodiment of the present application.
  • the codec accelerator 700 includes a rule registration module, a data acquisition module, a dynamic rule loading module, a core processing module, a copy distribution module, a storage calculation module, a storage application module, and an output notification module.
  • the rule registration module is used to: access memory to obtain the interactive environment, codec and other configuration information registered by the CPU in the accelerator, parse and store it into the applied memory space, and cache the memory address and data distribution of the processing template included in the registration information.
  • the memory address of the information is stored in the local mapping table of the accelerator for quick query and acquisition during encoding and decoding.
  • the data acquisition module is used to: Based on the first address and length information of the batch task description information written by the CPU into the accelerator register, access the corresponding memory space of the registered interactive environment, obtain the batch task description information and parse it, and describe each subtask The information is cached in the local receiving queue, waiting for accelerator scheduling and processing.
  • the dynamic rule loading module is used to: schedule multiple subtasks in the accelerator's local reception queue, parse the description information of each subtask, query the accelerator's local mapping table, and load the required algorithms, data structures, templates and other encoding and decoding information into the core processing module for execution. unit.
  • the core processing module is used for: Each execution unit in the core processing module accesses data according to the data address to be encoded/decoded in the subtask description information and the loaded subtask encoding and decoding information; calls the storage calculation module to accurately calculate the output memory Overhead, call the memory application module to apply for output memory; use the loaded algorithm to encode and decode and store it in the output space.
  • the storage calculation module is used to: during registration processing, based on the registered encoding and decoding information, calculate and cache the memory overhead of fixed parts of the output data (including metadata, fixed-length fields, etc.) during encoding and decoding, and reduce subsequent encoding and decoding memory overhead calculations ;
  • During encoding and decoding query and obtain the fixed part of the local cached output overhead based on the subtask description information, and Based on the data to be encoded and decoded by the subtask, the memory overhead of the variable-length fields in the output data is accurately calculated, and the precise output memory requirements are determined by combining the two.
  • serialization is performed in the cache, which also avoids writing the memory multiple times and avoiding poor performance.
  • Storage application module For parsing codec information during registration processing, or codec output during codec processing, this module will interact with the external memory management accelerator to apply for output memory based on the memory overhead calculated by the storage calculation module.
  • the copy distribution module is used to: According to the subtask description information, if copying is required, for the encoding task, the module will notify the core processing module, use the storage computing module to apply for the output memory required for copying, and call itself to output data after the encoding is completed. Copy; for decoding tasks, this module will notify the core processing module to apply for and copy corresponding memory during the decoding process.
  • the output notification module is used to: When the batch task is completed, the module will encapsulate the output description information of each subtask into the output batch task description information, write it into the corresponding memory space of the registered interactive environment, and send interrupt notifications.
  • the CPU takes the output data, or waits for CPU polling or scheduled query to actively obtain it.
  • the codec accelerator 700 can perform the operations performed by the codec accelerator in the embodiments shown in Figures 1a to 6, which will not be described again here.
  • the disclosed systems, devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in various embodiments of this application.
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (ROM), random access memory (RAM), magnetic disk or optical disk and other media that can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

本申请实施例公开了一种数据处理方法、编解码加速器和相关设备,用于降低运算负担,节约算力资源。本申请实施例方法包括:解析待处理数据,得到待处理数据对应的目标模板序号和目标数据分布序号。根据目标模板序号,确定待处理数据对应的目标处理模板;根据目标数据分布序号,确定待处理数据对应的目标数据分布信息。根据目标数据分布信息,确定待处理数据中定长字段对应的第一内存开销;根据目标数据分布信息、目标处理模板和待处理数据,计算待处理数据中变长字段对应的第二内存开销。根据第一内存开销和第二内存开销,申请用于存储编解码待处理数据的输出数据的目标内存空间。

Description

一种数据处理方法、编解码加速器和相关设备
本申请要求于2022年03月28日提交中国国家知识产权局、申请号为CN202210312378.X、发明名称为“一种数据处理方法、编解码加速器和相关设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及计算机领域,尤其涉及一种数据处理方法、编解码加速器和相关设备。
背景技术
在计算机科学中,编码和解码发挥着重要作用,在计算机通信、存储等场景中,序列化和反序列化作为编解码的一种具体实现,得到了广泛的应用。其中,序列化是指当数据需要传输或者存储时,将内存中分散的数据结构转换为按照特定方式排布的、连续的字节流的过程,而反序列化的功能则与之相反。
发明内容
在一种数据处理方法中,采用多次分配扩容的内存分配方式,进行序列化或者反序列化。具体来说,预先分配一块内存缓冲区用于进行序列化或者反序列化,在执行过程中,如果出现内存不足的情况,会重新申请更大的内存缓冲区,并将原内存缓冲区中的数据集中拷贝到该更大的内存缓冲区中,再释放原内存缓冲区。
在这种方法中,可能会出现多次申请内存并再次调用编解码加速器进行序列化或者反序列化,增加了运算负担,浪费了算力资源。
本申请实施例提供了一种数据处理方法、编解码加速器和相关设备。在这种数据处理方法中,对待处理数据进行解析,得到目标模板序号和目标数据分布序号,根据目标数据分布序号对应的目标数据分布信息,确定待处理数据中定长字段对应的第一内存开销;根据目标模板序号对应的目标处理模板和目标数据分布信息,确定待处理数据中变长字段对应的第二内存开销。最后根据第一内存开销和第二内存开销,确定输出数据的目标内存空间。这样能够精确计算出输出数据所需要的内存空间,进行一次内存申请即可,降低了运算负担,节约了算力资源。
本申请实施例第一方面提供了一种数据处理方法,包括:
编解码加速器获取到待处理数据之后,会对待处理数据进行解析,得到待处理数据所对应的目标模板序号和目标数据分布序号。根据目标模板序号,能够获取到待处理数据对应的目标处理模板,该目标处理模板指示了待处理数据的属性信息,包括待处理数据的类型、定义等信息。根据目标数据分布信息,能够获取待处理数据对应的目标数据分布信息,该目标数据分布信息指示的是待处理数据的数据分布情况,也即待处理数据的结构体,简单来说,即为待处理数据中各部分数据的排布情况。目标数据分布信息还可以绑定该信息 所对应的定长字段的内存开销,因此根据目标数据分布信息,能够确定待处理数据中定长字段对应的第一内存开销。在得到第一内存开销和第二内存开销之后,编解码加速器根据这两个内存开销能够计算出用于存储编解码待处理数据之后得到的输出数据的目标内存空间的大小,并申请目标内存空间。
从以上技术方案可以看出,本申请实施例具有以下优点:
根据目标数据分布序号对应的目标数据分布信息,确定待处理数据中定长字段对应的第一内存开销;根据目标模板序号对应的目标处理模板和目标数据分布信息,确定待处理数据中变长字段对应的第二内存开销。最后根据第一内存开销和第二内存开销,确定输出数据的目标内存空间。这样能够精确计算出输出数据所需要的内存空间,进行一次内存申请即可,降低了运算负担,节约了算力资源。
在第一方面一些可选的实施例中,在解析待处理数据之前,编解码加速器能够获取到指示注册信息的存储位置的第一地址信息,并根据第一地址信息获取到初始注册信息。注册信息包括M个处理模板和N个数据分布信息,M和N均为正整数。编解码加速器会建立这M个处理模板与N个数据分布信息之间的对应关系,使得一个处理模板对应至少一个数据分布信息。编解码加速器还可以基于多种方式,确定每个处理模板中每个定长字段的内存开销。具体来说,在实际应用中,编解码加速器可以对处理模板中的成员进行分析,并计算各个模板中每个定长字段的内存开销,由此确定每个处理模板中每个定长字段的内存开销。也可以由处理器或者终端设备在注册处理模板时,直接注明每个定长字段的内存开销,使得编解码加速器可以直接获取处理模板中每个定长字段的内存开销。除此之外,还可以通过其他的方式确定每个处理模板中各个定长字段的内存开销,具体此处不做限定。例如,在配置过程中,注明成员数量少的处理模板中每个定长字段的内存开销,其他未注明内存开销的定长字段由编解码加速器在获取到处理模板之后进行计算,由此获取到每个处理模板中每个定长字段的内存开销。编解码加速器根据每个定长字段的内存开销和M个处理模板与N个数据分布信息之间的对应关系,能够确定出这N个数据分布信息中每个数据分布信息所对应的定长字段的内存开销。
本申请实施例中,建立注册信息所包括的M个处理模板和N个数据分布信息之间的对应关系,由于一个处理模板中可以包括多种类型的成员,因此一个处理模板可以对应多个数据分布信息,这为本申请实施例实现以非侵入式的方式进行序列化和/或反序列化提供了技术支持,也即提升了技术方案的可实现性。其中,非侵入式是指不需要使用指定的输入数据结构,便可以实现序列化和/或反序列化功能。这是因为,注册信息中的处理模板囊括了多种情况,编解码加速器根据待处理数据的数据分布信息,便能确定与之对应的处理模板,也就确定了待处理数据的处理方式。同时,如果一个处理模板可以对应多个数据分布信息,也可以减少注册的处理模板的数量,从而降低了内存消耗。另外,根据数据分布信息与处理模板之间的对应关系,和每个处理模板中每个定长字段的内存开销,能够确定的每个数据分布信息所对应的定长内存开销,也即将数据分布信息与定长内存开销进行绑定,使得编解码加速器在后续的处理过程中,能够根据待处理数据的目标数据分布信息快速确定出待处理数据对应的定长内存开销,提升了处理效率。
在第一方面一些可选的实施例中,待处理数据对应的目标数据分布信息包含于N个数据分布信息中。在数据分布信息与定长内存开销绑定的情况下,编解码加速器可以直接将目标数据分布信息对应的定长内存开销,确定为待处理数据中定长字段对应的第一内存开销。
本申请实施例中,将数据分布信息与定长内存开销进行绑定,使得编解码加速器在数据处理过程中,不需要再进行计算,节约了运算资源。同时,够根据目标数据分布信息直接快速确定出待处理数据对应的定长内存开销,也提升了处理效率。
在第一方面一些可选的实施例中,在解析待处理数据之前,编解码加速器能够获取指示注册信息的存储位置的第一地址信息,并根据第一地址信息获取注册信息。该注册信息包括M个处理模板,M为正整数。然后确定这M个处理模板中每个处理模板所包括的每个定长字段的内存开销。编解码加速器确定每个定长字段的内存开销的方式,在上文已经介绍过,此处不再赘述。
在第一方面一些可选的实施例中,如果编解码加速器在解析待处理数据之前,仅仅确定了每个处理模板中每个定长字段的内存开销,那么在确定待处理数据中定长字段对应的第一内存开销时,编解码加速器需要根据目标数据分布信息和目标处理模板,确定待处理数据中包括的至少一个定长字段,然后根据至少一个定长字段中每个定长字段的内存开销,确定第一内存开销。
本申请实施例中,确定第一内存开销的方式有多种,可以根据实际应用的需要选择,提升了本申请技术方案的灵活性。
在第一方面一些可选的实施例中,目标数据分布信息指示待处理数据的成员分布情况,目标处理模板指示待处理数据的属性信息。其中,待处理数据的属性信息包括待处理数据中各个数据成员的类型、处理方式等信息。编解码加速器根据待处理数据的分布情况、待处理数据的属性信息和待处理数据,能够确定待处理数据包括的变长字段,从而计算出变长字段对应的第二内存开销。
本申请实施例中,在处理过程中,编解码加速器会结合待处理数据自身的信息精确计算变长字段对应的第二内存开销,使得计算结果准确,提升了本申请技术方案的准确度。
在第一方面一些可选的实施例中,在解析待处理数据之前,编解码加速器会获取M个处理模板对应的M个地址信息和M个模板序号,并建立这M个地址信息和M个模板序号之间的映射关系,得到第一映射表。编解码加速器在解析待处理数据得到目标模板序号之后,就可以根据目标模板序号查询第一映射表,得到目标处理模板对应的第二地址信息。然后根据第二地址信息,从内存中获取目标处理模板。
本申请实施例中,在编解码加速器中存储指示地址信息和模板序号之间映射关系的第一映射表,通过查询第一映射表的方式,从内存中获取目标处理模板,并不在编解码加速器本地存储各个处理模板,减少了本地资源的占用。
在第一方面一些可选的实施例中,在解析待处理数据之前,编解码加速器还会获取N个数据分布信息对应的N个地址信息和N个数据分布序号,并建立这N个地址信息与N个数据分布序号之间的映射关系,得到第二映射表。编解码加速器在解析待处理数据得到目 标数据分布序号之后,就可以根据目标数据分布序号查询第二映射表,确定目标数据分布序号对应的第三地址信息。然后根据第三地址信息,从内存中获取目标数据分布信息。
本申请实施例中,在编解码加速器中存储指示地址信息和数据分布序号之间映射关系的第二映射表,通过查询第二映射表的方式,从内存中获取目标数据分布信息,并不在编解码加速器本地存储各个数据分布信息,减少了本地资源的占用。
在第一方面一些可选的实施例中,编解码加速器可以通过多种方式申请目标内存空间,可以通过编解码加速器外部的内存管理加速器,申请目标内存空间。除此之外,还可以通过其他的方式申请目标内存空间,例如,通过编解码加速器内部的内存管理模块,申请目标内存空间,具体此处不做限定。
本申请实施例中,编解码加速器申请目标内存空间的方式有多种,可以适应不同的情况,提升了本申请技术方案的灵活性。
本申请实施例第二方面提供了一种编解码加速器,包括:
处理单元,用于解析待处理数据,得到待处理数据对应的目标模板序号和目标数据分布序号。
获取单元,用于根据目标模板序号,获取待处理数据对应的目标处理模板。
获取单元,还用于根据目标数据分布序号,获取待处理数据对应的目标数据分布信息。
处理单元,还用于:
根据目标数据分布信息,确定待处理数据中定长字段对应的第一内存开销。
根据目标数据分布信息、目标处理模板和待处理数据,计算待处理数据中变长字段对应的第二内存开销。
根据第一内存开销和第二内存开销,申请目标内存空间,目标内存空间用于存储编解码待处理数据得到的输出数据。
编解码加速器用于执行第一方面所述的方法,其有益效果如第一方面所示,此处不再赘述。
本申请实施例第三方面提供了一种数据处理系统,该处理系统包括编解码加速器,该编解码加速器用于执行第一方面所述的方法,其有益效果如第一方面所示,此处不再赘述。
本申请实施例第四方面提供了一种计算机可读存储介质,计算机可读存储介质中保存有程序,当计算机执行该程序时,执行第一方面的方法。
本申请实施例第五方面提供了一种计算机程序产品,其特征在于,当计算机程序产品在计算机上执行时,该计算机执行第一方面的方法。
附图说明
图1a为应用本申请实施例提供的数据处理方法的一个系统架构示意图;
图1b为应用本申请实施例提供的数据处理方法的另一个系统架构示意图;
图2为本申请实施例提供的数据处理方法的一个流程示意图;
图3为本申请实施例提供的处理模板和数据分布信息的一个对应关系示意图;
图4为本申请实施例提供的数据处理系统的一个结构示意图;
图5为本申请实施例提供的数据处理方法的一个示意图;
图6为本申请实施例提供的编解码加速器的一个结构示意图;
图7为本申请实施例提供的编解码加速器的另一个结构示意图。
具体实施方式
本申请实施例提供了一种数据处理方法、编解码加速器和相关设备。在这种数据处理方法中,对待处理数据进行解析,得到目标模板序号和目标数据分布序号,根据目标数据分布序号对应的目标数据分布信息,确定待处理数据中定长字段对应的第一内存开销;根据目标模板序号对应的目标处理模板和目标数据分布信息,确定待处理数据中变长字段对应的第二内存开销。最后根据第一内存开销和第二内存开销,确定输出数据的目标内存空间。这样能够精确计算出输出数据所需要的内存空间,进行一次内存申请即可,降低了运算负担,节约了算力资源。
下面结合附图,对本申请的实施例进行描述。本领域普通技术人员可知,随着技术的发展和新场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的术语在适当情况下可以互换,这仅仅是描述本申请的实施例中对相同属性的对象在描述时所采用的区分方式。此外,术语“包括”和“具有”以及他们的任何变形,其目的在于覆盖不排他的包含,以便包含一系列单元的过程、方法、系统、产品或设备不必限于那些单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它单元。另外,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b,或c中的至少一项(个),可以表示:a,b,c,a-b,a-c,b-c,或a-b-c,其中a,b,c可以是单个,也可以是多个。
首先,对本申请实施例可能涉及的专有名词和相关概念进行说明。
1.元数据(metadata)。
元数据,又可以被称为中介数据、中继数据等,是用来描述数据的数据,主要是用于描述数据的属性信息,以支持包括指示存储位置、历史数据、资源查找和文件记录等功能。在数据结构的相关应用中,元数据可以描述数据的名称、关系等。
2.侵入式和非侵入式的序列化/反序列化。
传统的序列化/反序列化采用的是侵入式的方式,即需要使用指定的输入数据结构,将原始数据进行移动、拷贝,才能实现序列化/反序列功能,限定性较强。非侵入式的方式与之相反。本申请实施例提供的数据处理方法,在配置时就注册了多种处理模板和数据分布类型,能够灵活适应不同的情况,支持非侵入式的方式进行序列化和反序列处理。
接下来,请参阅图1a和图1b,图1a和图1b为本申请实施例提供的数据处理方法应 用的系统架构示意图。
以处理器进行注册信息的配置为例进行说明,如图1a所示,处理器102进行注册信息的配置,将注册信息存储至内存103中,并将注册信息的地址信息发送编解码加速器101,使得编解码加速器101根据地址信息从内存103中获取注册信息,为动态编解码提供了基础。编解码加速器101获取待处理数据之后,根据待处理数据的描述信息,访问内存103,获取待处理数据所对应的注册信息,基于该注册信息对待处理数据进行处理,得到输出数据。在处理过程中,编解码加速器101根据待处理数据对应的注册信息和待处理数据自身,能够确定待处理数据中定长字段和变长字段的内存开销,从而得到输出数据所需要的目标内存空间,以向内存103申请该目标内存空间。
可选的,可以如图1a所示,由编解码加速器101直接向内存103申请目标内存空间,具体来说,可以是由编解码加速器101中的内存管理模块,向内存103申请目标内存空间。在实际应用中,编解码加速器101还可以通过其他方式申请目标内存空间,例如,如图1b所示,编解码加速器101向内存管理加速器104申请目标内存空间,再有内存管理加速器104向内存103进行申请。
接下来,请参阅图2,图2为本申请实施例提供的数据处理方法的一个流程示意图,包括以下步骤:
201.解析待处理数据,得到待处理数据对应的目标模板序号和目标数据分布序号。
处理器通过多种方式触发编解码加速器进行数据处理,具体来说,处理器可以按照约定好的协议,向编解码加速器发送数据包。编解码加速器解析数据包,获取待处理数据,触发对待处理数据的处理过程。除此之外,处理器还可以基于其他的方式触发编解码加速器,具体此处不做限定。例如,处理器调用编解码加速器的接口,基于约定好的协议,将待处理任务描述信息的地址信息(包括首地址和长度)写入编解码加速器对应的寄存器中,实现触发编解码加速器处理任务。其中,待处理任务描述信息用于描述待处理数据。编解码加速器对待处理任务描述信息进行解析,以获取待处理数据。
获取到待处理数据之后,编解码加速器会对待处理数据进行解析,得到待处理数据对应的目标模板序号和目标数据分布序号。目标模板序号用于指示待处理数据对应的目标处理模板,目标数据分布序号则用于指示待处理数据对应的目标数据分布信息。
202.根据目标模板序号,获取待处理数据对应的目标处理模板。
在解析待处理数据,得到目标模板序号之前,编解码加速器会获取注册信息,该注册信息包括M个处理模板,M为正整数。具体来说,处理器或者终端设备配置初始注册信息之后,会将初始注册信息的地址信息存储至编解码加速器的寄存器中,初始注册信息是编码文件。编解码加速器会对初始注册信息进行预处理,并计算解码后的注册信息所需要的内存空间,并申请该内存空间。之后,编解码加速器解析编码文件,将解码后的注册信息存放至申请的内存空间中,并将解码后的注册信息对应的第一地址信息(包括首地址和长度)存到编解码加速器的内部映射表中。编解码加速器对初始注册信息进行预处理,能够将初始注册信息转化为编解码加速器所能理解的格式,便于后续的处理顺利进行。
在接下来的处理过程中,编解码加速器根据第一地址信息,从内存中获取注册信息并 为注册信息中的M个处理模板配置第一映射表,具体来说,编解码加速器会为M个处理模板设置M个模板序号,并获取这M个处理模板所对应的M个地址信息。然后建立这M个地址信息和M个模板序号之间的映射关系,得到第一映射表。地址信息包括首地址和长度。
示例性的,第一映射表可以如表1所示:
表1
示例性的,表1中的地址信息0x10002000表示,0号处理模板的首地址为0x10002000。其中,首地址是采用16进制表示的内存地址。可以理解的是,原始注册信息经过编解码加速器的解码处理之后,编解码加速器定义了解码后注册信息中各个处理模板的长度,因此,在第一映射表中,并不需要存储各个处理模板的长度,只需要存储模板序号和地址信息之间的映射关系即可。
建立了第一映射表,在解析待处理数据得到目标模板序号之后,编解码加速器就可以基于目标模板序号查询第一映射表得到目标处理模板对应的第二地址信息。然后根据第二地址信息,从内存中获取目标处理模板。
本申请实施例中,在编解码加速器中存储指示地址信息和模板序号之间映射关系的第一映射表,通过查询第一映射表的方式,从内存中获取目标处理模板,并不在编解码加速器本地存储各个处理模板,减少了本地资源的占用。
203.根据目标数据分布序号,获取待处理数据对应的目标数据分布信息。
编解码加速器获取的注册信息中还包括N个数据分布信息,N为正整数。编解码加速器会为这N个数据分布信息配置第二映射表,具体来说,编解码加速器会为N个数据分布信息设置N个数据分布序号,并获取这N个数据分布信息所对应的N个地址信息。然后建立这N个地址信息和N个数据分布序号之间的映射关系,得到第二映射表。
示例性的,第二映射表可以如表2所示:
表2
示例性的,表2中的地址信息0x08049324表示,0号数据分布信息的首地址为0x008049324。其中,首地址是采用16进制表示的内存地址。可以理解的是,原始注册信息经过编解码加速器的解码处理之后,编解码加速器定义了解码后注册信息中各个数据分布信息的长度,因此,在第二映射表中,并不需要存储各个数据分布信息的长度,只需要存储数据分布序号和地址信息之间的映射关系即可。
需要注意的是,表1和表2只是对模板序号与地址信息之间的映射关系,以及数据分 布序号与地址信息之间的映射关系的一个示例,并不意味着实际应用中一定存在这样一张表。
建立了第二映射表,在解析待处理数据得到目标数据分布序号之后,编解码加速器就可以基于目标数据分布序号查询第二映射表得到目标数据分布信息对应的第三地址信息。然后根据第三地址信息,从内存中获取目标数据分布信息。
本申请实施例中,在编解码加速器中存储指示地址信息和数据分布序号之间映射关系的第二映射表,通过查询第二映射表的方式,从内存中获取目标数据分布信息,并不在编解码加速器本地存储各个数据分布信息,减少了本地资源的占用。
204.根据目标数据分布信息,确定待处理数据中定长字段对应的第一内存开销。
编解码加速器可以基于不同的方式确定待处理数据中定长字段的第一内存开销,下面分别对不同的情况进行说明。
1)数据分布信息与定长内存开销绑定。
编解码加速器根据第一地址信息,从内存中获取注册信息,该注册信息中包括M个处理模板和N个数据分布信息,M和N均为正整数。其中,处理模板指示待处理数据的属性信息,包括待处理数据中各个成员的类型、名称、长度和注释等信息。数据分布信息指示待处理数据中各个成员的排布情况。
这M个处理模板中的每个处理模板对应至少一个数据分布信息。示例性的,假设处理模板1包括5个成员,为成员0至成员4,那么处理模板1和数据分布信息之间的对应关系可以如图3所示。请参阅图3,图3为本申请实施例提供的处理模板和数据分布信息的一个对应关系示意图。
如图3所示,数据分布信息1至数据分布信息3均对应于处理模板1,各个数据分布信息中的成员均包含于处理模板1的成员中,也就是说,处理模板1的成员是全集,与处理模板1对应的各个数据分布信息的成员是该全集的子集。
本申请实施例中,一个处理模板可以对应多个数据分布信息,这为本申请实施例实现以非侵入式的方式进行序列化和/或反序列化提供了技术支持,也即提升了技术方案的可实现性。其中,非侵入式是指不需要使用指定的输入数据结构,便可以实现序列化和/或反序列化功能。这是因为,注册信息中的处理模板囊括了多种情况,编解码加速器根据待处理数据的数据分布信息,便能确定与之对应的处理模板,也就确定了待处理数据的处理方式。同时,如果一个处理模板可以对应多个数据分布信息,也可以减少注册的处理模板的数量,从而降低了内存消耗。
编解码加速器还会确定每个处理模板中每个定长字段的内存开销。
编解码加速器可以通过多种方式确定定长字段的内存开销的方式有多种,例如每个定长字段的成员所占用的内存开销在配置的时候直接注明了,那么编解码加速器就可以直接获取到该内存开销。或者,在注册的时候没有注明,编解码加速器对各个定长字段的成员进行分析,计算得到各个定长字段的成员的内存开销。除此之外,编解码加速器还可以基于其他的方式确定各个定长字段的,具体此处不做限定。例如,处理模板中一部分定长字段的成员的内存开销在配置过程中注明,另一部分定长字段的成员的内存开销由编解码加 速器对该部分定长字段成员进行分析后确定,由此确定出处理模板中每个定长字段成员的内存开销。
以图3所示的处理模板1为例,假设在处理模板1包括的5个成员中,成员0、成员2和成员3的类型是定长字段,成员1和成员4的类型是变长字段。那么编解码加速器会确定成员0、成员2和成员3各自的内存开销。
编解码加速器确定出各个定长字段的内存开销之后,会根据每个定长字段的内存开销,和M个处理模板与N个数据分布信息之间的对应关系,确定N个数据分布信息中每个数据分布信息对应的定长内存开销。在图3所示实施例中,假设处理模板1中定长字段成员0、成员2和成员3对应的内存开销分别为4字节、8字节和4字节。那么编解码加速器能够确定与处理模板1对应的数据分布信息1、数据分布信息2和数据分布信息3对应的定长内存开销分别为12字节、0字节和4字节,也即将数据分布信息与定长内存开销进行了绑定。
在数据分布信息与定长内存开销绑定的情况下,编解码加速器可以直接将目标数据分布信息对应的定长内存开销,确定为待处理数据中定长字段对应的第一内存开销。举例来说,假设待处理数据对应的目标数据分布信息为图3所示的数据分布信息1,那么编解码加速器可以确定待处理数据中定长字段对应的第一内存开销为12字节。
本申请实施例中,将数据分布信息与定长内存开销进行绑定,使得编解码加速器在数据处理过程中,不需要再进行计算,节约了运算资源。同时,能够根据目标数据分布信息直接快速确定出待处理数据对应的定长内存开销,也提升了处理效率。
2)数据分布信息与定长内存开销不绑定。
在数据分布信息与定长内存开销不绑定的情况下,编解码加速器在获取到注册信息之后,只需要确定M个处理模板中,每个处理模板所包括的每个定长字段的内存开销。具体过程在上文“1)数据分布信息与定长内存开销绑定”中已经说明,此处不再赘述。
在这种情况下,在编解码加速器获取到待处理数据所对应的目标数据分布信息和目标处理模板之后,会根据目标数据分布信息和目标处理模板,确定待处理数据对应的至少一个定长字段,然后根据至少一个定长字段中每个定长字段的内存开销,确定待处理数据中定长字段对应的第一内存开销。
举例来说,假设目标处理模板为图3所示的处理模板1,目标数据分布信息为图3所示的数据分布信息3,且在处理模板1包括的5个成员中,成员0、成员2和成员3的类型是定长字段,其内存开销分别为4字节、8字节和4字节;其他成员的类型是变长字段。那么编解码加速器根据处理模板1和数据分布信息3,能够确定待处理数据中成员0和成员3是定长字段,也即确定第一内存开销为8字节。
本申请实施例中,确定第一内存开销的方式有多种,可以根据实际应用的需要选择,提升了本申请技术方案的灵活性。
205.根据目标数据分布信息、目标处理模板和待处理数据,计算待处理数据中变长字段对应的第二内存开销。
目标数据分布信息指示待处理数据的成员分布情况,目标处理模板指示待处理数据的属性信息。其中,待处理数据的属性信息包括待处理数据中各个数据成员的类型、处理方式等信息。
变长字段需要根据待处理数据所包括的具体内容进行分析,才能确定其所对应的第二内存开销。因此,编解码加速器获取待处理数据,目标数据分布信息和目标处理模板之后,会根据目标数据分布信息查询目标处理模板,确定待处理数据中各个成员的类型,在确定是变长字段的情况下,结合待处理数据本身,确定出各个变长字段的内存开销,从而确定第二内存开销。
示例性的,假设目标处理模板为图3所示的处理模板1,目标数据分布信息为图3所示的数据分布信息3,且在处理模板1包括的5个成员中,成员1和成员4的类型是变长字段,其他成员的类型是变长字段。那么编解码加速器按照数据分布信息3,查询处理模板1,确定各个成员的类型。确定只有成员1为变长字段,再结合待处理数据自身,得到该变长字段对应的内存开销,也即得到了第二内存开销。
本申请实施例中,在处理过程中,编解码加速器会结合待处理数据自身的信息精确计算变长字段对应的第二内存开销,使得计算结果准确,提升了本申请技术方案的准确度。
206.根据第一内存开销和第二内存开销,申请目标内存空间,目标内存空间用于存储编解码待处理数据得到的输出数据。
编解码加速器得到第一内存开销和第二内存开销之后,能够确定输出数据对应的目标内存开销。示例性的,假设第一内存开销为8字节,第二内存开销为16字节,那么,编解码加速器便能确定目标内存开销为8+16=24字节。因此,目标内存空间对应的空间大小即为24字节,该目标内存空间便用于存储编解码待处理数据得到的输出数据。
编解码加速器可以通过多种方式申请目标内存空间,可以通过编解码加速器外部的内存管理加速器,申请目标内存空间。除此之外,还可以通过其他的方式申请目标内存空间,例如,通过编解码加速器内部的内存管理模块,申请目标内存空间,具体此处不做限定。
本申请实施例中,编解码加速器申请目标内存空间的方式有多种,可以适应不同的情况,提升了本申请技术方案的灵活性。
在一些可选的实施例中,在步骤206之后,编解码加速器将输出数据存储至目标内存空间之后,还会向处理器发送中断请求,以告知处理器数据处理完成,使得处理器从内存中获取输出数据。其中,中断请求中可以携带目标内存空间的地址信息。
在前面的说明中,是以编解码加速器为执行主体,对本申请实施例提供的数据处理方法进行的说明。接下来,在系统的角度,对本申请实施例提供的数据处理方法进行说明。请参阅图4,图4为本申请实施例提供的数据处理系统的一个结构示意图。
如图4所示,数据处理系统400包括编解码加速器401、处理器402、内存403和内存管理加速器404。本申请实施例提供的数据处理方法,关注软硬件交互框架设计,在编解码加速器、处理器和内存之间设置不同用途的交互接口。
编解码加速器401向处理器402提供注册接口,处理器402可以利用该注册接口,在 编解码执行前进行交互环境和注册信息的配置操作,从而支持编解码加速器401实现以非侵入式的方式,根据注册信息自主访问内存403获取待处理数据,并根据注册信息指定的编解码算法程序,实现动态编解码特性。其中,交互环境包括输入/输出的队列空间,注册信息的输入/输出空间等。
编解码加速器401还向处理器402提供写接口,处理器402可以利用该写接口将注册信息对应的地址信息,和待处理任务的描述信息对应的地址信息,写入编解码加速器401对应的寄存器中,以供后续编解码加速器401访问内存403取数据使用。
编解码加速器401还向处理器402提供触发接口,处理器402可以利用该触发接口,在将注册信息对应的地址信息,和待处理任务的描述信息对应的地址信息,写入编解码加速器401后,触发编解码加速器401开始处理注册信息或者待处理任务所对应的待处理数据。
编解码加速器401还可以设置有中断通知接口,在注册信息处理完成或者待处理数据处理完成之后,编解码加速器401可以通过中断通知接口向处理器402发送中断请求,以通知处理器402获取注册回执或者输出数据。
编解码加速器401上还设置有访存接口,通过访存接口,实现主动访问内存403获取注册信息、待处理任务描述信息等。并在待处理任务为批任务(又称为批量任务)的情况下,将批任务中每个子任务的编解码输出数据、以及编解码批任务完成后将输出描述信息写入内存403。在待处理任务为单个任务的情况下,将单个任务的输出数据和该输出数据的描述信息写入内存403。
编解码加速器401上还设置有申请内存接口,编解码加速器401基于注册信息预处理确定的定长字段对应的第一内存开销,和基于处理过程中确定的变长字段对应的第二内存开销,精确计算目标内存空间,通过申请内存接口,向内存管理加速器404申请目标内存空间。
在一些可选的实施例中,数据处理系统中也可以不包括内存管理加速器,由编解码加速器内部的内存管理模块实现内存管理加速器的功能。
接下来,以数据处理系统中包括内存管理加速器为例,对本申请实施例提供的数据处理方法进行更详细的说明,请参阅图5,图5为本申请实施例提供的数据处理方法的示意图。
配置了本申请实施例提供的编解码加速器(比如Serialization/DeSerialization,Ser/DeSer加速器)的设备上,以需要执行编解码批任务为例进行说明,可以按照以下步骤进行操作:
首先,在编解码任务执行前,需要先进行注册信息的配置。在处理器(比如central processing unit,CPU)软件侧,需要执行以下步骤完成注册信息的输入,即对应图5中步骤501和步骤502。
501.处理器准备交互环境,并封装环境信息和编解码信息为注册信息。
其中,准备交互环境是指准备软硬件交互环境,包括供CPU存放注册输入信息的缓存空间Buffer1、存放编解码批任务描述信息的缓存空间Buffer2,供Ser/DeSer加速器存放注册输出信息的缓存空间Buffer3、存放编解码批任务输出描述信息的缓存空间Buffer4。
在一些可选的实施例中,Buffer2和Buffer4也可以不注册。在注册buffer2的情况相爱,编解码任务触发时,告知编解码加速器处理下一个;在不注册buffer2的情况下,编解码批任务时,需要告知编解码加速器批任务描述信息的地址信息。在注册buffer4的情况下,输出数据直接存放至指定地址即可。在不注册buffer4的情况下,批任务描述信息中,需要携带输出数据的地址信息。
封装环境信息是指,按照Ser/DeSer加速器给定注册信息格式,CPU需要将Buffer2和Buffer4缓存空间信息(首地址+长度),写入Buffer1内。
封装编解码信息是指,按照Ser/DeSer加速器给定注册信息格式,CPU需要将需要使用的编解码算法信息(程序入口、类型等),写入Buffer1内;按照Ser/DeSer加速器指定编码格式(如JSON等),对序列化输入数据结构Ser_InStruct、反序列化输出数据结构DeSer_OutStruct和序列化/反序列化模式描述Schema等进行编码,将它们的编码数据首地址和长度信息,写入Buffer1内。其中,注册信息中的处理模板包括序列化输入数据结构和/或反序列化输出数据结构,数据分布信息包括序列化/反序列化模式描述。
502.处理器触发编解码加速器获取注册信息。
CPU可以调用Ser/DeSer加速器接口,将上述注册信息(首地址和长度)写到对应加速器寄存器中,实现触发加速器处理注册信息。
在Ser/DeSer加速器侧,在接收到注册信息并被CPU触发后,启动规则注册模块,执行以下步骤完成信息注册,即对应图5中步骤503至步骤505。
503.编解码加速器访问buffer1获取注册信息。
编解码加速器根据寄存器中注册信息的首地址和长度,访Buffer1取注册信息并进行以下解析:
首先,将Buffer2和Buffer4缓存空间信息(首地址+长度)存储到相应寄存器中,然后预处理Ser_InStruct、DeSer_OutStruct和Schema等编码文件,计算解码后的空间大小。需要注意的是,编解码加速器访问buffer1获取的注册信息是处理器或者终端设备所配置的初始注册信息,预处理得到的解码文件是在后续处理中所使用的注册信息。
504.编解码加速器申请内存。
编解码加速器计算得到解码后的空间大小之后,通过内存申请模块申请内存空间,该内存空间存储的是解码后的注册信息。然后分别将Ser_InStruct、DeSer_OutStruct和Schema的解码数据内存首地址和长度存到加速器内部映射表中,也即将第一地址信息存到内部映射表中。
编解码加速器还会启动存储计算模块,分别综合Ser_InStruct和Schema、DeSer_OutStruct和Schema,计算基于它们进行序列化、反序列化时的固定内存开销(如元数据字段、定长字段等),并进行缓存。
505.编解码加速器封装注册信息回执,写入buffer3。
Ser/DeSer加速器封装注册解析结果(如Schema的解析结果、映射表序号等),或者直接写入Buffer3,或者并利用内存申请模块申请注册回执空间进行存储,再将回执空间首地址和长度存入Buffer3;可基于中断通知CPU取注册回执,或等待CPU轮询或定时查询主动获取。
编解码加速器反馈注册回执有多重作用,一方面可以告知CPU,注册完成,CPU可以释放相关内存,以减少内存占用。另一方面,CPU能够根据注册回执反馈序号,如果序号≥0,则认为注册成功;如果小于0,则表示注册失败,客户端需要重新注册,提升了注册的成功率。
注册完成后,可利用Ser/DeSer加速器进行动态编解码。在CPU软件侧,创建任务,执行如下步骤,对应图5中步骤506和步骤507。
506.处理器创建任务并写入buffer2。
CPU会封装任务信息,也即将N个子任务描述信息(包括任务使用编解码算法类型、Ser_InStruct序号、DeSer_OutStruct序号、Schema序号、待编解码数据首地址、长度等),封装为批任务描述信息Work Metadata(包括批任务数量、子任务描述信息列表等),写入Buffer2中。
507.处理器触发编解码加速器进行编解码。
处理器可以调用Ser/DeSer加速器接口,将批任务描述信息Work Metadata的首地址和长度写到对应加速器寄存器中,实现触发加速器处理任务。
508.访存获取编解码输入。
编解码加速器获取到批任务描述信息Work Metadata的首地址和长度之后,会启动数据获取模块,访问Buffer2,获取到Work Metadata。然后解析Work Metadata,缓存各子任务描述信息到加速器内部接收队列。
509.编解码加速器访存获取目标处理模板和目标数据分布信息。
动态规则装载模块按序调度多个子任务,并解析各子任务描述信息,查询编解码加速器本地映射表,装载相应的编解码算法、Ser_InStruct、DeSer_OutStruct、Schema等信息到核心处理模块中的执行单元,并访存取到编解码加速器本地缓存中。
510.取编解码输入数据。
基于上述缓存的Ser_InStruct、DeSer_OutStruct、Schema等信息,核心处理模块中各执行单元根据子任务描述信息中待编/解码数据地址,从内存中获取编解码输入数据,即待处理数据。
511.编解码加速器申请内存。
若子任务需要进行拷贝,对于编码任务,拷贝分发模块会通知核心处理模块申请拷贝所需输出内存,并在编码结束后调用该模块进行输出数据拷贝;对于解码任务,拷贝分发模块通知核心处理模块,在解码过程中进行相应内存申请与拷贝。
调用存储计算模块,在注册阶段已缓存的固定内存开销的基础上,计算输出内存开销,也即目标内存空间。利用内存申请模块申请目标内存空间。
512.写输出数据到输出空间。
编解码加速器还会使用指定算法进行编解码,得到输出数据,然后将输出数据存入目标内存空间。
513.编解码加速器封装输出数据描述信息,写入buffer4。
在批任务所把包括的N个子任务均完成后,输出通知模块将各子任务输出描述信息(如DeSer_OutStruct序号、输出数据地址、长度等)封装输出批任务描述信息Output Metadata(包含批任务数量、子任务描述信息列表等)写入Buffer4,并可基于中断通知CPU取Output Metadata,或等待CPU轮询或定时查询主动获取。
接下来,对本申请实施例提供的编解码加速器和相关设备进行说明。
请参阅图6,图6为本申请实施例提供的编解码加速器600的一个结构示意图,包括:
处理单元601,用于解析待处理数据,得到待处理数据对应的目标模板序号和目标数据分布序号。
获取单元602,用于根据目标模板序号,获取待处理数据对应的目标处理模板。
获取单元602,还用于根据目标数据分布序号,获取待处理数据对应的目标数据分布信息。
处理单元601,还用于根据目标数据分布信息,确定待处理数据中定长字段对应的第一内存开销;根据目标数据分布信息、目标处理模板和待处理数据,计算待处理数据中变长字段对应的第二内存开销;根据第一内存开销和第二内存开销,申请目标内存空间,目标内存空间用于存储编解码待处理数据得到的输出数据。
在一些可选的实施例中,获取单元602,还用于:获取第一地址信息,第一地址信息指示注册信息的存储位置,注册信息包括M个处理模板和N个数据分布信息,M个处理模板中每个处理模板对应至少一个数据分布信息,M和N均为正整数;根据第一地址信息,获取M个处理模板和N个数据分布信息。
处理单元601,还用于确定每个处理模板中每个定长字段的内存开销;根据每个定长字段的内存开销,和M个处理模板与N个数据分布信息之间的对应关系,确定N个数据分布信息中每个数据分布信息对应的定长内存开销。
在一些可选的实施例中,目标数据分布信息包含于N个数据分布信息。处理单元601,具体用于确定目标数据分布信息对应的定长内存开销,为第一内存开销。
在一些可选的实施例中,获取单元602,还用于获取第一地址信息,第一地址信息指示注册信息的存储位置,注册信息包括M个处理模板,M为正整数;根据第一地址信息,获取M个处理模板。
处理单元601,还用于确定M个处理模板中每个处理模板包括的每个定长字段的内存开销。
在一些可选的实施例中,处理单元601,具体用于根据目标数据分布信息和目标处理模板,确定待处理数据对应的至少一个定长字段;根据至少一个定长字段中每个定长字段的内存开销,确定第一内存开销。
在一些可选的实施例中,目标数据分布信息指示待处理数据的成员分布情况,目标处 理模板指示待处理数据的属性信息。
处理单元601,具体用于根据待处理数据的成员分布情况、待处理数据的属性信息和待处理数据,确定待处理数据包括的变长字段;计算变长字段对应的第二内存开销。
在一些可选的实施例中,获取单元602,还用于获取M个处理模板对应的M个地址信息和M个模板序号。
处理单元601,还用于建立M个地址信息与M个模板序号之间的映射关系,得到第一映射表。
获取单元602,具体用于根据目标模板序号,从第一映射表中确定目标处理模板对应的第二地址信息;根据第二地址信息,从内存中获取目标处理模板。
在一些可选的实施例中,获取单元602,还用于获取N个数据分布信息对应的N个地址信息和N个数据分布序号。
处理单元601,还用于建立N个地址信息与N个数据分布序号之间的映射关系,得到第二映射表。
获取单元602,具体用于根据目标数据分布序号,从第二映射表中确定目标数据分布信息对应的第三地址信息;根据第三地址信息,从内存中获取目标数据分布信息。
在一些可选的实施例中,处理单元601,具体用于通过内存管理加速器或者内存管理模块,申请目标内存空间。
编解码加速器600可以执行前述图1a至图5所示实施例中编解码加速器所执行的操作,此处不再赘述。
请参阅图7,图7为本申请实施例提供的编解码加速器的一个结构示意图。
如图7所示,编解码加速器700包括规则注册模块、数据获取模块、动态规则装载模块、核心处理模块、拷贝分发模块、存储计算模块、存储申请模块和输出通知模块。
其中,规则注册模块用于:访存获取CPU注册到加速器中的交互环境、编解码等配置信息,解析、存储到申请的内存空间中,并缓存注册信息包括的处理模板的内存地址和数据分布信息的内存地址到加速器本地映射表中,以供编解码时快速查询、获取。
数据获取模块用于:基于CPU写入加速器寄存器中的批任务描述信息首地址和长度信息,访问已注册的交互环境相应内存空间,获取批任务描述信息并对其进行解析,将各子任务描述信息缓存到本地接收队列,等待加速器调度处理。
动态规则装载模块用于:调度加速器本地接收队列中多个子任务,解析各子任务描述信息,查询加速器本地映射表,装载所需算法、数据结构、模板等编解码信息到核心处理模块中的执行单元。
核心处理模块用于:核心处理模块中各执行单元根据子任务描述信息中待编/解码数据地址,和已装载的子任务编解码信息,访存取数据;调用存储计算模块,精确计算输出内存开销,调用内存申请模块申请输出内存;使用已装载的算法进行编解码并存入输出空间。
存储计算模块用于:在注册处理时,基于注册的编解码信息,计算、缓存编解码时输出数据中固定部分(包括元数据、定长字段等)的内存开销,降低后续编解码内存开销计算;在编解码处理时,根据子任务描述信息,查询获取本地缓存的固定部分输出开销,并 基于子任务的待编解码数据对输出数据中的变长字段的内存开销进行精确计算,综合二者确定精确的输出内存需求。其中,序列化是在缓存中进行的,这样也避免多次写内存,避免性能不好。存储申请模块:对于注册处理时解析编解码信息,或编解码处理时编解码输出,该模块会基于存储计算模块计算所得内存开销,与外部内存管理加速器交互,申请输出内存。
拷贝分发模块用于:根据子任务描述信息,若需要进行拷贝,对于编码任务,该模块会通知核心处理模块,利用存储计算模块申请拷贝所需输出内存,并在编码完成后调用自己进行输出数据拷贝;对于解码任务,该模块会通知核心处理模块,在解码过程中进行相应内存申请与拷贝。
输出通知模块用于:待批任务完成,该模块会将各子任务输出描述信息封装到输出批任务描述信息中,并将其写入到已注册的交互环境相应内存空间之中,发送中断通知CPU取输出数据,或等待CPU轮询或定时查询主动获取。
编解码加速器700可以执行前述图1a至图6所示实施例中编解码加速器所执行的操作,此处不再赘述。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (21)

  1. 一种数据处理方法,其特征在于,包括:
    解析待处理数据,得到所述待处理数据对应的目标模板序号和目标数据分布序号;
    根据所述目标模板序号,获取所述待处理数据对应的目标处理模板;
    根据所述目标数据分布序号,获取所述待处理数据对应的目标数据分布信息;
    根据所述目标数据分布信息,确定所述待处理数据中定长字段对应的第一内存开销;
    根据所述目标数据分布信息、所述目标处理模板和所述待处理数据,计算所述待处理数据中变长字段对应的第二内存开销;
    根据所述第一内存开销和所述第二内存开销,申请目标内存空间,所述目标内存空间用于存储编解码所述待处理数据得到的输出数据。
  2. 根据权利要求1所述的方法,其特征在于,在所述解析待处理数据之前,所述方法还包括:
    获取第一地址信息,所述第一地址信息指示注册信息的存储位置,所述注册信息包括M个处理模板和N个数据分布信息,所述M个处理模板中每个处理模板对应至少一个数据分布信息,M和N均为正整数;
    根据所述第一地址信息,获取所述M个处理模板和所述N个数据分布信息;
    确定所述每个处理模板中每个定长字段的内存开销;
    根据所述每个定长字段的内存开销,和所述M个处理模板与所述N个数据分布信息之间的对应关系,确定所述N个数据分布信息中每个数据分布信息对应的定长内存开销。
  3. 根据权利要求2所述的方法,其特征在于,所述目标数据分布信息包含于所述N个数据分布信息;
    所述根据所述目标数据分布信息,确定所述待处理数据中定长字段对应的第一内存开销,包括:
    确定所述目标数据分布信息对应的定长内存开销,为所述第一内存开销。
  4. 根据权利要求1所述的方法,其特征在于,在所述解析待处理数据之前,所述方法还包括:
    获取第一地址信息,所述第一地址信息指示注册信息的存储位置,所述注册信息包括M个处理模板,M为正整数;
    根据所述第一地址信息,获取所述M个处理模板;
    确定所述M个处理模板中每个处理模板包括的每个定长字段的内存开销。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述目标数据分布信息,确定所述待处理数据中定长字段对应的第一内存开销,包括:
    根据所述目标数据分布信息和所述目标处理模板,确定所述待处理数据对应的至少一个定长字段;
    根据所述至少一个定长字段中每个定长字段的内存开销,确定所述第一内存开销。
  6. 根据权利1至5中任一项所述的方法,其特征在于,所述目标数据分布信息指示所述待处理数据的成员分布情况,所述目标处理模板指示所述待处理数据的属性信息;
    所述根据所述目标数据分布信息和所述目标处理模板,计算所述待处理数据中变长字段对应的第二内存开销,包括:
    根据所述待处理数据的成员分布情况、所述待处理数据的属性信息和所述待处理数据,确定所述待处理数据包括的所述变长字段;
    计算所述变长字段对应的所述第二内存开销。
  7. 根据权利要求2至6中任一项所述的方法,其特征在于,所述方法还包括:
    获取所述M个处理模板对应的M个地址信息和M个模板序号;
    建立所述M个地址信息与所述M个模板序号之间的映射关系,得到所述第一映射表;
    所述根据所述目标模板序号,获取所述待处理数据对应的目标处理模板,包括:
    根据所述目标模板序号,从所述第一映射表中确定所述目标处理模板对应的第二地址信息;
    根据所述第二地址信息,从内存中获取所述目标处理模板。
  8. 根据权利要求2至7中任一项所述的方法,其特征在于,所述方法还包括:
    获取所述N个数据分布信息对应的N个地址信息和N个数据分布序号;
    建立所述N个地址信息与所述N个数据分布序号之间的映射关系,得到所述第二映射表;
    所述根据所述目标数据分布序号,获取所述待处理数据对应的目标数据分布信息,包括:
    根据所述目标数据分布序号,从所述第二映射表中确定所述目标数据分布信息对应的第三地址信息;
    根据所述第三地址信息,从内存中获取所述目标数据分布信息。
  9. 根据权利要求1至8中任一项所述的方法,其特征在于,所述申请目标内存空间,包括:
    通过内存管理加速器或者内存管理模块,申请所述目标内存空间。
  10. 一种编解码加速器,其特征在于,包括:
    处理单元,用于解析待处理数据,得到所述待处理数据对应的目标模板序号和目标数据分布序号;
    获取单元,用于根据所述目标模板序号,获取所述待处理数据对应的目标处理模板;
    所述获取单元,还用于根据所述目标数据分布序号,获取所述待处理数据对应的目标数据分布信息;
    所述处理单元,还用于:
    根据所述目标数据分布信息,确定所述待处理数据中定长字段对应的第一内存开销;
    根据所述目标数据分布信息、所述目标处理模板和所述待处理数据,计算所述待处理数据中变长字段对应的第二内存开销;
    根据所述第一内存开销和所述第二内存开销,申请目标内存空间,所述目标内存空间用于存储编解码所述待处理数据得到的输出数据。
  11. 根据权利要求10所述的编解码加速器,其特征在于,所述获取单元,还用于:
    获取第一地址信息,所述第一地址信息指示注册信息的存储位置,所述注册信息包括M个处理模板和N个数据分布信息,所述M个处理模板中每个处理模板对应至少一个数据分布信息,M和N均为正整数;
    根据所述第一地址信息,获取所述M个处理模板和所述N个数据分布信息;
    所述处理单元,还用于:
    确定所述每个处理模板中每个定长字段的内存开销;
    根据所述每个定长字段的内存开销,和所述M个处理模板与所述N个数据分布信息之间的对应关系,确定所述N个数据分布信息中每个数据分布信息对应的定长内存开销。
  12. 根据权利要求11所述的编解码加速器,其特征在于,所述目标数据分布信息包含于所述N个数据分布信息;
    所述处理单元,具体用于确定所述目标数据分布信息对应的定长内存开销,为所述第一内存开销。
  13. 根据权利要求10所述的编解码加速器,其特征在于,所述获取单元,还用于:
    获取第一地址信息,所述第一地址信息指示注册信息的存储位置,所述注册信息包括M个处理模板,M为正整数;
    根据所述第一地址信息,获取所述M个处理模板;
    所述处理单元,还用于确定所述M个处理模板中每个处理模板包括的每个定长字段的内存开销。
  14. 根据权利要求13所述的编解码加速器,其特征在于,所述处理单元,具体用于:
    根据所述目标数据分布信息和所述目标处理模板,确定所述待处理数据对应的至少一个定长字段;
    根据所述至少一个定长字段中每个定长字段的内存开销,确定所述第一内存开销。
  15. 根据权利10至14中任一项所述的编解码加速器,其特征在于,所述目标数据分布信息指示所述待处理数据的成员分布情况,所述目标处理模板指示所述待处理数据的属性信息;
    所述处理单元,具体用于:
    根据所述待处理数据的成员分布情况、所述待处理数据的属性信息和所述待处理数据,确定所述待处理数据包括的所述变长字段;
    计算所述变长字段对应的所述第二内存开销。
  16. 根据权利要求11至15中任一项所述的编解码加速器,其特征在于,所述获取单元,还用于获取所述M个处理模板对应的M个地址信息和M个模板序号;
    所述处理单元,还用于建立所述M个地址信息与所述M个模板序号之间的映射关系,得到所述第一映射表;
    所述获取单元,具体用于:
    根据所述目标模板序号,从所述第一映射表中确定所述目标处理模板对应的第二地址信息;
    根据所述第二地址信息,从内存中获取所述目标处理模板。
  17. 根据权利要求11至16中任一项所述的编解码加速器,其特征在于,所述获取单元,还用于获取所述N个数据分布信息对应的N个地址信息和N个数据分布序号;
    所述处理单元,还用于建立所述N个地址信息与所述N个数据分布序号之间的映射关系,得到所述第二映射表;
    所述获取单元,具体用于:
    根据所述目标数据分布序号,从所述第二映射表中确定所述目标数据分布信息对应的第三地址信息;
    根据所述第三地址信息,从内存中获取所述目标数据分布信息。
  18. 根据权利要求10至17中任一项所述的编解码加速器,其特征在于,所述处理单元,具体用于通过内存管理加速器或者内存管理模块,申请所述目标内存空间。
  19. 一种数据处理系统,其特征在于,所述系统包括编解码加速器,所述编解码加速器用于执行权利要求1至9中任一项所述的方法。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有指令,当所述指令在计算机上运行时,使得所述计算机执行权利要求1至9中任一项所述的方法。
  21. 一种计算机程序产品,其特征在于,当所述计算机程序产品在计算机上执行时,所述计算机执行权利要求1至9中任一项所述的方法。
PCT/CN2023/080127 2022-03-28 2023-03-07 一种数据处理方法、编解码加速器和相关设备 WO2023185401A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210312378.X 2022-03-28
CN202210312378.XA CN116860428A (zh) 2022-03-28 2022-03-28 一种数据处理方法、编解码加速器和相关设备

Publications (1)

Publication Number Publication Date
WO2023185401A1 true WO2023185401A1 (zh) 2023-10-05

Family

ID=88199009

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/080127 WO2023185401A1 (zh) 2022-03-28 2023-03-07 一种数据处理方法、编解码加速器和相关设备

Country Status (2)

Country Link
CN (1) CN116860428A (zh)
WO (1) WO2023185401A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981884A (zh) * 2012-11-22 2013-03-20 用友软件股份有限公司 序列化装置和序列化方法
CN111813783A (zh) * 2020-07-27 2020-10-23 南方电网数字电网研究院有限公司 数据处理方法、装置、计算机设备和存储介质
WO2021012553A1 (zh) * 2019-07-25 2021-01-28 深圳壹账通智能科技有限公司 一种数据处理方法及相关设备
CN113742056A (zh) * 2020-11-19 2021-12-03 北京沃东天骏信息技术有限公司 一种数据存储方法、装置、设备及计算机可读存储介质
CN113886087A (zh) * 2021-10-19 2022-01-04 深圳市领创星通科技有限公司 一种应用内存管理方法、装置、设备及存储介质

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102981884A (zh) * 2012-11-22 2013-03-20 用友软件股份有限公司 序列化装置和序列化方法
WO2021012553A1 (zh) * 2019-07-25 2021-01-28 深圳壹账通智能科技有限公司 一种数据处理方法及相关设备
CN111813783A (zh) * 2020-07-27 2020-10-23 南方电网数字电网研究院有限公司 数据处理方法、装置、计算机设备和存储介质
CN113742056A (zh) * 2020-11-19 2021-12-03 北京沃东天骏信息技术有限公司 一种数据存储方法、装置、设备及计算机可读存储介质
CN113886087A (zh) * 2021-10-19 2022-01-04 深圳市领创星通科技有限公司 一种应用内存管理方法、装置、设备及存储介质

Also Published As

Publication number Publication date
CN116860428A (zh) 2023-10-10

Similar Documents

Publication Publication Date Title
EP3667496B1 (en) Distributed computing system, data transmission method and device in distributed computing system
US8117615B2 (en) Facilitating intra-node data transfer in collective communications, and methods therefor
US7231638B2 (en) Memory sharing in a distributed data processing system using modified address space to create extended address space for copying data
Jia et al. Improving the performance of distributed tensorflow with RDMA
US7958274B2 (en) Heuristic status polling
US7827024B2 (en) Low latency, high bandwidth data communications between compute nodes in a parallel computer
US20210406068A1 (en) Method and system for stream computation based on directed acyclic graph (dag) interaction
US9251078B2 (en) Acquiring remote shared variable directory information in a parallel computer
US11556378B2 (en) Offloading execution of a multi-task parameter-dependent operation to a network device
WO2023169267A1 (zh) 一种基于网络设备的数据处理方法及网络设备
WO2023124543A1 (zh) 用于大数据的数据处理方法和数据处理装置
WO2023185401A1 (zh) 一种数据处理方法、编解码加速器和相关设备
Bertolotti et al. Modular design of an open-source, networked embedded system
US20230153153A1 (en) Task processing method and apparatus
Lai et al. ShmStreaming: A shared memory approach for improving Hadoop streaming performance
Raghavan et al. Cornflakes: Zero-Copy Serialization for Microsecond-Scale Networking
JP2023544911A (ja) 並列量子コンピューティングのための方法及び装置
Zhang et al. Modin OpenMPI compute engine
Tardieu et al. X10 for productivity and performance at scale
WO2022224409A1 (ja) アクセラレータ制御システム、アクセラレータ制御方法およびアクセラレータ制御プログラム
Tupinambá et al. Transparent and optimized distributed processing on gpus
CN116107954A (zh) 一种数据处理方法以及相关设备
CN113901016A (zh) 面向高能物理io密集型的可计算存储系统及服务方法
Ideguchi et al. CHAOS-MCAPI: An Optimized Mechanism to Support Multicore Parallel Programming
Chen et al. C2AS: An agent-based distributed and parallel processing virtual machine

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23777779

Country of ref document: EP

Kind code of ref document: A1