CN110399415A - A kind of data unserializing device and computing terminal - Google Patents

A kind of data unserializing device and computing terminal Download PDF

Info

Publication number
CN110399415A
CN110399415A CN201910664325.2A CN201910664325A CN110399415A CN 110399415 A CN110399415 A CN 110399415A CN 201910664325 A CN201910664325 A CN 201910664325A CN 110399415 A CN110399415 A CN 110399415A
Authority
CN
China
Prior art keywords
data
unserializing
task
interface
engine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910664325.2A
Other languages
Chinese (zh)
Inventor
臧春峰
王斌
严大卫
陈芬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Dingxue Network Technology Co Ltd
Original Assignee
Jiangsu Dingxue Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Dingxue Network Technology Co Ltd filed Critical Jiangsu Dingxue Network Technology Co Ltd
Priority to CN201910664325.2A priority Critical patent/CN110399415A/en
Publication of CN110399415A publication Critical patent/CN110399415A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24568Data stream processing; Continuous queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides data unserializing device and computing terminals, which includes control unit, are controlled by least one accelerator module of control unit in a serial fashion;Accelerator module is made of data input module, data cutting engine, unserializing engine cluster, data fusion engines and data outputting module, unserializing engine cluster is made of parallel several unserializing engines based on AVRO, unserializing data are generated by unserializing engine, and are exported after sequentially executing splicing to unserializing data according to Schema file by data fusion engines.The revealed data unserializing device and the computing terminal based on the data unserializing device through the invention reduce the consumption of cpu resource, memory source and computer bus bandwidth resources that data configure cloud platform, data center in unserializing processing significantly.

Description

A kind of data unserializing device and computing terminal
Technical field
The present invention relates to field of computer technology more particularly to a kind of data unserializing device and it is based on the data inverted sequence A kind of computing terminal that column makeup is set.
Background technique
To data carry out serializing processing (Serialization) refer to data object state is converted to can store or The process for the format that can be transmitted, and be specially the process that data structure conversion is known as to binary data stream or text flow.With Serialize the opposite treatment process of concept, that is, unserializing processing (Deserialization).The process of unserializing is by data Circulation is changed to data object, to realize that the storage of data is grasped with every computer such as transmission in computer or computer cluster Make, and is specially the process that binary data stream or text flow conversion are known as to the data structure for being easily handled and reading.
However, either needing to consume a large amount of during serializing or unserializing process, especially unserializing The various computer resources such as cpu resource, memory source and computer bus bandwidth resources.Therefore, to the mistake of data unserializing Journey optimizes the applied fields such as the carry out unserializing processing to cloud platform, data center or big data Environments amount data How Jing Zhong reduces cpu resource, memory source and the computer bus bandwidth resources configured to cloud platform, data center just Seem particularly necessary.
The Chinese invention patent of Publication No. CN106610922A discloses " unserializing method and device ".It is existing at this In technology, by path components as dictionary key, so that the function of dictionary be added in the path of JSON document, and then antitone sequence is supported Change dictionary.However the prior art still belongs to the method for the object serialization based on object (Object) and unserializing, because This input serialized data execute unserializing treatment process in need to rely on database and be customized by the user it is macro, and by This causes the unserializing treatment mechanism in mass data exchange and the scene communicated inflexible and to hardware resource consumption mistake Big problem.
In view of this, it is necessary to data unserializing technology in the prior art be improved, to solve the above problems.
Summary of the invention
It is an object of the invention to disclose a kind of data unserializing device and a kind of computing terminal, to solve existing skill Defect present in art, the CPU that reduction configures cloud platform, data center in unserializing processing to reduce data The consumption of resource, memory source and computer bus bandwidth resources.
To realize above-mentioned first goal of the invention, the present invention provides a kind of data unserializing devices, comprising: control is single Member is controlled by least one accelerator module of control unit in a serial fashion;
The accelerator module is by data input module, data cutting engine, unserializing engine cluster, data fusion engines And data outputting module composition, the unserializing engine cluster is by parallel several unserializing engine groups based on AVRO At;
The Schema file that the data cutting engine is passed to data input module determines cutting granularity, and will be after cutting Serialized data circulation be distributed in several unserializing engines, with by the unserializing engine generation unserializing number According to, and exported after sequentially executing splicing to unserializing data according to Schema file by data fusion engines.
As a further improvement of the present invention, the accelerator module further include: receive serialized data and be forwarded to data The Data Input Interface of input module, and receive unserializing data and the data for executing forwarding that data outputting module is sent Output interface;
Described control unit further include: task submits interface and task output interface;
Wherein, the task submit interface and Data Input Interface receive respectively the task flow being made of task requests and The data flow being made of serialized data.
As a further improvement of the present invention, described control unit configuration task Dispatching Unit, the task Dispatching Unit Task requests are received in a serial fashion, and the unserializing engine cluster controlled in accelerator module in a serial fashion is based on AVRO simultaneously Row ground executes unserializing processing to the serialized data after cutting.
As a further improvement of the present invention, described control unit further include: task queue cache unit and/or completion team Column cache unit;
The task queue cache unit saves the multiple tasks request for submitting interface to send from task, and in a serial fashion Task requests are called by task Dispatching Unit;
It is described to complete the queue cache unit task requests corresponding to the processing of the unserializing as performed by accelerator module Result cached.
As a further improvement of the present invention, the task submits interface and Data Input Interface mutually indepedent, and is matched It is set to fifo queue interface and/or DMA control interface;
The task output interface and data output interface are mutually indepedent, and are configured as fifo queue interface and/or DMA Control interface.
As a further improvement of the present invention, the task submits interface and Data Input Interface to be configured as jointly DMA control interface;
The task output interface and data output interface are configured as DMA control interface jointly.
As a further improvement of the present invention, the data unserializing device is configured as asic chip or FPGA core Piece.
As a further improvement of the present invention, the number of unserializing engine included in the unserializing engine cluster Amount is determined by the bandwidth of Data Input Interface or data output interface.
As a further improvement of the present invention, the number of the unserializing engine configured in the unserializing engine cluster Amount is that the bandwidth of Data Input Interface or data output interface and 4 bytes are formed by 2nIt is a again, wherein parameter n takes just whole Number.
As a further improvement of the present invention, described control unit further include: initial configuration is executed to accelerator module Initialization unit.
To realize above-mentioned second goal of the invention, the present invention also provides a kind of computing terminals, comprising:
Source, target side and the data unserializing device as described in any one invention;
The data unserializing device executes unserializing processing to the data transmitted between source and target side;
The source and target side include processor, memory, memory, cloud platform, computer cluster, data center, object Reason machine or virtual machine.
Compared with prior art, the beneficial effects of the present invention are: revealed data antitone sequence makeup through the invention It sets and the computing terminal based on the data unserializing device, it is flat to cloud in unserializing processing to reduce data significantly The consumption of cpu resource, memory source and computer bus bandwidth resources that platform, data center are configured.
Detailed description of the invention
Fig. 1 is a kind of entire block diagram of data unserializing device of the present invention;
Fig. 2 is a kind of specific block diagram of data unserializing device of the present invention;
Fig. 3 is a kind of structural block diagram of the control unit in a kind of data unserializing device of the present invention in variation;
Fig. 4 is the data format figure of the Schema configuration management order comprising cmd order;
Fig. 5 is the data format figure for being input to the serialized data of data input module;
Fig. 6 is the data format figure for the unserializing data that data outputting module is sent;
Fig. 7 is a kind of a kind of entire block diagram of the data unserializing device of the present invention in variation;
Fig. 8 is a kind of entire block diagram of the data unserializing device of the present invention in another variation;
Fig. 9 is a kind of entire block diagram of computing terminal of the present invention.
Specific embodiment
The present invention is described in detail for each embodiment shown in reference to the accompanying drawing, but it should be stated that, these Embodiment is not limitation of the present invention, those of ordinary skill in the art according to these embodiments made by function, method, Or equivalent transformation or substitution in structure, all belong to the scope of protection of the present invention within.
Embodiment one:
Join shown in Fig. 1, Fig. 2, fig. 4 to fig. 6, the present embodiment discloses data unserializing device (hereinafter referred to as " device ") The first specific embodiment.
Data unserializing device 100 disclosed in the present embodiment (hereinafter referred to as " device 100 ") is handled using stream data Method, serialized data carry out the inverted sequence as composed by multiple unserializing engines based on AVRO parallel in logic after cutting Columnization engine cluster 63 executes unserializing processing, and in unserializing after treatment, according to Schema file to antitone sequence Change after data execute splicing and export, to form specific data object.Such data object, can also be with either readable documents It is media file, for other any forms and can also can be based on data and/or document form composed by unserializing data.
The device 100 includes: control unit 10, is controlled by least one accelerator module of control unit 10 in a serial fashion 60.Accelerator module 60 is by data input module 61 (Din_unit), data cutting engine 62 (Data_Scatter), unserializing Engine cluster 63 (Avro_unit), data fusion engines 64 (Data_Gather) and data outputting module 65 (Dout_unit) Composition, the unserializing engine cluster 63 are made of (i.e. shown by Fig. 2 parallel several unserializing engines based on AVRO Unserializing engine 631, unserializing engine 632 (remaining summary) unserializing engine 63i).Data The Schema file that cutting engine 62 is passed to data input module 61 determines cutting granularity, and by the serialized data after cutting Circulation is distributed in multiple unserializing engines, is handled with serial execution unserializing, to pass through the unserializing engine Unserializing data are generated, and spelling is sequentially executed to unserializing data according to Schema file by data fusion engines 64 It is exported after connecing.
Unserializing processing is the application of bandwidth demand type, to bandwidth requirement height, under the premise of frequency is certain, data path Bit wide is bigger, and data transfer bandwidth is higher.Unserializing device assign to data and be cut as unit of data transfer bandwidth Point, while guaranteeing data path flowing water, also guarantee that each unserializing engine load is balanced.In the present embodiment, use The data width of 32Byte.Data cutting engine 62 distributes data to 8 unserializing engines, data by the sequence circulation of 0-7 Merge engine 64 also according to the sequence of 0-7, each unserializing data are successively read from 8 unserializing engines, every anti- The unserializing of the corresponding cutting data of serialized data is as a result, to splice all unserializing data together to obtain To unserializing data illustrated in fig. 6, streaming exports outward.
Schema file determines the script structure of the serialized data element of input, the data of common schema file Type includes integer, long number, floating number, character string etc..The process of serializing is original in Schema file Data recompile, and unserializing is the data after coding to be reverted to again initial data (shown in ginseng Fig. 6).Schema file The composition of a table is defined, serialized data is exactly the set to several list items record.
The logically sequence between source 1 and target side 2, for sending source 1 to device 100 of device 100 It executes unserializing processing with changing serial mode, to form specific data object, and is eventually sent in target side 2.Wherein, Source 1 and target side 2 are logically mutually indepedent, and can be identified as physical machine, virtual machine, database, storage device or Person can running process data and unserializing data various objects, while source 1 can also be regarded as to processor and will Target side 2 regards as memory.Data unserializing device 100, which is configured as asic chip or fpga chip or other, to be had The semiconductor devices of the programming devices function such as PAL, GAL, CPLD.
Join shown in Fig. 1 and Fig. 2, control unit 10 further include: task submits interface 20 and task output interface 30.Wherein, The task submits interface 20 with Data Input Interface 30 to receive the task flow being made of task requests respectively and by serializing The data flow that data are constituted.Preferably, accelerator module 60 disclosed in the present embodiment further include: receive serialized data and turn It is sent to the Data Input Interface 40 of data input module 61, and receives the unserializing data of the transmission of data outputting module 65 simultaneously Execute the data output interface 50 of forwarding.When source 1 initiates unserializing request to device 100, by request task and serializing Data are respectively sent to task and submit interface 20 and Data Input Interface 40.Control unit 10 receives task requests, and specific Task requests corresponding to serialized data execute after unserializing is disposed, processing result is sent to task output and is connect Mouth 30, while unserializing data are sent to data output interface 50 by accelerator module 60, to be sent in target side 2.
In the present embodiment, the input terminal setting task submission interface 20 and Data Input Interface 40 of device 100, while The output end setting task output interface 30 and data output interface 50 of device, thus respectively that the treatment process of unserializing is real Current task with the processing of the Dual parallel of data parallel, reduces the requirement to 100 operation frequency of device, while can also obtain parallel Obtain higher data throughput bandwidth.The requirement to 100 operation frequency of device is reduced not only to the hardware configuration for forming the device 100 Requirement it is not harsh, moreover it is possible to reduce cloud platform calculate node and upper component (such as Nova, Cinder, load balancer) it Between bandwidth occupancy so that the user experience of cloud platform is more preferably.Finally, task parallel with the Dual parallel of data parallel The mode of processing be also beneficial to be embedded into perhaps be transplanted in operating system nucleus therefore can very conveniently by cloud platform or In the calculate node of big data platform, to mass data, (solid data may include structure to the mode of operation simultaneously in a distributed manner Change data, semi-structured data or unstructured data) execute unserializing processing.
Task submits interface 20 and Data Input Interface 40 mutually indepedent, and is configured as fifo queue interface.Task is defeated Outgoing interface 30 and data output interface 50 are mutually indepedent, and are configured as fifo queue interface.Fifo queue interface has succinct Efficiently, by simplified driver interface, interface that can efficiently with general big data platform in the prior art (HTTP/POST) it is docked.
10 configuration task Dispatching Unit 12 of control unit, the task Dispatching Unit 12 receive task in a serial fashion and ask It asks, and controls the unserializing engine cluster 63 in accelerator module 60 in a serial fashion based on AVRO concurrently to the sequence after cutting Columnization data execute unserializing processing.By unserializing engine 631, unserializing engine in unserializing engine cluster 63 632 (remaining summary) unserializing engine 63i composition, wherein parameter i takes just whole more than or equal to two Number, to realize that the serialized data of 63 pairs of unserializing engine cluster inputs executes parallel unserializing processing.On simultaneously State unserializing engine 631, (remaining summary) the unserializing engine of unserializing engine 632 63i is base It is realized in AVRO.The Schema file of AVRO is JSON format, and the programming language relied on can be JAVA, C++, Python.
When AVRO writes data into file, it will Schema file together with real data (solid data in this scene Either serialized data, and can be unserializing data) it stores together, hereafter reader can be according to this Schema file process data, if reader has used different Schema files, AVRO also provides compatible mechanism To solve this problem.AVRO is used in RPC, the source 1 and target side 2 in Fig. 1 will be logical first before transmitting the data Handshake (Handshake Protocol) exchange Schema file is crossed, and reaches unification on Schema file consistence.At this point, Fig. 1 institute The source 1 of announcement is understood to be client (Client), and target side 2 is understood to be server-side (end Server).AVRO is One serializing frame, design philosophy, programming mode it is all closely similar with thirft, belong to the top item of Apache Mesh.AVRO additionally provides RPC mechanism, it may not be necessary to which generating additional API code AVRO can be used to come storing data and RPC Interaction.Therefore being highly suitable for cloud platform, perhaps big data platform is not to efficiently access cloud platform or big data platform, and not It needs to occupy computing resource, Internet resources, virtual bandwidth and the virtual ip address of cloud platform or big data platform, therefore, shows Landing reduces cpu resource, memory source and calculating that data configure cloud platform, data center in unserializing processing The consumption of machine bus bandwidth resources.
In conjunction with shown in Fig. 1 and Fig. 2, what the responsible reception source 1 of control unit 10 was submitted executes antitone sequence to serialized data Change processing request corresponding task requests and configuration information.In the present embodiment, in order to improve 10 reception task of control unit The ability of request, the control unit 10 further include: task queue cache unit 11 (i.e. RQ) and completion queue cache unit 13 are (i.e. CQ).Task queue cache unit 11 is used to receive and save multiple task requests for submitting interface 20 to send from task, parallel series Ground is sent to task Dispatching Unit 12.The descriptor format of task requests include cmd (order), valid (order significant notation), (data processing request is compiled by out_dblk (output data block size), scr_dlen (source data length, unit byte), drq_id Number, software definition), the information composition such as rev (reservation).The descriptor format of task requests is as shown in following table one.
Table one: the descriptor format of task requests
More detailed coded format is as shown in following table two included in cmd (order) in table one.In table two Host → avro represents the cmd (order) that task Dispatching Unit 12 is sent to accelerator module 60, and avro → Host, which is represented, accelerates list The cmd (order) that member 60 is fed back to task Dispatching Unit 12.
The coded format of table two: cmd (order)
Above-mentioned two unified definition of table is to 60 configuration order schema_cfg of accelerator module, task requests DRQ, task requests Cancel the command code that DRQ_NV and accelerator module 60 export, such as cancelling operation terminates NV_OVER, task processing result DRQ_CQ, result data end of transmission DATA_OVER etc..The coded format of Cmd order is not limited to the listed definition of above-mentioned table two, and It can be customized and be increased according to software requirement.
Task queue cache unit 11 saves the multiple tasks request for submitting interface 10 to send from task, and in a serial fashion Task requests are called by task Dispatching Unit 12.Accelerator module 60 is finished at unserializing to the serialized data of input After reason, processing result (is handled generated namely based on unserializing performed by unserializing engine by accelerator module 60 Unserializing data) notice task Dispatching Unit 12, and completion queue is sent to by the hair processing result of task Dispatching Unit 12 and is delayed Memory cell 13 the processing result of unserializing is finally sent to task output interface 30, and responds target side 2. Complete the results of the task requests corresponding to the processing of the unserializing as performed by accelerator module 60 of queue cache unit 13 into Row caching.
In the present embodiment, single by the way that task queue cache unit 11 is arranged in the control unit 10 and completes queue caching Member 13 improves 10 pairs of control unit and receives task requests and start the parallel processing capability of processing result.Preferably, in conjunction with figure Control unit 10 shown in 3, in the device 100 further include: the initialization unit 14 of initial configuration is executed to accelerator module 60. Initialization unit 14 is controlled by task queue cache unit 11, and task requests are passed through initialization unit 14 for initial configuration It is issued to accelerator module 60.In the present embodiment, so-called initial configuration is by the configuration information of Schema file, such as list item The information configurations such as length, the specific data type of each list item are to accelerator module 60.Accelerator module 60 obtains initial configuration After data, just according to list item, cutting is carried out according to Schema file to input data, is then formed by subsequenceization with cutting The segment of the corresponding schema file of data is distributed to unserializing engine cluster 63 (avro_unit) together, and by antitone sequence Change input data is executed according to unserializing algorithm (for example, serializing binary tree algorithm) in a serial fashion of engine cluster 63 Unserializing processing, with deserialization data to obtain unserializing data.
Meanwhile in the present embodiment, task Dispatching Unit 12 is also responsible for multiple antitone sequences parallel in accelerator module 60 Change the task switching that engine carries out unserializing processing in unserializing treatment process.Task switching can draw unserializing Unserializing engine included in cluster 63 is held up to be monitored, and for by data cutting engine 62 to serialized data (i.e. Input data) cutting is formed by subsequenceization data, according to the data type of input data (such as signless long data, floating-point Data, Boolean type data), the quantity of data length, unserializing engine included in unserializing engine cluster 63, with true The fixed cutting granularity to serialized data.Simultaneously as matching in Schema file received by multiple unserializing engines It is identical to set administration order, therefore can serially carry out unserializing processing to a biggish serialized data, and does not need pair Thus multiple unserializing engines carry out repeating configuration, improve accelerator module 60 significantly and execute to biggish serialized data The treatment effeciency of unserializing data.Such technical effect not only has higher treatment effect to structural data, simultaneously also Can to unstructured data (such as Word document, ERP system data or financial system data) and semi-structured data The effect of (such as journal file, XML document, JSON document) unserializing processing with higher.
As shown in connection with fig. 4, Fig. 4 shows position of the cmd order in the configuration management order that Schema file is included, And the length of Schema configuration management order.It is aligned in the corresponding Schema configuration management order of cmd order.It will figure The data of format shown by 4 configure accelerator module 60 by the task Dispatching Unit 12 in control unit 10.Such as Fig. 5 institute Show, is input to the serialized data (i.e. input data) of data input module 61, the serialized data and matching comprising cmd order The data length for setting administration order is 32Byte.By treated output data (the i.e. inverted sequence of unserializing engine cluster 63 Columnization data) data format join shown in Fig. 6, cmd order is located at the tail portion of output data.Meanwhile accelerator module 60 is in inverted sequence After columnization processing is finished, task processing result can be notified to the task Dispatching Unit 12 in control unit 10, and finally lead to It crosses completion queue cache unit 13 and task output interface 30 is responded to target side 2.Specifically, in the present embodiment, appointing The descriptor format for being engaged in handling is as shown in following table three:
Table three
Although in the present embodiment, although task is arranged in device 100 submits interface 20, Data Input Interface 40, task Output interface 30 and data output interface 50 handle task with the Dual parallel of data parallel parallel, and applicant goes back simultaneously It points out also submit interface 20 be merged with Data Input Interface 40 task in device 100, and by task output interface 30 are merged with data output interface 50, and merging is formed by unified interface both can be using fifo queue interface, can also be with Using DMA control interface.In this example, the data lattice for the Schema file that task Dispatching Unit 12 is issued to accelerator module 60 Formula is joined shown in following tables four.
Table four
The quantity of unserializing engine is by Data Input Interface or data included in unserializing engine cluster 63 The bandwidth of output interface determines.Specifically, in the present embodiment, the unserializing configured in unserializing engine cluster 63 is drawn The quantity held up is that the bandwidth of Data Input Interface 40 or data output interface 50 and 4 bytes (Byte) are formed by 2nIt is a again, Wherein, parameter n takes positive integer.
For example, it is assumed that the bandwidth of Data Input Interface 40 is 128bit (16Byte), then in unserializing engine cluster 63 The minimum number for the unserializing engine for being included can be configured to four unserializing engines, at most can be configured to eight antitone sequences Change engine.Meanwhile input capability and antitone sequence in order to guarantee the device 100 serialized data in unserializing treatment process Change the balance of the fan-out capability of data, it is preferred that can enter data into interface 40 and data output interface 50 bandwidth make it is identical Configuration, and combine task requests corresponding to input data specification and cutting granularity, by task Dispatching Unit 12 determine exist The quantity of required unserializing engine in unserializing treatment process.
Revealed data unserializing device reduces data in unserializing processing significantly through this embodiment Consumption to cpu resource, memory source and computer bus bandwidth resources that reduction configures cloud platform, data center;Together When, input serialized data (i.e. input data) execute unserializing treatment process in need not rely on database and by with Family is customized macro, so that unserializing treatment process performed in the scene that mass data exchanges and communicates is more clever It is living.
Embodiment two:
As shown in connection with fig. 7, the present embodiment further discloses a kind of data unserializing device 100a (in the present embodiment referred to as " device 100a ") a kind of variation.The revealed device 100a of the present embodiment is deposited with the revealed device 100 of embodiment one The main distinction be, in the present embodiment, the Data Input Interface 40 in embodiment one is replaced with into DMA control interface 40a, and the data output interface 50 in embodiment one is replaced with into DMA control interface 50a.Task submits interface 20 and DMA to control Interface 40a processed is mutually indepedent, and task submits interface 20 to be configured as fifo queue interface.Meanwhile task output interface 30 It is mutually indepedent with DMA control interface 50a, and task output interface 30 is configured as fifo queue interface.DMA control interface can be with Read/write operations are more efficiently executed from memory, and are conducive to set up high-effect embedded system algorithm and network.
Especially, in the present embodiment, it is accessed logically between DMA control interface 40a and DMA control interface 50a In multiple accelerator modules of parallel architecture, i.e. 601~accelerator module of accelerator module 60i, wherein parameter i is taken more than or equal to 1 Positive integer;Multiple accelerator modules are controlled by control unit 10 simultaneously simultaneously, the Schema text issued with reception control unit 10 Part.
The technical solution of the revealed device 100a of the present embodiment and same section in embodiment one, please join one institute of embodiment It states, details are not described herein.
Embodiment three:
As shown in connection with fig. 8, the present embodiment further discloses a kind of data unserializing device 100b (in the present embodiment referred to as " device 100b ") a kind of variation.The revealed device 100b of the present embodiment and the revealed device 100 of embodiment one and/ Or the main distinction present in the revealed device 100a of embodiment two is, in the present embodiment,
Fifo queue interface only is configured by task submission interface 20, and will be similar to that the input of the data in embodiment one connects Mouth 40 is configured to DMA control interface 40a, will be similar to that the task output interface 30 in embodiment one is configured to DMA control interface 30c, and the revealed task output interface 30 of embodiment one is configured as DMA control interface 40a, will be disclosed in embodiment one Data output interface 50 be configured as DMA control interface 50a.It is right to improve unserializing request institute in device 100b The parallel processing result of answering for task (i.e. unserializing data) is preferably acquired in target side 2.
The technical solution of the revealed device 100b of the present embodiment and same section in embodiment one and/or embodiment two, It please join described in embodiment one, details are not described herein.
Example IV:
Join shown in Fig. 9, a kind of revealed computing terminal 200 of the present embodiment includes above-described embodiment one into embodiment three Any one or a kind of revealed data unserializing device of several embodiments.
The computing terminal 200, comprising: source 1, any one of target side 2 and such as above-described embodiment one to embodiment three A kind of either revealed data unserializing device 100 (either device 100a or device 100b) of several embodiments.Data Unserializing device executes unserializing processing to the data transmitted between source 1 and target side 2.Source 1 and target side 2 are wrapped Include processor, memory, memory, cloud platform, computer cluster, data center, physical machine or virtual machine.
In the present embodiment, role of the processor 31 in computing terminal 200 can be identified as source 1, and memory 32 exists Role in computing terminal 200 can be identified as target side 2.Processor 31, memory 32 and device 100 are coupled to system bus 33。
The revealed data unserializing device 100 (either device 100a or device 100b) of the present embodiment joins above-mentioned reality It applies shown in example one to embodiment three, details are not described herein.
In several embodiments provided herein, it should be understood that disclosed system, device and method can be with It realizes by another way.For example, the apparatus embodiments described above are merely exemplary, for example, the module or The division of unit, only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units Or component can be combined or can be integrated into another system, or some features can be ignored or not executed.Another point, institute Display or the mutual coupling, direct-coupling or communication connection discussed can be through some interfaces, device or unit Indirect coupling or communication connection can be electrical property, mechanical or other forms.
The unit as illustrated by the separation member may or may not be physically separated, aobvious as unit The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, the functional units in various embodiments of the present invention may be integrated into one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
If the integrated unit is realized in the form of SFU software functional unit and sells or use as independent product When, it can store in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially The all or part of the part that contributes to existing technology or the technical solution can be in the form of software products in other words It embodies, which is stored in a storage medium, including some instructions are used so that a computer It is each that equipment (can be personal computer, server or the network equipment etc.) or processor (processor) execute the present invention The all or part of the steps of embodiment the method.And storage medium above-mentioned includes: USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic or disk Etc. the various media that can store program code.
The series of detailed descriptions listed above only for feasible embodiment of the invention specifically Protection scope bright, that they are not intended to limit the invention, it is all without departing from equivalent implementations made by technical spirit of the present invention Or change should all be included in the protection scope of the present invention.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included within the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.
In addition, it should be understood that although this specification is described in terms of embodiments, but not each embodiment is only wrapped Containing an independent technical solution, this description of the specification is merely for the sake of clarity, and those skilled in the art should It considers the specification as a whole, the technical solutions in the various embodiments may also be suitably combined, forms those skilled in the art The other embodiments being understood that.

Claims (11)

1. a kind of data unserializing device (100) characterized by comprising control unit (10) is controlled by a serial fashion At least one accelerator module (60) of control unit (10);
The accelerator module (60) by data input module (61), data cutting engine (62), unserializing engine cluster (63), Data fusion engines (64) and data outputting module (65) composition, the unserializing engine cluster (63) is if by parallel butt It is formed in the unserializing engine of AVRO;
The Schema file that the data cutting engine (62) is passed to data input module (61) determines cutting granularity, and will cut Serialized data circulation after point is distributed in several unserializing engines, to generate antitone sequence by the unserializing engine Change data, and by data fusion engines (64) sequentially according to Schema file to defeated after the execution splicing of unserializing data Out.
2. data unserializing device according to claim 1, which is characterized in that the accelerator module (60) further include: The Data Input Interface (40) for receiving serialized data and being forwarded to data input module (61), and receive data outputting module (65) the unserializing data that send simultaneously execute the data output interface (50) of forwarding;
Described control unit (10) further include: task submits interface (20) and task output interface (30);
Wherein, the task submits interface (20) and Data Input Interface (30) to receive being made of task requests for task respectively Stream and the data flow being made of serialized data.
3. data unserializing device according to claim 1, which is characterized in that described control unit (10) configuration task Dispatching Unit (12), the task Dispatching Unit (12) receive task requests in a serial fashion, and control accelerates in a serial fashion Unserializing engine cluster (63) in unit (60) concurrently executes antitone sequence to the serialized data after cutting based on AVRO Change processing.
4. data unserializing device according to claim 3, which is characterized in that described control unit (10) further include: Task queue cache unit (11) and/or completion queue cache unit (13);
The task queue cache unit (11) saves the multiple tasks request for submitting interface (10) to send from task, and with serial Mode calls task requests by task Dispatching Unit (12);
It is described to complete queue cache unit (13) task corresponding to the processing of the unserializing as performed by accelerator module (60) The result of request is cached.
5. data unserializing device according to claim 1, which is characterized in that the task submits interface (20) and number Independently of each other according to input interface (40), and fifo queue interface and/or DMA control interface are configured as;
The task output interface (30) and data output interface (50) independently of each other, and be configured as fifo queue interface and/ Or DMA control interface.
6. data unserializing device according to claim 1, which is characterized in that the task submits interface (20) and number It is configured as DMA control interface jointly according to input interface (40);
The task output interface (30) and data output interface (50) are configured as DMA control interface jointly.
7. data unserializing device according to any one of claim 1 to 6, which is characterized in that the data inverted sequence Column makeup, which is set, is configured as asic chip or fpga chip.
8. data unserializing device according to claim 7, which is characterized in that the unserializing engine cluster (63) Included in the quantity of unserializing engine determined by the bandwidth of Data Input Interface or data output interface.
9. data unserializing device according to claim 8, which is characterized in that the unserializing engine cluster (63) The quantity of middle configured unserializing engine is formed by the bandwidth of Data Input Interface or data output interface and 4 bytes 2nIt is a again, wherein parameter n takes positive integer.
10. data unserializing device according to claim 7, which is characterized in that described control unit (10) further include: Accelerator module (60) are executed with the initialization unit (14) of initial configuration.
11. a kind of computing terminal (200) characterized by comprising
Source (1), target side (2) and the data unserializing device (100) as described in any one of claims 1 to 10;
The data unserializing device executes unserializing processing to the data transmitted between source (1) and target side (2);
The source (1) and target side (2) include processor, memory, memory, cloud platform, computer cluster, data center, Physical machine or virtual machine.
CN201910664325.2A 2019-07-23 2019-07-23 A kind of data unserializing device and computing terminal Pending CN110399415A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910664325.2A CN110399415A (en) 2019-07-23 2019-07-23 A kind of data unserializing device and computing terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910664325.2A CN110399415A (en) 2019-07-23 2019-07-23 A kind of data unserializing device and computing terminal

Publications (1)

Publication Number Publication Date
CN110399415A true CN110399415A (en) 2019-11-01

Family

ID=68324807

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910664325.2A Pending CN110399415A (en) 2019-07-23 2019-07-23 A kind of data unserializing device and computing terminal

Country Status (1)

Country Link
CN (1) CN110399415A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984679A (en) * 2020-07-02 2020-11-24 中科驭数(北京)科技有限公司 Access method, device, host, system and medium of hardware acceleration database
CN113434149A (en) * 2021-07-07 2021-09-24 腾讯科技(深圳)有限公司 Application program generating and loading method, device and medium
WO2023084418A3 (en) * 2021-11-12 2023-07-13 Alpha Sanatorium Technologies Inc. Method and system for optimizing transmission of serialized data using dynamic, adaptive slicing and reduction of serialized data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3029581A1 (en) * 2013-07-30 2016-06-08 Fujitsu Limited Processing program, processing system, and processing method
CN109492657A (en) * 2018-09-18 2019-03-19 平安科技(深圳)有限公司 Handwriting samples digitizing solution, device, computer equipment and storage medium
CN110035103A (en) * 2018-01-12 2019-07-19 宁波中科集成电路设计中心有限公司 A kind of transferable distributed scheduling system of internodal data

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP3029581A1 (en) * 2013-07-30 2016-06-08 Fujitsu Limited Processing program, processing system, and processing method
CN110035103A (en) * 2018-01-12 2019-07-19 宁波中科集成电路设计中心有限公司 A kind of transferable distributed scheduling system of internodal data
CN109492657A (en) * 2018-09-18 2019-03-19 平安科技(深圳)有限公司 Handwriting samples digitizing solution, device, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
杨丽等: "浅析.NET Framework中二进制序列化与反序列", 《科技信息(科学教研)》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111984679A (en) * 2020-07-02 2020-11-24 中科驭数(北京)科技有限公司 Access method, device, host, system and medium of hardware acceleration database
CN111984679B (en) * 2020-07-02 2021-06-04 中科驭数(北京)科技有限公司 Access method, device, host, system and medium of hardware acceleration database
CN113434149A (en) * 2021-07-07 2021-09-24 腾讯科技(深圳)有限公司 Application program generating and loading method, device and medium
CN113434149B (en) * 2021-07-07 2023-09-08 腾讯科技(深圳)有限公司 Application program generating and loading method, device and medium
WO2023084418A3 (en) * 2021-11-12 2023-07-13 Alpha Sanatorium Technologies Inc. Method and system for optimizing transmission of serialized data using dynamic, adaptive slicing and reduction of serialized data
GB2627090A (en) * 2021-11-12 2024-08-14 Alpha Sanatorium Tech Inc Method and system for optimizing transmission of serialized data using dynamic, adaptive slicing and reduction of serialized data

Similar Documents

Publication Publication Date Title
Awan et al. Optimized broadcast for deep learning workloads on dense-GPU InfiniBand clusters: MPI or NCCL?
CN110399415A (en) A kind of data unserializing device and computing terminal
US11392740B2 (en) Dataflow function offload to reconfigurable processors
US9280297B1 (en) Transactional memory that supports a put with low priority ring command
Awan et al. Efficient large message broadcast using NCCL and CUDA-aware MPI for deep learning
US7958274B2 (en) Heuristic status polling
US9069602B2 (en) Transactional memory that supports put and get ring commands
US20090006663A1 (en) Direct Memory Access ('DMA') Engine Assisted Local Reduction
US7805546B2 (en) Chaining direct memory access data transfer operations for compute nodes in a parallel computer
TW200921432A (en) Query execution and optimization utilizing a combining network in a parallel computer system
HUE035246T2 (en) Scalable direct inter-node communication over peripheral component interconnect-express (pcie)
US20090031001A1 (en) Repeating Direct Memory Access Data Transfer Operations for Compute Nodes in a Parallel Computer
Zhu et al. Distributed recommendation inference on fpga clusters
WO2023040197A1 (en) Cross-node communication method and apparatus, device, and readable storage medium
US11729088B2 (en) Broadcast switch system in a network-on-chip (NoC)
Chu et al. Exploiting hardware multicast and GPUDirect RDMA for efficient broadcast
US20230403232A1 (en) Data Transmission System and Method, and Related Device
Zhou et al. Accelerating broadcast communication with gpu compression for deep learning workloads
CN115687233A (en) Communication method, device, equipment and computer readable storage medium
US9342313B2 (en) Transactional memory that supports a get from one of a set of rings command
US20210334264A1 (en) System, method, and program for increasing efficiency of database queries
US20130103926A1 (en) Establishing a data communications connection between a lightweight kernel in a compute node of a parallel computer and an input-output ('i/o') node of the parallel computer
Shafi et al. Efficient MPI-based communication for GPU-accelerated Dask applications
Huang et al. An evaluation of RDMA-based message passing protocols
Balle et al. Inter-kernel links for direct inter-FPGA communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191101