CN103746771A - Data format conversion method of channel coding and decoding based on GPP and SIMD technologies - Google Patents

Data format conversion method of channel coding and decoding based on GPP and SIMD technologies Download PDF

Info

Publication number
CN103746771A
CN103746771A CN201310729424.7A CN201310729424A CN103746771A CN 103746771 A CN103746771 A CN 103746771A CN 201310729424 A CN201310729424 A CN 201310729424A CN 103746771 A CN103746771 A CN 103746771A
Authority
CN
China
Prior art keywords
data
simd
encapsulation
instruction
byte
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201310729424.7A
Other languages
Chinese (zh)
Other versions
CN103746771B (en
Inventor
牛凯
丁忆南
贺志强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201310729424.7A priority Critical patent/CN103746771B/en
Publication of CN103746771A publication Critical patent/CN103746771A/en
Application granted granted Critical
Publication of CN103746771B publication Critical patent/CN103746771B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Advance Control (AREA)

Abstract

The invention relates to a data format conversion method of channel coding and decoding based on GPP and SIMD technologies. Processing before coding is that an input data stream A0A1A2..An-1 with a work length of n bytes is packaged into s SIMDA format data with a length of M first whose data formats are deserialized and are suitable for an SIMD instruction, concurrent ''mapping'' operation is executed, then concurrent ''and'' operation is executed, finally concurrent ''smaller value selection'' operation is executed, and an output data stream B0B1B2..B8n-1 with a work length of 8n bytes is generated; and processing after decoding is that an input data stream C0C1C2..C8n-1 with a work length of 8n bytes is packaged into 8s SIMD format data with a length of M first, concurrent ''equality judgment'' operation is executed, then concurrent ''highest bit combination selection'' operation is executed, and an output data stream D0D1D2..Dn-1 with a word length of n bytes is generated. The data format conversion method of channel coding and decoding based on the GPP and SIMD technologies in the invention uses SIMD concurrent operation instructions, greatly accelerates the conversion speed of data formats, and ensures transmission performance and correctness of coding and decoding. The method also has the characteristics of being low in cost, good in transplantability, easy to debug, easy and convenient to upgrade, etc.

Description

A kind of conversion method of data format of the channel decoding based on GPP and SIMD technology
Technical field
The present invention relates to a kind of based on general processor GPP(General Purpose Processor) and single-instruction multiple-data stream (SIMD) SIMD(Single Instruction Multiple Data) conversion method of data format of channel decoding of technology, the technical field of the coding and decoding of communicating by letter belonged to.
Background technology
In channel coding technology, as Turbo code, low-density checksum LDPC(Low Density Parity Check) code and convolution code etc., because its error-correcting performance is superior, over nearly 10 years, in high-speed radiocommunication standard (system of 3G or 4G), obtained being widely used.For example,, at Long Term Evolution LTE/ Long Term Evolution upgrade version LTE-A(Long Term Evolution/Long Term Evolution Advanced) system in, used Turbo code and convolution code; In 802.11 systems, LDPC code and convolution code have been used.
The coding and decoding of this type of channel has high time complexity, needs to expend a large amount of computing times.Yet emerging radio communication standard is again the means of communication towards big data quantity.Traditional communication implementation is mostly based on hardware handles platform, and hardware platform has following a plurality of problem: cost is high, platform scope of application limitation, and debug process is loaded down with trivial details, and the construction cycle is long, program upgrade inconvenience etc.
Over nearly 5 years, take general processor GPP platform as basic software radio ripe gradually.When having overcome the above-mentioned shortcoming of hardware platform, also there is bottleneck in software and radio technique in arithmetic speed.How to reduce the computation complexity that channel coding and decoding is brought, reduce time delay, become the main method of breaking communication system transmission rate bottleneck.
On GPP platform, the transmission of data and storage are all to take byte as base unit,, in computational process, are therefore minimum of computation unit greatly mainly with byte.And in communication system, information, with bit form storage or processing, is namely carried out according to bit: represent a unit information with a bit, be referred to as bit form.On GPP platform, the most efficient channel decoding implementation algorithm is all to using byte as minimum of computation unit at present,, with bit information of a byte representation, is referred to as byte form.Therefore, chnnel coding must have the Data Format Transform function that the bit form of data flow is converted to byte form.How to complete the bit form of data flow and the mutual translation function of byte form, become the Data Format Transform function of current inevitable chnnel coding front end and the Data Format Transform function of channel decoding rear end.
The object of the Data Format Transform of chnnel coding is: the input traffic A that by word length is the bit form of n byte 0a 1a 2... A n-1being converted to word length is the output stream B of the byte form of 8n byte 0b 1b 2... B 8n-1.Because any one A g(0≤g≤n-1) and B h(0≤h≤8n-1) is all 1 byte, and word length is 8 bits, and its index number is less, represents its Data Format Transform completing also more early.Wherein, A g=(a 8ga 8g+1a 8g+2a 8g+3a 8g+4a 8g+5a 8g+6a 8g+7), because any one element a is wherein 1 bit, the index number of a is less, and representative is the closer to the low level of its place byte, B h=(a h0000000), a hfor h bit in input traffic A, it is positioned at B hlowest order in byte.
The object of the Data Format Transform of channel decoding has been the inverse operation of above-mentioned chnnel coding: by word length, be the input traffic C of the byte form of 8n byte 0c 1c 2... C 8n-1being converted to word length is the output stream D of the bit form of n byte 0d 1d 2... D n-1.Because any one C l(0≤l≤8n-1) and D e(0≤e≤n-1) is all 1 byte, and word length is 8 bits, and its index number is less, represents its Data Format Transform completing also more early.Wherein, C l=(d l0000000), d lfor l bit in output stream D, it is positioned at C llowest order in byte; D e=(d 8ed 8e+1d 8e+2d 8e+3d 8e+4d 8e+5d 8e+6d 8e+7), and wherein any one element d is 1 bit, the index number of d is less, represents that it is the closer to the low level of place byte.
The conventional method of the channel-encoded data format conversion under GPP framework is: use based on displacement and " with " operation complete.With byte A 0for example, be translated into 8 byte B 0b 1b 2b 3b 4b 5b 6b 7time, this byte A 0=(a 0a 1a 2a 3a 4a 5a 6a 7) to circulate and carry out 8 following operations: each content of operation is all B f=(A 0<<f) & 1, and wherein, f is byte sequence number, and 0≤f≤7; Like this, when f=4, B 4=(A 0<<4) & 1=(a 4a 5a 6a 70000) & (10000000)=(a 40000000).Therefore, by input traffic A 0a 1a 2... A n-1convert output stream B to 0b 1b 2... B 8n-1time, just need circulation to carry out aforesaid operations n time: each circulation just completes the conversion of a byte in input traffic, i.e. the inferior circulation of g (0≤g≤n-1) has been aforementioned by A gbe converted to B 8gb 8g+1b 8g+2b 8g+3b 8g+4b 8g+5b 8g+6b 8g+7operation; Namely above-mentioned the g time circulation all includes 8 subcycles, a bit of each subcycle conversion, i.e. and the inferior subcycle of f (0≤f≤7) completes B 8g+f=(A g<<f) operation of & 1.
The Data Format Transform conventional method of the channel decoding under GPP framework is: the operation based on " displacement " and distance completes.With 8 byte C 0c 1c 2c 3c 4c 5c 6c 7for example, be translated into 1 byte D 0time, C wherein q=(d q0000000); First make D 0=0=(00000000), 8 following operations are carried out in recirculation: each content of operation is: D 0=D 0^ (C q>>q), wherein, q is byte sequence number, and 0≤q≤7.Like this, when q=4, D 0=D 0^ (C 4>>4)=(d 0d 1d 2d 30000) ^ (0000d 4000)=(d 0d 1d 2d 3d 4000).And by input traffic C 0c 1c 2... C 8n-1convert output stream D to 0d 1d 2... D n-1, will circulate and carry out n aforesaid operations, the conversion of 8 bytes in input traffic that at every turn circulated, i.e. the inferior circulation of e (0≤e≤n-1) has been by C 8ec 8e+1c 8e+2c 8e+3c 8e+4c 8e+5c 8e+6c 8e+7be converted to D e.The e time above-mentioned cycling content is: first by D e=0, then carry out 8 subcycles, a bit of each subcycle conversion, i.e. the inferior subcycle of q (0≤q≤7) completes D e=D e^ (C 8e+q>>q) operation.
The shortcoming of above-mentioned two kinds of conventional methods is: the operating unit of each computing only has 1 byte, carry out " displacement ", " with ", distance etc. is when operate, efficiency is on the low side.Therefore, how to improve the operating efficiency of coding&decoding, solve processing speed problem, become the focus problem that scientific and technical personnel pay close attention in the industry.
Single instruction stream multiple data stream SIMD(Single Instruction Multiple Data) be that controller of a kind of employing is controlled a plurality of processors, each data in one group of data (claiming again " data vector ") are carried out respectively to identical operation simultaneously, thus the technology that the concurrency on implementation space is processed.In microprocessor, single instruction stream multiple data stream technology is that a controller is controlled a plurality of parallel processing infinitesimals, for example the 3D Now technology of the MMX of Intel or SSE and AMD.
Summary of the invention
In view of this, the object of this invention is to provide a kind of based on general processor GPP(General Purpose Processor) and single-instruction multiple-data stream (SIMD) SIMD(Single Instruction Multiple Data) conversion method of data format of channel decoding of technology, the method is guaranteeing on the basis of transmission performance and coding and decoding correctness, redesign the applicable transfer algorithm of SIMD, use SIMD parallel work-flow instruction, greatly accelerate conversion speed; Because the present invention adopts GPP chip to realize, have cost low, portable good, debugging is simple and upgrade the feature such as easy.
In order to achieve the above object, the invention provides a kind of based on general processor GPP(General Purpose Processor) and single-instruction multiple-data stream (SIMD) SIMD(Single Instruction Multiple Data) chnnel coding before conversion method of data format, it is characterized in that: by word length, be first the input traffic A of n byte 0a 1a 2... A n-1be encapsulated as the SIMD formatted data that s length is M, make its data format parallelization, can be applicable to SIMD instruction and it is carried out to parallel " mapping " operation: each byte of input traffic be copied as to 8 bytes, be converted into the first intermediary data stream E 0e 1... E 8n-1; Again to the first intermediary data stream E 0e 1... E 8n-1carry out parallel AND-operation, extract after each bit information, be converted into the second intermediary data stream F 0f 1... F 8n-1; Finally to the second intermediary data stream F 0f 1... F 8n-1carry out parallel " choosing smaller value " operation, each bit information is moved on to the lowest order of the byte at its place, generating word length is the output stream B of 8n byte 0b 1b 2... B 8n-1; Wherein, byte length n=M * s, natural number M and s are respectively length and the numbers of SIMD encapsulation of data.
Described method comprises following operating procedure:
(1) carry out parallel " mapping " SIMD instruction, complete data Replica operation:
Successively s SIMD encapsulation of data used to " mapping " SIMD instruction, the SIMD encapsulation of data of the X input of " mapping " SIMD instruction is A tM+0, A tM+1..., A tM+M-1, and each X input data will participate in interior loop 8 times; Wherein, t is for carrying out the number of operations sequence number of the outer circulation of " mapping " SIMD instruction, and its span is [0, s-1]; U is for carrying out the number of operations sequence number of " mapping " SIMD instruction interior loop, and its span is [0,7]; The SIMD encapsulation of data of the Y of " mapping " SIMD instruction of the u time interior loop of the t time outer circulation input is:
{ M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , . . . , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 } , Complete after this " mapping " SIMD instruction, the first intermediary data stream Z obtaining is:
E 8tM+uM+0,E 8tM+uM+1,...,E 8tM+uM+M-1
(2) carry out parallel " with " SIMD instruction, complete the operation of extracting bit:
Successively to 8s encapsulation of data carry out parallel " with " SIMD instruction, should " with " the SIMD encapsulation of data of the X input of SIMD instruction is the first intermediary data stream Z:E rM+0, E rM+1..., E rM+M-1, " with " the SIMD encapsulation of data of the Y of SIMD instruction input is:
1,2,4,8,16,32,64,128,1,2,4,8,16,32,64,128 ..., 1,2,4,8,16,32,64,128}; Wherein, r for carrying out " with " the number of operations sequence number of SIMD instruction, its span is [0,8s-1]; Complete this " with " after SIMD instruction, the second intermediary data stream Z obtaining is: F rM+0, F rM+1..., F rM+M-1;
(3) carry out parallel " choosing smaller value " SIMD instruction, complete the lowest order operation that significant bit is displaced to its place byte:
Successively 8s SIMD encapsulation of data carried out to " choosing smaller value " SIMD instruction, the SIMD encapsulation of data of the X input of " choosing smaller value " SIMD instruction is F jM+0, F jM+1..., F jM+M-1, then arrange " choosing smaller value " SIMD instruction Y input SIMD encapsulation of data for 1,1 ..., 1}; Wherein, j is for carrying out the number of operations sequence number of " choosing smaller value " SIMD instruction, and its span is [0,8s-1]; Complete after this SIMD instruction of " choosing smaller value ", the final output stream Z obtaining is: B jM+0, B jM+1..., B jM+M-1.
In order to achieve the above object, the present invention also provides the conversion method of data format after a kind of channel decoding based on GPP and SIMD, it is characterized in that: by word length, be the input traffic C of 8n byte 0c 1c 2... C 8n-1be encapsulated as the SIMD encapsulation format data that 8s length is M, make its data format parallelization, can be applicable to SIMD order structure and it is carried out to parallel " judging whether to equate " operation, each bit information be moved on to the highest order of the byte at its place, thereby be converted into intermediary data stream G 0g 1... G 8n-1; Again to this intermediary data stream G 0g 1... G 8n-1carry out parallel " choosing highest order combination " operation, generating word length is the output stream D of n byte 0d 1d 2... D n-1; Wherein, n=M * s, natural number M and s are respectively length and the numbers of SIMD encapsulation of data.
Described method comprises following operating procedure:
(1) use parallel " judging whether to equate " SIMD instruction, complete the operation that significant bit is displaced to the highest order of place byte:
Successively 8s SIMD encapsulation of data used to " judging whether to equate " SIMD instruction, the encapsulation of data of the X input of " judging whether to equate " SIMD instruction is C kM+0, C kM+1..., C kM+M-1, the encapsulation of data of the Y input of " judging whether to equate " SIMD instruction for 1,1 ..., 1}; Wherein, k is for carrying out the number of operations sequence number of " judging whether to equate " SIMD instruction, and its span is [0,8s-1]; Complete this and " judge whether to equate " that after SIMD instruction, the intermediary data stream Z obtaining is: G kM+0, G kM+1..., G kM+M-1;
(2) use parallel " choosing highest order combination " SIMD instruction to complete the operation of the highest order of each byte of 8 continuous bytes being merged into 1 byte:
Successively s SIMD encapsulation of data used to " choosing highest order combination " SIMD instruction, the encapsulation of data of the X input of " choosing highest order combination " SIMD instruction is G 8wM+0, G 8wM+1..., G 8wM+8M-1; Wherein, w is for carrying out the number of operations sequence number of " choosing highest order combination " SIMD instruction, and its span is [0, s-1]; Complete after this " choosing highest order combines " SIMD instruction, the final output stream Z obtaining is: D wM+0, D wM+1..., D wM+M-1.
The innovation key technology of the inventive method is: make full use of the feature of GPP chip multinuclear, multiprocessor, completed at a high speed, the optimization process of general channel decoding.Compare with traditional method based on shifting function, processing speed of the present invention is accelerated greatly, wherein the data transaction before chnnel coding is that input traffic is carried out to map operation, be encapsulated as SIMD data format, make its data format parallelization, can be applicable to SIMD instruction and carry out SIMD algorithm, improve and process degree of parallelism.The advantage of the conversion method of data format after channel decoding is to save accessing operation, simplifies flow path switch.
Another innovation of the present invention is: under GPP chip, utilize single-instruction multiple-data stream (SIMD) SIMD technology, use parallel computation to process, improve computational speed.Because every SIMD instruction can both be to two groups of (or only to wherein one group) each self-contained M data elements encapsulation of data (X 0, X 1... X m-1and Y 0, Y 1... Y m-1) carry out parallel work-flow, make every couple of data element X i, Y i(0≤i≤M-1) carry out simultaneously same operation (the inventive method be comprise mapping, with, choose smaller value, judge whether to equate, choose one of them of highest order combination).The M an obtaining result of calculation, is used as again data element and is packaged in one group of data (Z 0, Z 1... Z m-1) in, therefore, use SIMD instruction can obviously improve the arithmetic speed of data in GPP chip.
In a word, the present invention has good popularizing application prospect.
Accompanying drawing explanation
Fig. 1 is the conversion method of data format operating procedure flow chart before chnnel coding of the present invention.
Fig. 2 is the content of operation schematic diagram of " mapping " SIMD instruction.
Fig. 3 be " with " the content of operation schematic diagram of SIMD instruction.
Fig. 4 is the content of operation schematic diagram of " choosing smaller value " SIMD instruction.
Fig. 5 is the conversion method of data format operating procedure flow chart after channel decoding of the present invention.
Fig. 6 is the content of operation schematic diagram of " judging whether to equate " SIMD instruction.
Fig. 7 is the content of operation schematic diagram of " choosing highest order combination " SIMD instruction.
Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with accompanying drawing, the present invention is described in further detail.
Single instruction stream multiple data stream SIMD(Single Instruction Multiple Data) be that controller of a kind of employing is controlled a plurality of processors, each data in one group of data (claiming again " data vector ") are carried out respectively to identical operation simultaneously, thus the technology that the concurrency on implementation space is processed.In microprocessor, single instruction stream multiple data stream technology is that a controller is controlled a plurality of parallel processing infinitesimals, for example the 3D Now technology of the MMX of Intel or SSE and AMD.
The SIMD technology adopting in the inventive method when carrying out every instruction, the SIMD encapsulation of data X to two groups of each self-contained M elements concurrently 0, X 1... X m-1and Y 0, Y 1... Y m-1execution comprises the various operations (while carrying out every SIMD instruction, also can only process one group of encapsulation of data) that judge whether to equate and choose highest order combination.And now, every couple of data element X iand Y icarry out same operation, wherein, i is the data sequence number in SIMD encapsulation of data simultaneously, and its span is [0, M-1]; Using a resulting M result of calculation as data element, be encapsulated in again one group of SIMD form encapsulation of data Z 0, Z 1... Z m-1in; Wherein, bit length P=64 * 2 of encapsulation of data p; When data element type is byte, corresponding Q=8, wherein, the length M of SIMD encapsulation of data depends on bit length P and the shared bit length Q of data element type of encapsulation of data, its computing formula is
Figure BDA0000446936920000071
wherein, bit length P=64 * 2 of encapsulation of data p, p is natural number; When data element type is word, corresponding Q=8, when data element type is byte, corresponding Q=16.
The conversion method of data format the present invention is based on before the chnnel coding of GPP and SIMD technology is: by word length, be first the input traffic A of n byte 0a 1a 2... A n-1be encapsulated as the SIMD formatted data that s length is M, make its data format parallelization, can be applicable to SIMD instruction and it is carried out to parallel " mapping " operation: each byte of input traffic be copied as to 8 bytes, be converted into the first intermediary data stream E 0e 1... E 8n-1; Again to the first intermediary data stream E 0e 1... E 8n-1carry out parallel AND-operation, extract after each bit information, be converted into the second intermediary data stream F 0f 1... F 8n-1; Finally to the second intermediary data stream F 0f 1... F 8n-1carry out parallel " choosing smaller value " operation, each bit information is moved on to the lowest order of the byte at its place, generating word length is the output stream B of 8n byte 0b 1b 2... B 8n-1; Wherein, byte length n=M * s, natural number M and s are respectively length and the numbers of SIMD encapsulation of data.
Referring to Fig. 1, introduce the concrete operation step of said method of the present invention:
Step 1, carries out parallel " mapping " SIMD instruction, completes data Replica operation:
Successively s SIMD encapsulation of data used to " mapping " SIMD instruction, the SIMD encapsulation of data of the X input of " mapping " SIMD instruction is A tM+0, A tM+1..., A tM+M-1, and each X input data will participate in interior loop 8 times; Wherein, t is for carrying out the number of operations sequence number of the outer circulation of " mapping " SIMD instruction, and its span is [0, s-1]; U is for carrying out the number of operations sequence number of " mapping " SIMD instruction interior loop, and its span is [0,7]; The SIMD encapsulation of data of the Y of " mapping " SIMD instruction of the u time interior loop of the t time outer circulation input is:
{ M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , . . . , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 } , Complete after this " mapping " SIMD instruction, the first intermediary data stream Z obtaining is:
E 8tM+uM+0,E 8tM+uM+1,...,E 8tM+uM+M-1
The SIMD encapsulation of data of 128 bits of take describes as embodiment: the A circuit-switched data A that the SIMD encapsulation of data of the X input of " mapping " SIMD instruction is 16 continuous bytes 0a 1a 2a 3a 4a 5a 6a 7a 8a 9a 10a 11a 12a 13a 14a 15, carrying out 8 circulations, the SIMD encapsulation of data of the Y input of each mapping SIMD instruction is:
Y={2u, 2u, 2u, 2u, 2u, 2u, 2u, 2u, 2u+1,2u+1,2u+1,2u+1,2u+1,2u+1,2u+1,2u+1}, wherein, and 0≤u≤7, the Z output intermediate variable C circuit-switched data of each mapping SIMD instruction is:
E 16ue 16u+1e 16u+2e 16u+3e 16u+4e 16u+5e 16u+6e 16u+7e 16u+8e 16u+9e 16u+10e 16u+11e 16u+12e 16u+13e 16u+14e 16u+15, wherein front 8 byte datas equal respectively A 2u, rear 8 byte datas equal respectively A 2u+1.So just completed each original byte has been copied as respectively to 8 bytes Coutinuous store in operation together.
Step 2, carry out parallel " with " SIMD instruction, complete the operation of extracting bit:
Successively to 8s encapsulation of data carry out parallel " with " SIMD instruction, should " with " the SIMD encapsulation of data of the X input of SIMD instruction is the first intermediary data stream Z:E rM+0, E rM+1..., E rM+M-1, " with " the SIMD encapsulation of data of the Y of SIMD instruction input is:
1,2,4,8,16,32,64,128,1,2,4,8,16,32,64,128 ..., 1,2,4,8,16,32,64,128}; Wherein, r for carrying out " with " the number of operations sequence number of SIMD instruction, its span is [0,8s-1]; Complete this " with " after SIMD instruction, the second intermediary data stream Z obtaining is: F rM+0, F rM+1..., F rM+M-1;
In embodiment, " with " the SIMD encapsulation of data of the X of SIMD instruction input is the E circuit-switched data that step 1 generates, the SIMD encapsulation of data of Y input is Y={1,2,4,8,16,32,64,128,1,2,4,8,16,32,64,128 ... }, " with " Z of SIMD instruction is output as intermediate variable F circuit-switched data.
Step 3, carries out parallel " choosing smaller value " SIMD instruction, completes the lowest order operation that significant bit is displaced to its place byte:
Successively 8s SIMD encapsulation of data carried out to " choosing smaller value " SIMD instruction, the SIMD encapsulation of data of the X input of " choosing smaller value " SIMD instruction is F jM+0, F jM+1..., F jM+M-1, then arrange " choosing smaller value " SIMD instruction Y input SIMD encapsulation of data for 1,1 ..., 1}; Wherein, j is for carrying out the number of operations sequence number of " choosing smaller value " SIMD instruction, and its span is [0,8s-1]; Complete after this SIMD instruction of " choosing smaller value ", the final output stream Z obtaining is: B jM+0, B jM+1..., B jM+M-1.
In embodiment, the SIMD encapsulation of data of the X of " choosing smaller value " SIMD instruction input is the F circuit-switched data that step 2 generates, and the SIMD encapsulation of data of Y input is Y={1,1,1,1 ..., the Z output of " choosing smaller value " SIMD instruction is exactly final B circuit-switched data.
Referring to Fig. 2, introduce the content of operation of " mapping " SIMD instruction: two groups of SIMD encapsulation of data X to input 0, X 1... X m-1and Y 0, Y 1... Y m-1, completing concurrently common M data are carried out after " mapping " SIMD instruction, the SIMD encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein, i element in the SIMD encapsulation of data of output is with Y ifor subscript, find X 0, X 1... X m-1in respective value,
Figure BDA0000446936920000091
wherein, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1].
Referring to Fig. 3, introduce " with " content of operation of SIMD instruction: two groups of encapsulation of data to input are X 0, X 1... X m-1and Y 0, Y 1... Y m-1, complete concurrently common M to data carry out " with " after SIMD command operating, the SIMD encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein, Z i=X iaMP.AMp.Amp Y i; Wherein, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1].
Referring to Fig. 4, introduce the content of operation of " choosing smaller value " SIMD instruction: two groups of encapsulation of data to input are X 0, X 1... X m-1and Y 0, Y 1... Y m-1, complete concurrently common M to " choosing smaller value " SIMD command operating after, the SIMD encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein, Z i=min (X i, Y i); Wherein, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1].
The conversion method of data format the present invention is based on after the channel decoding of GPP and SIMD technology is: by word length, be the input traffic C of 8n byte 0c 1c 2... C 8n-1be encapsulated as the SIMD encapsulation format data that 8s length is M, make its data format parallelization, can be applicable to SIMD order structure and it is carried out to parallel " judging whether to equate " operation, each bit information be moved on to the highest order of the byte at its place, thereby be converted into intermediary data stream G 0g 1... G 8n-1; Again to this intermediary data stream G 0g 1... G 8n-1carry out parallel " choosing highest order combination " operation, generating word length is the output stream D of n byte 0d 1d 2... D n-1; Wherein, n=M * s, natural number M and s are respectively length and the numbers of SIMD encapsulation of data.
Referring to Fig. 5, introduce the following concrete operation step of the conversion method of data format after channel decoding:
Step 1, is used parallel " judging whether to equate " SIMD instruction, completes the operation that significant bit is displaced to the highest order of place byte:
Successively 8s SIMD encapsulation of data used to " judging whether to equate " SIMD instruction, the encapsulation of data of the X input of " judging whether to equate " SIMD instruction is C kM+0, C kM+1..., C kM+M-1, the encapsulation of data of the Y input of " judging whether to equate " SIMD instruction for 1,1 ..., 1}; Wherein, k is for carrying out the number of operations sequence number of " judging whether to equate " SIMD instruction, and its span is [0,8s-1]; Complete this and " judge whether to equate " after SIMD instruction, the intermediary data stream Z that output obtains is: G kM+0, G kM+1..., G kM+M-1.
Step 2, use parallel " choosing highest order combination " SIMD instruction to complete the operation of the highest order of each byte of 8 continuous bytes being merged into 1 byte:
Successively s SIMD encapsulation of data used to " choosing highest order combination " SIMD instruction, the encapsulation of data of the X input of " choosing highest order combination " SIMD instruction is G 8wM+0, G 8wM+1..., G 8wM+8M-1; Wherein, w is for carrying out the number of operations sequence number of " choosing highest order combination " SIMD instruction, and its span is [0, s-1]; Complete after this " choosing highest order combines " SIMD instruction, the final output stream Z obtaining is: D wM+0, D wM+1..., D wM+M-1.
SIMD with 128 bits is encapsulated as embodiment, and the encapsulation of data of the X input of " choosing highest order combination " SIMD instruction is G circuit-switched data G 0g 1g 2g 3g 4g 5g 6g 7g 8g 9g 10g 11g 12g 13g 14g 15, after complete " choosing highest order combination " SIMD instruction, the Z obtaining is output as D circuit-switched data;
D 0d 1=(d 0d 1d 2d 3d 4d 5d 6d 7) (d 8d 9d 10d 11d 12d 13d 14d 15), d wherein ofor G ohighest order in byte, 0≤o≤15.
Referring to Fig. 6, introduce the content of operation of the SIMD instruction of " judging whether to equate ": two groups of encapsulation of data to input are X 0, X 1... X m-1and Y 0, Y 1... Y m-1, completing concurrently common M data are carried out after " judging whether to equate " SIMD command operating, output SIMD encapsulation of data is Z 0, Z 1... Z m-1; Wherein, Z i=X i==Y i255:0, the formula X on "=" number right side i==Y i255:0 is the conditional operator that machine word calls the turn, and represents: if X i==Y iset up, Z i=255, i.e. X iwith Y iwhile equating, Z iassignment is 255; If X i==Y ibe false, Z i=0, i.e. X iwith Y iwhen unequal, Z iassignment is 0; In formula, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1];
Referring to Fig. 7, introduce the content of operation of the SIMD instruction of " choosing highest order combination ": the encapsulation of data to input is X 0, X 1... X 8M-1, completing concurrently M to choosing after highest order combination operation, the encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein,
Z i=((X 8iaMP.AMp.Amp 0x80) <<7) ^ ((X 8i+1aMP.AMp.Amp 0x80) <<6) ^ ((X 8i+2aMP.AMp.Amp 0x80) <<5) ^ ((X 8i+3aMP.AMp.Amp 0x80) <<4); Formula ^ ((X 8i+4aMP.AMp.Amp 0x80) <<3) ^ ((X 8i+5aMP.AMp.Amp 0x80) <<2) ^ ((X 8i+6aMP.AMp.Amp 0x80) <<1) ^ (X 8i+7aMP.AMp.Amp 0x80)
In, i is the data sequence number in SIMD encapsulation of data, its span is [0, M-1].
The present invention has carried out repeatedly implementing test, and the parameter of emulation experiment is:
Figure BDA0000446936920000111
core tMon the cpu chip that i7-3610QM, dominant frequency are 2.3GHz, carry out the comparison of traditional shifting algorithm and the inventive method, coprocessing data transaction and the needed time of the data transaction after channel decoding before the chnnel coding of 655360 bit informations: the data transaction before chnnel coding, tradition look-up method amounts to 299 nanoseconds consuming time, the inventive method 31 nanoseconds consuming time.Data transaction after channel decoding, traditional look-up method 317 nanoseconds consuming time, the inventive method 73 nanoseconds consuming time.Speed, the inventive method is compared with traditional shifting algorithm, and processing speed has had and significantly improves.
In a word, the embodiment of the present invention has been verified the superperformance of this data transfer device, and experimental result is successfully, has realized goal of the invention.
The foregoing is only preferred embodiment of the present invention, in order to limit the present invention, within the spirit and principles in the present invention not all, any modification of making, be equal to replacement, improvement etc., within all should being included in the scope of protection of the invention.

Claims (10)

1. the conversion method of data format before the chnnel coding based on general processor GPP and single-instruction multiple-data stream (SIMD) SIMD technology, is characterized in that: by word length, be first the input traffic A of n byte 0a 1a 2... A n-1be encapsulated as the SIMD formatted data that s length is M, make its data format parallelization, can be applicable to SIMD instruction and it is carried out to parallel " mapping " operation: each byte of input traffic be copied as to 8 bytes, be converted into the first intermediary data stream E 0e 1... E 8n-1; Again to the first intermediary data stream E 0e 1... E 8n-1carry out parallel AND-operation, extract after each bit information, be converted into the second intermediary data stream F 0f 1... F 8n-1; Finally to the second intermediary data stream F 0f 1... F 8n-1carry out parallel " choosing smaller value " operation, each bit information is moved on to the lowest order of the byte at its place, generating word length is the output stream B of 8n byte 0b 1b 2... B 8n-1; Wherein, byte length n=M * s, natural number M and s are respectively length and the numbers of SIMD encapsulation of data.
2. method according to claim 1, is characterized in that: described method comprises following operating procedure:
(1) carry out parallel " mapping " SIMD instruction, complete data Replica operation:
Successively s SIMD encapsulation of data used to " mapping " SIMD instruction, the SIMD encapsulation of data of the X input of " mapping " SIMD instruction is A tM+0, A tM+1..., A tM+M-1, and each X input data will participate in interior loop 8 times; Wherein, t is for carrying out the number of operations sequence number of the outer circulation of " mapping " SIMD instruction, and its span is [0, s-1]; U is for carrying out the number of operations sequence number of " mapping " SIMD instruction interior loop, and its span is [0,7]; The SIMD encapsulation of data of the Y of " mapping " SIMD instruction of the u time interior loop of the t time outer circulation input is:
{ M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , M 8 u + 1 , . . . , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 , M 8 u + M 8 - 1 } , Complete after this " mapping " SIMD instruction, the first intermediary data stream Z obtaining is:
E 8tM+uM+0,E 8tM+uM+1,...,E 8tM+uM+M-1
(2) carry out parallel " with " SIMD instruction, complete the operation of extracting bit:
Successively to 8s encapsulation of data carry out parallel " with " SIMD instruction, should " with " the SIMD encapsulation of data of the X input of SIMD instruction is the first intermediary data stream Z:E rM+0, E rM+1..., E rM+M-1, " with " the SIMD encapsulation of data of the Y of SIMD instruction input is:
1,2,4,8,16,32,64,128,1,2,4,8,16,32,64,128 ..., 1,2,4,8,16,32,64,128}; Wherein, r for carrying out " with " the number of operations sequence number of SIMD instruction, its span is [0,8s-1]; Complete this " with " after SIMD instruction, the second intermediary data stream Z obtaining is: F rM+0, F rM+1..., F rM+M-1;
(3) carry out parallel " choosing smaller value " SIMD instruction, complete the lowest order operation that significant bit is displaced to its place byte:
Successively 8s SIMD encapsulation of data carried out to " choosing smaller value " SIMD instruction, the SIMD encapsulation of data of the X input of " choosing smaller value " SIMD instruction is F jM+0, F jM+1..., F jM+M-1, then arrange " choosing smaller value " SIMD instruction Y input SIMD encapsulation of data for 1,1 ..., 1}; Wherein, j is for carrying out the number of operations sequence number of " choosing smaller value " SIMD instruction, and its span is [0,8s-1]; Complete after this SIMD instruction of " choosing smaller value ", the final output stream Z obtaining is: B jM+0, B jM+1..., B jM+M-1.
3. method according to claim 1, is characterized in that: described SIMD technology when carrying out every instruction, the SIMD encapsulation of data X to two groups of each self-contained M elements concurrently 0, X 1... X m-1and Y 0, Y 1... Y m-1execution comprise mapping, with and choose the various operations of smaller value, and now, every couple of data element X iand Y icarry out same operation, wherein, i is the data sequence number in SIMD encapsulation of data simultaneously, and its span is [0, M-1]; Using a resulting M result of calculation as data element, be encapsulated in again one group of SIMD form encapsulation of data Z 0, Z 1... Z m-1in; Wherein, the length M of SIMD encapsulation of data depends on bit length P and the shared bit length Q of data element type of encapsulation of data, and its computing formula is
Figure FDA0000446936910000021
in formula, bit length P=64 * 2 of encapsulation of data p, p is natural number; When data element type is byte, corresponding Q=8, when data element type is word, corresponding Q=16.
4. method according to claim 3, is characterized in that: described SIMD technology, when carrying out every instruction, also can only be processed one group of encapsulation of data according to described method.
5. method according to claim 3, is characterized in that:
The content of operation of described " mapping " SIMD instruction is: two groups of SIMD encapsulation of data X to input 0, X 1... X m-1and Y 0, Y 1... Y m-1, completing concurrently common M data are carried out after " mapping " SIMD instruction, the SIMD encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein, i element in the SIMD encapsulation of data of output is with Y ifor subscript, find X 0, X 1... X m-1in respective value,
Figure FDA0000446936910000022
wherein, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1];
The content of operation of the SIMD instruction of described AND-operation is: two groups of encapsulation of data to input are X 0, X 1... X m-1and Y 0, Y 1... Y m-1, complete concurrently common M to data carry out " with " after SIMD command operating, the SIMD encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein, Z i=X iaMP.AMp.Amp Y i; Wherein, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1];
The SIMD command operating content of described " choosing smaller value " operation is: two groups of encapsulation of data to input are X 0, X 1... X m-1and Y 0, Y 1... Y m-1, complete concurrently common M to " choosing smaller value " SIMD command operating after, the SIMD encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein, Z i=min (X i, Y i); Wherein, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1].
6. the conversion method of data format after the channel decoding based on GPP and SIMD, is characterized in that: by word length, be the input traffic C of 8n byte 0c 1c 2... C 8n-1be encapsulated as the SIMD encapsulation format data that 8s length is M, make its data format parallelization, can be applicable to SIMD order structure and it is carried out to parallel " judging whether to equate " operation, each bit information be moved on to the highest order of the byte at its place, thereby be converted into intermediary data stream G 0g 1... G 8n-1; Again to this intermediary data stream G 0g 1... G 8n-1carry out parallel " choosing highest order combination " operation, generating word length is the output stream D of n byte 0d 1d 2... D n-1; Wherein, n=M * s, natural number M and s are respectively length and the numbers of SIMD encapsulation of data.
7. method according to claim 6, is characterized in that: described method comprises following operating procedure:
(1) use parallel " judging whether to equate " SIMD instruction, complete the operation that significant bit is displaced to the highest order of place byte:
Successively 8s SIMD encapsulation of data used to " judging whether to equate " SIMD instruction, the encapsulation of data of the X input of " judging whether to equate " SIMD instruction is C kM+0, C kM+1..., C kM+M-1, the encapsulation of data of the Y input of " judging whether to equate " SIMD instruction for 1,1 ..., 1}; Wherein, k is for carrying out the number of operations sequence number of " judging whether to equate " SIMD instruction, and its span is [0,8s-1]; Complete this and " judge whether to equate " that after SIMD instruction, the intermediary data stream Z obtaining is: G kM+0, G kM+1..., G kM+M-1;
(2) use parallel " choosing highest order combination " SIMD instruction to complete the operation of the highest order of each byte of 8 continuous bytes being merged into 1 byte:
Successively s SIMD encapsulation of data used to " choosing highest order combination " SIMD instruction, the encapsulation of data of the X input of " choosing highest order combination " SIMD instruction is G 8wM+0, G 8wM+1..., G 8wM+8M-1; Wherein, w is for carrying out the number of operations sequence number of " choosing highest order combination " SIMD instruction, and its span is [0, s-1]; Complete after this " choosing highest order combines " SIMD instruction, the final output stream Z obtaining is: D wM+0, D wM+1..., D wM+M-1.
8. method according to claim 6, is characterized in that: described SIMD technology when carrying out every instruction, the SIMD encapsulation of data X to two groups of each self-contained M elements concurrently 0, X 1... X m-1and Y 0, Y 1... Y m-1execution comprises the various operations that judge whether to equate and choose highest order combination, and now, every couple of data element X iand Y icarry out same operation, wherein, i is the data sequence number in SIMD encapsulation of data simultaneously, and its span is [0, M-1]; Using a resulting M result of calculation as data element, be encapsulated in again one group of SIMD form encapsulation of data Z 0, Z 1... Z m-1in; Wherein, bit length P=64 * 2 of encapsulation of data p; When data element type is byte, corresponding Q=8, wherein, the length M of SIMD encapsulation of data depends on bit length P and the shared bit length Q of data element type of encapsulation of data, its computing formula is
Figure FDA0000446936910000041
wherein, bit length P=64 * 2 of encapsulation of data p, p is natural number; When data element type is word, corresponding Q=8, when data element type is byte, corresponding Q=16.
9. method according to claim 8, is characterized in that: described SIMD technology, when carrying out every instruction, also can only be processed one group of encapsulation of data according to described method.
10. method according to claim 8, is characterized in that:
The content of operation of the SIMD instruction of described " judging whether to equate " is: two groups of encapsulation of data to input are X 0, X 1... X m-1and Y 0, Y 1... Y m-1, completing concurrently common M data are carried out after " judging whether to equate " SIMD command operating, output SIMD encapsulation of data is Z 0, Z 1... Z m-1; Wherein, Z i=X i==Y i255:0, the formula X on "=" number right side i==Y i255:0 is the conditional operator that machine word calls the turn, and represents: if X i==Y iset up, Z i=255, i.e. X iwith Y iwhile equating, Z iassignment is 255; If X i==Y ibe false, Z i=0, i.e. X iwith Y iwhen unequal, Z iassignment is 0; In formula, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1];
The content of operation of the SIMD instruction of described " choosing highest order combination " is: the encapsulation of data to input is X 0, X 1... X 8M-1, completing concurrently M to choosing after highest order combination operation, the encapsulation of data of output is Z 0, Z 1... Z m-1; Wherein,
Z i=((X 8iaMP.AMp.Amp 0x80) <<7) ^ ((X 8i+1aMP.AMp.Amp 0x80) <<6) ^ ((X 8i+2aMP.AMp.Amp 0x80) <<5) ^ ((X 8i+3aMP.AMp.Amp 0x80) <<4); Formula ^ ((X 8i+4aMP.AMp.Amp 0x80) <<3) ^ ((X 8i+5aMP.AMp.Amp 0x80) <<2) ^ ((X 8i+6aMP.AMp.Amp 0x80) <<1) ^ (X 8i+7aMP.AMp.Amp 0x80) in, i is the data sequence number in SIMD encapsulation of data, and its span is [0, M-1].
CN201310729424.7A 2013-12-26 2013-12-26 Data format conversion method of channel coding and decoding based on GPP and SIMD technologies Active CN103746771B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310729424.7A CN103746771B (en) 2013-12-26 2013-12-26 Data format conversion method of channel coding and decoding based on GPP and SIMD technologies

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310729424.7A CN103746771B (en) 2013-12-26 2013-12-26 Data format conversion method of channel coding and decoding based on GPP and SIMD technologies

Publications (2)

Publication Number Publication Date
CN103746771A true CN103746771A (en) 2014-04-23
CN103746771B CN103746771B (en) 2017-04-12

Family

ID=50503765

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310729424.7A Active CN103746771B (en) 2013-12-26 2013-12-26 Data format conversion method of channel coding and decoding based on GPP and SIMD technologies

Country Status (1)

Country Link
CN (1) CN103746771B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108780394A (en) * 2015-12-29 2018-11-09 英特尔公司 Hardware device and method for transform coding format
CN114581281A (en) * 2020-11-30 2022-06-03 北京君正集成电路股份有限公司 Optimization method based on first layer 4bit convolution calculation

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8570393B2 (en) * 2007-11-30 2013-10-29 Cognex Corporation System and method for processing image data relative to a focus of attention within the overall image
RU2011115796A (en) * 2011-04-22 2012-10-27 ЭлЭсАй Корпорейшн (US) DEVICE (OPTIONS) AND METHOD FOR APPROXIMATION WITH DOUBLE ACCURACY OPERATIONS WITH SINGLE ACCURACY
CN103294621B (en) * 2013-05-08 2016-04-06 中国人民解放军国防科学技术大学 Supported data presses the vectorial access method of mould restructuring

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108780394A (en) * 2015-12-29 2018-11-09 英特尔公司 Hardware device and method for transform coding format
CN108780394B (en) * 2015-12-29 2023-07-18 英特尔公司 Hardware apparatus and method for converting encoding format
CN114581281A (en) * 2020-11-30 2022-06-03 北京君正集成电路股份有限公司 Optimization method based on first layer 4bit convolution calculation

Also Published As

Publication number Publication date
CN103746771B (en) 2017-04-12

Similar Documents

Publication Publication Date Title
CN113762490B (en) Matrix multiplication acceleration using sparse matrix with column folding and squeezing
DE102018005181B4 (en) PROCESSOR FOR A CONFIGURABLE SPATIAL ACCELERATOR WITH PERFORMANCE, ACCURACY AND ENERGY REDUCTION CHARACTERISTICS
US20230333855A1 (en) Multi-variate strided read operations for accessing matrix operands
CN117931121A (en) Computer processor for higher precision computation using hybrid precision decomposition of operations
CN111512292A (en) Apparatus, method and system for unstructured data flow in a configurable spatial accelerator
DE102018126150A1 (en) DEVICE, METHOD AND SYSTEMS FOR MULTICAST IN A CONFIGURABLE ROOM ACCELERATOR
US11029958B1 (en) Apparatuses, methods, and systems for configurable operand size operations in an operation configurable spatial accelerator
CN104025033B (en) The SIMD variable displacements manipulated using control and circulation
CN105612509A (en) Methods, apparatus, instructions and logic to provide vector sub-byte decompression functionality
CN107992330A (en) Processor, method, processing system and the machine readable media for carrying out vectorization are circulated to condition
CN105264779A (en) Data compression and decompression using simd instructions
CN105975251B (en) A kind of DES algorithm wheel iteration systems and alternative manner based on coarseness reconstruction structure
CN118132147A (en) System for executing instructions that fast convert slices and use slices as one-dimensional vectors
CN110023903B (en) Binary vector factorization
CN107924307A (en) Register and data element rearrangement processor, method, system and instruction are dispersed to by index
EP3623940A2 (en) Systems and methods for performing horizontal tile operations
CN114443559A (en) Reconfigurable operator unit, processor, calculation method, device, equipment and medium
TW201732568A (en) Systems, apparatuses, and methods for lane-based strided gather
US20190004997A1 (en) Binary Multiplier for Binary Vector Factorization
CN111767512A (en) Discrete cosine transform/inverse discrete cosine transform DCT/IDCT system and method
CN103746771A (en) Data format conversion method of channel coding and decoding based on GPP and SIMD technologies
CN109328333B (en) System, apparatus and method for cumulative product
TWI544408B (en) Apparatus and method for sliding window data gather
CN109672524A (en) SM3 algorithm wheel iteration system and alternative manner based on coarseness reconstruction structure
EP3929734A1 (en) Loading and storing matrix data with datatype conversion

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant