CN114063977A - Universal digital signal processing device, method and system - Google Patents

Universal digital signal processing device, method and system Download PDF

Info

Publication number
CN114063977A
CN114063977A CN202010763369.3A CN202010763369A CN114063977A CN 114063977 A CN114063977 A CN 114063977A CN 202010763369 A CN202010763369 A CN 202010763369A CN 114063977 A CN114063977 A CN 114063977A
Authority
CN
China
Prior art keywords
operator
data
vector
module
memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010763369.3A
Other languages
Chinese (zh)
Inventor
龙红星
陈山枝
王卫兵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chenxin Technology Co ltd
Chen Core Technology Co ltd
Original Assignee
Chenxin Technology Co ltd
Chen Core Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chenxin Technology Co ltd, Chen Core Technology Co ltd filed Critical Chenxin Technology Co ltd
Priority to CN202010763369.3A priority Critical patent/CN114063977A/en
Publication of CN114063977A publication Critical patent/CN114063977A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/57Arithmetic logic units [ALU], i.e. arrangements or devices for performing two or more of the operations covered by groups G06F7/483 – G06F7/556 or for performing logical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)

Abstract

The embodiment of the invention discloses a universal digital signal processing device, a method and a system. The device comprises: the system comprises an instruction module, an operator calculation module and a serial calculation module; the instruction module reads an instruction from the memory and generates an operator control instruction and a serial control instruction, the operator control instruction is sent to the operator calculation module, and the serial control instruction is sent to the serial calculation module; the operator calculation module receives the operator control instruction, reads first data from the memory according to the operator control instruction, performs vector calculation on the first data, obtains a vector calculation result and writes the vector calculation result into the memory; and the serial computing module receives the serial control command, reads second data from the memory according to the serial control command, performs non-vector computation on the second data, obtains a non-vector computation result and writes the non-vector computation result into the memory. The embodiment of the invention can achieve better balance of time efficiency, energy efficiency and flexibility of the signal processor, and optimize the structure of the conventional universal digital signal processor.

Description

Universal digital signal processing device, method and system
Technical Field
The embodiment of the invention relates to the field of signal processing, in particular to a universal digital signal processing device, method and system.
Background
The digital signal processing method relates to a large number of digital signal processing tasks in the fields of wireless communication, digital image processing, radar and the like, and comprises three basic operations: multiplication, addition and shift operations, the various algorithmic processes of which can be written as a combination of these three operations. The digital signal processing operation is characterized in that a large amount of data are subjected to the same operation, the control logic is simple, and the throughput of input and output data is very high.
Most of the existing general digital signal processors are based on a von neumann architecture, a harvard architecture or an evolution architecture thereof, still adopt an instruction-data execution mode, and do not consider the data flow driving characteristics of digital signal processing, so that the energy efficiency and the time efficiency of the processors for completing digital signal processing tasks are not high.
Disclosure of Invention
Embodiments of the present invention provide a device, a method, and a system for processing a general digital signal, which can achieve better balance of time efficiency, energy efficiency, and flexibility of a signal processor by using characteristics of data stream driving, and optimize an architecture of an existing general digital signal processor.
In a first aspect, an embodiment of the present invention provides a general digital signal processing apparatus, including: the system comprises an instruction module, an operator calculation module and a serial calculation module;
the instruction module is used for reading an instruction from a memory, generating an operator control instruction or a serial control instruction according to the instruction, sending the operator control instruction to the operator calculation module, and sending the serial control instruction to the serial calculation module;
the operator calculation module is used for receiving the operator control instruction sent by the instruction module, reading first data from a memory according to the operator control instruction, performing vector calculation on the first data to obtain a vector calculation result, and writing the vector calculation result into the memory;
the serial calculation module is used for receiving the serial control instruction sent by the instruction module, reading second data from a memory according to the serial control instruction, performing non-vector calculation on the second data to obtain a non-vector calculation result, and writing the non-vector calculation result into the memory.
In a second aspect, an embodiment of the present invention further provides a method for general digital signal processing, where the method is performed by the general digital signal processing apparatus provided in the first aspect, and the method includes:
the instruction module reads an instruction from a memory, generates an operator control instruction or a serial control instruction according to the instruction, sends the operator control instruction to the operator calculation module, and sends the serial control instruction to the serial calculation module;
the operator calculation module receives an operator control instruction sent by the instruction module, reads first data from a memory according to the operator control instruction, performs vector calculation on the first data to obtain a vector calculation result, and writes the vector calculation result into the memory;
the serial calculation module receives the serial control instruction sent by the instruction module, reads second data from a memory according to the serial control instruction, performs non-vector calculation on the second data to obtain a non-vector calculation result, and writes the non-vector calculation result into the memory.
In a third aspect, an embodiment of the present invention further provides a general digital signal processing system, which includes a memory and at least two data access buses, and includes the general digital signal processing apparatus provided in the first aspect.
According to the technical scheme of the embodiment of the invention, the operator computing module completes a core computing task in an algorithm, realizes unified and flexible expansion of a basic operator form, realizes direct data interaction with a memory by utilizing direct memory access, eliminates a path bottleneck of instruction-data, and ensures that instruction receiving and executing processes of the serial computing module and the operator computing module are not influenced by each other by utilizing the characteristic of data flow driving, so that time efficiency, energy efficiency and flexibility of a signal processor can be well balanced, and the existing general digital signal processor architecture is optimized.
Drawings
Fig. 1 is a schematic structural diagram of a general digital signal processing apparatus according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of a general digital signal processing apparatus according to a second embodiment of the present invention.
Fig. 3A is a schematic diagram illustrating an operation of a butterfly operator according to a second embodiment of the present invention.
Fig. 3B is a schematic diagram illustrating an enhanced butterfly operator according to a second embodiment of the present invention.
Fig. 4 is a flowchart of a general digital signal processing method according to a third embodiment of the present invention.
Fig. 5 is a schematic structural diagram of a general digital signal processing system according to a fourth embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting of the invention.
It should be further noted that, for the convenience of description, only some but not all of the relevant aspects of the present invention are shown in the drawings. Before discussing exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as processes or methods depicted as flowcharts. Although a flowchart may describe the operations (or steps) as a sequential process, many of the operations can be performed in parallel, concurrently or simultaneously. In addition, the order of the operations may be re-arranged. The process may be terminated when its operations are completed, but may have additional steps not included in the figure. The processes may correspond to methods, functions, procedures, subroutines, and the like.
Example one
Fig. 1 is a schematic structural diagram of a general digital signal processing apparatus according to an embodiment of the present invention. As shown in fig. 1, the apparatus includes: instruction module 11, operator calculation module 12 and serial calculation module 13.
The instruction module 11 is configured to read an instruction from a memory, generate an operator control instruction or a serial control instruction according to the instruction, send the operator control instruction to the operator calculation module 12, and send the serial control instruction to the serial calculation module 13.
The memory stores instructions and data to be calculated, and the instruction module 11 reads the instructions from the memory and decodes the instructions to generate operator control instructions or serial control instructions. The operator control instruction is used for sending to the operator calculation module 12 and controlling the operator calculation module 12 to read first data in the data to be calculated from the memory. The serial control instruction is used for sending to the serial computing module 13 and controlling the serial computing module 13 to read the second data in the data to be computed from the memory.
The operator calculating module 12 is configured to receive the operator control instruction sent by the instruction module 11, read first data from a memory according to the operator control instruction, perform vector calculation on the first data to obtain a vector calculation result, and write the vector calculation result into the memory.
The operator calculation module 12 is used for completing a core calculation task in the algorithm and performing vector calculation by using various basic operators of vector signal analysis. The first data is data which needs to be calculated by a core calculation task in the algorithm and is stored in the memory. The process of vector calculation of the first data by the operator calculation module 12 is data stream driving, vector calculation of all received first data can be completed without sending any instruction by the instruction module 11 after the first data in the memory is accessed and read by using the direct memory, and each basic operator of vector signal analysis can perform vector calculation on a plurality of first data in parallel. The vector calculation result is a calculation result required by a core calculation task in the algorithm.
The serial calculation module 13 is configured to receive the serial control instruction sent by the instruction module 11, read second data from a memory according to the serial control instruction, perform non-vector calculation on the second data to obtain a non-vector calculation result, and write the non-vector calculation result into the memory.
The serial computation module 13 is used to perform tasks such as simple scalar computation, program flow control, direct memory access (GDMA) configuration, and the like. The serial calculation block 13 includes a serial calculation unit, a register group, and a read/write (LD/ST) unit. The serial computing unit may be a simple, self-developed Reduced Instruction Set (RISC) processor, or a simple processor based on RISC-V/ARM processor architecture, for performing non-vector computations. The register group comprises at least one register for storing the second data read from the memory by the read-write unit and the non-vector calculation result written by the serial calculation unit. The read-write unit is configured to receive a serial control instruction sent by the instruction module 11, and read a second data from the memory according to the serial control instruction and write the second data into the register or read a non-vector calculation result from the register and write the non-vector calculation result into the memory. The second data is data which needs to be calculated by tasks such as simple scalar calculation, program flow control, direct memory access configuration and the like, and is stored in a memory. A serial control command may control the serial calculation module 13 to perform non-vector calculations on a second data. The non-vector calculation result is a calculation result required by tasks such as simple scalar calculation, program flow control, direct memory access configuration and the like.
The embodiment provides a general digital signal processing device, which utilizes an operator computing module to complete a core computing task in an algorithm, realizes unified and flexible expansion of a basic operator form, utilizes direct memory access to realize direct data interaction with a memory, eliminates a path bottleneck of instruction-data, utilizes the characteristic of data flow driving to ensure that instruction receiving and executing processes of a serial computing module and the operator computing module are not influenced mutually, can achieve better balance of time efficiency, energy efficiency and flexibility of a signal processor, and optimizes the structure of the conventional general digital signal processor.
In an optional implementation manner of the foregoing embodiment, the instruction module 11 and the serial computation module 13 are connected to the memory through a first data access bus, and the operator computation module 12 is connected to the memory through a second data access bus;
or, the instruction module 11 is connected to the memory through a first data access bus, and the serial computation module 13 and the operator computation module 12 are connected to the memory through a second data access bus;
or, the instruction module 11 is connected to the memory through a first data access bus, the operator calculation module 12 is connected to the memory through a second data access bus, and the serial calculation module 13 is connected to the memory through a third data access bus.
Wherein, the first data access bus can be used for data interaction between the instruction module 11 and the memory, and the second data access bus can be used for data interaction between the operator calculation module 12 and the memory. The amount of data interaction between the serial computing module 13 and the memory is small, and may be implemented by the first data access bus or the second data access bus, or may be implemented by a third access data bus for data interaction between the serial computing module 13 and the memory.
The above embodiments provide a general digital signal processing apparatus with multiplexed data access buses, and data interaction between a serial computing module and a memory is combined into a first data access bus or a second data access bus, so that the architecture of the general digital signal processing apparatus is further optimized, and the integration complexity of the general digital signal processing apparatus in a system is reduced.
Example two
Fig. 2 is a schematic structural diagram of a general digital signal processing apparatus according to a second embodiment of the present invention. As shown in fig. 2, on the basis of the first embodiment, the operator calculation module 12 includes: an operator calculation unit 121 and a direct memory access controller 122.
The operator calculating unit 121 is configured to perform vector calculation on the first data and obtain a vector calculation result.
The operator calculation unit 121 starts to perform vector calculation after receiving the first data, and is driven by the data stream without any instruction sent by the instruction module 11.
Optionally, the operator calculating unit 121 includes: and the at least three vector signal analysis basic operators are used for carrying out vector calculation of corresponding types on the first data and obtaining a vector calculation result.
Wherein at least three vector signal analysis basic operators in said operator calculation unit 121 comprise basic operations of signal processing, the combination of which can perform all of the core calculation tasks of the algorithm. The vector signal analysis basic operator is a general operator.
Optionally, the at least three vector signal analysis basic operators include: vector scaling operators, vector point multipliers and vector multipliers.
The combination of the vector scaling operator, the vector point multiplication operator and the vector multiplication operator can complete all calculation tasks required by signal processing.
Optionally, the at least three vector signal analysis basic operators further include: at least one of a matrix multiplier, a convolution operator, a correlation operator, a butterfly operator, a polynomial operator, a look-up table operator, an accumulation operator, a comparison operator, and a tensor operator.
The matrix multiplication operator can be optimized for the operation of 2 × 2 and 4 × 4 matrixes, and the access amount of the external memory can be reduced. For example, the matrix multiplier expression for performing a matrix multiplication operation on a 2 × 2 matrix is:
Figure BDA0002613698740000071
the convolution operator can realize signal filtering, and for example, a finite-length single-bit impulse response (FIR) filter is a common convolution operation, and the expression of the convolution operator for realizing the convolution operation of the FIR filter is as follows:
Figure BDA0002613698740000081
where h (k) is the filter coefficient, which is often symmetrical. Under the condition of a certain number of multipliers, in order to support higher-order FIR filters, some special structures are often adopted, and for example, when K is an even number, the convolution operator expression is as follows:
Figure BDA0002613698740000082
illustratively, when K is an odd number, the convolution operator expression is:
Figure BDA0002613698740000083
the expression of the correlation operator is:
Figure BDA0002613698740000084
butterfly operator as shown in fig. 3A, for example, two vectors a ═ a (0), a (1), … a (N-1) ] and b ═ b (0), b (01), … b (N-1) ] are input, and the result is output as two new vectors [ a (0) + b (0), a (1) + b (1), … a (N-1) + b (N-1) ] and [ a (0) -b (0), a (1) -b (1), … a (N-1) -b (N-1) ], via the butterfly factor.
An enhanced form of the butterfly operator is to add two normalized multiplication coefficients C0 and C1 at the output, as shown in fig. 3B.
The expression of the polynomial operator is:
y=aK-1xK-1+aK-2xK-2+…+a0
the lookup table operator mainly comprises two forms of finding out the corresponding element from the table according to the index and finding out the index corresponding to the element which is not 0 in the output vector.
The expression of the accumulation operator is:
Figure BDA0002613698740000091
the comparison operator is used for comparing corresponding elements of the vector a and the vector b, and the expression is as follows:
Figure BDA0002613698740000092
tensor operators are used for high-dimensional digital signal processing.
The operator calculating unit 121 may further include any other general vector signal analysis basic operator capable of implementing a signal processing calculation task, so as to implement operator extension. Alternatively, at least two vector signal analysis basic operators of all vector signal analysis basic operators included in the operator calculation unit 121 may implement hardware multiplexing.
The implementation mode can realize the unification and flexible expansion of the basic operator form, greatly improves the efficiency of certain algorithm application of the general digital signal processing device, and has simple logic.
The direct memory access controller 122 is configured to receive the operator control instruction, read the first data from a memory according to the operator control instruction, write the first data into the operator computing unit 121, and read the vector computing result from the operator computing unit 121, and write the vector computing result into the memory.
The dma controller 122 directly exchanges data with the memory based on a slave-mode General Direct Memory Access (GDMA). After receiving the operator control instruction, the direct memory access controller 122 performs an addressing operation on the first data in the memory, and reads the first data and writes the first data into the operator calculation unit 121. When the operator calculation unit 121 completes vector calculation and obtains a vector calculation result, the direct memory access controller 122 automatically performs an addressing operation on the vector calculation result and writes the vector calculation result into the memory.
Optionally, the direct memory access controller 122 is configured to write data to a subsequent vector signal analysis basic operator after a vector calculation is completed by a previous vector signal analysis basic operator of every two vector signal analysis basic operators.
The vector signal analysis basic operators start vector calculation on the data after receiving the data written by the direct memory access controller 122, and all the vector signal analysis basic operators perform vector calculation in series.
Optionally, the direct memory access controller 122 is configured to write all data that needs to perform the same type of vector calculation in the first data into a vector signal analysis basic operator of a corresponding type, so that the vector signal analysis basic operator performs vector calculation on the data at the same time.
The embodiment provides a general digital signal processing device, which utilizes an operator computing module to complete a core computing task in an algorithm, realizes unified and flexible expansion of a basic operator form, utilizes direct memory access to realize direct data interaction with a memory, eliminates a path bottleneck of instruction-data, utilizes the characteristic of data flow driving, ensures that instruction receiving and executing processes of a serial computing module and the operator computing module are not influenced mutually, can achieve better balance of time efficiency, energy efficiency and flexibility of a signal processor, and optimizes the architecture of the existing general digital signal processor; vector calculation is carried out on a plurality of data simultaneously by using a vector signal analysis basic operator included in the operator calculation unit, parallel calculation of a large amount of data is realized, and the working efficiency of the general digital signal processing device is further improved.
EXAMPLE III
Fig. 4 is a flowchart of a general digital signal processing method according to a third embodiment of the present invention. The present embodiment is applicable to the case of processing general-purpose digital signals by serial calculation and sub-calculation, and the method can be executed by the general-purpose digital signal processing apparatus provided in the embodiment of the present invention. As shown in fig. 4, the general digital signal processing method specifically includes:
step 301, an instruction module reads an instruction from a memory, generates an operator control instruction or a serial control instruction according to the instruction, sends the operator control instruction to an operator calculation module, and sends the serial control instruction to a serial calculation module.
Step 302, the operator calculation module receives the operator control instruction sent by the instruction module, reads first data from a memory according to the operator control instruction, performs vector calculation on the first data to obtain a vector calculation result, and writes the vector calculation result into the memory.
Step 303, receiving the serial control instruction sent by the instruction module by using the serial calculation module, reading second data from a memory according to the serial control instruction, performing non-vector calculation on the second data to obtain a non-vector calculation result, and writing the non-vector calculation result into the memory.
The step 302 and the step 303 are respectively executed by the operator calculation module and the serial calculation module, the execution processes are not affected by each other, and the execution sequence is not limited.
In the embodiment, the operator computing module is used for completing core computing tasks in an algorithm, the unified and flexible expansion of a basic operator form is realized, the direct data interaction with the memory is realized by utilizing the access of the direct memory, the instruction-data path bottleneck is eliminated, the instruction receiving and executing processes of the serial computing module and the operator computing module are not influenced by each other by utilizing the characteristic of data stream driving, the time efficiency, the energy efficiency and the flexibility of the signal processor can be well balanced, and the structure of the conventional universal digital signal processor is optimized.
In an optional implementation manner of this embodiment, the instruction module and the serial computation module are connected to the memory through a first data access bus, and the operator computation module is connected to the memory through a second data access bus;
or, the instruction module is connected with the memory through a first data access bus, and the serial computation module and the operator computation module are connected with the memory through a second data access bus;
or, the instruction module 11 is connected to the memory through a first data access bus, the operator calculation module 12 is connected to the memory through a second data access bus, and the serial calculation module 13 is connected to the memory through a third data access bus.
The above embodiments provide a data access bus multiplexing method for processing a universal digital signal, and data interaction between a serial computing module and a memory is combined into a first data access bus or a second data access bus, so that the architecture of a universal digital signal processing apparatus is further optimized, and the integration complexity of the universal digital signal processing apparatus in a system is reduced.
In an optional implementation manner of this embodiment, the operator calculating module includes: the operator computing unit and the direct memory access controller; the operator calculation unit is used for carrying out vector calculation on the first data and obtaining a vector calculation result; the direct memory access controller is used for receiving the operator control instruction, reading the first data from a memory according to the operator control instruction, writing the first data into the operator computing unit, and reading the vector computing result from the operator computing unit, and writing the vector computing result into the memory. Step 302 specifically includes:
the direct memory access controller receives an operator control instruction sent by an instruction unit, reads first data from a memory according to the operator control instruction and writes the first data into an operator calculation unit;
the operator calculation unit carries out vector calculation on the first data to obtain a vector calculation result;
the direct memory access controller writes the vector calculation result into the memory.
Optionally, the operator calculation unit includes at least three vector signal analysis basic operators, and the vector calculation of the first data by the operator calculation unit to obtain a vector calculation result includes: and analyzing a basic operator by using at least three vector signals, and performing vector calculation of a corresponding type on the first data to obtain a vector calculation result.
Optionally, the at least three vector signal analysis basic operators include: vector scaling operators, vector point multipliers and vector multipliers.
Optionally, the at least three vector signal analysis basic operators further include: at least one of a matrix multiplier, a convolution operator, a correlation operator, a butterfly operator, a polynomial operator, a look-up table operator, an accumulation operator, a comparison operator, and a tensor operator.
The implementation mode can realize the unification and flexible expansion of the basic operator form, greatly improves the efficiency of certain algorithm application of the general digital signal processing device, and has simple logic.
Optionally, the reading the first data from the memory and writing the first data into the operator computing unit according to the operator control instruction includes: and the direct memory access controller writes data into the next vector signal analysis basic operator after the vector calculation is completed by the previous vector signal analysis basic operator in every two vector signal analysis basic operators.
Optionally, the reading the first data from the memory and writing the first data into the operator computing unit according to the operator control instruction further includes: and the direct memory access controller writes all data which need to execute the same type of vector calculation in the first data into a vector signal analysis basic operator of a corresponding type, so that the vector signal analysis basic operator performs vector calculation on the data at the same time.
The embodiment utilizes the operator computing module to complete the core computing task in the algorithm, realizes the unified and flexible expansion of the basic operator form, utilizes the direct memory access to realize the direct data interaction with the memory, eliminates the path bottleneck of instruction-data, utilizes the characteristic of data flow drive to ensure that the instruction receiving and executing processes of the serial computing module and the operator computing module are not influenced mutually, can achieve better balance of time efficiency, energy efficiency and flexibility of a signal processor, and optimizes the architecture of the existing general digital signal processor; the vector signal analysis basic operator in the operator calculation unit simultaneously carries out vector calculation on a plurality of data, thereby realizing parallel calculation of a large amount of data and further improving the working efficiency of the general digital signal processing device.
Example four
Fig. 5 is a schematic structural diagram of a general digital signal processing system according to a fourth embodiment of the present invention. As shown in fig. 5, the system includes: a memory 51, at least two data access buses 52 and a general digital signal processing device 53 provided by the first or second embodiment of the invention. The general digital signal processing system can execute the general digital signal processing method provided by the third embodiment of the present invention, and the specific implementation corresponds to the implementation in the foregoing embodiments.
The memory 51 is used for storing instructions, data to be calculated and calculation results. The memory 51 is connected to an instruction module in the general digital signal processing device 53 through a first data access bus, so that the instruction module reads instructions from the memory 51. The memory 51 is connected with the operator calculation module in the general digital signal processing device 53 through a second data access bus, so that the operator calculation module reads the first data in the data to be calculated from the memory 51 and writes the vector calculation result into the memory 51. The memory 51 is connected with the serial computing module in the general digital signal processing device 53 through the first data access bus, the second data access bus or the third data access bus, so that the serial computing module reads the second data in the data to be computed from the memory 51 and writes the non-vector computing result into the memory 51.
The at least two data access buses 52 include a first data access bus and a second data access bus, the first data access bus can be used for data interaction between the instruction module in the general-purpose digital signal processing device 53 and the memory 51, the second data access bus can be used for data interaction between the operator calculation module in the general-purpose digital signal processing device 53 and the memory 51, the amount of data interaction between the serial calculation module in the general-purpose digital signal processing device 53 and the memory 51 is small, and the data interaction can be realized by the first data access bus or the second data access bus. The at least two data access buses 52 may further include a third data access bus, and data interaction between the serial computing module in the general digital signal processing device 53 and the memory 51 may also be implemented by the third data access bus, and the third data access bus may be used for data interaction between the serial computing module and the memory 51.
The general digital signal processing device 53 includes an instruction module, an operator calculation module, and a serial calculation module. The instruction module is configured to read an instruction from the memory 51 through the first data access bus, generate an operator control instruction or a serial control instruction according to the instruction, send the operator control instruction to the operator calculation module, and send the serial control instruction to the serial calculation module. The operator calculation module is configured to receive the operator control instruction sent by the instruction module, read first data from the memory 51 through a second data access bus according to the operator control instruction, perform vector calculation on the first data to obtain a vector calculation result, and write the vector calculation result into the memory 51. The serial calculation module is configured to receive the serial control instruction sent by the instruction module, read second data from the memory 51 through the first data access bus, the second data access bus, or the third data access bus according to the serial control instruction, perform non-vector calculation on the second data to obtain a non-vector calculation result, and write the non-vector calculation result into the memory 51.
One specific embodiment of the general digital signal processing system is that, for example, when the system starts a data processing task, an instruction module in the general digital signal processing device 53 reads an instruction S1 from the memory 51 through a first data access bus of the at least two data access buses 52 and decodes the instruction to obtain a serial control instruction S1, sends the serial control instruction S1 to a serial calculation module in the general digital signal processing device 53, and the serial calculation module receives the serial control instruction S1, reads one second data from the memory 51 through the first data access bus, the second data access bus, or a third data access bus according to the serial control instruction S1 and performs corresponding non-vector calculation on the one second data to obtain a non-vector calculation result and writes the non-vector calculation result in the memory 51; after the serial control instruction S1 is sent to the serial calculation module by the instruction module, the next instruction P1 is read from the memory 51 and decoded to obtain an operator control instruction P1, the operator control instruction P1 is sent to the operator calculation module in the general digital signal processing device 53, the operator calculation module receives the operator control instruction P1, reads a plurality of first data from the memory 51 through a second data access bus of the at least two data access buses 52 according to the operator control instruction P1 and performs corresponding vector calculation on the plurality of first data to obtain a vector calculation result, and writes the vector calculation result into the memory 51; the operator calculation module can perform vector calculation on the plurality of first data in parallel, and can complete the vector calculation without sending any instruction by the instruction module after the first data is read, so that the calculation processes of the operator calculation module and the serial calculation module are not influenced by each other. According to the above flow, until the instruction module finishes reading all the instructions corresponding to the data processing task in the memory 51, and the operator calculation module and the serial calculation module write all the calculation results into the memory 51, the data processing task is completed.
In the embodiment, the data processing efficiency is improved, the time ratio of the operator calculation module to perform the algorithm core calculation task is high, and the energy efficiency ratio of calculation is improved.
The embodiment provides a general digital signal processing system, which utilizes an operator computing module to complete a core computing task in an algorithm, realizes unified and flexible expansion of a basic operator form, utilizes direct memory access to realize direct data interaction with a memory, eliminates a path bottleneck of instruction-data, utilizes the characteristic of data flow driving, ensures that instruction receiving and executing processes of a serial computing module and the operator computing module are not influenced mutually, can achieve better balance of time efficiency, energy efficiency and flexibility of a signal processor, and optimizes the architecture of the existing general digital signal processor; by combining the data interaction between the serial computing module and the memory into the first data access bus or the second data access bus, the architecture of the universal digital signal processing device is further optimized, and the integration complexity of the universal digital signal processing device in a system is reduced.
It is to be noted that the foregoing is only illustrative of the preferred embodiments of the present invention and the technical principles employed. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, although the present invention has been described in greater detail by the above embodiments, the present invention is not limited to the above embodiments, and may include other equivalent embodiments without departing from the spirit of the present invention, and the scope of the present invention is determined by the scope of the appended claims.

Claims (10)

1. A general-purpose digital signal processing apparatus, comprising: the system comprises an instruction module, an operator calculation module and a serial calculation module;
the instruction module is used for reading an instruction from a memory, generating an operator control instruction or a serial control instruction according to the instruction, sending the operator control instruction to the operator calculation module, and sending the serial control instruction to the serial calculation module;
the operator calculation module is used for receiving the operator control instruction sent by the instruction module, reading first data from a memory according to the operator control instruction, performing vector calculation on the first data to obtain a vector calculation result, and writing the vector calculation result into the memory;
the serial calculation module is used for receiving the serial control instruction sent by the instruction module, reading second data from a memory according to the serial control instruction, performing non-vector calculation on the second data to obtain a non-vector calculation result, and writing the non-vector calculation result into the memory.
2. The apparatus of claim 1, wherein the operator computation module comprises: the operator computing unit and the direct memory access controller;
the operator calculation unit is used for carrying out vector calculation on the first data and obtaining a vector calculation result;
the direct memory access controller is used for receiving the operator control instruction, reading the first data from a memory according to the operator control instruction, writing the first data into the operator computing unit, and reading the vector computing result from the operator computing unit, and writing the vector computing result into the memory.
3. The apparatus of claim 2, wherein the operator computation unit comprises: and the at least three vector signal analysis basic operators are used for carrying out vector calculation of corresponding types on the first data and obtaining a vector calculation result.
4. The apparatus of claim 3, wherein the at least three vector signal analysis operators comprise: vector scaling operators, vector point multipliers and vector multipliers.
5. The apparatus of claim 4, wherein the at least three vector signal analysis operators further comprise: at least one of a matrix multiplier, a convolution operator, a correlation operator, a butterfly operator, a polynomial operator, a look-up table operator, an accumulation operator, a comparison operator, and a tensor operator.
6. The apparatus of claim 3, wherein said direct memory access controller is configured to write data to a subsequent vector signal analysis operator after a vector calculation is performed by a previous vector signal analysis operator of every two of said vector signal analysis operators.
7. The apparatus according to claim 3, wherein the direct memory access controller is configured to write all data of the first data, which need to perform vector calculation of a same type, into a vector signal analysis basic operator of a corresponding type, so that the vector signal analysis basic operator performs vector calculation on the data simultaneously.
8. The apparatus of claim 1, wherein the instruction module and the serial computation module are connected to the memory via a first data access bus, and the operator computation module is connected to the memory via a second data access bus;
or, the instruction module is connected with the memory through a first data access bus, and the serial computation module and the operator computation module are connected with the memory through a second data access bus.
9. A general-purpose digital signal processing method, performed by the general-purpose digital signal processing apparatus of any one of claims 1 to 8, the method comprising:
the instruction module reads an instruction from a memory, generates an operator control instruction or a serial control instruction according to the instruction, sends the operator control instruction to the operator calculation module, and sends the serial control instruction to the serial calculation module;
the operator calculation module receives an operator control instruction sent by the instruction module, reads first data from a memory according to the operator control instruction, performs vector calculation on the first data to obtain a vector calculation result, and writes the vector calculation result into the memory;
the serial calculation module receives the serial control instruction sent by the instruction module, reads second data from a memory according to the serial control instruction, performs non-vector calculation on the second data to obtain a non-vector calculation result, and writes the non-vector calculation result into the memory.
10. A general purpose digital signal processing system comprising a memory and at least two data access buses, characterized in that it comprises a general purpose digital signal processing device according to any of claims 1-8.
CN202010763369.3A 2020-07-31 2020-07-31 Universal digital signal processing device, method and system Pending CN114063977A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010763369.3A CN114063977A (en) 2020-07-31 2020-07-31 Universal digital signal processing device, method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010763369.3A CN114063977A (en) 2020-07-31 2020-07-31 Universal digital signal processing device, method and system

Publications (1)

Publication Number Publication Date
CN114063977A true CN114063977A (en) 2022-02-18

Family

ID=80227919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010763369.3A Pending CN114063977A (en) 2020-07-31 2020-07-31 Universal digital signal processing device, method and system

Country Status (1)

Country Link
CN (1) CN114063977A (en)

Similar Documents

Publication Publication Date Title
US11531540B2 (en) Processing apparatus and processing method with dynamically configurable operation bit width
CN109240746B (en) Apparatus and method for performing matrix multiplication operation
CN107315715B (en) Apparatus and method for performing matrix addition/subtraction operation
JP5748935B2 (en) Programmable data processing circuit supporting SIMD instructions
US7725520B2 (en) Processor
CN107315718B (en) Device and method for executing vector inner product operation
CN107315716B (en) Device and method for executing vector outer product operation
JP2959104B2 (en) Signal processor
CN107315717B (en) Device and method for executing vector four-rule operation
JPS60134974A (en) Vector processor
WO2022134729A1 (en) Risc-v-based artificial intelligence inference method and system
CN111176608A (en) Apparatus and method for performing vector compare operations
CN114443559A (en) Reconfigurable operator unit, processor, calculation method, device, equipment and medium
JP3237858B2 (en) Arithmetic unit
CN116796812A (en) Programmable parallel processing device, neural network chip and electronic equipment
CN109389213B (en) Storage device and method, data processing device and method, and electronic device
CN111459546A (en) Device and method for realizing variable bit width of operand
CN114063977A (en) Universal digital signal processing device, method and system
US6658440B1 (en) Multi channel filtering device and method
US5602727A (en) Image processor
CN114327639A (en) Accelerator based on data flow architecture, and data access method and equipment of accelerator
CN113867800A (en) Computing device, integrated circuit chip, board card, electronic equipment and computing method
CN117632607B (en) Programmable digital signal parallel processor and abnormality detection and fault recognition method thereof
CN117093263A (en) Processor, chip, board card and method
CN114820313A (en) Image scaling device, method, equipment and storage medium based on data stream architecture

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 201206 Shanghai Pudong New Area free trade pilot area 1258 moon 3 building fourth floor A406 room

Applicant after: Chen core technology Co.,Ltd.

Applicant after: Chenxin Technology Co.,Ltd.

Address before: 201206 Shanghai Pudong New Area free trade pilot area 1258 moon 3 building fourth floor A406 room

Applicant before: Chen core technology Co.,Ltd.

Applicant before: Chenxin Technology Co.,Ltd.