CN113076277B - Method, device, computer storage medium and terminal for realizing pipeline scheduling - Google Patents
Method, device, computer storage medium and terminal for realizing pipeline scheduling Download PDFInfo
- Publication number
- CN113076277B CN113076277B CN202110331636.4A CN202110331636A CN113076277B CN 113076277 B CN113076277 B CN 113076277B CN 202110331636 A CN202110331636 A CN 202110331636A CN 113076277 B CN113076277 B CN 113076277B
- Authority
- CN
- China
- Prior art keywords
- pipeline
- pipeline stage
- data
- stage
- running
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000012545 processing Methods 0.000 claims description 79
- 238000007781 pre-processing Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 14
- 230000005540 biological transmission Effects 0.000 abstract description 14
- 238000010586 diagram Methods 0.000 description 13
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 8
- 238000004891 communication Methods 0.000 description 5
- 238000005516 engineering process Methods 0.000 description 5
- 238000012795 verification Methods 0.000 description 4
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000007723 transport mechanism Effects 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/38—Information transfer, e.g. on bus
- G06F13/42—Bus transfer protocol, e.g. handshake; Synchronisation
- G06F13/4282—Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/76—Architectures of general purpose stored program computers
- G06F15/78—Architectures of general purpose stored program computers comprising a single central processing unit
- G06F15/7867—Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
- G06F15/7871—Reconfiguration support, e.g. configuration loading, configuration switching, or hardware OS
- G06F15/7878—Reconfiguration support, e.g. configuration loading, configuration switching, or hardware OS for pipeline reconfiguration
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- Advance Control (AREA)
- Power Sources (AREA)
Abstract
The invention discloses a method, a device, a computer storage medium and a terminal for realizing pipeline scheduling, wherein when pipeline stages are not aligned, the method, the device and the terminal acquire the running time information of each pipeline stage of pipeline operation data; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.
Description
Technical Field
The present disclosure relates to, but is not limited to, chip processing technologies, and in particular, to a method, an apparatus, a computer storage medium, and a terminal for implementing pipeline scheduling.
Background
In the communication between the host and the encryption and decryption chip, the data from the communication interface to the encryption and decryption processing is scheduled in a pipeline mode on the premise of the same hardware resource, so that the system bandwidth can be greatly improved, and the high-speed transmission processing of the data is realized.
To make each pipeline stage in the pipeline run to reach maximum efficiency, it is necessary to ensure that the running time of each pipeline stage is the same or approximately the same, and if the running time of each pipeline stage is different, it is necessary to ensure that the running time of each pipeline stage remains the same by setting a mode of idle waiting, which can prolong the time required by the processing chip and cause resource waste. Taking an encryption and decryption chip with a Serial Peripheral Interface (SPI) as a communication interface as an example, FIG. 1 is a schematic diagram of a system for realizing data processing of the encryption and decryption chip in the related art, and as shown in FIG. 1, the system comprises the SPI, a four-channel Direct Memory Access (DMA) and an encryption and decryption module for encryption and decryption processing; wherein a four-way Direct Memory Access (DMA) includes: the system comprises a first built-in remote monitoring (EDMAC), a second EDMAC, a first Static Random Access Memory (SRAM) sending module, a second SRAM sending module, a first SRAM receiving module and a second SRAM receiving module; the first EDMAC is connected with the SPI interface and receives data to be operated, the first EDMAC transmits the data to be operated to the first SRAM sending module through a sending channel I, sends the data to the second EDMAC through a sending channel III and sends the data to the encryption and decryption module through a second EDMAC; similarly, the first EDMAC transmits the data to be operated to the second SRAM transmitting module through the second transmitting channel, and transmits the data to the second EDMAC through the fourth transmitting channel, and transmits the data to the encryption and decryption module through the second EDMAC; the DMA pre-processes the data to be operated in the process of data transmission; after the encryption and decryption module encrypts and decrypts the received preprocessed data, the second EDMAC transmits the encrypted and decrypted data to the first SRAM receiving module through the third receiving channel, and sends the encrypted and decrypted data to the first EDMAC through the first receiving channel to finish encryption and decryption data return; similarly, the second EDMAC transmits the encrypted and decrypted data to the second SRAM receiving module through the receiving channel four, and sends the encrypted and decrypted data to the first EDMAC through the receiving channel two, so as to complete the return of the encrypted and decrypted data; based on the system of FIG. 1, an external host transmits data to be operated to an encryption and decryption chip through SPI, and the encryption and decryption chip performs encryption and decryption processing after preprocessing the data to be operated; generally, the related art preprocesses data to be operated by an algorithm including a domestic hash algorithm (SM 3); encrypting and decrypting the data by adopting an algorithm including an asymmetric encryption and decryption algorithm (SM 2); the data size of the data to be operated generally meets the data length requirement of encryption and decryption processing, such as abstract signature verification, message signature verification and long data signature verification, which respectively correspond to different hundred-several hundred-bit (Byte) bytes. Preprocessing data to be operated, performing DMA transmission, and performing encryption and decryption processing through SM 2; in order to implement high-speed processing of large data volume operations, a plurality of pipeline stages for running SM2 are generally required to be set to work simultaneously, and if three pipeline stages for running SM2 are set, the pipeline stages are defined as sm2_0, sm2_1 and sm2_2 by underlining and numbering, and then sm2_0, sm2_1 and sm2_2 respectively perform a part of encryption and decryption processing. After the SM2 encrypts and decrypts the preprocessed data, the encrypted and decrypted data is transmitted back to the external host through the DMA and the SPI, so that the data processing of the encryption and decryption chip is completed once. The related art generally implements preprocessing by SM3 through one pipeline stage, except for pipeline stages sm2_0, sm2_1, and sm2_2. When the pipeline runs, a pipeline with equal execution time of each pipeline stage is called an alignment pipeline, and the alignment pipeline can obtain the maximum acceleration ratio. Pipelines with unequal time consumption of each pipeline stage are called non-aligned pipelines; in order to realize normal operation of the non-aligned pipeline, the related technology generally needs to set idle stages for less pipeline stages in use, and therefore, part of operation efficiency of the pipeline is lost. Taking a pipeline comprising four pipeline stages as an example, fig. 2 is a working schematic diagram of an alignment pipeline, as shown in fig. 2, in the figure, S1, S2, S3 and S4 represent four pipeline stages, and each row of S1, S2, S3 and S4, which are ordered according to time, represents a complete data encryption and decryption process of an encryption and decryption chip; s1 which is longitudinally arranged according to the space of the pipeline is continuous in time, and the pre-processing is performed on different encryption and decryption chips through SM 3; each S2 longitudinally arranged according to the space of the pipeline is continuous in time, and represents that the pre-processed data of the row where the S2 is located is encrypted and decrypted by the SM2_0, the SM2_1 and the SM 2_2; the execution time of the four pipeline stages in the figure is the same, and the operation time of each pipeline stage is delta T. Fig. 3 is a schematic working diagram of a non-aligned pipeline, as shown in fig. 3, the operation time of S1 is Δt, which is greater than the operation time of S2, S3 and S4, and in order to ensure normal operation of the pipeline, all of S2 to S4 need to be set with null or the like, and the operation time is extended to Δt, so that the alignment of pipeline stages can be realized, thereby ensuring normal operation of the pipeline. The operation time of the pipeline stages in the pipeline is different, so that the operation speed of the pipeline stages is not matched, the pipeline stages can be aligned by setting the pipeline stages in a null mode, the transmission bandwidth for scheduling the encryption and decryption chips is reduced, and in addition, the power consumption of the system is increased due to the excessive operation of the encryption and decryption modules under the transmission bandwidth. If the encryption and decryption module is set at a slower operation rate or the preprocessing module is not configured at an optimal operation rate, the data transmission bandwidth of the encryption and decryption chip is reduced.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a method, a device, a computer storage medium and a terminal for realizing pipeline scheduling, which can reduce the power consumption of a pipeline and improve the transmission bandwidth and the operation efficiency of the pipeline.
The embodiment of the invention provides a method for realizing pipeline scheduling, which comprises the following steps:
When the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment.
On the other hand, the embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and the method for realizing pipeline scheduling is realized when the computer program is executed by a processor.
In still another aspect, an embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory storing a computer program; wherein,
The processor is configured to execute the computer program in the memory;
the computer program, when executed by the processor, implements a method of implementing pipeline scheduling as described above.
In still another aspect, an embodiment of the present invention further provides an apparatus for implementing pipeline scheduling, including: an acquisition unit and an adjustment unit; wherein,
The acquisition unit is configured to: when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
the adjusting unit is configured to: and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment.
When the pipeline stages are not aligned, the running time information of each pipeline stage of pipeline operation data is acquired; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and do not limit the application.
FIG. 1 is a schematic diagram of a related art system for implementing encryption and decryption chip data processing;
FIG. 2 is a schematic diagram of the operation of an alignment pipeline;
FIG. 3 is a schematic diagram of the operation of a non-aligned pipeline;
FIG. 4 is a flow chart of a method of implementing pipeline scheduling according to an embodiment of the present invention;
FIG. 5 is a block diagram of an apparatus for implementing pipeline scheduling according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of the encryption and decryption system applied to the present invention;
FIG. 7 is a schematic diagram illustrating the operation of a system in accordance with an exemplary embodiment of the present invention;
FIG. 8 is a schematic diagram of another system operation for an example of the application of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in detail hereinafter with reference to the accompanying drawings. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be arbitrarily combined with each other.
The steps illustrated in the flowchart of the figures may be performed in a computer system, such as a set of computer-executable instructions. Also, while a logical order is depicted in the flowchart, in some cases, the steps depicted or described may be performed in a different order than presented herein.
The inventor discovers that a technician sets the running clock frequency of each pipeline stage according to the operation performance of the pipeline stage, thereby realizing the control of operation time; because of the service difference, the code stream length of data sent to the encryption and decryption chip by the external host computer is different, the running clock frequency of the pipeline stage is not generally adjusted after the pipeline runs, when the pipeline stages contained in the pipeline are not aligned, the alignment of the pipeline stages is realized through setting such as null, and the system power consumption is increased and the data transmission bandwidth is reduced; if the alignment of the pipeline stages is not realized by setting a null or the like, a technician is required to manually write the redetermined running clock frequency of the pipeline stages into the system, the instantaneity is poor, and the running efficiency of the pipeline is affected.
FIG. 4 is a flowchart of a method for implementing pipeline scheduling according to an embodiment of the present invention, as shown in FIG. 4, including:
Step 401, when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
step 402, adjusting the operation clock frequency of more than one pipeline stage according to the obtained operation time information of each pipeline stage so as to realize pipeline stage alignment.
When the pipeline stages are not aligned, the running time information of each pipeline stage of pipeline operation data is acquired; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.
In one illustrative example, embodiments of the present invention determine whether a pipeline stage is aligned by setting a null or the like when the pipeline is present.
In an exemplary embodiment, the method adjusts the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage, and comprises the following steps:
determining the running time information of each pipeline stage according to the data quantity transmitted by an external host;
determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage;
traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage;
Selecting one of more than one group of pipeline running clock frequencies determined by traversal according to preset performance, power consumption and area (PPA) information as the pipeline running clock frequency for realizing pipeline stage alignment;
Wherein each set of pipeline operating clock frequencies includes: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
The data amount transmitted by the external host is the data amount of the pipeline operation processing data.
The PPA of the embodiment of the invention refers to the performance, the power consumption and the area of an encryption and decryption chip and is defined by the well-known definition of a person skilled in the art; in one illustrative example, an embodiment of the present invention may determine a set of pipeline running clock frequencies as pipeline running clock frequencies to achieve pipeline stage alignment based on part of the information in the PPA. A set of pipeline operating clock frequencies refers to the operating clock frequencies of the pipeline stages that enable the pipeline stages contained in the pipeline to be aligned.
In one illustrative example, an embodiment of the present invention determines a pipeline stage for arithmetic processing of data, comprising:
Determining total duration information of each operation processing of the data according to the running time information of each pipeline stage;
determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
Wherein the operation processing includes: preprocessing and/or encryption and decryption processing.
It should be noted that, for an operation process of data, the embodiment of the present invention may determine, by calling a pipeline stage, a time period required when the highest operation rate of the pipeline stage is adopted; after the total time length information of one item of operation processing of the data is determined, the minimum number of pipeline stages required for carrying out the item of operation processing of the data can be determined through the quotient of the total time length and the time length required at the highest operation rate. In an exemplary embodiment, the encryption and decryption process according to the embodiment of the present invention includes any one of the following processes: encryption processing, decryption processing, and encryption and decryption processing.
In an exemplary embodiment, encryption and decryption chips based on pipeline processing generally only have differences in service scenarios, so that the number of pipeline stages required for the same operation processing generally only changes by increasing or decreasing a unit in the operation processing of data on the same pipeline, for example, when three pipeline stages are adopted for encryption and decryption processing when pipeline stages are aligned as determined by a technician, the number of the determined pipeline stages for encryption and decryption processing is between 2 and 4 when the pipeline stages are detected to be misaligned. Assuming that the pipeline stage includes a first operation process and a second operation process; wherein, the first operation processing adopts a pipeline stage, and the second operation processing adopts three pipeline stages; assuming that the running time of the first operation process is X, and the running time of each pipeline stage during the second operation process is Y; when 2X is larger than 3Y and the operation rate of the pipeline stage of the first operation processing is adjusted, the embodiment of the invention can determine that the first operation processing adopts one pipeline stage, the second operation processing adopts two pipelines, the operation rate is adjusted by adjusting the frequency of an operation clock, and the alignment of the pipeline stages is realized when the 2X is equal to 2Y; for ease of understanding, it may be assumed that the initial value of X is equal to 150 microseconds and the initial value of Y is equal to 80 microseconds when the pipeline stages are not aligned; and in the operation rate range of the pipeline stages, the operation time of the first operation process is adjusted to be 120 microseconds, the operation time of the second operation process is adjusted to be 240 microseconds, and the second operation process adopts two pipeline stages to operate.
In one illustrative example, an embodiment of the present invention traverses determining more than one set of pipeline running clock frequencies, comprising:
according to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
and determining the running clock frequency of each pipeline stage within the operation rate range of the pipeline stage for each group of pipeline stage combination capable of realizing pipeline stage alignment for carrying out operation processing on data.
In one illustrative example, the operation clock frequencies of the pipeline stages for performing the same operation process are the same, i.e., the operation rates (required operation times) of the pipeline stages for performing the same operation process are the same.
In addition, after determining the number of pipeline stages required for performing the operation processing under the condition that the pipeline stage alignment is set, the embodiment of the invention can determine the operation clock frequency of each pipeline stage by gradually adjusting the operation rate of the pipeline stage for performing each operation processing.
In an exemplary embodiment, the embodiment of the present invention may set a minimum operation rate of the pipeline stage according to power consumption, transmission bandwidth, and the like, and set a maximum operation rate of the pipeline stage according to operation performance of the pipeline stage. And determining the operation rate range of the pipeline stage according to the determined minimum operation rate and the determined maximum operation rate.
In an exemplary embodiment, after determining the number of pipeline stages required for each operation processing on the data, the embodiment of the present invention further includes: the pipeline stage is increased by adjusting the operating clock frequency of the pipeline stage from non-zero to zero, or is decreased by adjusting the operating clock frequency of the pipeline stage from zero to non-zero.
The embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and the method for realizing pipeline scheduling is realized when the computer program is executed by a processor.
The embodiment of the invention also provides a terminal, which comprises: a memory and a processor, the memory storing a computer program; wherein,
The processor is configured to execute the computer program in the memory;
The computer program, when executed by a processor, implements a method of implementing pipeline scheduling as described above.
FIG. 5 is a block diagram of an apparatus for implementing pipeline scheduling according to an embodiment of the present invention, as shown in FIG. 5, including: an acquisition unit and an adjustment unit; wherein,
The acquisition unit is configured to: when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
the adjusting unit is configured to: and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment.
When the pipeline stages are not aligned, the running time information of each pipeline stage of pipeline operation data is acquired; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.
In an exemplary embodiment, the adjusting unit of the embodiment of the present invention is configured to:
determining the running time information of each pipeline stage according to the data quantity transmitted by an external host;
determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage;
traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage;
Selecting one of more than one group of pipeline running clock frequencies determined by traversal according to preset performance, power consumption and area (PPA) information as the pipeline running clock frequency for realizing pipeline stage alignment;
Wherein each set of pipeline operating clock frequencies includes: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
In an exemplary embodiment, the adjusting unit of the embodiment of the present invention is configured to determine a pipeline stage for performing an operation processing on data, and includes:
Determining total duration information of each operation processing of the data according to the running time information of each pipeline stage;
determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
Wherein the operation processing includes: preprocessing and/or encryption and decryption processing.
In one illustrative example, an embodiment of the invention an adjustment unit is configured to traverse a determination of more than one set of pipeline operating clock frequencies, comprising:
according to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
and determining the running clock frequency of each pipeline stage within the operation rate range of the pipeline stage for each group of pipeline stage combination capable of realizing pipeline stage alignment for carrying out operation processing on data.
Application example
Fig. 6 is a schematic diagram of the encryption and decryption system applied to the embodiment of the present invention, as shown in fig. 6, the system includes: the system comprises an SPI interface, a preprocessing module, three encryption and decryption modules and a device for realizing pipeline scheduling; the pretreatment module comprises a group of serial connection: the system comprises a first DMA, a module for running an SM3 algorithm and a second DMA, wherein the first DMA is used for transmitting data from an SPI interface to the module for running the SM3 algorithm, and the second DMA is used for transmitting the data processed by the encryption and decryption module to the SPI interface; the three encryption and decryption modules are three parallel pipeline branches running the SM2 algorithm, and the three pipeline branches are respectively expressed as: sm2_00, sm2_20, and sm2_30; the three pipeline branches work independently and respectively process three parts of data to be encrypted and decrypted after being preprocessed by the preprocessing module; namely, the operation efficiency of the serially connected preprocessing modules can meet the working efficiency of three independent encryption and decryption modules. If the device for realizing pipeline scheduling in the system shown in fig. 6 does not work, when the external host performs long data signature verification, a frame of data is issued to pass through the preprocessing module, 150 microseconds is used when the system clock and the module running the SM3 algorithm are both at the common rate, and the encryption and decryption processing of a frame of data is performed by a single encryption and decryption module, which is about 240 microseconds. If the system is not adaptively adjusted, the SM2_00, the SM2_10 and the SM2_20 are operated in parallel; if the operation processing performed by the preprocessing module is implemented by a pipeline stage S1 and the operation processing performed by each encryption and decryption module is implemented by three pipeline stages S2, S3 and S4, the pipeline scheduling of the operation of the application example system of the present invention is shown in fig. 7, where, for clearly illustrating the operation time of each component of fig. 6, the operation time of the pipeline stage of the preprocessing module is displayed by the identifier of spi+dma+sm3, and the operation time of the pipeline stage S2, the pipeline stage S3 and the pipeline stage S4 of the sm2_00 pipeline stage branch are respectively represented by sm2_01, sm2_02 and sm2_03; the working time lengths of the water level S2, the water level S3 and the water level S4 of the SM2_10 water level branch are respectively represented by SM2_11, SM2_12 and SM 2_13; the working time lengths of the water level S2, the water level S3 and the water level S4 of the SM2_20 water level branch are respectively represented by SM2_21, SM2_22 and SM 2_23; the total time of the pipeline S1 is 150 microseconds, and the operation time of the pipeline S2, the pipeline S3 and the pipeline S4 in each encryption and decryption processing pipeline branch all comprises the working time of 80 microseconds and the time of 70 microseconds such as null; as shown in FIG. 7, the system has three encryption and decryption modules running relatively fast, and has set idle time, so that the data throughput rate of the system is not optimal, and the pipeline stages in the three encryption and decryption modules are all in a working state, so that the power consumption of the encryption and decryption chip is relatively high, and the resource utilization rate is low.
When a device for realizing pipeline scheduling in the system shown in fig. 6 is started, the running clock frequency of each pipeline stage is adjusted to realize pipeline stage alignment; the adjusting of the running clock frequency of the pipeline stage by the device for pipeline scheduling comprises the following steps: the operation clock frequency (sm2_clk) of each pipeline stage of encryption and decryption processing is the operation clock (sys_clk) of the DMA, the clock (sm3_clk) operated by SM 3; in one illustrative example, the means for implementing pipeline scheduling also controls the number of pipeline stages for encryption and decryption processing by configuring the integrated gating clock (ICG) to be on and off. The application example can configure each pipeline stage on a proper running clock frequency after frequency division of the clock source. ICG can be directly called from standard unit library of the processing technology; when the ICG is configured to be not enabled, the ICG controls the pipeline stage to have no clock, and the pipeline stage is in an off state, so that power consumption can be saved.
The application example realizes pipeline level alignment by starting a device for realizing pipeline scheduling; by increasing the operation clock frequency of the operation SM3 and the DMA operation clock frequency, the operation time of the pipeline stage S1 is shortened to 120 microseconds, each encryption and decryption module starts two pipeline stages to work, each pipeline stage takes 120 microseconds, and the complete encryption and decryption of data is completed by 3 pipeline stages at a time; FIG. 8 is a schematic diagram illustrating another system operation of an application example of the present invention, where after the application example system of the present invention is used to start a device for implementing pipeline scheduling, the pipeline stages are equal in time and are all 120 microseconds; the parallel encryption and decryption modules are respectively changed into two of SM2_00 and SM2_10, the flow levels of each encryption and decryption module are respectively changed into two of three, the working time length of the two flow levels contained in SM2_00 is represented by SM2_01 and SM2_02, and the working time length of the two flow levels contained in SM2_10 is represented by SM2_11 and SM2_12; dynamic power consumption of the encryption and decryption chip is reduced. In addition, under the condition that the SM2 algorithm runs relatively slowly, the application example adjusts the SM2 algorithm to run at a higher operation frequency, reduces the running clock frequency of the operation processing module, enables the pipeline stages contained in the pipeline to be in an aligned state, improves the working efficiency of the system, and realizes reasonable matching of hardware resources in high-speed data transmission of the encryption and decryption chip system.
"One of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. "
Claims (8)
1. A method of implementing pipelined scheduling, comprising:
When the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment;
Wherein the adjusting the operation clock frequency of more than one pipeline stage according to the obtained operation time information of each pipeline stage comprises: determining the running time information of each pipeline stage according to the data quantity transmitted by an external host; determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage; traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage; selecting one of more than one group of pipeline running clock frequencies determined through traversal according to preset performance, power consumption and area PPA information as the pipeline running clock frequency for realizing pipeline stage alignment; wherein each set of the pipeline running clock frequencies comprises: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
2. The method of claim 1, wherein the determining a pipeline stage for arithmetic processing of the data comprises:
Determining total duration information of each operation processing item of the data according to the running time information of each pipeline stage;
Determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
wherein the arithmetic processing includes: preprocessing and/or encryption and decryption processing.
3. The method of claim 2, wherein the traversing determines more than one set of pipeline operating clock frequencies, comprising:
According to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
And for each group of pipeline stage combinations capable of realizing pipeline stage alignment for carrying out operation processing on the data, determining the operation clock frequency of each pipeline stage in the operation rate range of the pipeline stage.
4. A computer storage medium having a computer program stored therein, which when executed by a processor, implements the method of implementing pipelined scheduling of any one of claims 1-3.
5. A terminal, comprising: a memory and a processor, the memory storing a computer program; wherein,
The processor is configured to execute the computer program in the memory;
A method of implementing pipelined scheduling as claimed in any one of claims 1 to 3 when said computer program is executed by said processor.
6. An apparatus that implements pipelined scheduling, comprising: an acquisition unit and an adjustment unit; wherein,
The acquisition unit is configured to: when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
The adjusting unit is configured to: determining the running time information of each pipeline stage according to the data quantity transmitted by an external host; determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage; traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage; selecting one of more than one group of pipeline running clock frequencies determined through traversal according to preset performance, power consumption and area PPA information as the pipeline running clock frequency for realizing pipeline stage alignment so as to realize pipeline stage alignment; wherein each set of the pipeline running clock frequencies comprises: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
7. The apparatus of claim 6, wherein the adjustment unit is configured to determine a pipeline stage for performing an arithmetic process on the data, comprising:
Determining total duration information of each operation processing item of the data according to the running time information of each pipeline stage;
Determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
wherein the arithmetic processing includes: preprocessing and/or encryption and decryption processing.
8. The apparatus of claim 7, wherein the adjustment unit is configured to traverse a determination of more than one set of pipeline operating clock frequencies, comprising:
According to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
And for each group of pipeline stage combinations capable of realizing pipeline stage alignment for carrying out operation processing on the data, determining the operation clock frequency of each pipeline stage in the operation rate range of the pipeline stage.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110331636.4A CN113076277B (en) | 2021-03-26 | 2021-03-26 | Method, device, computer storage medium and terminal for realizing pipeline scheduling |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110331636.4A CN113076277B (en) | 2021-03-26 | 2021-03-26 | Method, device, computer storage medium and terminal for realizing pipeline scheduling |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113076277A CN113076277A (en) | 2021-07-06 |
CN113076277B true CN113076277B (en) | 2024-05-03 |
Family
ID=76611328
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110331636.4A Active CN113076277B (en) | 2021-03-26 | 2021-03-26 | Method, device, computer storage medium and terminal for realizing pipeline scheduling |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113076277B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114978473B (en) * | 2022-05-07 | 2024-03-01 | 海光信息技术股份有限公司 | SM3 algorithm processing method, processor, chip and electronic equipment |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09223010A (en) * | 1996-02-16 | 1997-08-26 | Toshiba Corp | Microprocessor and its processing method |
JP2004062281A (en) * | 2002-07-25 | 2004-02-26 | Nec Micro Systems Ltd | Pipeline processor and pipeline operation control method |
JP2007094669A (en) * | 2005-09-28 | 2007-04-12 | Yokogawa Electric Corp | Pipeline processor |
CN101002169A (en) * | 2004-05-19 | 2007-07-18 | Arc国际(英国)公司 | Microprocessor architecture |
CN201374690Y (en) * | 2009-02-12 | 2009-12-30 | 苏州通创微芯有限公司 | Pipeline analog-to-digital converter |
CN101861585A (en) * | 2007-10-06 | 2010-10-13 | 阿克西斯半导体有限公司 | Method and apparatus for real time signal processing |
CN102692563A (en) * | 2012-05-18 | 2012-09-26 | 大唐微电子技术有限公司 | Clock frequency detector |
CN105739948A (en) * | 2014-12-12 | 2016-07-06 | 超威半导体(上海)有限公司 | Self-adaptive adjustable assembly line and method for adaptively adjusting assembly line |
CN109582367A (en) * | 2017-09-28 | 2019-04-05 | 刘欣 | A kind of processor structure with assembly line time division multiplexing dispatching device |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TWI230855B (en) * | 2002-01-05 | 2005-04-11 | Via Tech Inc | Transmission line circuit structure saving power consumption and operating method thereof |
-
2021
- 2021-03-26 CN CN202110331636.4A patent/CN113076277B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09223010A (en) * | 1996-02-16 | 1997-08-26 | Toshiba Corp | Microprocessor and its processing method |
JP2004062281A (en) * | 2002-07-25 | 2004-02-26 | Nec Micro Systems Ltd | Pipeline processor and pipeline operation control method |
CN101002169A (en) * | 2004-05-19 | 2007-07-18 | Arc国际(英国)公司 | Microprocessor architecture |
JP2007094669A (en) * | 2005-09-28 | 2007-04-12 | Yokogawa Electric Corp | Pipeline processor |
CN101861585A (en) * | 2007-10-06 | 2010-10-13 | 阿克西斯半导体有限公司 | Method and apparatus for real time signal processing |
CN201374690Y (en) * | 2009-02-12 | 2009-12-30 | 苏州通创微芯有限公司 | Pipeline analog-to-digital converter |
CN102692563A (en) * | 2012-05-18 | 2012-09-26 | 大唐微电子技术有限公司 | Clock frequency detector |
CN105739948A (en) * | 2014-12-12 | 2016-07-06 | 超威半导体(上海)有限公司 | Self-adaptive adjustable assembly line and method for adaptively adjusting assembly line |
CN109582367A (en) * | 2017-09-28 | 2019-04-05 | 刘欣 | A kind of processor structure with assembly line time division multiplexing dispatching device |
Also Published As
Publication number | Publication date |
---|---|
CN113076277A (en) | 2021-07-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20210326182A1 (en) | Technologies for hybrid field-programmable gate array application-specific integrated circuit code acceleration | |
TWI351615B (en) | Apparatus,method,and system for controller link fo | |
US20210045176A1 (en) | Data Transmission Method, Data Transmission Apparatus, Processor, and Mobile Terminal | |
US20100153478A1 (en) | Parallel true random number generator architecture | |
CN113076277B (en) | Method, device, computer storage medium and terminal for realizing pipeline scheduling | |
CN109067523A (en) | A kind of data ciphering method of encrypted card | |
TW201004235A (en) | Zeroing-out LLRs using demod-bitmap to improve performance of modem decoder | |
US20130170638A1 (en) | System for checking acceptance of string by automaton | |
CN109325356A (en) | A kind of encryption card architecture | |
CN110222519B (en) | Data processing system and method capable of configuring channel | |
US8180816B2 (en) | Control of a pseudo random number generator and a consumer circuit coupled thereto | |
US20120250671A1 (en) | Information communication apparatus and program storage medium | |
CN111181874B (en) | Message processing method, device and storage medium | |
US8761671B2 (en) | Data merging for bluetooth devices | |
CN112291336B (en) | Multichannel parallel data loading method of ARINC429 network card | |
CN115292764B (en) | Bus safety protection method, device and medium | |
CN106034346B (en) | Information processing method and electronic equipment | |
CN114499958A (en) | Control method and device, vehicle and storage medium | |
CN110908886A (en) | Data sending method and device, electronic equipment and storage medium | |
CN113873026A (en) | Dynamic timeout response method, device, terminal equipment and storage medium | |
CN110516413A (en) | A kind of method, system, equipment and the readable storage medium storing program for executing of licensing storage | |
JP6667524B2 (en) | Dynamic RAM sharing in software-defined TDD communication | |
CN111030844B (en) | Method and device for establishing flow processing framework | |
CN109032566B (en) | Decoupling method and device of software logic layer and communication layer | |
EP4322165A1 (en) | Data interface equalization adjustment method and apparatus, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |