CN113076277B - Method, device, computer storage medium and terminal for realizing pipeline scheduling - Google Patents

Method, device, computer storage medium and terminal for realizing pipeline scheduling Download PDF

Info

Publication number
CN113076277B
CN113076277B CN202110331636.4A CN202110331636A CN113076277B CN 113076277 B CN113076277 B CN 113076277B CN 202110331636 A CN202110331636 A CN 202110331636A CN 113076277 B CN113076277 B CN 113076277B
Authority
CN
China
Prior art keywords
pipeline
pipeline stage
data
stage
running
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110331636.4A
Other languages
Chinese (zh)
Other versions
CN113076277A (en
Inventor
赵红敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Datang Microelectronics Technology Co Ltd
Original Assignee
Datang Microelectronics Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Datang Microelectronics Technology Co Ltd filed Critical Datang Microelectronics Technology Co Ltd
Priority to CN202110331636.4A priority Critical patent/CN113076277B/en
Publication of CN113076277A publication Critical patent/CN113076277A/en
Application granted granted Critical
Publication of CN113076277B publication Critical patent/CN113076277B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F13/00Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
    • G06F13/38Information transfer, e.g. on bus
    • G06F13/42Bus transfer protocol, e.g. handshake; Synchronisation
    • G06F13/4282Bus transfer protocol, e.g. handshake; Synchronisation on a serial bus, e.g. I2C bus, SPI bus
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F15/78Architectures of general purpose stored program computers comprising a single central processing unit
    • G06F15/7867Architectures of general purpose stored program computers comprising a single central processing unit with reconfigurable architecture
    • G06F15/7871Reconfiguration support, e.g. configuration loading, configuration switching, or hardware OS
    • G06F15/7878Reconfiguration support, e.g. configuration loading, configuration switching, or hardware OS for pipeline reconfiguration
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Advance Control (AREA)
  • Power Sources (AREA)

Abstract

The invention discloses a method, a device, a computer storage medium and a terminal for realizing pipeline scheduling, wherein when pipeline stages are not aligned, the method, the device and the terminal acquire the running time information of each pipeline stage of pipeline operation data; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.

Description

Method, device, computer storage medium and terminal for realizing pipeline scheduling
Technical Field
The present disclosure relates to, but is not limited to, chip processing technologies, and in particular, to a method, an apparatus, a computer storage medium, and a terminal for implementing pipeline scheduling.
Background
In the communication between the host and the encryption and decryption chip, the data from the communication interface to the encryption and decryption processing is scheduled in a pipeline mode on the premise of the same hardware resource, so that the system bandwidth can be greatly improved, and the high-speed transmission processing of the data is realized.
To make each pipeline stage in the pipeline run to reach maximum efficiency, it is necessary to ensure that the running time of each pipeline stage is the same or approximately the same, and if the running time of each pipeline stage is different, it is necessary to ensure that the running time of each pipeline stage remains the same by setting a mode of idle waiting, which can prolong the time required by the processing chip and cause resource waste. Taking an encryption and decryption chip with a Serial Peripheral Interface (SPI) as a communication interface as an example, FIG. 1 is a schematic diagram of a system for realizing data processing of the encryption and decryption chip in the related art, and as shown in FIG. 1, the system comprises the SPI, a four-channel Direct Memory Access (DMA) and an encryption and decryption module for encryption and decryption processing; wherein a four-way Direct Memory Access (DMA) includes: the system comprises a first built-in remote monitoring (EDMAC), a second EDMAC, a first Static Random Access Memory (SRAM) sending module, a second SRAM sending module, a first SRAM receiving module and a second SRAM receiving module; the first EDMAC is connected with the SPI interface and receives data to be operated, the first EDMAC transmits the data to be operated to the first SRAM sending module through a sending channel I, sends the data to the second EDMAC through a sending channel III and sends the data to the encryption and decryption module through a second EDMAC; similarly, the first EDMAC transmits the data to be operated to the second SRAM transmitting module through the second transmitting channel, and transmits the data to the second EDMAC through the fourth transmitting channel, and transmits the data to the encryption and decryption module through the second EDMAC; the DMA pre-processes the data to be operated in the process of data transmission; after the encryption and decryption module encrypts and decrypts the received preprocessed data, the second EDMAC transmits the encrypted and decrypted data to the first SRAM receiving module through the third receiving channel, and sends the encrypted and decrypted data to the first EDMAC through the first receiving channel to finish encryption and decryption data return; similarly, the second EDMAC transmits the encrypted and decrypted data to the second SRAM receiving module through the receiving channel four, and sends the encrypted and decrypted data to the first EDMAC through the receiving channel two, so as to complete the return of the encrypted and decrypted data; based on the system of FIG. 1, an external host transmits data to be operated to an encryption and decryption chip through SPI, and the encryption and decryption chip performs encryption and decryption processing after preprocessing the data to be operated; generally, the related art preprocesses data to be operated by an algorithm including a domestic hash algorithm (SM 3); encrypting and decrypting the data by adopting an algorithm including an asymmetric encryption and decryption algorithm (SM 2); the data size of the data to be operated generally meets the data length requirement of encryption and decryption processing, such as abstract signature verification, message signature verification and long data signature verification, which respectively correspond to different hundred-several hundred-bit (Byte) bytes. Preprocessing data to be operated, performing DMA transmission, and performing encryption and decryption processing through SM 2; in order to implement high-speed processing of large data volume operations, a plurality of pipeline stages for running SM2 are generally required to be set to work simultaneously, and if three pipeline stages for running SM2 are set, the pipeline stages are defined as sm2_0, sm2_1 and sm2_2 by underlining and numbering, and then sm2_0, sm2_1 and sm2_2 respectively perform a part of encryption and decryption processing. After the SM2 encrypts and decrypts the preprocessed data, the encrypted and decrypted data is transmitted back to the external host through the DMA and the SPI, so that the data processing of the encryption and decryption chip is completed once. The related art generally implements preprocessing by SM3 through one pipeline stage, except for pipeline stages sm2_0, sm2_1, and sm2_2. When the pipeline runs, a pipeline with equal execution time of each pipeline stage is called an alignment pipeline, and the alignment pipeline can obtain the maximum acceleration ratio. Pipelines with unequal time consumption of each pipeline stage are called non-aligned pipelines; in order to realize normal operation of the non-aligned pipeline, the related technology generally needs to set idle stages for less pipeline stages in use, and therefore, part of operation efficiency of the pipeline is lost. Taking a pipeline comprising four pipeline stages as an example, fig. 2 is a working schematic diagram of an alignment pipeline, as shown in fig. 2, in the figure, S1, S2, S3 and S4 represent four pipeline stages, and each row of S1, S2, S3 and S4, which are ordered according to time, represents a complete data encryption and decryption process of an encryption and decryption chip; s1 which is longitudinally arranged according to the space of the pipeline is continuous in time, and the pre-processing is performed on different encryption and decryption chips through SM 3; each S2 longitudinally arranged according to the space of the pipeline is continuous in time, and represents that the pre-processed data of the row where the S2 is located is encrypted and decrypted by the SM2_0, the SM2_1 and the SM 2_2; the execution time of the four pipeline stages in the figure is the same, and the operation time of each pipeline stage is delta T. Fig. 3 is a schematic working diagram of a non-aligned pipeline, as shown in fig. 3, the operation time of S1 is Δt, which is greater than the operation time of S2, S3 and S4, and in order to ensure normal operation of the pipeline, all of S2 to S4 need to be set with null or the like, and the operation time is extended to Δt, so that the alignment of pipeline stages can be realized, thereby ensuring normal operation of the pipeline. The operation time of the pipeline stages in the pipeline is different, so that the operation speed of the pipeline stages is not matched, the pipeline stages can be aligned by setting the pipeline stages in a null mode, the transmission bandwidth for scheduling the encryption and decryption chips is reduced, and in addition, the power consumption of the system is increased due to the excessive operation of the encryption and decryption modules under the transmission bandwidth. If the encryption and decryption module is set at a slower operation rate or the preprocessing module is not configured at an optimal operation rate, the data transmission bandwidth of the encryption and decryption chip is reduced.
Disclosure of Invention
The following is a summary of the subject matter described in detail herein. This summary is not intended to limit the scope of the claims.
The embodiment of the invention provides a method, a device, a computer storage medium and a terminal for realizing pipeline scheduling, which can reduce the power consumption of a pipeline and improve the transmission bandwidth and the operation efficiency of the pipeline.
The embodiment of the invention provides a method for realizing pipeline scheduling, which comprises the following steps:
When the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment.
On the other hand, the embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and the method for realizing pipeline scheduling is realized when the computer program is executed by a processor.
In still another aspect, an embodiment of the present invention further provides a terminal, including: a memory and a processor, the memory storing a computer program; wherein,
The processor is configured to execute the computer program in the memory;
the computer program, when executed by the processor, implements a method of implementing pipeline scheduling as described above.
In still another aspect, an embodiment of the present invention further provides an apparatus for implementing pipeline scheduling, including: an acquisition unit and an adjustment unit; wherein,
The acquisition unit is configured to: when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
the adjusting unit is configured to: and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment.
When the pipeline stages are not aligned, the running time information of each pipeline stage of pipeline operation data is acquired; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate and do not limit the application.
FIG. 1 is a schematic diagram of a related art system for implementing encryption and decryption chip data processing;
FIG. 2 is a schematic diagram of the operation of an alignment pipeline;
FIG. 3 is a schematic diagram of the operation of a non-aligned pipeline;
FIG. 4 is a flow chart of a method of implementing pipeline scheduling according to an embodiment of the present invention;
FIG. 5 is a block diagram of an apparatus for implementing pipeline scheduling according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of the encryption and decryption system applied to the present invention;
FIG. 7 is a schematic diagram illustrating the operation of a system in accordance with an exemplary embodiment of the present invention;
FIG. 8 is a schematic diagram of another system operation for an example of the application of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, embodiments of the present application will be described in detail hereinafter with reference to the accompanying drawings. It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be arbitrarily combined with each other.
The steps illustrated in the flowchart of the figures may be performed in a computer system, such as a set of computer-executable instructions. Also, while a logical order is depicted in the flowchart, in some cases, the steps depicted or described may be performed in a different order than presented herein.
The inventor discovers that a technician sets the running clock frequency of each pipeline stage according to the operation performance of the pipeline stage, thereby realizing the control of operation time; because of the service difference, the code stream length of data sent to the encryption and decryption chip by the external host computer is different, the running clock frequency of the pipeline stage is not generally adjusted after the pipeline runs, when the pipeline stages contained in the pipeline are not aligned, the alignment of the pipeline stages is realized through setting such as null, and the system power consumption is increased and the data transmission bandwidth is reduced; if the alignment of the pipeline stages is not realized by setting a null or the like, a technician is required to manually write the redetermined running clock frequency of the pipeline stages into the system, the instantaneity is poor, and the running efficiency of the pipeline is affected.
FIG. 4 is a flowchart of a method for implementing pipeline scheduling according to an embodiment of the present invention, as shown in FIG. 4, including:
Step 401, when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
step 402, adjusting the operation clock frequency of more than one pipeline stage according to the obtained operation time information of each pipeline stage so as to realize pipeline stage alignment.
When the pipeline stages are not aligned, the running time information of each pipeline stage of pipeline operation data is acquired; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.
In one illustrative example, embodiments of the present invention determine whether a pipeline stage is aligned by setting a null or the like when the pipeline is present.
In an exemplary embodiment, the method adjusts the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage, and comprises the following steps:
determining the running time information of each pipeline stage according to the data quantity transmitted by an external host;
determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage;
traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage;
Selecting one of more than one group of pipeline running clock frequencies determined by traversal according to preset performance, power consumption and area (PPA) information as the pipeline running clock frequency for realizing pipeline stage alignment;
Wherein each set of pipeline operating clock frequencies includes: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
The data amount transmitted by the external host is the data amount of the pipeline operation processing data.
The PPA of the embodiment of the invention refers to the performance, the power consumption and the area of an encryption and decryption chip and is defined by the well-known definition of a person skilled in the art; in one illustrative example, an embodiment of the present invention may determine a set of pipeline running clock frequencies as pipeline running clock frequencies to achieve pipeline stage alignment based on part of the information in the PPA. A set of pipeline operating clock frequencies refers to the operating clock frequencies of the pipeline stages that enable the pipeline stages contained in the pipeline to be aligned.
In one illustrative example, an embodiment of the present invention determines a pipeline stage for arithmetic processing of data, comprising:
Determining total duration information of each operation processing of the data according to the running time information of each pipeline stage;
determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
Wherein the operation processing includes: preprocessing and/or encryption and decryption processing.
It should be noted that, for an operation process of data, the embodiment of the present invention may determine, by calling a pipeline stage, a time period required when the highest operation rate of the pipeline stage is adopted; after the total time length information of one item of operation processing of the data is determined, the minimum number of pipeline stages required for carrying out the item of operation processing of the data can be determined through the quotient of the total time length and the time length required at the highest operation rate. In an exemplary embodiment, the encryption and decryption process according to the embodiment of the present invention includes any one of the following processes: encryption processing, decryption processing, and encryption and decryption processing.
In an exemplary embodiment, encryption and decryption chips based on pipeline processing generally only have differences in service scenarios, so that the number of pipeline stages required for the same operation processing generally only changes by increasing or decreasing a unit in the operation processing of data on the same pipeline, for example, when three pipeline stages are adopted for encryption and decryption processing when pipeline stages are aligned as determined by a technician, the number of the determined pipeline stages for encryption and decryption processing is between 2 and 4 when the pipeline stages are detected to be misaligned. Assuming that the pipeline stage includes a first operation process and a second operation process; wherein, the first operation processing adopts a pipeline stage, and the second operation processing adopts three pipeline stages; assuming that the running time of the first operation process is X, and the running time of each pipeline stage during the second operation process is Y; when 2X is larger than 3Y and the operation rate of the pipeline stage of the first operation processing is adjusted, the embodiment of the invention can determine that the first operation processing adopts one pipeline stage, the second operation processing adopts two pipelines, the operation rate is adjusted by adjusting the frequency of an operation clock, and the alignment of the pipeline stages is realized when the 2X is equal to 2Y; for ease of understanding, it may be assumed that the initial value of X is equal to 150 microseconds and the initial value of Y is equal to 80 microseconds when the pipeline stages are not aligned; and in the operation rate range of the pipeline stages, the operation time of the first operation process is adjusted to be 120 microseconds, the operation time of the second operation process is adjusted to be 240 microseconds, and the second operation process adopts two pipeline stages to operate.
In one illustrative example, an embodiment of the present invention traverses determining more than one set of pipeline running clock frequencies, comprising:
according to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
and determining the running clock frequency of each pipeline stage within the operation rate range of the pipeline stage for each group of pipeline stage combination capable of realizing pipeline stage alignment for carrying out operation processing on data.
In one illustrative example, the operation clock frequencies of the pipeline stages for performing the same operation process are the same, i.e., the operation rates (required operation times) of the pipeline stages for performing the same operation process are the same.
In addition, after determining the number of pipeline stages required for performing the operation processing under the condition that the pipeline stage alignment is set, the embodiment of the invention can determine the operation clock frequency of each pipeline stage by gradually adjusting the operation rate of the pipeline stage for performing each operation processing.
In an exemplary embodiment, the embodiment of the present invention may set a minimum operation rate of the pipeline stage according to power consumption, transmission bandwidth, and the like, and set a maximum operation rate of the pipeline stage according to operation performance of the pipeline stage. And determining the operation rate range of the pipeline stage according to the determined minimum operation rate and the determined maximum operation rate.
In an exemplary embodiment, after determining the number of pipeline stages required for each operation processing on the data, the embodiment of the present invention further includes: the pipeline stage is increased by adjusting the operating clock frequency of the pipeline stage from non-zero to zero, or is decreased by adjusting the operating clock frequency of the pipeline stage from zero to non-zero.
The embodiment of the invention also provides a computer storage medium, wherein a computer program is stored in the computer storage medium, and the method for realizing pipeline scheduling is realized when the computer program is executed by a processor.
The embodiment of the invention also provides a terminal, which comprises: a memory and a processor, the memory storing a computer program; wherein,
The processor is configured to execute the computer program in the memory;
The computer program, when executed by a processor, implements a method of implementing pipeline scheduling as described above.
FIG. 5 is a block diagram of an apparatus for implementing pipeline scheduling according to an embodiment of the present invention, as shown in FIG. 5, including: an acquisition unit and an adjustment unit; wherein,
The acquisition unit is configured to: when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
the adjusting unit is configured to: and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment.
When the pipeline stages are not aligned, the running time information of each pipeline stage of pipeline operation data is acquired; and adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment. The automatic scheduling of the pipeline in real time is realized, the power consumption of the pipeline is reduced, and the transmission bandwidth and the operation efficiency of the pipeline are improved.
In an exemplary embodiment, the adjusting unit of the embodiment of the present invention is configured to:
determining the running time information of each pipeline stage according to the data quantity transmitted by an external host;
determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage;
traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage;
Selecting one of more than one group of pipeline running clock frequencies determined by traversal according to preset performance, power consumption and area (PPA) information as the pipeline running clock frequency for realizing pipeline stage alignment;
Wherein each set of pipeline operating clock frequencies includes: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
In an exemplary embodiment, the adjusting unit of the embodiment of the present invention is configured to determine a pipeline stage for performing an operation processing on data, and includes:
Determining total duration information of each operation processing of the data according to the running time information of each pipeline stage;
determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
Wherein the operation processing includes: preprocessing and/or encryption and decryption processing.
In one illustrative example, an embodiment of the invention an adjustment unit is configured to traverse a determination of more than one set of pipeline operating clock frequencies, comprising:
according to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
and determining the running clock frequency of each pipeline stage within the operation rate range of the pipeline stage for each group of pipeline stage combination capable of realizing pipeline stage alignment for carrying out operation processing on data.
Application example
Fig. 6 is a schematic diagram of the encryption and decryption system applied to the embodiment of the present invention, as shown in fig. 6, the system includes: the system comprises an SPI interface, a preprocessing module, three encryption and decryption modules and a device for realizing pipeline scheduling; the pretreatment module comprises a group of serial connection: the system comprises a first DMA, a module for running an SM3 algorithm and a second DMA, wherein the first DMA is used for transmitting data from an SPI interface to the module for running the SM3 algorithm, and the second DMA is used for transmitting the data processed by the encryption and decryption module to the SPI interface; the three encryption and decryption modules are three parallel pipeline branches running the SM2 algorithm, and the three pipeline branches are respectively expressed as: sm2_00, sm2_20, and sm2_30; the three pipeline branches work independently and respectively process three parts of data to be encrypted and decrypted after being preprocessed by the preprocessing module; namely, the operation efficiency of the serially connected preprocessing modules can meet the working efficiency of three independent encryption and decryption modules. If the device for realizing pipeline scheduling in the system shown in fig. 6 does not work, when the external host performs long data signature verification, a frame of data is issued to pass through the preprocessing module, 150 microseconds is used when the system clock and the module running the SM3 algorithm are both at the common rate, and the encryption and decryption processing of a frame of data is performed by a single encryption and decryption module, which is about 240 microseconds. If the system is not adaptively adjusted, the SM2_00, the SM2_10 and the SM2_20 are operated in parallel; if the operation processing performed by the preprocessing module is implemented by a pipeline stage S1 and the operation processing performed by each encryption and decryption module is implemented by three pipeline stages S2, S3 and S4, the pipeline scheduling of the operation of the application example system of the present invention is shown in fig. 7, where, for clearly illustrating the operation time of each component of fig. 6, the operation time of the pipeline stage of the preprocessing module is displayed by the identifier of spi+dma+sm3, and the operation time of the pipeline stage S2, the pipeline stage S3 and the pipeline stage S4 of the sm2_00 pipeline stage branch are respectively represented by sm2_01, sm2_02 and sm2_03; the working time lengths of the water level S2, the water level S3 and the water level S4 of the SM2_10 water level branch are respectively represented by SM2_11, SM2_12 and SM 2_13; the working time lengths of the water level S2, the water level S3 and the water level S4 of the SM2_20 water level branch are respectively represented by SM2_21, SM2_22 and SM 2_23; the total time of the pipeline S1 is 150 microseconds, and the operation time of the pipeline S2, the pipeline S3 and the pipeline S4 in each encryption and decryption processing pipeline branch all comprises the working time of 80 microseconds and the time of 70 microseconds such as null; as shown in FIG. 7, the system has three encryption and decryption modules running relatively fast, and has set idle time, so that the data throughput rate of the system is not optimal, and the pipeline stages in the three encryption and decryption modules are all in a working state, so that the power consumption of the encryption and decryption chip is relatively high, and the resource utilization rate is low.
When a device for realizing pipeline scheduling in the system shown in fig. 6 is started, the running clock frequency of each pipeline stage is adjusted to realize pipeline stage alignment; the adjusting of the running clock frequency of the pipeline stage by the device for pipeline scheduling comprises the following steps: the operation clock frequency (sm2_clk) of each pipeline stage of encryption and decryption processing is the operation clock (sys_clk) of the DMA, the clock (sm3_clk) operated by SM 3; in one illustrative example, the means for implementing pipeline scheduling also controls the number of pipeline stages for encryption and decryption processing by configuring the integrated gating clock (ICG) to be on and off. The application example can configure each pipeline stage on a proper running clock frequency after frequency division of the clock source. ICG can be directly called from standard unit library of the processing technology; when the ICG is configured to be not enabled, the ICG controls the pipeline stage to have no clock, and the pipeline stage is in an off state, so that power consumption can be saved.
The application example realizes pipeline level alignment by starting a device for realizing pipeline scheduling; by increasing the operation clock frequency of the operation SM3 and the DMA operation clock frequency, the operation time of the pipeline stage S1 is shortened to 120 microseconds, each encryption and decryption module starts two pipeline stages to work, each pipeline stage takes 120 microseconds, and the complete encryption and decryption of data is completed by 3 pipeline stages at a time; FIG. 8 is a schematic diagram illustrating another system operation of an application example of the present invention, where after the application example system of the present invention is used to start a device for implementing pipeline scheduling, the pipeline stages are equal in time and are all 120 microseconds; the parallel encryption and decryption modules are respectively changed into two of SM2_00 and SM2_10, the flow levels of each encryption and decryption module are respectively changed into two of three, the working time length of the two flow levels contained in SM2_00 is represented by SM2_01 and SM2_02, and the working time length of the two flow levels contained in SM2_10 is represented by SM2_11 and SM2_12; dynamic power consumption of the encryption and decryption chip is reduced. In addition, under the condition that the SM2 algorithm runs relatively slowly, the application example adjusts the SM2 algorithm to run at a higher operation frequency, reduces the running clock frequency of the operation processing module, enables the pipeline stages contained in the pipeline to be in an aligned state, improves the working efficiency of the system, and realizes reasonable matching of hardware resources in high-speed data transmission of the encryption and decryption chip system.
"One of ordinary skill in the art will appreciate that all or some of the steps, systems, functional modules/units in the apparatus, and methods disclosed above may be implemented as software, firmware, hardware, and suitable combinations thereof. In a hardware implementation, the division between the functional modules/units mentioned in the above description does not necessarily correspond to the division of physical components; for example, one physical component may have multiple functions, or one function or step may be performed cooperatively by several physical components. Some or all of the components may be implemented as software executed by a processor, such as a digital signal processor or microprocessor, or as hardware, or as an integrated circuit, such as an application specific integrated circuit. Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). The term computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data, as known to those skilled in the art. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital Versatile Disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium which can be used to store the desired information and which can be accessed by a computer. Furthermore, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. "

Claims (8)

1. A method of implementing pipelined scheduling, comprising:
When the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
adjusting the running clock frequency of more than one pipeline stage according to the acquired running time information of each pipeline stage so as to realize pipeline stage alignment;
Wherein the adjusting the operation clock frequency of more than one pipeline stage according to the obtained operation time information of each pipeline stage comprises: determining the running time information of each pipeline stage according to the data quantity transmitted by an external host; determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage; traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage; selecting one of more than one group of pipeline running clock frequencies determined through traversal according to preset performance, power consumption and area PPA information as the pipeline running clock frequency for realizing pipeline stage alignment; wherein each set of the pipeline running clock frequencies comprises: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
2. The method of claim 1, wherein the determining a pipeline stage for arithmetic processing of the data comprises:
Determining total duration information of each operation processing item of the data according to the running time information of each pipeline stage;
Determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
wherein the arithmetic processing includes: preprocessing and/or encryption and decryption processing.
3. The method of claim 2, wherein the traversing determines more than one set of pipeline operating clock frequencies, comprising:
According to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
And for each group of pipeline stage combinations capable of realizing pipeline stage alignment for carrying out operation processing on the data, determining the operation clock frequency of each pipeline stage in the operation rate range of the pipeline stage.
4. A computer storage medium having a computer program stored therein, which when executed by a processor, implements the method of implementing pipelined scheduling of any one of claims 1-3.
5. A terminal, comprising: a memory and a processor, the memory storing a computer program; wherein,
The processor is configured to execute the computer program in the memory;
A method of implementing pipelined scheduling as claimed in any one of claims 1 to 3 when said computer program is executed by said processor.
6. An apparatus that implements pipelined scheduling, comprising: an acquisition unit and an adjustment unit; wherein,
The acquisition unit is configured to: when the pipeline stages are not aligned, acquiring the running time information of each pipeline stage of pipeline operation data;
The adjusting unit is configured to: determining the running time information of each pipeline stage according to the data quantity transmitted by an external host; determining a pipeline stage for performing operation processing on the data according to the determined running time information of each pipeline stage; traversing and determining more than one group of pipeline running clock frequencies in the operation rate range of the pipeline stage; selecting one of more than one group of pipeline running clock frequencies determined through traversal according to preset performance, power consumption and area PPA information as the pipeline running clock frequency for realizing pipeline stage alignment so as to realize pipeline stage alignment; wherein each set of the pipeline running clock frequencies comprises: a determined running clock frequency of more than one pipeline stage for pipeline stage alignment of arithmetic processing of data may be implemented.
7. The apparatus of claim 6, wherein the adjustment unit is configured to determine a pipeline stage for performing an arithmetic process on the data, comprising:
Determining total duration information of each operation processing item of the data according to the running time information of each pipeline stage;
Determining the number of pipeline stages required by each operation processing on the data according to the determined total duration information;
wherein the arithmetic processing includes: preprocessing and/or encryption and decryption processing.
8. The apparatus of claim 7, wherein the adjustment unit is configured to traverse a determination of more than one set of pipeline operating clock frequencies, comprising:
According to the number of the pipeline stages required by each operation processing of the data, more than one group of pipeline stage combinations capable of realizing pipeline stage alignment of the operation processing of the data are determined;
And for each group of pipeline stage combinations capable of realizing pipeline stage alignment for carrying out operation processing on the data, determining the operation clock frequency of each pipeline stage in the operation rate range of the pipeline stage.
CN202110331636.4A 2021-03-26 2021-03-26 Method, device, computer storage medium and terminal for realizing pipeline scheduling Active CN113076277B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110331636.4A CN113076277B (en) 2021-03-26 2021-03-26 Method, device, computer storage medium and terminal for realizing pipeline scheduling

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110331636.4A CN113076277B (en) 2021-03-26 2021-03-26 Method, device, computer storage medium and terminal for realizing pipeline scheduling

Publications (2)

Publication Number Publication Date
CN113076277A CN113076277A (en) 2021-07-06
CN113076277B true CN113076277B (en) 2024-05-03

Family

ID=76611328

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110331636.4A Active CN113076277B (en) 2021-03-26 2021-03-26 Method, device, computer storage medium and terminal for realizing pipeline scheduling

Country Status (1)

Country Link
CN (1) CN113076277B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978473B (en) * 2022-05-07 2024-03-01 海光信息技术股份有限公司 SM3 algorithm processing method, processor, chip and electronic equipment

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09223010A (en) * 1996-02-16 1997-08-26 Toshiba Corp Microprocessor and its processing method
JP2004062281A (en) * 2002-07-25 2004-02-26 Nec Micro Systems Ltd Pipeline processor and pipeline operation control method
JP2007094669A (en) * 2005-09-28 2007-04-12 Yokogawa Electric Corp Pipeline processor
CN101002169A (en) * 2004-05-19 2007-07-18 Arc国际(英国)公司 Microprocessor architecture
CN201374690Y (en) * 2009-02-12 2009-12-30 苏州通创微芯有限公司 Pipeline analog-to-digital converter
CN101861585A (en) * 2007-10-06 2010-10-13 阿克西斯半导体有限公司 Method and apparatus for real time signal processing
CN102692563A (en) * 2012-05-18 2012-09-26 大唐微电子技术有限公司 Clock frequency detector
CN105739948A (en) * 2014-12-12 2016-07-06 超威半导体(上海)有限公司 Self-adaptive adjustable assembly line and method for adaptively adjusting assembly line
CN109582367A (en) * 2017-09-28 2019-04-05 刘欣 A kind of processor structure with assembly line time division multiplexing dispatching device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI230855B (en) * 2002-01-05 2005-04-11 Via Tech Inc Transmission line circuit structure saving power consumption and operating method thereof

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH09223010A (en) * 1996-02-16 1997-08-26 Toshiba Corp Microprocessor and its processing method
JP2004062281A (en) * 2002-07-25 2004-02-26 Nec Micro Systems Ltd Pipeline processor and pipeline operation control method
CN101002169A (en) * 2004-05-19 2007-07-18 Arc国际(英国)公司 Microprocessor architecture
JP2007094669A (en) * 2005-09-28 2007-04-12 Yokogawa Electric Corp Pipeline processor
CN101861585A (en) * 2007-10-06 2010-10-13 阿克西斯半导体有限公司 Method and apparatus for real time signal processing
CN201374690Y (en) * 2009-02-12 2009-12-30 苏州通创微芯有限公司 Pipeline analog-to-digital converter
CN102692563A (en) * 2012-05-18 2012-09-26 大唐微电子技术有限公司 Clock frequency detector
CN105739948A (en) * 2014-12-12 2016-07-06 超威半导体(上海)有限公司 Self-adaptive adjustable assembly line and method for adaptively adjusting assembly line
CN109582367A (en) * 2017-09-28 2019-04-05 刘欣 A kind of processor structure with assembly line time division multiplexing dispatching device

Also Published As

Publication number Publication date
CN113076277A (en) 2021-07-06

Similar Documents

Publication Publication Date Title
US20210326182A1 (en) Technologies for hybrid field-programmable gate array application-specific integrated circuit code acceleration
TWI351615B (en) Apparatus,method,and system for controller link fo
US20210045176A1 (en) Data Transmission Method, Data Transmission Apparatus, Processor, and Mobile Terminal
US20100153478A1 (en) Parallel true random number generator architecture
CN113076277B (en) Method, device, computer storage medium and terminal for realizing pipeline scheduling
CN109067523A (en) A kind of data ciphering method of encrypted card
TW201004235A (en) Zeroing-out LLRs using demod-bitmap to improve performance of modem decoder
US20130170638A1 (en) System for checking acceptance of string by automaton
CN109325356A (en) A kind of encryption card architecture
CN110222519B (en) Data processing system and method capable of configuring channel
US8180816B2 (en) Control of a pseudo random number generator and a consumer circuit coupled thereto
US20120250671A1 (en) Information communication apparatus and program storage medium
CN111181874B (en) Message processing method, device and storage medium
US8761671B2 (en) Data merging for bluetooth devices
CN112291336B (en) Multichannel parallel data loading method of ARINC429 network card
CN115292764B (en) Bus safety protection method, device and medium
CN106034346B (en) Information processing method and electronic equipment
CN114499958A (en) Control method and device, vehicle and storage medium
CN110908886A (en) Data sending method and device, electronic equipment and storage medium
CN113873026A (en) Dynamic timeout response method, device, terminal equipment and storage medium
CN110516413A (en) A kind of method, system, equipment and the readable storage medium storing program for executing of licensing storage
JP6667524B2 (en) Dynamic RAM sharing in software-defined TDD communication
CN111030844B (en) Method and device for establishing flow processing framework
CN109032566B (en) Decoupling method and device of software logic layer and communication layer
EP4322165A1 (en) Data interface equalization adjustment method and apparatus, device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant