CN103500107B - Hardware optimization method for CPU - Google Patents

Hardware optimization method for CPU Download PDF

Info

Publication number
CN103500107B
CN103500107B CN201310450768.4A CN201310450768A CN103500107B CN 103500107 B CN103500107 B CN 103500107B CN 201310450768 A CN201310450768 A CN 201310450768A CN 103500107 B CN103500107 B CN 103500107B
Authority
CN
China
Prior art keywords
burst
cpu
areas
data
hardware
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201310450768.4A
Other languages
Chinese (zh)
Other versions
CN103500107A (en
Inventor
朱钟琦
曾田
阮航
王炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan Lingjiu Microelectronics Co ltd
709th Research Institute of CSSC
Original Assignee
709th Research Institute of CSIC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 709th Research Institute of CSIC filed Critical 709th Research Institute of CSIC
Priority to CN201310450768.4A priority Critical patent/CN103500107B/en
Publication of CN103500107A publication Critical patent/CN103500107A/en
Application granted granted Critical
Publication of CN103500107B publication Critical patent/CN103500107B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

A hardware optimization method for a CPU comprises the following steps: (1) hardware of a system is designed, wherein the system comprises a master controller, a storage controller, an external bus interface and other modules; (2) a burst region is set, wherein the burst region and a non-burst region in an internal storage can be set simply through a burst address editor; (3) a burst mode is used, wherein the burst mode can be started when the length of data is larger than 32 bits. The hardware optimization method for the CPU has the advantages that the hardware optimization method is completely compatible with an X86 instruction set, and a burst read-write region can be configured entirely to conduct segmented optimization of data read-write; with respect to hardware deign, a peripheral circuit is not added, the number of logic gates of the CPU is rarely increased, and therefore the cost of the system is not affected; the execution time of an instruction is greatly shortened, and the number of bus visit times of the CPU is reduced; the hardware optimization method can be popularized to the CPU design of other CISC or RISC instruction sets, thereby being wide in application range.

Description

A kind of hardware optimization method of CPU
Technical field
The present invention relates to the hardware design field of processor, more particularly to CPU(Central processing unit)The hardware of data transfer Optimization design.
Background technology
Data transfer component is always the important component part of CPU, during processor design is always to its Optimization Work One of emphasis of performance optimization.In state's inner treater design the optimization of data transfer component design is mainly passed through to improve Cache (Cache)Execution efficiency, solution read-write correlation, increase DMAC(Direct memory access controller)The modes such as part are entering OK, the optimization to the execution flow process of data transfer instruction is seldom referred to.By taking the data transfer instruction of 32 x86 instruction sets as an example, It is entirely according to traditional mode transmitted one by one, by byte to perform flow process(Byte), word(Word)Or double word(It is double Word)One by one order passes to destination address from source address.This kind of instruction can in a large number occupy bus, cause the pause of CPU streamlines With the increase of bus bandwidth load.
By taking 32 X86 instruction as an example, figure one and figure two are respectively typical instructions REP of memory to memory data transfer MOVS(String transmission instruction)With typical instructions POPA of memory to register transfer(Pull instruction)Flow chart.For REP For MOVS, it is assumed that ECX values are 100, then this instruction needs 100 step 1- steps 5 of repetition just can complete.It can be seen that this Class instruction execution efficiency is low, bus bandwidth is loaded higher.
This intellectual achievement carries out labor by the data transfer instruction to x86 instruction set, it is proposed that for therein The method that continuous data transmission instruction carries out hardware optimization, the execution cycle of reduction data transfer instruction and CPU are to bus access (Particularly write access)Number of times, effectively increase the instruction execution speed of CPU.
The content of the invention
The invention reside in providing a kind of hardware optimization method of data transfer instruction, it is intended to improve holding for data transfer instruction Line efficiency, and then lift the performance of processor.Its address parameter setting is simple, can only in system initialization(For example BIOS sets It is fixed)When setting once, it is also possible to according to client need parameter setting is changed in system implementation;Performance is obviously improved, excellent Data transfer instruction after change generally has more than one times of improved efficiency.
A kind of hardware optimization method of CPU of the present invention, including:
(1), design system hardware:System is by module compositions such as master controller, storage control, external bus interfaces: Wherein, master controller is the main part of system, is that the storage that can be directed to different memory access addresses execution different modes is operated CPU, specifically, this CPU can be set by way of BIOS program or setting internal register value can burst ground Location area, for can the memory access of burst areas perform by the instruction scheme after optimization, can not burst areas perform by former instruction scheme, storage Controller is written and read operation to memory by receiving master controller data message and control information, it would be preferable to support burst reads Write and non-burst read-writes.External bus interface is then used for the inside and outside data communication of CPU.Packet after hardware optimization Include:The source address information of data transfer, destination address information, data length information, control information, data message.
(2), setting burst regions.By burst addresses editing machine, burst areas that can be simply in set memory and Can not burst areas.Can burst areas and can not burst areas be the configurable region determined according to the Memory Allocation mode of user.Can Burst areas and can not burst areas be not required for it is completely continuous, the memory address of system can be assigned as it is multiple can burst areas and It is multiple can not burst areas.Whether can be configured by hardware and software two ways using burst patterns.Outside can pass through Single switch is turned on and off burst patterns, and inside can decide whether to open burst moulds by arranging particular register value Formula.In addition to using burst editing machines, user can also be set by specific program can burst areas.It is generally initial in system During change by BIOS setup once, it is also possible to reset burst areas when program is run as needed.The data of execution When source address, the destination address of transmission instruction are in burst regions, the instruction scheme of optimization is enabled.
(3), using burst patterns.When data length is more than 32, you can enable burst patterns.Burst patterns are not Depend on whether to open cache functions, can at most support the memory read-write of the data of a cache row size.Burst is most This requirement of data of one cache row size of many supports is simultaneously revocable, and one why is selected in this intellectual achievement The size of cache rows, as maximum, is conveniently to open carry out data exchange, burst reality with cache modules under cache patterns Desirable maximum number of byte has no hard requirement.When CPU inside and outside carries out data transmission, when reading internal memory, can be with one The secondary data by a cache row are read in the burst registers of data transmission unit, then carry out the behaviour such as register assignment Make;During write internal memory, in the burst registers of data transmission unit, then once the data for needing write are temporarily stored in first Property ground write internal memory.Open cache functions to be very helpful burst read-write capabilitys, CPU can be effectively improved and perform effect Rate.
The number of times of cpu bus access is reduced by burst read-writes, traditional instruction execution flow is shortened, to specific Instruction(Such as REP MOVSB)Instruction flow can be reduced to the 1/16 of traditional process after optimization(According to byte number contained by cache rows It is fixed).
This optimization method is not limited only in the instruction set of X86 structures, and other CISC or risc instruction set are also applicable. By taking risc instruction set as an example, there is an instruction to be LDMIA in the ARM instruction widely of application, multiple deposits can be completed The transmission of device value, at most can transmit the value of 16 general registers.The way of ARM is to perform after load/store operations one by one, is deposited Storage unit address is increased by word length.If using the burst read-writes in this patent and assignment mode, can disposably by register Value all read again property and be all assigned to 16 general registers, the lifting of performance is obvious.
A kind of advantage of the hardware optimization method of CPU of the present invention is:It is completely compatible with x86 instruction set, operating system and should With software without the need for any change;
The burst writable areas of the present invention are fully configurable, user can according to oneself need divide burst writable areas, it is right Reading and writing data carries out subsection optimization;
For hardware design, peripheral circuit is not increased, cpu logic door increases number seldom, system cost is not affected;
Further, by analysing in depth to data transfer instruction, partial data transmission of this intellectual achievement to X86 refers to The execution flow process of order is optimized, and substantially reduces the execution time of instruction, reduces the bus access number of times of CPU, for Specific instruction(Such as REP MOVSB instructions)Bus access number of times can be reduced to 1/16(Depending on byte number contained by cache rows).By More concentrate in bus access, so the bandwidth load of bus also has substantial degradation;
In extending to the CPU design of other CISC or risc instruction set, range of application is big.
Description of the drawings:
Fig. 1 is the typical instructions REP MOVS tradition execution flow charts of memory to memory data transfer.
Fig. 2 is typical instructions POPA tradition execution flow chart of the memory to register transfer.
Fig. 3 is general execution flow process comparison diagram before and after data transfer instruction optimization.
Fig. 4 is the execution flow chart after REP MOVS optimizations.
Fig. 5 is the execution flow chart after POPA optimizations.
Specific embodiment:
Shown in Ju Fig. 1~Fig. 5, a kind of hardware optimization method of CPU, its step is as follows:
1. burst regions are configured by burst addresses editing machine
Assignment is carried out to certain two general register in CPU so as to the lower address and upper address in value correspondence burst regions. Again a certain reserved bit of EFLAG is carried out putting an operation, by the burst configuration register groups of the two address assignments to CPU, Again zero-setting operation is performed to the reserved bit of EFLAG, to carry out burst regions configuration next time.This step is repeated several times with Configure multiple burst regions.
2. instructed by the transmission of hardware optimization rapid memory to internal memory
By taking REP MOVS instructions as an example, first DS is judged by hardware:ESI and ES:Whether the address of EDI is in In burst regions.If being in burst regions, optimization logic is performed, otherwise perform former logic.In optimization logic, ECX values are judged Whether burst_num is more than(According to the difference of operand size M, burst_num can be 4,8,16 etc.)If being more than operand Size, then from DS:ESI takes out N number of byte by a read burst(N represents cache row byte numbers)Data be temporarily stored in In internal burst registers, then ES is write by a write burst:EDI, and ESI/ EDI are deducted or added(Plus Or subtract value depending on DF)N, by ECX burst_num is deducted;If burst_num is less than, from DS:ESI is by once Read burst take out(ECX*M)The data of individual byte are temporarily stored in internal burst registers, then by a write Burst writes ES:EDI, and ESI/ EDI are deducted or plus ECX*M, and ECX is set to 0.
3. accelerate register to instruct to the transmission of internal memory by hardware optimization
By taking POPA/PUSHA instructions as an example, SS is first judged:Whether ESP is in burst regions.If being in burst regions, Optimization logic is performed, former logic is otherwise performed.In optimization logic, for POPA:The stacked data of N number of byte is read and kept in Internally in burst registers, the general register of the CPU such as DI, SI, BP is assigned to successively.For PUSHA:By DI, SI, BP Value combination Deng the general register of CPU is assigned to internal burst registers, is then disposably write out by write burst.
In addition to above-described embodiment, can be many to use the method for the present invention to write out in other categorical data transmission technologys Individual embodiment, here is not repeated one by one.

Claims (5)

1. a kind of hardware optimization method of CPU, it is characterised in that:Comprise the following steps:
(1) hardware of design system:Described system includes:Master controller, storage control, external bus interface module;
(2) burst regions are set:By burst addresses editing machine, burst areas in simple set memory and can not burst Area;Can burst areas and can not burst areas be the configurable region determined according to the Memory Allocation mode of user, can burst areas and Can not burst areas be not required for it is completely continuous, the memory address of system be assigned as it is multiple can burst areas and it is multiple can not burst Area;Whether can be configured by hardware and software two ways using burst patterns;Outside is turned on and off by single switch Burst patterns, it is internal to decide whether to open burst patterns by arranging particular register value;Except using burst editing machines with Outward, user can also burst areas to set by setting internal register value;Pass through BIOS setup one generally in system initialization It is secondary, or burst areas are reset when program is run as needed;The source address of the data transfer instruction of execution, mesh Address in the burst regions when, enable the instruction scheme of optimization;
(3) using burst patterns:When data length is more than 32, you can enable burst patterns, burst patterns are not relied on Cache functions whether are opened, the memory read-write of the data of a cache row size can be at most supported.
2. the hardware optimization method of a kind of CPU described in a Ju claim 1, it is characterised in that:System includes:Master controller, Storage control, external bus interface module;Wherein, master controller is the main part of system, and being can be for different memory access Address performs the CPU of the storage operation of different modes, and specifically, this CPU can be internal by BIOS program or setting The mode of register value come set can burst address areas, for can the memory access of burst areas by optimization after instruction scheme perform, no Can burst areas perform by former instruction scheme, storage control by receiving master controller data message and control information, to storage Device is written and read operation, it would be preferable to support burst reads and writes and non-burst read-writes;It is inside and outside that external bus interface is then used for CPU Data communication;
Information after hardware optimization includes:The source address information of data transfer, destination address information, data length information, control Information, data message.
3. the hardware optimization method of a kind of CPU described in a Ju claim 1, it is characterised in that:Described use burst patterns It is when data length is more than 32, you can enable burst patterns;Burst patterns are independent on whether to open cache functions, The memory read-write of the data of a cache row size can at most be supported;When CPU inside and outside carries out data transmission, read When taking internal memory, once the data of a cache row are read in the burst registers of data transmission unit, then deposited Device assignment operation;During write internal memory, the data for needing write are temporarily stored in the burst registers of data transmission unit first, Then internal memory is disposably write.
4. a kind of hardware optimization method of CPU, it is characterised in that:Comprise the following steps:
(1) hardware of design system:Described system includes:Master controller, storage control, external bus interface module;
(2) burst regions are set:By burst addresses editing machine, burst areas in simple set memory and can not burst Area;Can burst areas and can not burst areas be the configurable region determined according to the Memory Allocation mode of user, can burst areas and Can not burst areas be not required for it is completely continuous, the memory address of system be assigned as it is multiple can burst areas and it is multiple can not burst Area;Whether can be configured by hardware and software two ways using burst patterns;Outside is turned on and off by single switch Burst patterns, it is internal to decide whether to open burst patterns by arranging particular register value;Except using burst editing machines with Outward, user can also burst areas to set by setting internal register value;Pass through BIOS setup one generally in system initialization It is secondary, or burst areas are reset when program is run as needed;The source address of the data transfer instruction of execution, mesh Address in the burst regions when, enable the instruction scheme of optimization;
(3) using burst patterns:When data length is more than 32, you can enable burst patterns, burst patterns are not relied on Whether cache function is opened, and the actual desirable maximum number of bytes of burst are without hard requirement.
5. the hardware optimization method of a kind of CPU described in a Ju claim 4, it is characterised in that:System includes:Master controller, Storage control, external bus interface module;Wherein, master controller is the main part of system, and being can be for different memory access Address performs the CPU of the storage operation of different modes, and specifically, this CPU can be internal by BIOS program or setting The mode of register value come set can burst address areas, for can the memory access of burst areas by optimization after instruction scheme perform, no Can burst areas perform by former instruction scheme, storage control by receiving master controller data message and control information, to storage Device is written and read operation, it would be preferable to support burst reads and writes and non-burst read-writes;It is inside and outside that external bus interface is then used for CPU Data communication;
Information after hardware optimization includes:The source address information of data transfer, destination address information, data length information, control Information, data message.
CN201310450768.4A 2013-09-29 2013-09-29 Hardware optimization method for CPU Active CN103500107B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310450768.4A CN103500107B (en) 2013-09-29 2013-09-29 Hardware optimization method for CPU

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310450768.4A CN103500107B (en) 2013-09-29 2013-09-29 Hardware optimization method for CPU

Publications (2)

Publication Number Publication Date
CN103500107A CN103500107A (en) 2014-01-08
CN103500107B true CN103500107B (en) 2017-05-17

Family

ID=49865322

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310450768.4A Active CN103500107B (en) 2013-09-29 2013-09-29 Hardware optimization method for CPU

Country Status (1)

Country Link
CN (1) CN103500107B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109324982B (en) * 2017-07-31 2023-06-27 上海华为技术有限公司 Data processing method and data processing device
CN109426528A (en) * 2017-09-05 2019-03-05 东软集团股份有限公司 Realize the method, apparatus and storage medium, program product of software version selection

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114259A (en) * 2006-07-27 2008-01-30 杭州晟元芯片技术有限公司 Program code memory bank in processor piece based on FLASH structure and method for realizing execution in code piece
CN101425044A (en) * 2008-11-06 2009-05-06 西安交通大学 Write-through cache oriented SDRAM read-write method
CN103207843A (en) * 2013-04-15 2013-07-17 山东大学 Data line width dynamically-configurable cache structure design method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7945840B2 (en) * 2007-02-12 2011-05-17 Micron Technology, Inc. Memory array error correction apparatus, systems, and methods

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101114259A (en) * 2006-07-27 2008-01-30 杭州晟元芯片技术有限公司 Program code memory bank in processor piece based on FLASH structure and method for realizing execution in code piece
CN101425044A (en) * 2008-11-06 2009-05-06 西安交通大学 Write-through cache oriented SDRAM read-write method
CN103207843A (en) * 2013-04-15 2013-07-17 山东大学 Data line width dynamically-configurable cache structure design method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
JX微处理器指令CACHE的设计与验证;张汉林;《中国优秀硕士学位论文全文数据库 信息科技辑》;20060315;第38-39页 *
面向逻辑设计的SDRAM控制器性能度量模型;潘光荣等;《计算机应用研究》;20090930;第26卷(第9期);第3432-3435页 *

Also Published As

Publication number Publication date
CN103500107A (en) 2014-01-08

Similar Documents

Publication Publication Date Title
JP5927263B2 (en) Method and memory for communication between host computer system and memory
US8365111B2 (en) Data driven logic simulation
CN1952917A (en) Memory controller and data processing system with the same
CN104238957B (en) SPI controller, SPI flash memory and its access method and access control method
US20160179388A1 (en) Method and apparatus for providing programmable nvm interface using sequencers
CN104965676B (en) A kind of access method of random access memory, device and control chip
CN106776458B (en) Communication device and communication method between DSPs (digital Signal processors) based on FPGA (field programmable Gate array) and HPI (high Performance Integrated interface)
CN106557442A (en) A kind of chip system
CN103500107B (en) Hardware optimization method for CPU
CN102789424B (en) External extended DDR2 (Double Data Rate 2) read-write method on basis of FPGA (Field Programmable Gate Array) and external extended DDR2 particle storage on basis of FPGA
CN106980587A (en) A kind of universal input output timing processor and sequential input and output control method
CN207008602U (en) A kind of storage array control device based on Nand Flash memorizer multichannel
CN105788636A (en) EMMC controller based on parallel multichannel structure
CN107544937A (en) A kind of coprocessor, method for writing data and processor
CN104683265B (en) High-capacity accurate packet counting method for 100G interface
JP2018507489A (en) TIGERSHARC DSP boot management chip and method
CN206975631U (en) A kind of universal input output timing processor
CN105893036A (en) Compatible accelerator extension method for embedded system
CN106571156B (en) A kind of interface circuit and method of high-speed read-write RAM
CN106547716B (en) A kind of expansion bus configuration system and method towards low pin number
CN106909523A (en) large-scale data transmission method and system
CN106445879A (en) SoC architecture with high cost performance
CN106708755A (en) PCIE interface realization method and apparatus
CN104298616A (en) Method of initializing data block, cache memory and terminal
CN207067979U (en) A kind of high speed SWD protocol conversion interface circuits

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: 430000, No.1, Canglong North Road, Fenghuang Industrial Park, Donghu New Technology Development Zone, Wuhan City, Hubei Province

Patentee after: No. 709 Research Institute of China Shipbuilding Corp.

Address before: 430074 No. 718, Luoyu Road, Hongshan District, Wuhan City, Hubei Province

Patentee before: NO.709 RESEARCH INSTITUTE OF CHINA SHIPBUILDING INDUSTRY Corp.

TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220805

Address after: 430000 No. 1 Baihe Road, Guandong Industrial Park, Donghu New Technology Development Zone, Wuhan City, Hubei Province

Patentee after: Wuhan lingjiu Microelectronics Co.,Ltd.

Address before: 430000, No.1, Canglong North Road, Fenghuang Industrial Park, Donghu New Technology Development Zone, Wuhan City, Hubei Province

Patentee before: No. 709 Research Institute of China Shipbuilding Corp.