[background technology]
In traditional application, CPU conducted interviews to storage unit when MMU (memory management unit) was generally used for the operation system required virtual address map and management.
Along with the development of technology, graph image becomes more and more important use in consumer portable terminal electronic equipment, also follows application and development and becomes more complicated originally and produce the required graph and image processing circuit of high quality graphics image effect.In the graph and image processing circuit, do not add before the MMU unit; All data all are directly to conduct interviews with physical address with order; The problem of bringing like this is a lot: when the graph and image processing circuit need be opened up a block space as frame buffer in internal memory in processing procedure, if do not have virtual address map then this part storage space is necessary for a slice continuous address, so then need system to find a continuous physical space to supply the graph and image processing circuit to use; If the resolution of image is very big; Then corresponding required memory space also can be very big, and under the nervous situation of internal memory, system possibly be difficult to the continuous physical space of finding a slice to arrive very much; If the graph and image processing circuit does not have MMU; Because the address of desired data and order all is a physical address; And operating system software itself is seen all is virtual address; So when using the graph and image processing circuit, needing software is could dispose the figure image processing circuit behind the physical address to carry out work with virtual address map, has strengthened the use complexity of graph and image processing circuit.Become Development Trend so in the graph and image processing circuit, add MMU.
Lifting along with the graph and image processing performance; SOC (system level chip; Or title SOC(system on a chip)) system requires also to increase to bus width and efficient thereupon; The AXI bus has progressively become a kind of EBI that the graph and image processing circuit usually uses in the SOC chip as a kind of high-performance, high bandwidth, the low bus on chip that postpones, and is necessary based on the MMU of this advanced Bus Interface Design.AXI (Advanced eXtensible Interface) is a kind of bus protocol; This agreement is a most important parts in AMBA (Advanced Microcontroller Bus Architecture) 3.0 agreements that propose of ARM company, is a kind of towards high-performance, high bandwidth, the low bus on chip that postpones.Its address/control separates with data phase; The data transmission that support does not line up; Simultaneously in burst transfer, only need first address, the passage and support out of order visit of reading and writing data that separates simultaneously; Can use different id numbers to show difference, so that realize out of order transmission visit between different commands and the data.
Because MMU circuit design in the past designs based on the characteristic of operating system usually; And the data stream of graph and image processing circuit has the characteristics of himself; If be used for the graph and image processing circuit and can not improve mapping efficient effectively so directly apply mechanically the project organization of general MMU, therefore designing a kind of MMU to the graph and image processing data flow characteristics is a job highly significant.
And the resolution speed of existing MMU and transfer efficiency deficiency; The dirigibility configurability of MMU circuit is not enough.
[summary of the invention]
The technical matters that the present invention will solve; Be to provide a kind of Flame Image Process special-purpose configurable MMU circuit; It is to the characteristics of graph and image processing data stream; The hit rate of TLB (Translation lookaside buffer, promptly bypass conversion buffered or be called page table buffering) be can improve, and the efficient and the speed of transmission improved.
The present invention is achieved in that the special-purpose configurable MMU circuit of a kind of Flame Image Process; It comprises: original directive resolution unit, TLB, the TLB identical with graph and image processing unit operations region quantity hit statistic unit, TLB upgrades control module, order ruling unit, AXI master interface, read data passage processing unit, configuration information register, the corresponding operating area of each TLB; Said original directive resolution unit receives the signal from the graph and image processing unit; And be connected to each TLB; And order ruling unit, said TLB all is connected to order ruling unit, and said order ruling unit is connected to AXI master interface; Said AXI master interface is connected to storage unit through the AXI bus; And AXI master interface is connected to read data passage processing unit, and said read data passage processing unit is connected respectively to order ruling unit, TLB upgrades control module, and is connected to the graph and image processing unit; Said configuration information register receives the configuration information of preserving from CPU, and configuration information is transferred to each unit.
Further, said TLB all is connected to said TLB and hits statistic unit.
Further, said TLB adopts the parametrization design, is the degree of depth of scalable TLB through the depth parameter of regulating TLB.
Further, different ID is used in the different operation of said graph and image processing unit zone, and the bus ID operation in respective operations zone is used in each TLB visit.
The present invention has following advantage:
1. MMU of the present invention divides the different TLB of use to the design of graph and image processing data flow characteristics according to the different successive operating area, to improve the hit rate of TLB;
2. the present invention is based on the AXI bus design; Order based on different I D in the AXI bus does not need the ordinal relation that front and back rely on; Just can out of order and parallel work-flow; Different bus ID operations is used in different TLB visits, but because the operation concurrency between the different I D significantly improves the efficient and the speed of transmission;
3. TLB of the present invention hits the hit rate that statistic unit can be added up each TLB; Can inquire about the hit situation of each TLB in real time; And the design of said TLB design employing parametrization, the deviser only need can adjust the degree of depth of TLB through the depth parameter that TLB is set, so during emulation; The deviser can hit the statistical information of each TLB of statistic unit according to TLB; Obtain the accessed characteristics of each image-region, thereby can constantly adjust the degree of depth of TLB, to reach area and efficient optimum balance.
[embodiment]
See also Fig. 1 to shown in Figure 4, embodiments of the invention are carried out detailed explanation.
As shown in Figure 1, when MMU1 of the present invention worked, the last AXI master interface of MMU1 was connected to storage unit 2 through the AXI bus, and said MMU1 is also connected to graph and image processing unit 3.CPU4 is connected on the MMU1, is responsible for MMU1 is configured, and configuration information comprises: the ID that each operating area uses in physical storage address in storage unit of page table, page table, the graph and image processing unit.Said storage unit 2 is responsible for all data of storage, comprising: page table information, pending graphic image data.The processing graphics view data is responsible in said graph and image processing unit 3, and after handling, writes back storage unit, and said graph and image processing unit 3 carries out through an AXImaster interface and outside alternately.Said MMU1 is between graph and image processing unit 3 and AXI bus, and virtual address is to the mapping and the management of physical address when being responsible for graph and image processing unit 3 data accesses.
What TLB deposited the inside is some page table files (virtual address is to the conversion tables of physical address), is called fast table technology again.Because " page table " is stored in the primary memory, the cost that the inquiry page table is paid is very big, has produced TLB thus.MMU need use page table information with virtual address map during for physical address; MMU can at first inquire about in TLB whether required page table information is arranged; If hit then need directly not use the page table information among the TLB to shine upon (time that has significantly reduced the MMU access memory in this case), if do not hit then need carry out map addresses (can increase the time of MMU access memory in this case) through the page table in the access memory through the page table in the access memory.
Like Fig. 2, MMU1 of the present invention comprises: original directive resolution unit 11, with the corresponding a plurality of TLB12 in graph and image processing unit operations zone (total TLB in the present embodiment
1, TLB
2, TLB
3), TLB hits statistic unit 13, TLB upgrades control module 14, order ruling unit 15, AXI master interface 16, read data passage processing unit 17, configuration information register 18; The signal that said original directive resolution unit 11 receives from graph and image processing unit 3; And be connected to each TLB12; And order ruling unit 15, said each TLB12 all is connected to order ruling unit 15, and said each TLB12 also all is connected to a TLB and hits statistic unit 13.Said order ruling unit 15 is connected to AXI master interface 16; Said AXI master interface 16 is connected to storage unit 2 through the AXI bus; And AXI master interface 16 is connected to read data passage processing unit 17; Said read data passage processing unit 17 is connected respectively to order ruling unit 15, TLB upgrades control module 14; And be connected to graph and image processing unit 3, said configuration information register 18 receives the configuration information of preserving from CPU4, and configuration information is transferred to each unit.Said TLB12 adopts the parametrization design, is the degree of depth of scalable TLB12 through the depth parameter of regulating TLB12.The quantity of said TLB12 is identical with graph and image processing unit operations region quantity, the corresponding operating area of each TLB12, and different ID is used in the different operation zone, and the bus ID operation in respective operations zone is used in each TLB12 visit.
The principle of work of MMU of the present invention:
1. at first; CPU4 is configured MMU1, and the configuration information of CPU4 is kept in the configuration information register 18, and configuration information register 18 is sent to all internal elements with configuration information and supplies its use then; After CPU4 finishes to MMU1 configuration, start working in graph and image processing unit 3;
2. after start working in graph and image processing unit 3; Can send read write command to MMU1; It is that which operating area is carried out read-write operation that at this time original directive resolution unit 11 is told order according to the employed different I D of each operating area number; According to the command operation target area that parses, in its corresponding TLB12, carry out the page table information inquiry, simultaneously then; All command signals all can directly be sent in the order ruling unit 15 and be buffered in the buffer, and WOO ruling unit 15 carries out sending together after map addresses is accomplished again;
3.TLB12 after accomplishing inquiry; The result that can whether successfully hit is sent to order ruling unit 15 and hits statistic unit 13 with TLB; If successfully hit, then will hit the page table direct information that obtains simultaneously and order ruling unit 15 for accomplishing the mapping of virtual address to physical address; If do not hit, then can go the mapping that the required page table information of inquiry arrives physical address with the completion virtual address in the storage unit 2 in request command ruling unit 15, and after the required page table information of inquiry, carry out the page table renewal of TLB12;
After 4.TLB12 the result that will whether successfully hit transmits and arrives order ruling unit 15; Whether order ruling unit 15 can basis hit the following behavior that determines: if successful inquiring is hit; Then use the page table information that transmits from TLB12 simultaneously to carry out map addresses; After accomplishing, cooperate other to exist the bus line command in the buffer to be sent to AXI master interface 16 together physical address; If do not hit; Then need in storage unit 2, inquire about page table information; At this time order ruling unit 15 to send an order of reading page table to AXI master interface 16; And return the back in the corresponding page table information of the order of reading page table and accomplish map addresses, after map addresses, cooperate other to exist the bus line command in the buffer to be sent to AXI master interface 16 together physical address then;
5. under the situation that TLB12 does not hit; Order ruling unit 15 can send the page table walks read command and inquire about the page table information in the storage unit 2; When the corresponding read data of page table walks read command returns; Read data can arrive read data passage processing unit 17, and read data passage processing unit 17 can be according to judging that read data is the read data that read command that query page meter-run data or graph and image processing unit 3 send obtains for ID number; If read data is the query page meter-run data is page table information, then page table direct information order ruling unit 15 is upgraded control module 14 with TLB;
6. after the page table information transmission that storage unit 2 inquiries obtain arrives order ruling unit 15; Order ruling unit 15 can carry out map addresses with page table information; After accomplishing map addresses; Cooperate other to exist the bus line command in the buffer to be sent to AXI master interface 16 together physical address, thereby accomplish the mapping of a subcommand;
6. after the page table information transmission that storage unit 2 inquiry obtains is upgraded control module 14 to TLB, TLB upgrades among the TLB12 that control module 14 can not hit to this inquiry and upgrades, and updating strategy is replacement in turn;
7. after an order is received in the operating area order of MMU1; Between the map addresses of accomplishing this subcommand; MMU1 no longer receives the Next Command to this operating area; Accomplish the map addresses of this this subcommand of operating area at MMU1 after, just can begin to receive the next time bus line command of graph and image processing unit to this operating area.For example MMU1 receives a corresponding TLB
1The order of operating area after, before map addresses finishes, no longer receive corresponding TLB
1The order of operating area; Between the different operation zone, MMU is that parallel receive is ordered always, and for example MMU receives a corresponding TLB
1The order of target area after, still can continue to receive corresponding TLB at once
2Perhaps TLB
3The order of target area.Each operating area corresponding one ID number; Can make the command operation of each operating area that a different ID is all arranged like this; And the order based on different I D does not need the ordinal relation that front and back rely in the AXI bus; Just can out of order and parallel work-flow, and the read data that returns also only need to identify according to the corresponding ID of data be the page table information that belongs to which operating area.So utilize corresponding ID number the operation of each operating area, can improve transmission concurrency and efficient greatly.
Said TLB hits the hit rate that statistic unit 13 can be added up each TLB12, and the design of said TLB12 design employing parametrization, and the deviser only need can adjust the degree of depth of TLB through the depth parameter that TLB12 is set.So during emulation, the deviser can hit the statistical information of each TLB of statistic unit 13 according to TLB, obtains the accessed characteristics of each image-region, thereby can constantly adjust the degree of depth of TLB, to reach area and efficient optimum balance.Greatly facilitate and when emulation, constantly adjust the TLB degree of depth to be fit to the accessed characteristics of each image-region;
The number of TLB in the MMU circuit also can be adjusted very easily; The number of TLB mainly has several to decide according to the operating area of graph and image processing unit 3: like Fig. 3; One than the complicated graphs image processing circuit, can realize the be added to image in purpose zone of three source images, and then three source images are stored in different start addresses usually with a purpose image in storage unit; But the data in each image all are continuous, so this situation need be used 4 TLB.
As shown in Figure 4, virtual address comprises: virtual page number and page or leaf bias internal amount.TLB comprises: significance bit, label, Physical Page plot.Equal sign is a logic of judging whether the virtual page number in label substance and the virtual address equates among the TLB.
The process of carrying out map addresses according to page table information is: after obtaining virtual address; All labels among virtual page number and the TLB are compared; The result who equates whether is sent to logical and unit (equal then the result equals 1, unequal is 0), and the significance bit of label is carried out logical and calculating among result that will whether equate then and the TLB; If the result is 1 then judges that TLB hits, otherwise TLB does not hit.
The page or leaf bias internal amount that directly will hit label corresponding physical page base location and virtual address if TLB hits is combined into physical address and accomplishes mapping; If TLB does not hit, then need arrive external memory unit read page table information with obtain virtual page number to the mapping relations of Physical Page plot to accomplish the mapping of virtual address to physical address.
The above is merely preferred embodiment of the present invention, so can not limit the scope that the present invention implements according to this, the equivalence of promptly doing according to claim of the present invention and description changes and modifies, and all should still belong in the scope that the present invention contains.