CN110245094A - A kind of block grade cache prefetching optimization method and system based on deep learning - Google Patents

A kind of block grade cache prefetching optimization method and system based on deep learning Download PDF

Info

Publication number
CN110245094A
CN110245094A CN201910526384.3A CN201910526384A CN110245094A CN 110245094 A CN110245094 A CN 110245094A CN 201910526384 A CN201910526384 A CN 201910526384A CN 110245094 A CN110245094 A CN 110245094A
Authority
CN
China
Prior art keywords
data
memory
prediction
module
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910526384.3A
Other languages
Chinese (zh)
Other versions
CN110245094B (en
Inventor
周可
王桦
石星
何铭健
张
冉忞玮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong University of Science and Technology
Original Assignee
Huazhong University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong University of Science and Technology filed Critical Huazhong University of Science and Technology
Priority to CN201910526384.3A priority Critical patent/CN110245094B/en
Publication of CN110245094A publication Critical patent/CN110245094A/en
Application granted granted Critical
Publication of CN110245094B publication Critical patent/CN110245094B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0862Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches with prefetch
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0866Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches for peripheral storage systems, e.g. disk cache
    • G06F12/0868Data transfer between cache memory and other subsystems, e.g. storage devices or host systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a kind of block grade cache prefetching optimization method based on deep learning, it include: the I/O data obtained from test data set as unit of byte, I/O data in blocks is converted by the I/O data, whether the I/O data after judging conversion hits in the buffer, sequential prediction is carried out to the I/O data after conversion if not, to obtain multiple memory blocks, I/O data after conversion is stored in the IO queue of memory, and judge whether the IO queue has expired, if it is all I/O datas in the IO queue of memory are inputted in trained Seq2Seq model, with the I/O data predicted, and corresponding multiple memory blocks are obtained according to the I/O data of the prediction.The present invention excavates the correlation of IO according to the method using deep learning, and the prediction of I/O sequence is completed using the Seq2Seq model based on LSTM, finally predicts to combine with sequential prediction by I/O sequence, to complete prefetching for caching, promotes the hit rate of caching.

Description

A kind of block grade cache prefetching optimization method and system based on deep learning
Technical field
The invention belongs to computer memory technical fields, cache more particularly, to a kind of block grade based on deep learning Prefetching optimization method and system.
Background technique
It is become more and more important in the performance of big data era, storage system, in order to further enhance the property of storage system Can, it needs to prefetch the block grade access of local disk.
The forecasting method of current block grade access mainly has fixed prefetching algorithm, sequence prefetching algorithm, based on application hint Prefetching algorithm, prefetching algorithm based on data mining etc..It is relatively simple wherein to fix prefetching algorithm realization, need to only prefetch current visit Ask n data behind data;Sequence prefetching algorithm is to a kind of improvement of fixed prefetching algorithm, it can be according to data Access characteristics, the opportunity that adjust automatically prefetches and the quantity prefetched;Prefetching algorithm based on application hint is by being supplied to upper layer Using an interface, prefetched according to the instruction completion of upper layer application;Prefetching algorithm based on data mining is visited by excavating history Ask that data completion prefetches, such as: the correlation of C-Miner method mining data block by way of finding Frequent Subsequence, and It is translated into correlation rule completion and prefetches work, Block2Vec method excavates the correlation of memory block based on deep learning, leads to Crossing the COS distance calculated between memory block vector can also complete to prefetch.
However, the forecasting method of above-mentioned piece of grade access haves the defects that some can not ignore: fixed prefetching algorithm with it is suitable Sequence prefetching algorithm is all just with the principle of locality of data access, and prefetching efficiency and to prefetch accuracy rate all relatively low; The interface of upper layer application one reception and registration prefetched instruction is provided to based on the prefetching algorithm that application implies, this is in many systems It is unable to get support;Prefetching algorithm based on data mining, the ability of mining data access characteristics is also than relatively limited, not In same access module, it is widely different to prefetch accuracy rate performance.
Summary of the invention
Aiming at the above defects or improvement requirements of the prior art, the present invention provides a kind of block grade of base deep learning cachings Prefetching optimization method, it is intended that excavating the correlation of IO according to the method using deep learning, and using based on LSTM's Seq2Seq model completes the prediction of I/O sequence, finally predicts to combine with sequential prediction by I/O sequence, to complete the pre- of caching It takes, promotes the hit rate of caching.
To achieve the above object, according to one aspect of the present invention, a kind of block grade caching based on deep learning is provided Prefetching optimization method, comprising:
(1) I/O data as unit of byte is obtained from test data set, converts the I/O data in blocks I/O data;
(2) whether the I/O data after judging conversion hits in the buffer, and if it is process terminates, and otherwise enters step (3);
(3) sequential prediction is carried out to the I/O data after conversion, to obtain multiple memory blocks;
(4) I/O data after conversion is stored in the IO queue of memory, and judges whether the IO queue has expired, if it is (5) are then entered step, otherwise return step (1);
(5) all I/O datas in the IO queue of memory are inputted in the trained Seq2Seq model based on LSTM, with The I/O data predicted, and corresponding multiple memory blocks are obtained according to the I/O data of the prediction;
(6) according to cache replacement algorithm, and depositing in the memory block replacement caching obtained using step (3) and step (5) Store up block.
Preferably, the I/O data after step (1) conversion is the storage block number with its logical block address LBA and the I/O data BlockNum is indicated jointly.
Preferably, corresponding multiple data blocks are obtained according to the I/O data of prediction, is to parse the I/O data first, to obtain Its logical block address LBA and its storage block number BlockNum is obtained, and then obtains all storages included by the I/O data of the prediction Block.
Preferably, judge I/O data whether in memory hit be check I/O data all memory blocks whether in the buffer Hit, if it is indicates that the I/O data is hit in the buffer, otherwise indicates that the I/O data is not hit in the buffer.
Preferably, it is pre- according to the logical address of the last one memory block in the I/O data for carrying out sequential prediction to I/O data Survey the memory block that will be accessed.
Preferably, the Seq2Seq model based on LSTM used in step (5) is through the following steps that training obtained:
(5-1) obtains training dataset, concentrates the I/O data as unit of byte to be converted into the training data and is with block The I/O data of unit;
All I/O datas that (5-2) is obtained after converting to step (5-1) carry out segment processing, after obtaining multiple segmentations I/O data, and store the I/O data after all segmentations;
(5-3) is handled the I/O data after all segmentations using term vector generating algorithm, corresponding to obtain I/O data Vector;
(5-4) carries out non-overlapping sequence sampling to the I/O data after all segmentations, to obtain multiple sequence samples;
I/O data institute after (5-5) the Seq2Seq model of basis based on LSTM and the segmentation for using step (5-3) to obtain is right The vector answered carries out repetition training to multiple sequence samples that step (5-4) obtains, until reaching default the number of iterations.
Preferably, the continuous bag of words in Word2vec algorithm specifically used in step (5-3), the dimension of vector It is 50.
Preferably, the cache replacement algorithm that step (6) uses can be lru algorithm, FIFO algorithm or ARC algorithm.
It is another aspect of this invention to provide that providing a kind of block grade cache prefetching optimization system based on deep learning, wrap It includes:
First module, for obtaining I/O data as unit of byte from test data set, by the I/O data be converted into Memory block is the I/O data of unit;
Second module, for judging whether the I/O data after conversion hits in the buffer, if it is process terminates, otherwise Into third module;
Third module, for carrying out sequential prediction to the I/O data after conversion, to obtain multiple memory blocks;
Whether 4th module for the I/O data after conversion to be stored in the IO queue of memory, and judges the IO queue It has been expired that, if yes then enter the 5th module, otherwise return to the first module;
5th module inputs trained based on LSTM's for all I/O datas in the IO queue by memory In Seq2Seq model, with the I/O data predicted, and corresponding multiple memory blocks are obtained according to the I/O data of the prediction;
6th module, for being replaced according to cache replacement algorithm, and using the memory block that third module and the 5th module obtain Change the memory block in caching.
In general, through the invention it is contemplated above technical scheme is compared with the prior art, can obtain down and show Beneficial effect:
(1) present invention is able to solve prefetching efficiency present in existing method and prefetches the relatively low problem of accuracy rate: (6) are arrived due to using step (3) in the present invention, sequential prediction are combined with the Seq2Seq prediction based on LSTM, sufficiently benefit With the principle of locality of data access and the correlation principle of data, therefore, the efficiency and accuracy rate prefetched is all promoted.
(2) present invention, which is able to solve present in existing method, is provided to upper layer application interface, conveys prefetched instruction The problem of, (6) are arrived due to using step (1) in the present invention, whole process is isolated with upper layer application, the finger without upper layer application Show, accordingly, it is not necessary to provide upper layer application for conveying the interface of prefetched instruction.
(3) it is big to be able to solve predictablity rate under different access mode present in existing method performance difference by the present invention Problem: arriving (5) due to using step (2) in the present invention, by detecting cache hit situation, by sequential prediction and is based on LSTM Seq2Seq prediction combine, to adapt to different access modules, therefore, under different access modules, the accuracy rate that prefetches All perform well.
(4) present invention carries out the prediction of I/O sequence by the Seq2Seq model based on LSTM, more than calculating vector Chordal distance not only improves forecasting efficiency and also reduces memory overhead.
(5) present invention has not only excavated the correlation of IO, has more reached IO dimensionality reduction by IO2Vec technique drill IO vector Purpose.Compared to the vector that simple One-Hot is indicated, when the IO vector of IO2Vec training greatly reduces model training Time loss and memory consumption, and improve the accuracy rate of model.
(6) in the test of caching, hit rate prefetches the prefetching algorithm designed in the present invention relative to no use For caching, averagely promotion 10%-30%;Relative to the caching for only using sequence prefetching algorithm, also there is the promotion of 3%-10%.
Detailed description of the invention
Fig. 1 is the flow chart of the block grade cache prefetching optimization method the present invention is based on deep learning.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, and It is not used in the restriction present invention.As long as in addition, technical characteristic involved in the various embodiments of the present invention described below Not constituting a conflict with each other can be combined with each other.
As shown in Figure 1, the present invention provides a kind of block grade cache prefetching optimization method based on deep learning, including it is following Step:
(1) input and output (Input/output, abbreviation IO) data as unit of byte are obtained from test data set, it will The I/O data is converted into the I/O data with block (Block) for unit;
Specifically, the storage block size of the I/O data after converting in this step is 4096KB.
I/O data after conversion is with its logical block address (Logical block address, abbreviation LBA) and the IO number According to storage block number (BlockNum) jointly indicate.
(2) whether the I/O data after judging conversion hits in the buffer, and if it is process terminates, and otherwise enters step (3);
It is to check all memory blocks of I/O data whether slow specifically, judging whether I/O data hits in the buffer Middle hit is deposited, if it is indicates that the I/O data is hit in the buffer, otherwise indicates that the I/O data is not hit in the buffer.
(3) sequential prediction (Sequential prediction) is carried out to the I/O data after conversion, to obtain multiple storages Block;
Specifically, carrying out sequential prediction to I/O data is the logical address according to the last one memory block in the I/O data Predict the memory block (quantity predicted in the present invention 4-16, determine with specific reference to sequence detection result) that will be accessed;Example Such as, if the logical address of the last one memory block is 10 in current I/O data, the logical address after prediction is 11,12, 13 etc., it is corresponding with multiple memory blocks.
(4) I/O data after conversion is stored in the IO queue of memory, and judges whether the IO queue has expired, if it is (5) are then entered step, otherwise return step (1);
(5) all I/O datas input in the IO queue of memory is trained based on shot and long term memory (Long short- Term memory, abbreviation LSTM) sequence in series model (Sequence to Sequence, abbreviation Seq2Seq), with The I/O data predicted, and corresponding multiple memory blocks are obtained according to the I/O data of the prediction;
Specifically, obtain corresponding multiple data blocks according to the I/O data of prediction, be parse the I/O data first, thus Its logical block address (Logical block address, abbreviation LBA) and its storage block number (BlockNum) are obtained, and then just It can obtain all memory blocks included by the I/O data of the prediction.
Seq2Seq model used in this step based on LSTM is through the following steps that training obtained:
(5-1) obtains training dataset, concentrates the I/O data as unit of byte to be converted into the training data and is with block The I/O data of unit;
All I/O datas that (5-2) is obtained after converting to step (5-1) carry out segment processing, after obtaining multiple segmentations I/O data, and store the I/O data after all segmentations;
Specifically, the section gap in this step is 10 milliseconds;
(5-3) is handled the I/O data after all segmentations using term vector generating algorithm (word2vec), to obtain The corresponding vector of I/O data;
In this step, be using Word2vec algorithm in continuous bag of words (Continuous Bag-Of- Words Model, abbreviation CBOW), the dimension of vector is 50.
(5-4) carries out non-overlapping sequence sampling to the I/O data after all segmentations, to obtain multiple sequence samples;
Specifically, the sequence length during sequential sampling is 3;
Non-overlap samples does not intersect between the I/O data for referring to front and back double sampling.
I/O data institute after (5-5) the Seq2Seq model of basis based on LSTM and the segmentation for using step (5-3) to obtain is right The vector answered carries out repetition training to multiple sequence samples that step (5-4) obtains, until reaching default the number of iterations;
Specifically, presetting the number of iterations in this step is 1 to 1000 times, preferably 500 times.
(6) according to cache replacement algorithm, and depositing in the memory block replacement caching obtained using step (3) and step (5) Store up block.
Specifically, cache replacement algorithm used in the present invention can be least recently used (Least recently Used, abbreviation LRU) algorithm, first in first out (First in first out, abbreviation FIFO) algorithm or adaptive displacement high speed it is slow Deposit (Adaptive replacement cache, abbreviation ARC) algorithm.
In general, the present invention has the following beneficial effects:
(1) present invention is able to solve prefetching efficiency present in existing method and prefetches the relatively low problem of accuracy rate: (6) are arrived due to using step (3) in the present invention, sequential prediction are combined with the Seq2Seq prediction based on LSTM, sufficiently benefit With the principle of locality of data access and the correlation principle of data, therefore, the efficiency and accuracy rate prefetched is all promoted.
(2) present invention, which is able to solve present in existing method, is provided to upper layer application interface, conveys prefetched instruction The problem of, (6) are arrived due to using step (1) in the present invention, whole process is isolated with upper layer application, the finger without upper layer application Show, accordingly, it is not necessary to provide upper layer application for conveying the interface of prefetched instruction.
(3) it is big to be able to solve predictablity rate under different access mode present in existing method performance difference by the present invention Problem: arriving (5) due to using step (2) in the present invention, by detecting cache hit situation, by sequential prediction and is based on LSTM Seq2Seq prediction combine, to adapt to different access modules, therefore, under different access modules, the accuracy rate that prefetches All perform well.
(4) present invention carries out the prediction of I/O sequence by the Seq2Seq model based on LSTM, more than calculating vector Chordal distance not only improves forecasting efficiency and also reduces memory overhead.
(5) present invention has not only excavated the correlation of IO, has more reached IO dimensionality reduction by IO2Vec technique drill IO vector Purpose.Compared to the vector that simple One-Hot is indicated, when the IO vector of IO2Vec training greatly reduces model training Time loss and memory consumption, and improve the accuracy rate of model.
(6) in the test of caching, hit rate prefetches the prefetching algorithm designed in the present invention relative to no use For caching, averagely promotion 10%-30%;Relative to the caching for only using sequence prefetching algorithm, also there is the promotion of 3%-10%.
The present invention optimizes the method for deep learning applied to cache prefetching, and is experimentally confirmed its validity, at this The innovative meaning of technical field.
As it will be easily appreciated by one skilled in the art that the foregoing is merely illustrative of the preferred embodiments of the present invention, not to The limitation present invention, any modifications, equivalent substitutions and improvements made within the spirit and principles of the present invention should all include Within protection scope of the present invention.

Claims (9)

1. a kind of block grade cache prefetching optimization method based on deep learning, which comprises the following steps:
(1) I/O data as unit of byte is obtained from test data set, converts IO number in blocks for the I/O data According to;
(2) whether the I/O data after judging conversion hits in the buffer, and if it is process terminates, and otherwise enters step (3);
(3) sequential prediction is carried out to the I/O data after conversion, to obtain multiple memory blocks;
(4) I/O data after conversion is stored in the IO queue of memory, and judges whether the IO queue has expired, if it is into Enter step (5), otherwise return step (1);
(5) all I/O datas in the IO queue of memory are inputted in the trained Seq2Seq model based on LSTM, to obtain The I/O data of prediction, and corresponding multiple memory blocks are obtained according to the I/O data of the prediction;
(6) according to cache replacement algorithm, and the memory block in caching is replaced using the memory block that step (3) and step (5) obtain.
2. according to claim 1 piece of grade cache prefetching optimization method, which is characterized in that the IO number after step (1) conversion According to being indicated jointly with its logical block address LBA and the storage block number BlockNum of the I/O data.
3. according to claim 2 piece of grade cache prefetching optimization method, which is characterized in that obtained according to the I/O data of prediction Corresponding multiple data blocks are to parse the I/O data first, to obtain its logical block address LBA and its storage block number BlockNum, and then obtain all memory blocks included by the I/O data of the prediction.
4. according to claim 1 piece of grade cache prefetching optimization method, which is characterized in that judge whether I/O data is caching Middle hit is to check whether all memory blocks of I/O data hit in the buffer, if it is indicates that the I/O data is ordered in the buffer In, otherwise indicate that the I/O data is not hit in the buffer.
5. according to claim 1 piece of grade cache prefetching optimization method, which is characterized in that carry out sequential prediction to I/O data It is the memory block for predicting to access according to the logical address of the last one memory block in the I/O data.
6. according to claim 1 piece of grade cache prefetching optimization method, which is characterized in that step is based on used in (5) The Seq2Seq model of LSTM is through the following steps that training obtained:
(5-1) obtains training dataset, concentrates the I/O data as unit of byte to be converted into blocks the training data I/O data;
All I/O datas that (5-2) is obtained after converting to step (5-1) carry out segment processing, to obtain the IO number after multiple segmentations According to, and store the I/O data after all segmentations;
(5-3) is handled the I/O data after all segmentations using term vector generating algorithm, with obtain I/O data it is corresponding to Amount;
(5-4) carries out non-overlapping sequence sampling to the I/O data after all segmentations, to obtain multiple sequence samples;
Corresponding to I/O data after the segmentation that (5-5) is obtained according to the Seq2Seq model based on LSTM and using step (5-3) Vector carries out repetition training to multiple sequence samples that step (5-4) obtains, until reaching default the number of iterations.
7. according to claim 6 piece of grade cache prefetching optimization method, which is characterized in that be specifically to make in step (5-3) Continuous bag of words in word2vec algorithm, the dimension of vector are 50.
8. according to claim 1 piece of grade cache prefetching optimization method, which is characterized in that the caching that step (6) uses replaces Scaling method can be lru algorithm, FIFO algorithm or ARC algorithm.
9. a kind of block grade cache prefetching optimization system based on deep learning characterized by comprising
First module converts the I/O data to store for obtaining the I/O data as unit of byte from test data set Block is the I/O data of unit;
Second module, for judging whether the I/O data after conversion hits in the buffer, if it is process terminates, and otherwise enters Third module;
Third module, for carrying out sequential prediction to the I/O data after conversion, to obtain multiple memory blocks;
4th module for the I/O data after conversion to be stored in the IO queue of memory, and judges whether the IO queue has expired, If yes then enter the 5th module, the first module is otherwise returned;
5th module inputs the trained Seq2Seq mould based on LSTM for all I/O datas in the IO queue by memory In type, with the I/O data predicted, and corresponding multiple memory blocks are obtained according to the I/O data of the prediction;
6th module is used for according to cache replacement algorithm, and is replaced and delayed using the memory block that third module and the 5th module obtain Memory block in depositing.
CN201910526384.3A 2019-06-18 2019-06-18 Block-level cache prefetching optimization method and system based on deep learning Active CN110245094B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910526384.3A CN110245094B (en) 2019-06-18 2019-06-18 Block-level cache prefetching optimization method and system based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910526384.3A CN110245094B (en) 2019-06-18 2019-06-18 Block-level cache prefetching optimization method and system based on deep learning

Publications (2)

Publication Number Publication Date
CN110245094A true CN110245094A (en) 2019-09-17
CN110245094B CN110245094B (en) 2020-12-29

Family

ID=67887848

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910526384.3A Active CN110245094B (en) 2019-06-18 2019-06-18 Block-level cache prefetching optimization method and system based on deep learning

Country Status (1)

Country Link
CN (1) CN110245094B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114065947A (en) * 2021-11-15 2022-02-18 深圳大学 Data access speculation method and device, storage medium and electronic equipment
TWI763168B (en) * 2019-12-30 2022-05-01 大陸商上海商湯智能科技有限公司 Data processing method and apparatus, computer device, storage medium
CN116822657A (en) * 2023-08-25 2023-09-29 之江实验室 Method and device for accelerating model training, storage medium and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105786717A (en) * 2016-03-22 2016-07-20 华中科技大学 DRAM (dynamic random access memory)-NVM (non-volatile memory) hierarchical heterogeneous memory access method and system adopting software and hardware collaborative management
CN106021128A (en) * 2016-05-31 2016-10-12 东南大学—无锡集成电路技术研究所 Data prefetcher based on correlation of strides and data and prefetching method of data prefetcher
CN106681990A (en) * 2015-11-05 2017-05-17 华中科技大学 Method for reading caching data under mobile cloud storage environment
US20180315158A1 (en) * 2017-04-28 2018-11-01 Intel Corporation Programmable coarse grained and sparse matrix compute hardware with advanced scheduling
CN109639760A (en) * 2018-11-02 2019-04-16 西北工业大学 It is a kind of based on deeply study D2D network in cache policy method
US20190130101A1 (en) * 2018-12-27 2019-05-02 Li Chen Methods and apparatus for detecting a side channel attack using hardware performance counters
CN109788566A (en) * 2019-01-18 2019-05-21 南京邮电大学 Network resource allocation method based on depth enhancing study

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106681990A (en) * 2015-11-05 2017-05-17 华中科技大学 Method for reading caching data under mobile cloud storage environment
CN105786717A (en) * 2016-03-22 2016-07-20 华中科技大学 DRAM (dynamic random access memory)-NVM (non-volatile memory) hierarchical heterogeneous memory access method and system adopting software and hardware collaborative management
CN106021128A (en) * 2016-05-31 2016-10-12 东南大学—无锡集成电路技术研究所 Data prefetcher based on correlation of strides and data and prefetching method of data prefetcher
US20180315158A1 (en) * 2017-04-28 2018-11-01 Intel Corporation Programmable coarse grained and sparse matrix compute hardware with advanced scheduling
CN109639760A (en) * 2018-11-02 2019-04-16 西北工业大学 It is a kind of based on deeply study D2D network in cache policy method
US20190130101A1 (en) * 2018-12-27 2019-05-02 Li Chen Methods and apparatus for detecting a side channel attack using hardware performance counters
CN109788566A (en) * 2019-01-18 2019-05-21 南京邮电大学 Network resource allocation method based on depth enhancing study

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI763168B (en) * 2019-12-30 2022-05-01 大陸商上海商湯智能科技有限公司 Data processing method and apparatus, computer device, storage medium
CN114065947A (en) * 2021-11-15 2022-02-18 深圳大学 Data access speculation method and device, storage medium and electronic equipment
CN116822657A (en) * 2023-08-25 2023-09-29 之江实验室 Method and device for accelerating model training, storage medium and electronic equipment
CN116822657B (en) * 2023-08-25 2024-01-09 之江实验室 Method and device for accelerating model training, storage medium and electronic equipment

Also Published As

Publication number Publication date
CN110245094B (en) 2020-12-29

Similar Documents

Publication Publication Date Title
CN103488463B (en) The renewal of branch history register is suppressed by loop-ending branch
CN110245094A (en) A kind of block grade cache prefetching optimization method and system based on deep learning
US6611910B2 (en) Method for processing branch operations
US8490065B2 (en) Method and apparatus for software-assisted data cache and prefetch control
US7958317B2 (en) Cache directed sequential prefetch
EP1150213B1 (en) Data processing system and method
JP3618385B2 (en) Method and system for buffering data
CN101002178A (en) System, apparatus and method for issuing predictions from an inventory to access a memory
US20030079088A1 (en) Prefetching mechanism for data caches
EP1271308A2 (en) Apparatus and method for branch prediction based on history table
CN104657286B (en) A kind of classification storage method and device
US20070150660A1 (en) Inserting prefetch instructions based on hardware monitoring
US7047362B2 (en) Cache system and method for controlling the cache system comprising direct-mapped cache and fully-associative buffer
CN111324556B (en) Method and system for prefetching a predetermined number of data items into a cache
CN102707933B (en) Method and apparatus for managing return stack
CN102163144A (en) Hardware data pre-fetching method of embedded processor
CN104572026B (en) Data processing method and device for being prefetched
US20210124586A1 (en) Apparatus and method for handling incorrect branch direction predictions
CN107844380A (en) A kind of multi-core buffer WCET analysis methods for supporting instruction prefetch
Feng et al. Dynamic access distance driven cache replacement
CN117609110A (en) Caching method, cache, electronic device and readable storage medium
US11561796B2 (en) Linked miss-to-miss instruction prefetcher
CN108874690A (en) The implementation method and processor of data pre-fetching
CN115934170A (en) Prefetching method and device, prefetching training method and device, and storage medium
CN102662720A (en) Optimization method of compiler of multi-issue embedded processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant