CN105512185A - Cache sharing method based on operation sequence - Google Patents

Cache sharing method based on operation sequence Download PDF

Info

Publication number
CN105512185A
CN105512185A CN201510830806.8A CN201510830806A CN105512185A CN 105512185 A CN105512185 A CN 105512185A CN 201510830806 A CN201510830806 A CN 201510830806A CN 105512185 A CN105512185 A CN 105512185A
Authority
CN
China
Prior art keywords
job
dfscache
cache
new
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510830806.8A
Other languages
Chinese (zh)
Other versions
CN105512185B (en
Inventor
何晓斌
魏巍
王红艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuxi Jiangnan Computing Technology Institute
Original Assignee
Wuxi Jiangnan Computing Technology Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuxi Jiangnan Computing Technology Institute filed Critical Wuxi Jiangnan Computing Technology Institute
Priority to CN201510830806.8A priority Critical patent/CN105512185B/en
Publication of CN105512185A publication Critical patent/CN105512185A/en
Application granted granted Critical
Publication of CN105512185B publication Critical patent/CN105512185B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Abstract

The invention provides a cache sharing method based on operation sequence. The cache sharing method comprises following steps: announcing the resources amount of DFS Cache needed during JOB operation before JOB is submitted and operated; allocate corresponding DFS Cache resources to all JOB operation by a system and starting JOB operation to launch; operating all JOB operation in a multi-wheeled mode, wherein JOB may generate access to DFS Cache for multiple times such that the system acquires time intervals of all JOB operation access to DFS Cache; recording time intervals and starting a DFS Cache sharing allocation algorithm on the condition that time intervals of the JOB operation access to cache resources tend to stabilize; and reading and writing data of JOB operation by a storage management system in accordance with an all operating JOB operation access to DFS Cache and starting again the sharing allocation algorithm when the storage management system determines that JOB data access exceeds the time window.

Description

A kind of method based on operation timing Cache Design
Technical field
The present invention relates to field of computer technology, be specifically related to a kind of method based on operation timing Cache Design.
Background technology
High-performance computer is huge, data access amount concurrent during tasks carrying is also ten hundreds of, therefore the performance for distributed file system has very high requirement, distributed file system generally can configure on the server and must accelerate resource for this reason, realize the read-write requests process for mass data, these cache resources capacity for the memory capacity of distributed file system itself is very little, but its performance height but may be several times as much as the performance of distributed file system itself, cost is also very high, although therefore in job run process, the cache resource allocation of exclusive formula is simple, but it is and unreasonable.
JOB is that one operates in application software on high-performance computer computational resource, the scientific algorithm task that its general execution is certain, and at the certain phase of calculation task, data are write distributed file system, the data volume that problem once exports is very large, often at tens of TB thousands of TB even up to a hundred, the performance requirement for the storage resources of distributed document structure is very high.
When in high-performance calculation, JOB starts, system generally can for JOB distributes must DFS (DistributeFileSystem, the integrated a large amount of storage server resource of distributed file system and cache resources, operate on storage server, a shared storage space by software simulating, for high-performance computer provides high-performance, high concurrent reading and writing data support) cache resources, be used for accelerating the reading and writing data performance of JOB, in JOB implementation, usually this DFS cache resources can distribute to JOB regularly, due to the stage that JOB at data access is, namely there is gap, and access gap is generally more than more than ten minutes, therefore the free time of cache resources is caused in access gap, also the waste of cache resources is indirectly caused.
More particularly, when a JOB startup optimization, system can distribute DFSCache resource for this JOB.DFSCache is a kind of distributed file system buffer memory, this distributed file system generally can dispose the resource of the special acceleration distributed file access such as certain SSD, internal memory on its server run, and dispatch these cache resources by distributed file system, accelerate to provide support for calculation task obtains data access performance.
Like this, in JOB operational process, this resource is monopolized by JOB, and JOB is in operational process, often have the stage of data access, namely after a stage of reading and writing data completes, JOB just can carry out the reading and writing data in next stage after only completing certain calculation task, due to DFSCache resource in HPC, often performance is high, cost is high, therefore can cause the wasting of resources.
Summary of the invention
Technical matters to be solved by this invention is for there is above-mentioned defect in prior art, provides a kind of method based on operation timing Cache Design that DFSCache that can realize between JOB shares.
According to the present invention, provide a kind of method based on operation timing Cache Design, comprising:
First step: before JOB submits operation to, the DFSCache stock number of the needs in statement JOB operational process;
Second step: system is that the operation of each JOB distributes corresponding DFSCache resource, and starts JOB job run;
Third step: each JOB job run is taken turns more, and wherein JOB can produce repeatedly for the access of DFSCache, makes each JOB operation of system acquisition access the time interval of DFSCache;
4th step: when the time interval of JOB operation access cache resource tends towards stability, record the described time interval, and start DFSCache and share allocation algorithm.
Preferably, the described method based on operation timing Cache Design also comprises:
5th step: when performing JOB operation, the data of situation to JOB operation of accessing DFSCache according to all JOB operations run by storage management system are read and write, if storage management system determines to there is JOB data access overtime window, return to the 4th step and share allocation algorithm to restart DFSCache.
6th step: after Job execution completes, discharges DFSCache resource shared by described operation.
Preferably, described DFSCache shares allocation algorithm and comprises: the operation time interval allocation table setting up all JOB in system, determines whether there is DFSCache and has free time section the JOB corresponding to this DFSCache and other JOB can be allowed to share DFSCache; And when there is DFSCache and having that free time, section can allow the JOB corresponding to this DFSCache and other JOB share DFSCache, judge whether the spatial cache of this DFSCache exists remaining cache space.
Preferably, described DFSCache shares allocation algorithm and also comprises: if described free time section and the remaining cache space of this DFSCache meet the requirement of the new JOB started, so direct by described free time section and the remaining cache allocation of space of this DFSCache give the new JOB started.
Preferably, described DFSCache shares allocation algorithm and also comprises: if described free time section and DFSCache remaining cache insufficient space to meet the requirement of the new JOB started, then after making the new JOB started utilize described remaining cache space resources again for the new JOB started distributes new resource.
Preferably, described DFSCache shares allocation algorithm and also comprises: if DFSCache spatial cache does not remain, then for the new JOB started distributes new DFSCache, and monopolized the described new DFSCache of distribution by the operation of the JOB of this new startup before another JOB job initiation.
Preferably, the time interval tends towards stability and refers to that the time interval is steady state value or is greater than particular value.
The invention solves DFS cache resources fixed allocation to the drawback of JOB, by the gap information at accumulation layer Collecting operation access DFS buffer memory, operation is transferred according to operation cache access gap, realize multiplexing between operation access DFS buffer memory gap of different DFS cache resources, improve the comprehensive utilization ratio of system.
Accompanying drawing explanation
By reference to the accompanying drawings, and by reference to detailed description below, will more easily there is more complete understanding to the present invention and more easily understand its adjoint advantage and feature, wherein:
Fig. 1 schematically shows according to the preferred embodiment of the invention based on the process flow diagram of the method for operation timing Cache Design.
It should be noted that, accompanying drawing is for illustration of the present invention, and unrestricted the present invention.Note, represent that the accompanying drawing of structure may not be draw in proportion.Further, in accompanying drawing, identical or similar element indicates identical or similar label.
Embodiment
In order to make content of the present invention clearly with understandable, below in conjunction with specific embodiments and the drawings, content of the present invention is described in detail.
High-performance calculation HPC (HighPerformanceComputing) is the system of integrated large-scale calculations resource and storage resources, realize the process for ultra-large problem, this system integration tens thousand of central processing unit parallel computations, and mass data is written in the storage resources that distributed file system builds, the requirements such as institute supports for the concurrency of storage resources, performance are all very high.
In high-performance calculation HPC, apply and use computational resource and storage resources to solve problem in science by the mode of submit job and JOB, for the JOB that data output quantity is larger, when its startup optimization, system can distribute DFSCache resource for this JOB, and in JOB operational process, simple for JOB data management, guarantee JOB data security, this resource is monopolized by JOB, and JOB is in operational process, often there is the stage of data access, namely after a stage of reading and writing data completes, JOB just can carry out the reading and writing data in next stage after only completing certain calculation task, due to DFSCache resource in high-performance calculation HPC, often performance is high, cost is high, therefore the wasting of resources can be caused.
What the present invention proposed is according to schedule job according to operation access DFSCache time slot, realizes operation sharing DFSCache resource.After Hand up homework, be that operation distributes different cache resources in former steps of job run, and the time interval of Collecting operation access cache resource, after collecting more stable time interval value, by process of ranking to these time interval values, obtain the tandem that JOB accesses DFSCache, system can access certain data dispatch strategy realization in different work gap for the share and access of DFSCache afterwards.
Fig. 1 schematically shows according to the preferred embodiment of the invention based on the process flow diagram of the method for operation timing Cache Design.
As shown in Figure 1, comprise based on the method for operation timing Cache Design according to the preferred embodiment of the invention:
The DFSCache stock number of the needs in first step S1: before JOB submits operation to, statement JOB operational process;
Second step S2: system (such as according to acquiescence mode) distributes corresponding DFSCache resource for the operation of each JOB, and starts JOB job run;
Third step S3: each JOB job run is taken turns more, and wherein JOB can produce repeatedly for the access of DFSCache, makes each JOB operation of system acquisition access the time interval of DFSCache;
4th step S4: tending towards stability in the time interval (time interval corresponding to JOB operation access cache resource) of often taking turns operation of JOB operation, (" tending towards stability " refers to, the time interval is steady state value or is greater than particular value) when, record the described time interval, and start DFSCache and share allocation algorithm;
Particularly, described DFSCache shares allocation algorithm and comprises:
Set up the operation time interval allocation table of all JOB in system, determine whether there is DFSCache and there is free time section the JOB corresponding to this DFSCache and other JOB can be allowed to share DFSCache; When there is DFSCache and having that free time, section can allow the JOB corresponding to this DFSCache and other JOB share DFSCache, judge whether the spatial cache of this DFSCache exists remaining cache space;
Further, if described free time section and the remaining cache space of this DFSCache meet the requirement of the new JOB started, so direct by described free time section and the remaining cache allocation of space of this DFSCache give the new JOB started;
On the other hand, if described free time section and DFSCache remaining cache insufficient space to meet the requirement of the new JOB started, then after making the new JOB started utilize described remaining cache space resources again for the new JOB started distributes new resource;
If DFSCache spatial cache does not remain, then for the new JOB started distributes new DFSCache, and monopolized the described new DFSCache of distribution by the operation of the JOB of this new startup before another JOB job initiation.
And, 5th step S5: when performing JOB operation, the data of situation to JOB operation of accessing DFSCache according to all JOB operations run by storage management system are read and write, if storage management system determines to there is JOB data access overtime window, return to the 4th step S4 and carry out restarting DFSCache and share allocation algorithm;
6th step S6: after Job execution completes, can discharge the time interval shared by described operation and DFSCache resource.
The present invention, in conjunction with the time interval between different JOB between access cache, realizes sharing for memory buffers resource, and above shared transparent for user JOB.The invention has the advantages that and achieve the sharing of cache resources for preciousness of multiple JOB in high-performance calculation, overcome the shortcoming that traditional cache resources monopolizes use-pattern, improve the utilization factor of system cache resource.
In addition, it should be noted that, unless otherwise indicated, otherwise the term " first " in instructions, " second ", " the 3rd " etc. describe only for distinguishing each assembly, element, step etc. in instructions, instead of for representing logical relation between each assembly, element, step or ordinal relation etc.
Be understandable that, although the present invention with preferred embodiment disclose as above, but above-described embodiment and be not used to limit the present invention.For any those of ordinary skill in the art, do not departing under technical solution of the present invention ambit, the technology contents of above-mentioned announcement all can be utilized to make many possible variations and modification to technical solution of the present invention, or be revised as the Equivalent embodiments of equivalent variations.Therefore, every content not departing from technical solution of the present invention, according to technical spirit of the present invention to any simple modification made for any of the above embodiments, equivalent variations and modification, all still belongs in the scope of technical solution of the present invention protection.

Claims (8)

1., based on a method for operation timing Cache Design, it is characterized in that comprising:
First step: before JOB submits operation to, the DFSCache stock number of the needs in statement JOB operational process;
Second step: system is that the operation of each JOB distributes corresponding DFSCache resource, and starts JOB job run;
Third step: each JOB job run is taken turns more, and wherein JOB can produce repeatedly for the access of DFSCache, makes each JOB operation of system acquisition access the time interval of DFSCache;
4th step: when the time interval of JOB operation access cache resource tends towards stability, record the described time interval, and start DFSCache and share allocation algorithm.
2. the method based on operation timing Cache Design according to claim 1, characterized by further comprising:
5th step: when performing JOB operation, the data of situation to JOB operation of accessing DFSCache according to all JOB operations run by storage management system are read and write, if storage management system determines to there is JOB data access overtime window, return to the 4th step and share allocation algorithm to restart DFSCache.
3. the method based on operation timing Cache Design according to claim 1 and 2, characterized by further comprising:
6th step: after Job execution completes, discharges DFSCache resource shared by described operation.
4. the method based on operation timing Cache Design according to claim 1 and 2, it is characterized in that, described DFSCache shares allocation algorithm and comprises: the operation time interval allocation table setting up all JOB in system, determines whether there is DFSCache and has free time section the JOB corresponding to this DFSCache and other JOB can be allowed to share DFSCache; And when there is DFSCache and having that free time, section can allow the JOB corresponding to this DFSCache and other JOB share DFSCache, judge whether the spatial cache of this DFSCache exists remaining cache space.
5. the method based on operation timing Cache Design according to claim 4, it is characterized in that, described DFSCache shares allocation algorithm and also comprises: if described free time section and the remaining cache space of this DFSCache meet the requirement of the new JOB started, so direct by described free time section and the remaining cache allocation of space of this DFSCache give the new JOB started.
6. the method based on operation timing Cache Design according to claim 5, it is characterized in that, described DFSCache shares allocation algorithm and also comprises: if described free time section and DFSCache remaining cache insufficient space to meet the requirement of the new JOB started, then after making the new JOB started utilize described remaining cache space resources again for the new JOB started distributes new resource.
7. the method based on operation timing Cache Design according to claim 6, it is characterized in that, described DFSCache shares allocation algorithm and also comprises: if DFSCache spatial cache does not remain, then for the new JOB started distributes new DFSCache, and monopolized the described new DFSCache of distribution by the operation of the JOB of this new startup before another JOB job initiation.
8. the method based on operation timing Cache Design according to claim 1 and 2, is characterized in that, the time interval tends towards stability and refers to that the time interval is steady state value or is greater than particular value.
CN201510830806.8A 2015-11-24 2015-11-24 A method of it is shared based on operation timing caching Active CN105512185B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510830806.8A CN105512185B (en) 2015-11-24 2015-11-24 A method of it is shared based on operation timing caching

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510830806.8A CN105512185B (en) 2015-11-24 2015-11-24 A method of it is shared based on operation timing caching

Publications (2)

Publication Number Publication Date
CN105512185A true CN105512185A (en) 2016-04-20
CN105512185B CN105512185B (en) 2019-03-26

Family

ID=55720167

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510830806.8A Active CN105512185B (en) 2015-11-24 2015-11-24 A method of it is shared based on operation timing caching

Country Status (1)

Country Link
CN (1) CN105512185B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050165735A1 (en) * 2003-10-23 2005-07-28 Microsoft Corporation Persistent caching directory level support
CN101395586A (en) * 2006-03-02 2009-03-25 Nxp股份有限公司 Method and apparatus for dynamic resizing of cache partitions based on the execution phase of tasks
CN102137125A (en) * 2010-01-26 2011-07-27 复旦大学 Method for processing cross task data in distributive network system
CN102207830A (en) * 2011-05-27 2011-10-05 杭州宏杉科技有限公司 Cache dynamic allocation management method and device
CN102231121A (en) * 2011-07-25 2011-11-02 北方工业大学 Memory mapping-based rapid parallel extraction method for big data file
CN102546751A (en) * 2011-12-06 2012-07-04 华中科技大学 Hierarchical metadata cache control method of distributed file system
CN103279429A (en) * 2013-05-24 2013-09-04 浪潮电子信息产业股份有限公司 Application-aware distributed global shared cache partition method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050165735A1 (en) * 2003-10-23 2005-07-28 Microsoft Corporation Persistent caching directory level support
CN101395586A (en) * 2006-03-02 2009-03-25 Nxp股份有限公司 Method and apparatus for dynamic resizing of cache partitions based on the execution phase of tasks
CN102137125A (en) * 2010-01-26 2011-07-27 复旦大学 Method for processing cross task data in distributive network system
CN102207830A (en) * 2011-05-27 2011-10-05 杭州宏杉科技有限公司 Cache dynamic allocation management method and device
CN102231121A (en) * 2011-07-25 2011-11-02 北方工业大学 Memory mapping-based rapid parallel extraction method for big data file
CN102546751A (en) * 2011-12-06 2012-07-04 华中科技大学 Hierarchical metadata cache control method of distributed file system
CN103279429A (en) * 2013-05-24 2013-09-04 浪潮电子信息产业股份有限公司 Application-aware distributed global shared cache partition method

Also Published As

Publication number Publication date
CN105512185B (en) 2019-03-26

Similar Documents

Publication Publication Date Title
US9619430B2 (en) Active non-volatile memory post-processing
CN105103144B (en) For the device and method of the self adaptive control of memory
CN110663019B (en) File system for Shingled Magnetic Recording (SMR)
CN101799773B (en) Memory access method of parallel computing
US9430508B2 (en) Disk optimized paging for column oriented databases
US8868622B2 (en) Method and apparatus for allocating resources in a computer system
JP6412244B2 (en) Dynamic integration based on load
US9507636B2 (en) Resource management and allocation using history information stored in application's commit signature log
CN105027093A (en) Methods and apparatus for compressed and compacted virtual memory
CN104111897A (en) Data processing method, data processing device and computer system
CN103544153A (en) Data updating method and system based on database
CN107728935B (en) Re-partitioning data in a distributed computing system
US20150378782A1 (en) Scheduling of tasks on idle processors without context switching
US11366788B2 (en) Parallel pipelined processing for snapshot data deletion
US10102267B2 (en) Method and apparatus for access control
CN100454241C (en) Method and device for storing C++ object in shared memory
US10579419B2 (en) Data analysis in storage system
CN103365926A (en) Method and device for storing snapshot in file system
CN104158875A (en) Method and system for sharing and reducing tasks of data center server
US11720402B2 (en) Fast shutdown of large scale-up processes
US9405470B2 (en) Data processing system and data processing method
US20220318042A1 (en) Distributed memory block device storage
CN105512185A (en) Cache sharing method based on operation sequence
US10824640B1 (en) Framework for scheduling concurrent replication cycles
CN113535087A (en) Data processing method, server and storage system in data migration process

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant