CN109002543A - A kind of method and apparatus of file storage - Google Patents
A kind of method and apparatus of file storage Download PDFInfo
- Publication number
- CN109002543A CN109002543A CN201810819557.6A CN201810819557A CN109002543A CN 109002543 A CN109002543 A CN 109002543A CN 201810819557 A CN201810819557 A CN 201810819557A CN 109002543 A CN109002543 A CN 109002543A
- Authority
- CN
- China
- Prior art keywords
- file
- caching
- stored
- access
- released
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The invention discloses a kind of method and apparatus of file storage, are related to field of computer technology.One specific embodiment of this method includes: acquisition file to be stored, and determines the size of file to be stored;When the size of file to be stored is no more than preset threshold, file to be stored is saved in caching;The file to be released in caching is determined according to caching replacement policy, is saved in distributed file system after file to be released is merged.The embodiment can carry out file storage based on caching and distributed file system, on the basis of file saves reliable, effectively reduce the memory usage of manager's node, and can reduce magnetic disc i/o, promote the uploading speed and reading speed of file.
Description
Technical field
The present invention relates to the method and apparatus that field of computer technology more particularly to a kind of file store.
Background technique
In system operation, many files can be generated, and wherein most file is small documents, is for example, less than
The file of 50MB.Can there are a large amount of reconciliation operation, statement text by taking the financial system of certain electric business platform as an example, in financial system
The sources of part may include wechat, wealth pays the multiple channels such as logical, Internetbank, Unionpay, direct-connected, overseas bank of each domestic bank,
Correspondingly reconciliation monofile is also very much, and in reconciliation monofile is largely small documents.Storage for the small documents of magnanimity, both
Guarantee its reliability, also to take into account the access efficiency of file.Currently, the mode that reconciliation monofile is stored specifically includes that
1) it is stored in the server of single-point;
2) it stores in the buffer;
3) it is stored in distributed file system.
In realizing process of the present invention, at least there are the following problems in the prior art for inventor's discovery:
1) when storing reconciliation monofile using the server of single-point, reliability is difficult to ensure;
2) higher to the capacity requirement of caching using caching come when storing reconciliation monofile, and reliability is difficult to ensure;
3) small although being guaranteed in reliability when storing reconciliation monofile using distributed file system
File storage efficiency is lower.In distributed file system, using manager (NameNode) come maintenance documentation path to data
The mapping of block, the mapping of data block to worker (DataNode), the heartbeat for monitoring DataNode and maintenance data block copy
Number etc..When the storage of a large amount of small documents is into distributed file system, NameNode can exhaust most of memory to execute
Aforesaid operations, so will cause, storage efficiency is low, limits the access speed of file.And it is one big when small documents are integrated into
When file carries out file storage, and make the access efficiency of reconciliation monofile lower.
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of method and apparatus of file storage, it can be based on caching and distribution
Formula file system carries out file storage, and on the basis of file saves reliable, the memory for effectively reducing manager's node is accounted for
With rate, and magnetic disc i/o can be reduced, promote the uploading speed and reading speed of file.
To achieve the above object, according to an aspect of an embodiment of the present invention, a kind of method of file storage is provided.
A kind of method of file storage, comprising: obtain file to be stored, and the size of the file to be stored is sentenced
It is fixed;When the size of the file to be stored is no more than preset threshold, the file to be stored is saved in caching;According to slow
It deposits replacement policy and determines file to be released in the caching, distributed text is saved in after the file to be released is merged
In part system.
Optionally, further includes: when the size of the file to be stored is more than preset threshold, the file to be stored is straight
It connects and is saved in distributed file system.
Optionally, before determining the file to be released in the caching according to caching replacement policy, further includes: described in determining
File total amount in caching has reached predetermined threshold.
Optionally, determine that the file to be released in the caching includes: to calculate in the caching according to caching replacement policy
File access temperature;The file that the access temperature is less than preset access heat degree threshold is determined as the text to be released
Part.
Optionally, the access temperature is calculated according to the amount of access of file and the generation number of days of file.
According to another aspect of an embodiment of the present invention, a kind of device of file storage is provided.
A kind of device of file storage, comprising: size determination module, for obtaining file to be stored, and to described wait deposit
The size of storage file is determined;File cache module, for when the size of the file to be stored be no more than preset threshold when,
The file to be stored is saved in caching;Release module is cached, for determining in the caching according to caching replacement policy
File to be released, be saved in distributed file system after the file to be released is merged.
Optionally, the file cache module is also used to:, will when the size of the file to be stored is more than preset threshold
The file to be stored is saved directly in distributed file system.
Optionally, further include total amount determining module, be used for: according to caching replacement policy determine in the caching wait release
Before putting file, determine that the file total amount in the caching has reached predetermined threshold.
Optionally, the caching release module is also used to: calculating the access temperature of the file in the caching;By the visit
Ask that temperature is less than the preset file for accessing heat degree threshold and is determined as the file to be released.
Optionally, the access temperature is calculated according to the amount of access of file and the generation number of days of file.
Another aspect according to an embodiment of the present invention provides a kind of electronic equipment of file storage.
A kind of electronic equipment of file storage, comprising: one or more processors;Storage device, for store one or
Multiple programs, when one or more of programs are executed by one or more of processors, so that one or more of places
Manage the method that device realizes the storage of file provided by the embodiment of the present invention.
It is according to an embodiment of the present invention in another aspect, providing a kind of computer-readable medium.
A kind of computer-readable medium is stored thereon with computer program, realizes this when described program is executed by processor
The method of the storage of file provided by inventive embodiments.
One embodiment in foregoing invention has the following advantages that or the utility model has the advantages that by obtaining file to be stored, and right
The size of file to be stored is determined, when the size of file to be stored is no more than preset threshold, file to be stored is saved
The file to be released in caching is determined into caching, and according to caching replacement policy, is protected after file to be released is merged
It is stored in distributed file system, realizes the document storage mode based on caching and distributed file system, saved in file
On the basis of reliable, the memory usage of manager's node is effectively reduced, and magnetic disc i/o can be reduced, promotes file
Uploading speed and reading speed.
Further effect possessed by above-mentioned non-usual optional way adds hereinafter in conjunction with specific embodiment
With explanation.
Detailed description of the invention
Attached drawing for a better understanding of the present invention, does not constitute an undue limitation on the present invention.Wherein:
Fig. 1 is the key step schematic diagram of the method for file storage according to an embodiment of the present invention;
Fig. 2 is the file storage principle schematic diagram of the embodiment of the present invention;
Fig. 3 is shown compared with speed when carrying out file upload using prior art using technical solution of the present invention
It is intended to;
Fig. 4 is the main modular schematic diagram of the device of file storage according to an embodiment of the present invention;
Fig. 5 is that the embodiment of the present invention can be applied to exemplary system architecture figure therein;
Fig. 6 is adapted for the structural representation of the computer system for the terminal device or server of realizing the embodiment of the present invention
Figure.
Specific embodiment
Below in conjunction with attached drawing, an exemplary embodiment of the present invention will be described, including the various of the embodiment of the present invention
Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize
It arrives, it can be with various changes and modifications are made to the embodiments described herein, without departing from scope and spirit of the present invention.Together
Sample, for clarity and conciseness, descriptions of well-known functions and structures are omitted from the following description.
In order to solve the problems in the existing technology, the present invention provides a kind of storage sides for mass small documents
Method, using caching (e.g. being realized based on distributed cache systems memcache or redis database)+distribution
The mode that formula file system combines carries out file storage, and the access efficiency of mass small documents can be improved, and can guarantee file
The reliability of storage.
It for big file, can directly store into distributed file system, reduce the pressure of buffer memory capacity, guarantee file
Reliability;For small documents, then first store into caching, it, will be more after meeting certain strategy to improve the access efficiency of file
A small documents are integrated into a big file, then store into distributed file system and guarantee the reliability of file, and drop
It is low because directly small documents are stored to distributed file system due to bring memory overhead, improve the storage efficiency of small documents.
Under normal circumstances, the file of access is concentrated mainly on the newer file being newly generated, so by newer file
It is set as hot spot file, and the small documents in hot spot file are kept in into caching, by small text after meeting caching replacement policy
Part is integrated into big file and saves to distributed file system.Wherein, cache replacement policy basic thought be will be minimum recently
The small documents used are released from caching, are saved after merging to distributed file system.It is only needed after being merged into big file
Carrying out primary deposit operation can be saved in distributed file system, so reduce magnetic disc i/o (Input/Output, it is defeated
Enter/export), improve the hit rate of caching and the access efficiency of file.
Fig. 1 is the key step schematic diagram of the method for file storage according to an embodiment of the present invention.As shown in Figure 1, this hair
The method of the file storage of bright embodiment mainly includes the following steps, namely S101 to step S103.
Step S101: file to be stored is obtained, and the size of file to be stored is determined;
Step S102: when the size of file to be stored is no more than preset threshold, file to be stored is saved in caching;
Step S103: the file to be released in caching is determined according to caching replacement policy, file to be released is merged
After be saved in distributed file system.
According to one embodiment of present invention, the method for file storage can also include:
When the size of file to be stored is more than preset threshold, file to be stored is saved directly to distributed file system
In, to reduce the pressure of buffer memory capacity, guarantee the reliability of file.
According to one embodiment of present invention, before determining the file to be released in caching according to caching replacement policy,
It can also determine that the file total amount in caching has reached predetermined threshold.In this way, can file total amount in the buffer reach predetermined
File release is carried out when threshold value, avoids due to not having enough remaining spaces in caching and file is caused to save failure or file
It loses, while also avoiding continually carrying out file release to caching, reduce number of operations.
In one embodiment of the invention, determine that the file to be released in caching specifically wraps according to caching replacement policy
It includes:
Firstly, calculating the access temperature of the file in caching;
Then, the file that will access temperature less than preset access heat degree threshold is determined as file to be released.
By the access temperature of calculation document, then according to the access heat degree threshold of setting, the lower text of temperature will be accessed
Part is discharged, and can make to save newest hot spot file in caching, to improve the access efficiency of hot spot file.
In one embodiment of the invention, access temperature is calculated according to the amount of access of file and the generation number of days of file
It arrives.By with the amount of access of file and file generate number of days as a reference to calculation document access temperature, can be preferably
The access temperature of file is determined in conjunction with the rule of file access.
Specifically, in the access temperature of calculation document, other calculation methods also can be used.In another of the invention
In embodiment, if only with the amount of access of file as a reference to the access temperature of calculation document, then can be by file access amount
Temperature with the ratio of the total amount of access of file as file.According to different business scenarios, can the different consideration of flexible choice because
The usually access temperature of calculation document, which is not limited by the present invention.
The method of file storage of the invention is introduced below with reference to a specific embodiment.
As shown in Fig. 2, being the file storage principle schematic diagram of the embodiment of the present invention.With the reconciliation monofile to financial system
For being stored, when needing to store a reconciliation monofile, the big of reconciliation monofile to be stored is obtained first
It is small, then determined by the size to file to judge whether file is small documents, and stamp timestamp for this document.?
When determining the size of this document, it is assumed that preset threshold is 50MB (can dynamically be adjusted according to the variation of system business),
Then the file greater than 50MB is big file, and the file no more than 50MB is small documents.
When determining the file wait store is big file, then directly stored into distributed file system;Work as judgement
When file wait store is small documents, then stored first into caching to facilitate user's high speed to read, then when in caching
When file total amount reaches predetermined threshold (for example, caching the 80% of total capacity, can be adjusted flexibly as needed), according to slow
The file to be released of caching will be released by depositing replacement policy determination, be stored after file to be released is carried out file mergences to distribution
In formula file system, to discharge caching, make the hot spot file that can store update in caching, and guarantee the text discharged from caching
The reliability of part.
In financial system, access for reconciliation monofile, amount of access is larger on the day of statement file generated, with
The passage of time, amount of access die-off.When determining the file that needs are discharged from caching, firstly, calculating the file in caching
The amount of access N of this document is first added 1, further according to file generated whenever receiving a file access request by access temperature
Number of days D, the access temperature G=N/D of calculation document is with the temperature for identifying file;Then, it is arranged according to combination business scenario
Access heat degree threshold, by access temperature be less than access heat degree threshold file discharged from caching;Finally, in caching
The file of release stores to distributed file system after merging into a file, not only ensure that the fault-tolerance of file, but also reduce
Magnetic disc i/o.
Wherein, access heat degree threshold is depending on business scenario, it is assumed for example that has 30% file in 1000 files
Access temperature 3 or so, 20% file access temperature 7 or so, 50% file access temperature 9 or so.If visiting
Ask that file of the temperature less than 7 is all rarely used, then 7 can be set by access heat degree threshold, for access temperature less than 7
File is discharged from caching;If only access file of the temperature less than 3 can be just rarely employed, then heat degree threshold can will be accessed
3 are set as, access file of the temperature less than 3 is discharged from caching.
Below with reference to the calculating process of the access temperature of Tables 1 and 2 present document.Three file (files are shown in table 1
1, file 2 and file 3) in four days after generation, daily corresponding accumulative amount of access.
Table 1
File 1 | File 2 | File 3 | |
First day | 1 | 5 | 10 |
Second day | 3 | 12 | 10 |
Third day | 6 | 15 | 10 |
4th day | 8 | 16 | 10 |
The access temperature of corresponding file is calculated according to the calculation formula G=N/D of the access temperature of file, is shown in Table 2.
Table 2
File 1 | File 2 | File 3 | |
First day | 1 | 5 | 10 |
Second day | 1.5 | 6 | 5 |
Third day | 2 | 5 | 3.3 |
4th day | 2 | 4 | 2.5 |
It is 1 time by first day amount of access, corresponding access temperature is for file 1 it can be seen from table 1, table 2
1;It is 3 by second day accumulative amount of access, corresponding access temperature is 1.5, and so on be can be obtained by third day and the
Four days access temperatures, the access temperature variation of file 1 is little as seen from Table 2.For file 2, by second day add up
The amount of access that amount of access reaches 12, and second day is up to 7 times, and second day access temperature is 6 in table 2, also reflects file 2
Second day access temperature is higher.For file 3, first day amount of access has reached 10 times, but its equal nothing in excess-three day
Amount of access, so the access temperature of file 3 is also gradually becoming smaller.
It can be obtained by table 1, table 2, amount of access file-based, and fully consider that file amount of access on the day of generation is larger,
And the characteristics of amount of access die-offs over time, according to the formula G=N/D access temperature being calculated and actual conditions
Under the access temperature situation of file is more met.
Directly file is uploaded in the prior art with using to using technical solution of the present invention to carry out file storage below
File storage is carried out to distributed file system, both document storage modes are compared.
1, magnetic disc i/o when file is uploaded to distributed file system compares
To identical 1000 small documents, it is respectively adopted and file is directly uploaded to distributed file system and uses this hair
These small documents are uploaded to distributed file system by the bright document storage mode uploaded based on caching+merging, and are compared and adopted
Magnetic disc i/o when file upload is carried out with both modes.Assuming that the physics block size of disk is 512 bytes, what disk was read
The total amount of block number (Blk_read) and the block number (Blk_wrtn) of write-in is as shown in table 3.
Table 3
It reads | Write-in | |
Directly upload | 2.01M Blk_read | 4.12M Blk_wrtn |
Caching+merging uploads | 0.13M Blk_read | 2.07M Blk_wrtn |
Wherein, " 2.01M Blk_read " refers to that the block number for the physical block that disk is read is 2.01M, " M " herein
It is million meaning, and 1M=1024.And so on, the block number (Blk_read) of the reading of disk shown in table 3 can be obtained and write
The total amount of the block number (Blk_wrtn) entered.
According to table 3 as can be seen that using temporary small documents are cached, and merge be uploaded to the strategy of distributed file system will
File is stored to distributed file system, can effectively reduce magnetic disc i/o, because the mode for keeping in small documents in the buffer can be with
Reading when disk before avoiding file from not merging is written and merges.
2, file uploading speed compares
To identical small documents, it is respectively adopted and is directly uploaded to distributed file system and using of the invention based on caching
These small documents are uploaded to distributed file system by the document storage mode that+merging uploads, and are compared using both modes
Carry out speed when file upload.
Fig. 3 is shown compared with speed when carrying out file upload using prior art using technical solution of the present invention
It is intended to.In Fig. 3, Trendline 1 is the time-consuming curve that file upload is carried out using prior art, can be with according to Trendline 1
It obtains that prior art is used directly to upload small documents to the average time-consuming of distributed file system as 485.7ms;Trendline 2 is
The time-consuming curve that file upload is carried out using technical solution of the present invention, can be utilized base of the invention according to Trendline 2
These small documents are uploaded to distributed file system in the document storage mode that caching+merging uploads, average time-consuming only has
61.6ms, uploading speed significantly improve.
When carrying out file upload, by the way that distributed file system is received file and writes direct disk, it is optimized for making
File is received with caching, the slow write-in disk operating to be compared such as may not need, so uploading speed is obviously improved.Only
When caching reaches preset threshold, it can just merge file and be uploaded to distributed file system, to greatly reduce slowly
Disk operating is written, is substantially improved so that uploading speed has.
3, file acquisition speed compares
Due to, when carrying out file storage, being first to store small documents into caching, leading to using technical solution of the present invention
Crossing test discovery and obtaining the average time-consuming of file from distributed file system is 500ms or so, and file is obtained from caching
Average time-consuming obtains file much faster from distributed file system only less than 100ms, so obtaining file ratio from caching.
After small documents are released from caching and merge preservation to distributed file system, obtain the speed of file with directly from point
The speed that file is obtained in cloth file system is consistent.Therefore, more frequent feature is accessed when newly-generated for file,
When largely continually carrying out file reading, since present invention employs caching mechanisms, so file acquisition can be greatly improved
Speed.
4, distributed file system memory usage compares
Manager's NameNode node before file is uploaded to distributed file system, in distributed file system
Memory usage be 6%.After the upload for carrying out 10,000 small documents respectively, directly uploaded files to using prior art
After distributed file system, the memory usage of NameNode node rises to 33%;And it uses of the invention based on caching+conjunction
And after the document storage mode uploaded uploads files to distributed file system, the memory usage of NameNode node is only improved
To 12%.In distributed file system, due to memory usage can with quantity of documents increase and increase, so merge file into
Row storage can be effectively reduced memory usage.Also, file Merge operation carries out outside NameNode node, moreover it is possible to effectively
Ground avoids the memory consumption of NameNode node.
It can be seen that the file storage side of the invention uploaded based on caching+merging by the comparison of above 4 aspects
Formula effectively reduces the memory usage of NameNode node, and can reduce magnetic on the basis of file saves reliable
Disk I/O promotes the uploading speed and reading speed of file.
Fig. 4 is the main modular schematic diagram of the device of file storage according to an embodiment of the present invention.As shown in figure 4, this hair
The device 400 of the file storage of bright embodiment mainly includes size determination module 401, file cache module 402 and caching release
Module 403.
Size determination module 401 determines the size of file to be stored for obtaining file to be stored;
File cache module 402 is used to protect file to be stored when the size of file to be stored is no more than preset threshold
It is stored in caching;
It caches release module 403 to be used to determine the file to be released in caching according to caching replacement policy, by text to be released
Part is saved in distributed file system after merging.
According to one embodiment of present invention, file cache module 402 can be also used for:
When the size of file to be stored is more than preset threshold, file to be stored is saved directly to distributed file system
In.
According to one embodiment of present invention, the device 400 of file storage can also be including total amount determining module (in figure not
Show), it is used for:
Before determining the file to be released in caching according to caching replacement policy, determine that the file total amount in caching has reached
To predetermined threshold.
According to one embodiment of present invention, caching release module 403 can be also used for:
Calculate the access temperature of the file in caching;
The file that temperature will be accessed less than preset access heat degree threshold is determined as file to be released.
Technical solution according to an embodiment of the present invention accesses temperature according to the amount of access of file and the generation number of days meter of file
It obtains.
Technical solution according to an embodiment of the present invention, by obtain file to be stored, and to the size of file to be stored into
Row determines, when the size of file to be stored is no more than preset threshold, file to be stored is saved in caching, and according to slow
It deposits replacement policy and determines the file to be released in caching, be saved in distributed file system after file to be released is merged
In, the document storage mode based on caching and distributed file system is realized, on the basis of file saves reliable, effectively
Reduce the memory usage of manager's node, and magnetic disc i/o can be reduced, promote the uploading speed of file and reads speed
Degree.
Fig. 5 is shown can be using the exemplary of the device for the method or file storage that the file of the embodiment of the present invention stores
System architecture 500.
As shown in figure 5, system architecture 500 may include terminal device 501,502,503, network 504 and server 505.
Network 504 between terminal device 501,502,503 and server 505 to provide the medium of communication link.Network 504 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
User can be used terminal device 501,502,503 and be interacted by network 504 with server 505, to receive or send out
Send message etc..Various telecommunication customer end applications, such as the application of shopping class, net can be installed on terminal device 501,502,503
(merely illustrative) such as the application of page browsing device, searching class application, instant messaging tools, mailbox client, social platform softwares.
Terminal device 501,502,503 can be the various electronic equipments with display screen and supported web page browsing, packet
Include but be not limited to smart phone, tablet computer, pocket computer on knee and desktop computer etc..
Server 505 can be to provide the server of various services, such as utilize terminal device 501,502,503 to user
The shopping class website browsed provides the back-stage management server (merely illustrative) supported.Back-stage management server can be to reception
To the data such as information query request analyze etc. processing, and by processing result (such as target push information, product letter
Breath -- merely illustrative) feed back to terminal device.
It should be noted that the method for the storage of file provided by the embodiment of the present invention is generally executed by server 505, phase
The device of Ying Di, file storage are generally positioned in server 505.
It should be understood that the number of terminal device, network and server in Fig. 5 is only schematical.According to realization need
It wants, can have any number of terminal device, network and server.
Below with reference to Fig. 6, it illustrates the calculating of the terminal device or server that are suitable for being used to realize the embodiment of the present invention
The structural schematic diagram of machine system 600.Terminal device or server shown in Fig. 6 are only an example, should not be to of the invention real
The function and use scope for applying example bring any restrictions.
As shown in fig. 6, computer system 600 includes central processing unit (CPU) 601, it can be read-only according to being stored in
Program in memory (ROM) 602 or be loaded into the program in random access storage device (RAM) 603 from storage section 608 and
Execute various movements appropriate and processing.In RAM 603, also it is stored with system 600 and operates required various programs and data.
CPU 601, ROM 602 and RAM 603 are connected with each other by bus 604.Input/output (I/O) interface 605 is also connected to always
Line 604.
I/O interface 605 is connected to lower component: the importation 606 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 607 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 608 including hard disk etc.;
And the communications portion 609 of the network interface card including LAN card, modem etc..Communications portion 609 via such as because
The network of spy's net executes communication process.Driver 610 is also connected to I/O interface 605 as needed.Detachable media 611, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 610, in order to read from thereon
Computer program be mounted into storage section 608 as needed.
Particularly, disclosed embodiment, the process described above with reference to flow chart may be implemented as counting according to the present invention
Calculation machine software program.For example, embodiment disclosed by the invention includes a kind of computer program product comprising be carried on computer
Computer program on readable medium, the computer program include the program code for method shown in execution flow chart.?
In such embodiment, which can be downloaded and installed from network by communications portion 609, and/or from can
Medium 611 is dismantled to be mounted.When the computer program is executed by central processing unit (CPU) 601, system of the invention is executed
The above-mentioned function of middle restriction.
It should be noted that computer-readable medium shown in the present invention can be computer-readable signal media or meter
Calculation machine readable storage medium storing program for executing either the two any combination.Computer readable storage medium for example can be --- but not
Be limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or any above combination.Meter
The more specific example of calculation machine readable storage medium storing program for executing can include but is not limited to: have the electrical connection, just of one or more conducting wires
Taking formula computer disk, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only storage
Device (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory device,
Or above-mentioned any appropriate combination.In the present invention, computer readable storage medium can be it is any include or storage journey
The tangible medium of sequence, the program can be commanded execution system, device or device use or in connection.And at this
In invention, computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal,
Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including but unlimited
In electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be that computer can
Any computer-readable medium other than storage medium is read, which can send, propagates or transmit and be used for
By the use of instruction execution system, device or device or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to: wireless, electric wire, optical cable, RF etc. are above-mentioned
Any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of various embodiments of the invention, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more
Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box
The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical
On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants
It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule
The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction
It closes to realize.
Being described in the embodiment of the present invention involved unit or module can be realized by way of software, can also be with
It is realized by way of hardware.Described unit or module also can be set in the processor, for example, can be described as:
A kind of processor includes size determination module, file cache module and caching release module.Wherein, the name of these units or module
Claim not constituting the restriction to the unit or module itself under certain conditions, for example, size determination module can also be described
For " for obtaining file to be stored, and to the module that the size of the file to be stored is determined ".
As on the other hand, the present invention also provides a kind of computer-readable medium, which be can be
Included in equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying equipment.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the equipment, makes
Obtaining the equipment includes: acquisition file to be stored, and is determined the size of the file to be stored;When the file to be stored
Size be no more than preset threshold when, the file to be stored is saved in caching;According to caching replacement policy determination
File to be released in caching is saved in distributed file system after merging the file to be released.
Technical solution according to an embodiment of the present invention, by obtain file to be stored, and to the size of file to be stored into
Row determines, when the size of file to be stored is no more than preset threshold, file to be stored is saved in caching, and according to slow
It deposits replacement policy and determines the file to be released in caching, be saved in distributed file system after file to be released is merged
In, the document storage mode based on caching and distributed file system is realized, on the basis of file saves reliable, effectively
Reduce the memory usage of manager's node, and magnetic disc i/o can be reduced, promote the uploading speed of file and reads speed
Degree.
Above-mentioned specific embodiment, does not constitute a limitation on the scope of protection of the present invention.Those skilled in the art should be bright
It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and substitution can occur.It is any
Made modifications, equivalent substitutions and improvements etc. within the spirit and principles in the present invention, should be included in the scope of the present invention
Within.
Claims (12)
1. a kind of method of file storage characterized by comprising
File to be stored is obtained, and the size of the file to be stored is determined;
When the size of the file to be stored is no more than preset threshold, the file to be stored is saved in caching;
The file to be released in the caching is determined according to caching replacement policy, is saved after the file to be released is merged
Into distributed file system.
2. the method according to claim 1, wherein further include:
When the size of the file to be stored is more than preset threshold, the file to be stored is saved directly to distributed document
In system.
3. the method according to claim 1, wherein according to caching replacement policy determine in the caching wait release
Before putting file, further includes:
Determine that the file total amount in the caching has reached predetermined threshold.
4. the method according to claim 1, wherein according to caching replacement policy determine in the caching wait release
Putting file includes:
Calculate the access temperature of the file in the caching;
The file that the access temperature is less than preset access heat degree threshold is determined as the file to be released.
5. according to the method described in claim 4, it is characterized in that, the access temperature is according to the amount of access of file and file
Number of days is generated to be calculated.
6. a kind of device of file storage characterized by comprising
Size determination module determines for obtaining file to be stored, and to the size of the file to be stored;
File cache module, for when the size of the file to be stored be no more than preset threshold when, by the file to be stored
It is saved in caching;
Release module is cached, it, will be described to be released for determining the file to be released in the caching according to caching replacement policy
File is saved in distributed file system after merging.
7. device according to claim 6, which is characterized in that the file cache module is also used to:
When the size of the file to be stored is more than preset threshold, the file to be stored is saved directly to distributed document
In system.
8. device according to claim 6, which is characterized in that further include total amount determining module, be used for:
Before determining the file to be released in the caching according to caching replacement policy, the file total amount in the caching is determined
Have reached predetermined threshold.
9. device according to claim 6, which is characterized in that the caching release module is also used to:
Calculate the access temperature of the file in the caching;
The file that the access temperature is less than preset access heat degree threshold is determined as the file to be released.
10. device according to claim 9, which is characterized in that the temperature that accesses is according to the amount of access and file of file
Generation number of days be calculated.
11. a kind of electronic equipment of file storage characterized by comprising
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real
Now such as method as claimed in any one of claims 1 to 5.
12. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor
Such as method as claimed in any one of claims 1 to 5 is realized when row.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810819557.6A CN109002543A (en) | 2018-07-24 | 2018-07-24 | A kind of method and apparatus of file storage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810819557.6A CN109002543A (en) | 2018-07-24 | 2018-07-24 | A kind of method and apparatus of file storage |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109002543A true CN109002543A (en) | 2018-12-14 |
Family
ID=64596858
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810819557.6A Pending CN109002543A (en) | 2018-07-24 | 2018-07-24 | A kind of method and apparatus of file storage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109002543A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109819039A (en) * | 2019-01-31 | 2019-05-28 | 网宿科技股份有限公司 | A kind of file acquisition method, file memory method, server and storage medium |
CN110263010A (en) * | 2019-05-31 | 2019-09-20 | 广东睿江云计算股份有限公司 | A kind of cache file automatic update method and device |
CN110278282A (en) * | 2019-07-01 | 2019-09-24 | 成都启英泰伦科技有限公司 | A kind of voice big data storage call method |
CN110297601A (en) * | 2019-06-06 | 2019-10-01 | 清华大学 | Solid state hard disk array construction method, electronic equipment and storage medium |
CN113111031A (en) * | 2021-04-12 | 2021-07-13 | 成都淞幸科技有限责任公司 | Intelligent storage method for heterogeneous mass data files |
CN114726728A (en) * | 2022-06-08 | 2022-07-08 | 莱芜职业技术学院 | Computer storage optimization method for monitoring data |
CN114840474A (en) * | 2022-07-06 | 2022-08-02 | 中汽信息科技(天津)有限公司 | Data migration method and system of patent index database |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110320436A1 (en) * | 2009-03-10 | 2011-12-29 | Mark K Hokanson | Optimizing access time of files stored on storages |
CN104978362A (en) * | 2014-04-11 | 2015-10-14 | 中兴通讯股份有限公司 | Data migration method of distributive file system, data migration device of distributive file system and metadata server |
CN105447086A (en) * | 2015-11-06 | 2016-03-30 | 深圳市网心科技有限公司 | File storage method and server for implementing storage method |
CN107590191A (en) * | 2017-08-11 | 2018-01-16 | 郑州云海信息技术有限公司 | A kind of HDFS mass small documents processing method and system |
CN108089825A (en) * | 2018-01-11 | 2018-05-29 | 郑州云海信息技术有限公司 | A kind of storage system based on distributed type assemblies |
-
2018
- 2018-07-24 CN CN201810819557.6A patent/CN109002543A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110320436A1 (en) * | 2009-03-10 | 2011-12-29 | Mark K Hokanson | Optimizing access time of files stored on storages |
CN104978362A (en) * | 2014-04-11 | 2015-10-14 | 中兴通讯股份有限公司 | Data migration method of distributive file system, data migration device of distributive file system and metadata server |
CN105447086A (en) * | 2015-11-06 | 2016-03-30 | 深圳市网心科技有限公司 | File storage method and server for implementing storage method |
CN107590191A (en) * | 2017-08-11 | 2018-01-16 | 郑州云海信息技术有限公司 | A kind of HDFS mass small documents processing method and system |
CN108089825A (en) * | 2018-01-11 | 2018-05-29 | 郑州云海信息技术有限公司 | A kind of storage system based on distributed type assemblies |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109819039A (en) * | 2019-01-31 | 2019-05-28 | 网宿科技股份有限公司 | A kind of file acquisition method, file memory method, server and storage medium |
CN109819039B (en) * | 2019-01-31 | 2022-04-19 | 网宿科技股份有限公司 | File acquisition method, file storage method, server and storage medium |
CN110263010A (en) * | 2019-05-31 | 2019-09-20 | 广东睿江云计算股份有限公司 | A kind of cache file automatic update method and device |
CN110263010B (en) * | 2019-05-31 | 2023-05-02 | 广东睿江云计算股份有限公司 | Automatic updating method and device for cache file |
CN110297601A (en) * | 2019-06-06 | 2019-10-01 | 清华大学 | Solid state hard disk array construction method, electronic equipment and storage medium |
CN110297601B (en) * | 2019-06-06 | 2020-06-23 | 清华大学 | Solid state disk array construction method, electronic device and storage medium |
CN110278282A (en) * | 2019-07-01 | 2019-09-24 | 成都启英泰伦科技有限公司 | A kind of voice big data storage call method |
CN113111031A (en) * | 2021-04-12 | 2021-07-13 | 成都淞幸科技有限责任公司 | Intelligent storage method for heterogeneous mass data files |
CN114726728A (en) * | 2022-06-08 | 2022-07-08 | 莱芜职业技术学院 | Computer storage optimization method for monitoring data |
CN114840474A (en) * | 2022-07-06 | 2022-08-02 | 中汽信息科技(天津)有限公司 | Data migration method and system of patent index database |
CN114840474B (en) * | 2022-07-06 | 2022-09-20 | 中汽信息科技(天津)有限公司 | Data migration method and system of patent index database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109002543A (en) | A kind of method and apparatus of file storage | |
CN106886375B (en) | The method and apparatus of storing data | |
CN102902730B (en) | Based on data reading method and the device of data buffer storage | |
CN105205014B (en) | A kind of date storage method and device | |
CN110019125A (en) | The method and apparatus of data base administration | |
US10389837B2 (en) | Multi-tier dynamic data caching | |
CN109947668A (en) | The method and apparatus of storing data | |
US20110264759A1 (en) | Optimized caching for large data requests | |
CN110334036A (en) | A kind of method and apparatus for realizing data cached scheduling | |
US9420056B2 (en) | Analytics caching based on users connected | |
CN109767274B (en) | Method and system for carrying out associated storage on massive invoice data | |
CN107888659A (en) | The processing method and system of user's request | |
CN108629029A (en) | A kind of data processing method and device applied to data warehouse | |
CN104598639A (en) | Real-time display method and real-time display system for supplying commodity prices | |
CN110489407A (en) | Data filling mining method, apparatus, computer equipment and storage medium | |
US11663288B2 (en) | Just-in-time front end template generation using logical document object models | |
CN109447635A (en) | Information storage means and device for block chain | |
CN110209677A (en) | The method and apparatus of more new data | |
CN103067479A (en) | Network disk synchronized method and system based on file coldness and hotness | |
JP7176209B2 (en) | Information processing equipment | |
CN110334145A (en) | The method and apparatus of data processing | |
CN104519103A (en) | Synchronous network data processing method, server and related system | |
CN109918352A (en) | The method of storage system and storing data | |
CN109697019A (en) | The method and system of data write-in based on FAT file system | |
CN110020271A (en) | Method and system for cache management |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181214 |
|
RJ01 | Rejection of invention patent application after publication |