CN106547911A - A kind of access method and system of mass small documents - Google Patents
A kind of access method and system of mass small documents Download PDFInfo
- Publication number
- CN106547911A CN106547911A CN201611054238.8A CN201611054238A CN106547911A CN 106547911 A CN106547911 A CN 106547911A CN 201611054238 A CN201611054238 A CN 201611054238A CN 106547911 A CN106547911 A CN 106547911A
- Authority
- CN
- China
- Prior art keywords
- eigenvalue
- service scripts
- bibliographic structure
- storage
- data encoding
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000013515 script Methods 0.000 claims abstract description 84
- 238000007689 inspection Methods 0.000 claims description 9
- 239000000284 extract Substances 0.000 claims description 5
- 230000009286 beneficial effect Effects 0.000 abstract description 4
- 230000008901 benefit Effects 0.000 description 5
- 238000007726 management method Methods 0.000 description 4
- 238000013467 fragmentation Methods 0.000 description 2
- 238000006062 fragmentation reaction Methods 0.000 description 2
- 238000009432 framing Methods 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 230000007423 decrease Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 239000002699 waste material Substances 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
Abstract
The present invention relates to the access method and system of a kind of mass small documents, the method is comprised the following steps:To obtain business datum corresponding to service scripts be compressed, obtain compressing file bag;The eigenvalue of business datum is extracted according to default extracting rule;When eigenvalue is not stored in file index storehouse, eigenvalue is encoded according to pre-arranged code rule, obtain data encoding, and the corresponding relation of eigenvalue and data encoding is stored in file index storehouse;Corresponding bibliographic structure is generated according to data encoding;Compressing file bag is decompressed, service scripts is obtained;Service scripts is deposited under the store path of bibliographic structure;Read service scripts.The invention has the beneficial effects as follows:Efficient storage and reading mass small documents, system access are simple, write, read batch data efficiency high, and can be with multi-faceted resilient expansion.
Description
Technical field
The present invention relates to data management field, more particularly to a kind of access method and system of mass small documents.
Background technology
Inclusion relation type data and during non-relational data in existing data storage management system, it is generally the case that make
Non-relational data are stored with network attached storage (Network Attached Storage, NAS), using relevant database
Storage relational data, meanwhile, the file path and file name on NAS is also stored in relevant database.
When quantity of documents is excessively huge, the efficiency of the data access of the method can be increasingly slower, and NAS extensions can be new
Build file system, it is impossible to smooth extension, the waste of memory space can be caused.Other systems are if necessary to access the upper number of files of storage
According to when, it is necessary to the system provide special interface match, if necessary parameter cannot be provided, just cannot be provided using this document
Source.
Also, the number of files in storage typically can be identical with filename with the data path of data-base recording, works as data volume
After reaching billion-degree, the reading performance of storage can decline to a great extent, extreme influence file access efficiency.
The content of the invention
The technical problem to be solved is for the deficiencies in the prior art, there is provided a kind of high-performance mass small documents
Storage and the method and system for reading.
The technical scheme that the present invention solves above-mentioned technical problem is as follows:
A kind of access method of mass small documents, comprises the following steps:
Step 1, to obtain business datum corresponding to service scripts be compressed, obtain compressing file bag;
Step 2, extracts the eigenvalue of the business datum according to default extracting rule;
Step 3, when the eigenvalue is not stored in file index storehouse, according to pre-arranged code rule to the eigenvalue
Encoded, obtained data encoding, and by the corresponding relation storage of the eigenvalue and the data encoding to the file rope
Draw in storehouse;
Step 4, generates corresponding bibliographic structure according to the data encoding;
Step 5, decompresses to the compressing file bag, obtains the service scripts;
Step 6, the service scripts is deposited under the store path of the bibliographic structure;
Step 7, reads the service scripts.
The invention has the beneficial effects as follows:Eigenvalue of the present invention by extraction business datum, generates corresponding with eigenvalue
Data encoding, sets up the contrast relationship of eigenvalue and data encoding, so as to set up the contrast relationship of business datum and file storage
And store, and corresponding bibliographic structure is generated according to data encoding, service scripts is deposited under corresponding bibliographic structure, can be with
Efficient storage and reading mass small documents are realized, the present invention has system access simple, write, reading batch data efficiency
Height, and can be with the advantage of multi-faceted resilient expansion.
On the basis of above-mentioned technical proposal, the present invention can also do following improvement.
Further, also included before step 5:
Step 8, when the eigenvalue has been stored in the file index storehouse, obtains data according to the eigenvalue and compiles
The corresponding existing bibliographic structure of code.
Further, the extracting rule is:The typical field in the business datum is selected, character is spliced in order
String, using the character string as the business datum eigenvalue.
Further, in step 6, including:
Step 6.1, obtains the document entity and bibliographic structure of the service scripts, and by the file reality of the service scripts
Body and bibliographic structure storage are to back end, and when existing service scripts under the bibliographic structure, cover existing business text
Part;
Step 6.2, generates the metadata of the service scripts according to the document entity and bibliographic structure of the service scripts;
Step 6.3, obtains the metadata, and is stored in metadata node.
Using the beneficial effect of above-mentioned further scheme it is:Tied by the document entity and catalogue of the service scripts that will be obtained
Back end is arrived in structure storage, and by the metadata storage for generating to metadata node, it is possible to achieve it is efficient to store the little text of magnanimity
Part, is had the advantages that to be stored into batch data efficiency high, and the available of storage can be improved by the quantity of expanding node
Space, and without changing original file system.
Further, in step 7, comprise the following steps:
Step 7.1, obtains HTPP requests, directly reads the document entity of the service scripts, if obtain for described
Eigenvalue, then execution step 7.2;
Step 7.2, encodes according to the eigenvalue matched data;
Step 7.3, according to the data encoding, obtains the bibliographic structure of the service scripts storage;
Step 7.4, the catalogue of the service scripts is added on the bibliographic structure;
Step 7.5, HTTP reference address is added on the bibliographic structure, HTTP request is obtained;
Step 7.6, carries out load balancing to the HTTP request;
Step 7.7, to load balancing after the HTTP request carry out safety inspection;
Step 7.8, according to safety inspection after the HTTP request read the service scripts.
Using the beneficial effect of above-mentioned further scheme it is:By matching characteristic value and data encoding, it is possible to achieve quick
The store path of file is obtained, and by generating HTTP request, can directly be accessed the file of storage, be made the reading of file simple
Efficiently, while requiring to pass through load balancing according to concurrency, conduct interviews each reading service control, and by read-only
Mode carries out file reading, makes system integrally realize read and write abruption, and the write, storage and reading according to read-write comparison data is carried out
Flexible modulation, can meet different application demands.
Further, the resilient expansion that the metadata node and the back end can be independent, there is provided high-performance
Literacy, the read module can be as using local disk come using the storage mould with said write module
Block, can improve the quantity of documents and number of folders of write, by growth data by extending the metadata node quantity
Number of nodes can improve free space, and original file system is constant i.e. expansible.
Another kind of technical scheme that the present invention solves above-mentioned technical problem is as follows:
A kind of access system of mass small documents, including:The write client that is sequentially connected, writing module, memory module,
Read module, wherein,
Said write client is compressed for the service scripts corresponding to the business datum to obtaining, and obtains file pressure
Contracting bag;
Said write module is used for the eigenvalue that the business datum is extracted according to default extracting rule, and works as the feature
When value is not stored in file index storehouse, the eigenvalue is encoded according to pre-arranged code rule, obtain data encoding, and
The corresponding relation of the eigenvalue and the data encoding is stored in the file index storehouse, and according to the data encoding
Generate corresponding bibliographic structure;When the eigenvalue has been stored in the file index storehouse, obtained according to the eigenvalue
The corresponding existing bibliographic structure of data encoding;And the compressing file bag is decompressed, obtain the service scripts;
The memory module is for the service scripts is deposited under the store path of the bibliographic structure;
The read module is used to read the service scripts.
Further, the extracting rule is:The typical field in the business datum is selected, character is spliced in order
String, using the character string as the business datum eigenvalue.
Further, the memory module includes:Metadata node, back end, unified storage management control station and deposit
Storage Client Agent, wherein,
The metadata node is used for the metadata for storing the service scripts, and the metadata includes file attribute, text
Part storage location, file fragmentation information etc.;
The back end is used to store the service scripts, and when existing service scripts under the bibliographic structure, covers
Cover existing service scripts;
The unified storage management control station respectively with the metadata node, the back end and the storage client
End agency's connection, for controlling and managing the metadata node, the back end and the storage Client Agent;
The storage Client Agent is connected with the read module and said write module respectively, for transmitting the industry
Business file.
Further, the read module specifically for:HTTP request is obtained, and is directly read according to the HTTP request
The document entity of the service scripts, or the eigenvalue of the business datum that need to be read is obtained, and according to the eigenvalue coupling number
According to coding, and the bibliographic structure of service scripts storage is obtained according to the data encoding, and by the mesh of the service scripts
Record is added on the bibliographic structure, and HTTP reference address is added on the bibliographic structure, obtains HTTP request, and root
The service scripts is read according to the HTTP request.
Further, the access system also includes reading distribute module, and the reading distribute module includes:
Load Balance Unit, is connected with the read module, for carrying out load balancing to the HTTP request;
Access control unit, is connected with the Load Balance Unit and the memory module, respectively for load balancing
The HTTP request afterwards carries out safety inspection.
Further, the resilient expansion that the metadata node and the back end can be independent, there is provided high-performance
Literacy, the read module can be as using local disk come using the storage mould with said write module
Block, can improve the quantity of documents and number of folders of write, by growth data by extending the metadata node quantity
Number of nodes can improve free space, and original file system is constant i.e. expansible.
The advantage of the additional aspect of the present invention will be set forth in part in the description, and partly will become from the following description
Obtain substantially, or recognized by present invention practice.
Description of the drawings
Fig. 1 is a kind of schematic flow sheet of the access method of mass small documents provided in an embodiment of the present invention;
A kind of structural framing figure of the access system of mass small documents that Fig. 2 is provided for another embodiment of the present invention;
A kind of wiring method of mass small documents that Fig. 3 is provided for another embodiment of the present invention;
A kind of storage method of mass small documents that Fig. 4 is provided for another embodiment of the present invention;
A kind of read method of mass small documents that Fig. 5 is provided for another embodiment of the present invention.
Specific embodiment
The principle and feature of the present invention are described below in conjunction with accompanying drawing, example is served only for explaining the present invention, and
It is non-for limiting the scope of the present invention.
As shown in figure 1, for a kind of schematic flow sheet of the access method of mass small documents provided in an embodiment of the present invention, should
Access method is comprised the following steps:
S110, to obtain business datum corresponding to service scripts be compressed, obtain compressing file bag;
S120, extracts the eigenvalue of business datum according to default extracting rule;
S130, when eigenvalue is not stored in file index storehouse, encodes to eigenvalue according to pre-arranged code rule,
Data encoding is obtained, and the corresponding relation of eigenvalue and data encoding is stored in file index storehouse;
S140, generates corresponding bibliographic structure according to data encoding;
S150, decompresses to compressing file bag, obtains service scripts;
S160, service scripts is deposited under the store path of bibliographic structure;
S170, reads service scripts.
A kind of access method of the mass small documents provided in above-described embodiment, by the eigenvalue for extracting business datum,
Generate corresponding with eigenvalue data encoding, set up the contrast relationship of eigenvalue and data encoding, so as to set up business datum and
The contrast relationship of file storage is simultaneously stored, and generate corresponding bibliographic structure according to data encoding, and service scripts is deposited into phase
It is under the bibliographic structure answered, it is possible to achieve efficient storage and reading mass small documents, simple with system access, write, read
Batch data efficiency high, and can be with the advantage of multi-faceted resilient expansion.
In another embodiment, as shown in Fig. 2 a kind of access of the mass small documents provided for another embodiment of the present invention
The structural framing figure of system, including:The write client 210 that is sequentially connected, writing module 220, memory module 230, read mould
Block 240 and reading distribute module 250, wherein,
Write client 210 includes:
Client end AP I211 (Application Programming Interface, application programming interface), is used for
Business datum and corresponding service scripts are obtained, and service scripts is compressed, obtain compressed package;
FTP connection pools 212 (File Transfer Protocol, file transfer protocol (FTP)), respectively with client end AP I211
Connect with the FTP units 222 in writing module 220, for obtaining the compressed package of service scripts, and be sent to FTP units 222;
Redis connection pools 213 (Redis, storage system), respectively with client end AP I211 and writing module 220 in
Redis file indexes storehouse 221 connects, and for obtaining business datum, and is sent to Redis file indexes storehouse 221.
Writing module 220 includes:
Redis file indexes storehouse 221, is connected with the storage Client Agent 234 in memory module 230, for according to pre-
If extracting rule extracts the eigenvalue of business datum, and when eigenvalue is not stored in Redis file indexes storehouse 221, according to
Pre-arranged code rule is encoded to eigenvalue, obtains data encoding, and the corresponding relation of eigenvalue and data encoding is stored
To in Redis file indexes storehouse 221, and corresponding bibliographic structure is generated according to data encoding;When eigenvalue has been stored in Redis
When in file index storehouse 221, the corresponding existing bibliographic structure of data encoding is obtained according to eigenvalue;
FTP units 222, are connected with unwrapper unit 223, for the indexed results according to Redis file indexes storehouse 221, will
The compressed package of service scripts is sent under corresponding bibliographic structure;
Unwrapper unit 223, is connected with storage Client Agent 234, for decompressing to compressed package, obtains business text
Part.
Memory module 230 includes:
Metadata node 231, for the metadata of storage service file, metadata includes file attribute, file storage position
Put, file fragmentation information etc.;
Back end 232, for storage service file, and when existing service scripts under bibliographic structure, covers existing
Service scripts;
Unified storage management control station 233 respectively with metadata node 231, back end 232 and storage Client Agent
234 connections, for controlling and managing metadata node 231, back end 232 and storage Client Agent 234, and to entirely depositing
The storage hardware resource of module 230, the running status of memory module 230, parameter configuration of memory module 230 etc. carry out unifying pipe
Reason;
Storage Client Agent 234 is connected with read module 240 and writing module 220 respectively, for transmission services file.
Read module 240 includes:
HTTP clients 241, with read distribute module 250 in Load Balance Unit 251 be connected, for service end according to
HTTP request reads service scripts;
Browser 242, is connected with the Load Balance Unit 251 in distribute module 250 is read, for client according to HTTP
Service scripts is read in request.
Reading distribute module 250 includes:
Load Balance Unit 251, is connected with multiple access control units 252, for carrying out load balancing to HTTP request;
Access control unit 252, with storage Client Agent 234 be connected, for by Lua to the HTTP after load balancing
Request carries out safety inspection, makes HTTP request carry out file reading by read-only mode, and system integrally realizes read and write abruption.
Further, extracting rule is:Typical field in selection business datum, is spliced into character string, in order by character
The eigenvalue gone here and there as business datum.
Further, the resilient expansion that metadata node 231 and back end 232 can be independent, there is provided high performance reading
Write capability, read module 240 and writing module 220 using memory module 230, can be passed through as using local disk
231 quantity of extended metadata node can improve the quantity of documents and number of folders of write, be counted by growth data node 232
Amount can improve free space, and original file system is constant i.e. expansible.
Further, read module 240 can also obtain the eigenvalue of the business datum that need to be read, and be matched according to eigenvalue
Data encoding, and the bibliographic structure of service scripts storage is obtained according to data encoding, and the catalogue of service scripts is added to into mesh
On directory structures, and HTTP reference address is added on bibliographic structure, obtains HTTP request.
In another embodiment, as shown in figure 3, a kind of write of the mass small documents provided for another embodiment of the present invention
Method, the wiring method are comprised the following steps:
S301, to obtain business datum corresponding to service scripts be compressed, obtain compressing file bag;
S302, extracts the eigenvalue of business datum according to default extracting rule, exactly selects typical field in certain sequence
Character string is spliced into, used as the service feature value of the data, service feature value=field 1+ field 2+ ...+field n, n are unsuitable
It is too big, for example, from the bibliographical particulars of patent documentation, country origin, document number, document type, publication date can be chosen and can just be represented
The uniqueness of patent document, i.e.,:Eigenvalue=country origin+document number+document type+the publication date of patent documentation;
S303, when eigenvalue is not stored in file index storehouse, execution step S304 otherwise performs S305;
S304, encodes to eigenvalue according to pre-arranged code rule, obtains data encoding, and eigenvalue and data are compiled
The corresponding relation of code is stored in file index storehouse, generates corresponding bibliographic structure according to data encoding, in storage service file
Before, it is necessary to the corresponding directory rules on the coding rule and storage of design data coding, and depth and the storage of catalogue are considered
The discreteness of data, makes file have rational depth in storage, and the file or text of fair amount are kept under a catalogue
Part, can give full play to the performance of storage, it is to avoid data extreme case occur, cause to reach storage parameter upper limit, for example:It is single
Writable maximum number of files, directories deep etc. under file;
S305, obtains the corresponding existing bibliographic structure of data encoding according to eigenvalue;
S306, decompresses to compressing file bag, obtains service scripts;
S307, service scripts is deposited under the store path of bibliographic structure.
Further, extracting rule is:Typical field in selection business datum, is spliced into character string, in order by word
Eigenvalue of the symbol string as business datum.
In another embodiment, as shown in figure 4, a kind of storage of the mass small documents provided for another embodiment of the present invention
Method, the storage method are comprised the following steps:
S401, obtains the document entity and bibliographic structure of service scripts;
S402, by the document entity of service scripts and bibliographic structure storage to back end;
S403, in back end, if existing service scripts under the storage catalogue, performs S404, otherwise, performs
S405;
S404, by under service scripts storage to the respective directories structure of back end, covers existing service scripts;
S405, by under service scripts storage to the respective directories structure of back end;
S406, generates the metadata of service scripts according to the document entity and bibliographic structure of service scripts;
S407, obtains metadata, and is stored in metadata node.
A kind of storage method of the mass small documents provided in above-described embodiment, the file of the service scripts by obtaining
Entity and bibliographic structure storage are to back end, and the metadata storage for generating is arrived metadata node, it is possible to achieve efficient
Storage mass small documents, have the advantages that to be stored into batch data efficiency high.
In another embodiment, as shown in figure 5, a kind of reading of the mass small documents provided for another embodiment of the present invention
Method, by taking patent document as an example, reads claims of this document, and claims are stored as QLYQS.HTML, it is assumed that specially
Sharp Access to publication data feature values are:Country origin+document number+document type+publication date, Application No. CN97324977.3
Invalid patent, country origin is China CN, document number is 3092258, document type is outward appearance D0, publication date is 1998.12.02, literary
The reference address of part reading service is http://file.patent.com/H, then, read method is comprised the following steps:
S501, obtains eigenvalue CN3092258D019981202;
S502, encodes according to eigenvalue matched data, and the corresponding data encoding of eigenvalue is
CND019981202000000000030920FCE16JOL00127;
S503, data encoding step-by-step is divided and obtains bibliographic structure, and bibliographic structure is /CND0/1998/1202/
00000000003092/0FCE16JOL00127/;
S504, it would be desirable to which claims (/QLYQS/QLYQS.HTML) of reading are added on bibliographic structure, obtain
CND0/1998/1202/00000000003092/0FCE16JOL00127/QLYQS/QLYQS.HTML;
S505, HTTP reference address is added on bibliographic structure, HTTP request is obtained:http://
file.patent.com/H/CND0/1998/1202/00000000003092/0FCE16JOL00127/QLYQS/
QLYQS.HTML;
S506, carries out load balancing to HTTP request;
S507, carries out safety inspection to the HTTP request after load balancing;
S508, reads service scripts according to the HTTP request after safety inspection.
Above-described embodiment can just access the file resource in storage system by simple rule, and file resource accesses letter
Single convenient, access system enhances the adaptability of document storage system to document storage system without dependence.
In several embodiments provided herein, it should be understood that disclosed system and method, which can be passed through
Its mode is realized.For example, system embodiment described above is only schematic, and for example, the division of module is only
A kind of division of logic function, can have when actually realizing other dividing mode, such as multiple units or component can with reference to or
Person is desirably integrated into another system, or some features can be ignored, or does not perform.
The foregoing is only presently preferred embodiments of the present invention, not to limit the present invention, all spirit in the present invention and
Within principle, any modification, equivalent substitution and improvements made etc. should be included within the scope of the present invention.
Claims (10)
1. a kind of access method of mass small documents, it is characterised in that comprise the following steps:
Step 1, to obtain business datum corresponding to service scripts be compressed, obtain compressing file bag;
Step 2, extracts the eigenvalue of the business datum according to default extracting rule;
Step 3, when the eigenvalue is not stored in file index storehouse, is carried out to the eigenvalue according to pre-arranged code rule
Coding, obtains data encoding, and by the corresponding relation storage of the eigenvalue and the data encoding to the file index storehouse
In;
Step 4, generates corresponding bibliographic structure according to the data encoding;
Step 5, decompresses to the compressing file bag, obtains the service scripts;
Step 6, the service scripts is deposited under the store path of the bibliographic structure;
Step 7, reads the service scripts.
2. access method according to claim 1, it is characterised in that also included before step 5:
Step 8, when the eigenvalue has been stored in the file index storehouse, obtains data encoding pair according to the eigenvalue
The existing bibliographic structure answered.
3. access method according to claim 2, it is characterised in that the extracting rule is:Select the business datum
In typical field, be spliced into character string in order, using the character string as the business datum eigenvalue.
4. the access method according to any one of claims 1 to 3, it is characterised in that in step 6, including:
Step 6.1, obtains the document entity and bibliographic structure of the service scripts, and by the document entity of the service scripts and
Bibliographic structure storage is to back end, and when existing service scripts under the bibliographic structure, covers existing service scripts;
Step 6.2, generates the metadata of the service scripts according to the document entity and bibliographic structure of the service scripts;
Step 6.3, obtains the metadata, and is stored in metadata node.
5. access method according to claim 4, it is characterised in that in step 7, comprise the following steps:
Step 7.1, obtains HTPP requests, directly reads the document entity of the service scripts, if obtain for the feature
It is worth, then execution step 7.2;
Step 7.2, encodes according to the eigenvalue matched data;
Step 7.3, according to the data encoding, obtains the bibliographic structure of the service scripts storage;
Step 7.4, the catalogue of the service scripts is added on the bibliographic structure;
Step 7.5, HTTP reference address is added on the bibliographic structure, HTTP request is obtained;
Step 7.6, carries out load balancing to the HTTP request;
Step 7.7, to load balancing after the HTTP request carry out safety inspection;
Step 7.8, according to safety inspection after the HTTP request read the service scripts.
6. the access system of a kind of mass small documents, it is characterised in that include:The write client that is sequentially connected, writing module,
Memory module, read module, wherein,
Said write client is compressed for the service scripts corresponding to the business datum to obtaining, and obtains compressing file
Bag;
Said write module is used for the eigenvalue that the business datum is extracted according to default extracting rule, and works as the eigenvalue not
When being stored in file index storehouse, the eigenvalue is encoded according to pre-arranged code rule, obtain data encoding, and by institute
The corresponding relation for stating eigenvalue and the data encoding is stored in the file index storehouse, and is generated according to the data encoding
Corresponding bibliographic structure;When the eigenvalue has been stored in the file index storehouse, data are obtained according to the eigenvalue
The corresponding existing bibliographic structure of coding;And the compressing file bag is decompressed, obtain the service scripts;
The memory module is for the service scripts is deposited under the store path of the bibliographic structure;
The read module is used to read the service scripts.
7. access system according to claim 6, it is characterised in that the extracting rule is:Select the business datum
In typical field, be spliced into character string in order, using the character string as the business datum eigenvalue.
8. access system according to claim 7, it is characterised in that the memory module includes:Metadata node, data
Node, unified storage management control station and storage Client Agent, wherein,
The metadata node is used for the metadata for storing the service scripts;
The back end is used to store the service scripts, and when existing service scripts under the bibliographic structure, covers
Some service scripts;
The unified storage management control station respectively with the metadata node, the back end and the storage client generation
Reason connection, for controlling and managing the metadata node, the back end and the storage Client Agent;
The storage Client Agent is connected with the read module and said write module respectively, for transmitting the business text
Part.
9. the access system according to any one of claim 6 to 8, it is characterised in that the read module specifically for:Obtain
Take HTTP request, and the document entity of the service scripts is directly read according to the HTTP request, or obtain the industry that need to be read
The eigenvalue of business data, and encoded according to the eigenvalue matched data, and the business text is obtained according to the data encoding
The bibliographic structure of part storage, and the catalogue of the service scripts is added on the bibliographic structure, and by HTTP reference address
It is added on the bibliographic structure, obtains HTTP request, and the service scripts is read according to the HTTP request.
10. access system according to claim 9, it is characterised in that the access system also includes reading distribute module,
The reading distribute module includes:
Load Balance Unit, is connected with the read module, for carrying out load balancing to the HTTP request;
Access control unit, is connected with the Load Balance Unit and the memory module, after to load balancing respectively
The HTTP request carries out safety inspection.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611054238.8A CN106547911B (en) | 2016-11-25 | 2016-11-25 | Access method and system for massive small files |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611054238.8A CN106547911B (en) | 2016-11-25 | 2016-11-25 | Access method and system for massive small files |
Publications (2)
Publication Number | Publication Date |
---|---|
CN106547911A true CN106547911A (en) | 2017-03-29 |
CN106547911B CN106547911B (en) | 2020-07-10 |
Family
ID=58395134
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611054238.8A Active CN106547911B (en) | 2016-11-25 | 2016-11-25 | Access method and system for massive small files |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106547911B (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107423321A (en) * | 2017-03-31 | 2017-12-01 | 上海斐讯数据通信技术有限公司 | It is applicable the method and its device of high-volume small documents cloud storage |
CN107463606A (en) * | 2017-06-22 | 2017-12-12 | 浙江力石科技股份有限公司 | A kind of data compression engine and method for big data storage system |
CN107918654A (en) * | 2017-11-16 | 2018-04-17 | 联想(北京)有限公司 | File decompression method, apparatus and electronic equipment |
CN111666257A (en) * | 2020-06-03 | 2020-09-15 | 中国建设银行股份有限公司 | File fragment storage method, device, equipment and storage medium |
CN111753518A (en) * | 2020-08-12 | 2020-10-09 | 深圳潮数软件科技有限公司 | Autonomous file consistency checking method |
CN111752954A (en) * | 2020-06-29 | 2020-10-09 | 深圳前海微众银行股份有限公司 | Large-scale feature data storage method and device |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101193089A (en) * | 2006-11-20 | 2008-06-04 | 阿里巴巴公司 | Stateful session system and its realization method |
CN101226546A (en) * | 2008-02-01 | 2008-07-23 | 华为技术有限公司 | Method for organizing and searching document |
CN101398869A (en) * | 2008-10-07 | 2009-04-01 | 深圳市蓝韵实业有限公司 | Mass data storage means |
CN101547161A (en) * | 2008-03-28 | 2009-09-30 | 阿里巴巴集团控股有限公司 | Folder transmission system, folder transmission device and folder transmission method |
CN103905414A (en) * | 2013-03-22 | 2014-07-02 | 哈尔滨安天科技股份有限公司 | Storage method and system based on updating package information |
CN104820717A (en) * | 2015-05-22 | 2015-08-05 | 国网智能电网研究院 | Massive small file storage and management method and system |
CN104978330A (en) * | 2014-04-04 | 2015-10-14 | 西南大学 | Data storage method and device |
-
2016
- 2016-11-25 CN CN201611054238.8A patent/CN106547911B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101193089A (en) * | 2006-11-20 | 2008-06-04 | 阿里巴巴公司 | Stateful session system and its realization method |
CN101226546A (en) * | 2008-02-01 | 2008-07-23 | 华为技术有限公司 | Method for organizing and searching document |
CN101547161A (en) * | 2008-03-28 | 2009-09-30 | 阿里巴巴集团控股有限公司 | Folder transmission system, folder transmission device and folder transmission method |
CN101398869A (en) * | 2008-10-07 | 2009-04-01 | 深圳市蓝韵实业有限公司 | Mass data storage means |
CN103905414A (en) * | 2013-03-22 | 2014-07-02 | 哈尔滨安天科技股份有限公司 | Storage method and system based on updating package information |
CN104978330A (en) * | 2014-04-04 | 2015-10-14 | 西南大学 | Data storage method and device |
CN104820717A (en) * | 2015-05-22 | 2015-08-05 | 国网智能电网研究院 | Massive small file storage and management method and system |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107423321A (en) * | 2017-03-31 | 2017-12-01 | 上海斐讯数据通信技术有限公司 | It is applicable the method and its device of high-volume small documents cloud storage |
CN107423321B (en) * | 2017-03-31 | 2020-12-01 | 北京亿智云科技有限公司 | Method and device suitable for cloud storage of large-batch small files |
CN107463606A (en) * | 2017-06-22 | 2017-12-12 | 浙江力石科技股份有限公司 | A kind of data compression engine and method for big data storage system |
CN107463606B (en) * | 2017-06-22 | 2020-11-13 | 浙江力石科技股份有限公司 | Data compression engine and method for big data storage system |
CN107918654A (en) * | 2017-11-16 | 2018-04-17 | 联想(北京)有限公司 | File decompression method, apparatus and electronic equipment |
CN107918654B (en) * | 2017-11-16 | 2020-07-24 | 联想(北京)有限公司 | File decompression method and device and electronic equipment |
CN111666257A (en) * | 2020-06-03 | 2020-09-15 | 中国建设银行股份有限公司 | File fragment storage method, device, equipment and storage medium |
CN111666257B (en) * | 2020-06-03 | 2024-03-19 | 中国建设银行股份有限公司 | Method, device, equipment and storage medium for file fragment storage |
CN111752954A (en) * | 2020-06-29 | 2020-10-09 | 深圳前海微众银行股份有限公司 | Large-scale feature data storage method and device |
CN111753518A (en) * | 2020-08-12 | 2020-10-09 | 深圳潮数软件科技有限公司 | Autonomous file consistency checking method |
Also Published As
Publication number | Publication date |
---|---|
CN106547911B (en) | 2020-07-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106547911A (en) | A kind of access method and system of mass small documents | |
CN111259006B (en) | Universal distributed heterogeneous data integrated physical aggregation, organization, release and service method and system | |
CN101009516B (en) | A method, system and device for data synchronization | |
US7487191B2 (en) | Method and system for model-based replication of data | |
CN106469158B (en) | Method of data synchronization and device | |
CN104573068A (en) | Information processing method based on megadata | |
CN102804168A (en) | Data Compression For Reducing Storage Requirements In A Database System | |
US9535966B1 (en) | Techniques for aggregating data from multiple sources | |
CN104699718A (en) | Method and device for rapidly introducing business data | |
CN107451237B (en) | Serialization and deserialization method, device and equipment | |
CN103150402A (en) | Index-code-based virtual file system, establishment method and access method | |
CN106970929A (en) | Data lead-in method and device | |
KR20120106544A (en) | Method for accessing files of a file system according to metadata and device implementing the method | |
US9600597B2 (en) | Processing structured documents stored in a database | |
CN109766085A (en) | A kind of method and device handling enumeration type code | |
CN104035993A (en) | Memory search method for e-books, e-book management system and reading system | |
CN108536617A (en) | Buffer memory management method, medium, system and electronic equipment | |
CN106528896A (en) | Database optimization method and apparatus | |
CN106570153A (en) | Data extraction method and system for mass URLs | |
CN104408084B (en) | A kind of big data screening technique and device | |
KR20080014737A (en) | Method and system for mapping between components of a packaging model and features of a physical representation of a package | |
CN110347673A (en) | Data file loading method, device, computer equipment and storage medium | |
CN102624894A (en) | Method and system for depacketize and message analysis | |
CN106570152A (en) | Mobile phone number volume extracting method and system | |
CN108243207A (en) | A kind of date storage method of network cloud disk |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder |
Address after: 100190 17-19 / F, building a 1, 66 Zhongguancun East Road, Haidian District, Beijing Patentee after: New Great Wall Technology Co.,Ltd. Address before: 100190 17-19 / F, building a 1, 66 Zhongguancun East Road, Haidian District, Beijing Patentee before: GREAT WALL COMPUTER SOFTWARE & SYSTEMS Inc. |
|
CP01 | Change in the name or title of a patent holder |