CN110263061A - A kind of data query method and system - Google Patents
A kind of data query method and system Download PDFInfo
- Publication number
- CN110263061A CN110263061A CN201910521866.XA CN201910521866A CN110263061A CN 110263061 A CN110263061 A CN 110263061A CN 201910521866 A CN201910521866 A CN 201910521866A CN 110263061 A CN110263061 A CN 110263061A
- Authority
- CN
- China
- Prior art keywords
- data
- file
- partitioned
- business datum
- server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
- G06F16/24552—Database cache management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Abstract
The embodiment of the invention provides a kind of data query method and system, by storing business datum into local disk, and corresponding index information is cached in local memory, wherein, first passing through data partitioned server obtains multiple partitioned files for big data multidomain treat-ment in advance, each partitioned file is distributed into corresponding distributed server and carries out data storage, and it is directed to the partitioned file respectively stored by distributed server and generates corresponding index information, it is subsequent after receiving data inquiry request, the index information being directly based upon in local memory, quickly position storage location of the data to be checked in local disk, so both without occupying excessive local memory space, reduce data buffer storage cost, corresponding data can be quickly read from local disk again, improve the response efficiency of data inquiry request, it is copied in combination with zero Shellfish technology sends data to network card interface, reduces data copy number, further improves the efficiency data query of client.
Description
Technical field
The present invention relates to data query technique field more particularly to a kind of data query method and system.
Background technique
Currently, generalling use Data cache technology in the prior art in order to improve the search efficiency of data to improve data
Search efficiency;
Existing data buffer storage mode includes: memory type caching, magnetic disc type caching and memory disk mixed type caching;It is adopting
During carrying out data query with above-mentioned data buffer storage mode, in the lesser situation of data volume, using above-mentioned data buffer storage
The search efficiency of data can be improved in any one mode in mode, and efficiency during data query is also very high;
However, when being directed to big data field, it usually needs when the larger even up to PB rank of the data of caching, if still using above-mentioned
Many disadvantages will be present come the method for improving the search efficiency of data in Data cache technology;
When improving the efficiency of transmission of data using memory type caching technology, when buffer data size is very big, a large amount of numbers
According to more memories will be occupied, however, occupying more memories along with mass data, it will lead to costly machine
Resources costs;It is cached using magnetic disc type, when a large amount of data are written in disk, will seriously affect the search efficiency of data;
When improving the efficiency of transmission of data using memory disk mixed type technology, when the data inquired without in memory but
When in disk, the efficiency for inquiring data also will receive serious influence.
It follows that during existing data query, if using memory cache data, will be present data storage at
This high problem, and if storing data will be present and read slow problem using disk buffering data, therefore, it is impossible to simultaneous simultaneously
Care for data buffer storage cost and data reading performance using redundancy.
Summary of the invention
The purpose of the embodiment of the present application is to provide a kind of data query method and system, both without occupying in excessive local
Space is deposited, reduces data buffer storage cost, and can quickly read corresponding data from local disk, data query is improved and asks
The response efficiency asked sends data to network card interface in combination with zero duplication technology, reduces data copy number, further
Improve the efficiency data query of client.
In order to solve the above technical problems, the embodiment of the present application is achieved in that
The embodiment of the present application provides a kind of data query method, comprising: receives client for target service data
Inquiry request, wherein the inquiry request carries the Data Identification of the target service data, and the inquiry request is described
What the target device mark that client is returned according to data partitioned server was sent, the target device mark is the data point
Distribution where the target service data that area's server is chosen in multiple distributed servers according to the Data Identification
The identification information of formula server;
In the index file information stored in local memory, according to the Data Identification, the target service number is inquired
According to storage location information, wherein the index file information include: business datum Data Identification and storage location information it
Between corresponding relationship;
Based on the corresponding storage location information of the target service data, the target industry is read from local disk
Business data, wherein the pre-assigned number comprising multiple business datums of data partitioned server is stored in the local disk
According to partitioned file;
The target service data read are transmitted to default network card interface using zero duplication technology, by described
The target service data are sent to the client by network card interface.
The embodiment of the present application provides a kind of data query system, comprising: client and multiple distributed servers;
The client, for sending the inquiry request for being directed to target service data, and reception to distributed server
The target service data that the distributed server returns, wherein the inquiry request is the client according to data
What the target device mark that partitioned server returns was sent, the target device mark is the data partitioned server according to institute
State the mark of the distributed server where the target service data that Data Identification is chosen in multiple distributed servers
Information;
The distributed server, for receiving the inquiry request;And the index file letter stored in local memory
In breath, according to the Data Identification, the storage location information of the target service data is inquired, wherein the index file letter
Breath includes: the corresponding relationship between the Data Identification of business datum and storage location information;And
Based on the corresponding storage location information of the target service data, the target industry is read from local disk
Business data, wherein the pre-assigned number comprising multiple business datums of data partitioned server is stored in the local disk
According to partitioned file;And
The target service data read are transmitted to default network card interface using zero duplication technology, by described
The target service data are sent to the client by network card interface.The embodiment of the present application provides a kind of data query and sets
It is standby, comprising: processor;And
It is arranged to the memory of storage computer executable instructions, the computer executable instructions make when executed
The processor realizes following below scheme:
Receive the inquiry request that client is directed to target service data, wherein the inquiry request carries the target
The Data Identification of business datum, the inquiry request are the target device marks that the client is returned according to data partitioned server
Know and send, target device mark be the data partitioned server according to the Data Identification in multiple Distributed Services
The identification information of distributed server where the target service data chosen in device;
In the index file information stored in local memory, according to the Data Identification, the target service number is inquired
According to storage location information, wherein the index file information include: business datum Data Identification and storage location information it
Between corresponding relationship;
Based on the corresponding storage location information of the target service data, the target industry is read from local disk
Business data, wherein the pre-assigned number comprising multiple business datums of data partitioned server is stored in the local disk
According to partitioned file;
The target service data read are transmitted to default network card interface using zero duplication technology, by described
The target service data are sent to the client by network card interface.
The embodiment of the present application provides a kind of storage medium, and for storing computer executable instructions, the computer can
It executes instruction and realizes following below scheme when executed:
Receive the inquiry request that client is directed to target service data, wherein the inquiry request carries the target
The Data Identification of business datum, the inquiry request are the target device marks that the client is returned according to data partitioned server
Know and send, target device mark be the data partitioned server according to the Data Identification in multiple Distributed Services
The identification information of distributed server where the target service data chosen in device;
In the index file information stored in local memory, according to the Data Identification, the target service number is inquired
According to storage location information, wherein the index file information include: business datum Data Identification and storage location information it
Between corresponding relationship;
Based on the corresponding storage location information of the target service data, the target industry is read from local disk
Business data, wherein the pre-assigned number comprising multiple business datums of data partitioned server is stored in the local disk
According to partitioned file;
The target service data read are transmitted to default network card interface using zero duplication technology, by described
The target service data are sent to the client by network card interface.
Data query method and system in the embodiment of the present application, the inquiry for receiving client for target service data are asked
It asks;In the index file information stored in local memory, according to Data Identification, the storage location letter of target service data is inquired
Breath;Based on the corresponding storage location information of target service data, target service data are read from local disk;Utilize zero-copy
The target service data read are transmitted to default network card interface by technology, to be sent target service data by network card interface
To client.By the way that business datum storage is cached in local memory into local disk, and by corresponding index information,
In, data partitioned server is first passed through in advance by big data multidomain treat-ment and obtains multiple partitioned files, and each partitioned file is distributed to
Corresponding distributed server carries out data storage, and is directed to the partitioned file respectively stored by distributed server and generates accordingly
Index information, subsequent after receiving data inquiry request, the index information being directly based upon in local memory, quickly positioning to
Storage location of the data in local disk is inquired, it is slow both to reduce data without occupying excessive local memory space in this way
It is saved as this, and can quickly read corresponding data from local disk, the response efficiency of data inquiry request is improved, ties simultaneously
It closes zero duplication technology and sends data to network card interface, reduce data copy number, further improve the data of client
Search efficiency.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The some embodiments recorded in application, for those of ordinary skill in the art, in the premise of not making the creative labor property
Under, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is the application scenarios schematic diagram of data query provided by the embodiments of the present application processing;
Fig. 2 is the first flow diagram of data query method provided by the embodiments of the present application;
Fig. 3 is second of flow diagram of data query method provided by the embodiments of the present application;
Fig. 4 is the realization principle figure that data partitioned file is generated in data query method provided by the embodiments of the present application;
Fig. 5 is the realization principle figure that index file information is generated in data query method provided by the embodiments of the present application;
Fig. 6 is the specific implementation process schematic of data query method provided by the embodiments of the present application;
Fig. 7 is the first module composition schematic diagram of data query system provided by the embodiments of the present application;
Fig. 8 is second of module composition schematic diagram of data query system provided by the embodiments of the present application;
Fig. 9 is the structural schematic diagram of data query equipment provided by the embodiments of the present application.
Specific embodiment
In order to make those skilled in the art better understand the technical solutions in the application, below in conjunction with the application reality
The attached drawing in example is applied, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described implementation
Example is merely a part but not all of the embodiments of the present application.Based on the embodiment in the application, this field is common
The application protection all should belong in technical staff's every other embodiment obtained without creative efforts
Range.
The embodiment of the present application provides a kind of data query method and system, by storing business datum to local disk
In, and corresponding index information is cached in local memory, wherein data partitioned server is first passed through by big data subregion in advance
Processing obtains multiple partitioned files, and each partitioned file is distributed to corresponding distributed server and carries out data storage, and by dividing
Cloth server is directed to the partitioned file respectively stored and generates corresponding index information, subsequent to receive data inquiry request
Afterwards, the index information being directly based upon in local memory quickly positions storage location of the data to be checked in local disk, in this way
Both without occupying excessive local memory space, data buffer storage cost is reduced, and phase can be quickly read from local disk
The data answered, improve the response efficiency of data inquiry request, send data to network card interface in combination with zero duplication technology, subtract
Lack data copy number, further improves the efficiency data query of client.
Fig. 1 is the application scenarios schematic diagram for the data query system that this specification one or more embodiment provides, such as Fig. 1
Shown, which includes: data partitioned server, multiple distributed servers in distributed data-storage system, Yi Jike
Family end, wherein the client can be the mobile terminals such as smart phone, tablet computer;The data partitioned server is used for big
Data carry out multidomain treat-ment and obtain multiple partitioned files, and each partitioned file is distributed to corresponding distributed server and is counted
According to storage;Wherein, which can be millions data, and this specification one or more embodiment is mainly for mass data
It is handled;Distributed server is multiple, and the distributed server can be partitioned server transmission for receiving data
Data partitioned file, which is stored in local disk, and generates for the partitioned file of storage corresponding
Index information server, wherein the detailed process of data query method are as follows:
(1) partitioning algorithm that data partitioned server is provided using off-line calculation technology and division module, will be to be processed
Big data carries out multidomain treat-ment and obtains multiple data partitioned files, wherein and each data partitioned file carries file identification information,
And each file identification information both corresponds to the device identification of a distributed server;Data partitioned server takes according to subregion
The data partitioned file for carrying file identification information, is sent to corresponding distributed server by the device identification of business device
In;
(2) distributed server receives the data partitioned file that data partitioned server is sent, and the data subregion is literary
Part is stored into local disk;According to storage position of each business datum included in the data partitioned file in local disk
Confidence breath, generates the index file information of data partitioned file, index file information is stored into local memory;
(3) client sends the inquiry request for carrying the Data Identification of target service data to data partitioned server,
After the data partitioned server receives inquiry request, according to the Data Identification for the target service data that client is sent, by this
The device identification of distributed server where target service data returns to client;Client receives the device identification
Afterwards, data inquiry request is sent to partitioned server corresponding to the device identification;
(4) after distributed server receives client for the inquiry request of target service data, according to the target industry
The identification information for data of being engaged in is searched in the index file information for being stored in local memory and is believed with the mark of the target service data
The target service data read in local disk are transmitted to by the corresponding storage location information of manner of breathing using zero duplication technology
Default network card interface, target service data are sent to client by network card interface.
In the embodiment of the present application, by the way that business datum storage is cached into local disk, and by corresponding index information
Into local memory, wherein first passing through data partitioned server obtains multiple partitioned files for big data multidomain treat-ment in advance, will be each
Partitioned file distributes to corresponding distributed server and carries out data storage, and point respectively stored is directed to by distributed server
Area file generates corresponding index information, subsequent after receiving data inquiry request, the index being directly based upon in local memory
Information quickly positions storage location of the data to be checked in local disk, both empty without occupying excessive local memory in this way
Between, data buffer storage cost is reduced, and can quickly read corresponding data from local disk, improves data inquiry request
Response efficiency sends data to network card interface in combination with zero duplication technology, reduces data copy number, further increases
The efficiency data query of client.
Fig. 2 is the first flow diagram for the data query method that one embodiment of the application provides, the method energy in Fig. 2
Enough distributed servers by Fig. 1 execute, as shown in Fig. 2, this method at least includes the following steps:
S201, distributed server receive the inquiry request that client is directed to target service data, wherein inquiry request is taken
Data Identification with target service data, inquiry request are the target device marks that client is returned according to data partitioned server
Know transmission, target device mark is the mesh that data partitioned server is chosen in multiple distributed servers according to Data Identification
Mark the identification information of the distributed server where business datum;
Wherein, distributed server can be more, wherein the appearance of institute's storage service data in every distributed server
Amount can reach TB rank, constitute clusters using more distributed servers, can very easily allow and data cached reach PB
Rank, by the way that business datum to be stored in the disk of distributed server, wherein above-mentioned that business datum is stored in distribution
Process in the disk of server is that the business datum is copied to distribution by operating system free file copy function
In the disk of server, rather than the method for being buffered in the distributed server memory, it is slow that data are reduced to a certain extent
The cost deposited;
Specifically, client is asked to the inquiry that data partitioned server sends the Data Identification for carrying target service data
It asks, it, will according to the Data Identification for the target service data that client is sent after which receives inquiry request
The device identification of distributed server where the target service data returns to client;Client receives the device identification
Afterwards, data inquiry request is sent to partitioned server corresponding to the device identification;Wherein, the mark letter of the target service data
It ceases corresponding with the distributed server for carrying target device where the target service data;When client finds the mesh
The target distribution formula server where business datum is marked, the inquiry for issuing target service data to the target distribution formula server is asked
It asks;The target distribution formula server, when being directed to the inquiry request of target service data based on the client received, finally at this
The target service data to be checked are back to client in the disk of target distribution formula server;
S202 in the index file information stored in local memory, according to Data Identification, inquires target service data
Storage location information, wherein index file information includes: corresponding between the Data Identification of business datum and storage location information
Relationship;
Specifically, the distributed server receive client transmitted by for target service data inquiry request it
Before, the data partitioned file transmitted by data partitioned server comprising the target service data is received, and will be received
Data partitioned file is stored in local disk;Being stored in local disk using index module reading includes target service number
According to data partitioned file, according to data volume corresponding to the Data Identification of target service data local disk storage location
Information generates index file information, and by the index file information cache in the local memory of the distributed server;
Wherein, Data Identification can be to be handled using the off-line technologies such as spark or MapReduce or novel Flink
Data Identification Key corresponding to the target service data arrived;The distributed server is directed to mesh what reception client was sent
When marking the inquiry request of business datum, according to Data Identification Key entrained in the inquiry request, in the distribution server
In the index file information stored in local memory, storage position of the data volume of the target service data in local disk is inquired
Confidence breath;
S203 is based on the corresponding storage location information of target service data, and target service data are read from local disk,
Wherein, the pre-assigned data partitioned file comprising multiple business datums of data partitioned server is stored in local disk;
Specifically, the distributed server receive client transmitted by for target service data inquiry request it
Before, the data partitioned file transmitted by data partitioned server comprising the target service data is received, and will be received
Data partitioned file is stored in local disk;Being stored in local disk using index module reading includes target service number
According to data partitioned file, according to data volume corresponding to the Data Identification of target service data local disk storage location
Information generates index file information, and by the index file information cache in the local memory of the distributed server;
Data inquiry module in the distributed server is asked receiving inquiry of the client for target service data
After asking, according to Data Identification, in the index file information being stored in local memory, the storage position of target service data is inquired
Confidence breath, is based on target service data storage location information in local disk, and target service number is read from local disk
According to;
The target service data read are transmitted to default network card interface using zero duplication technology, to pass through net by S204
Target service data are sent to client by card interface;
Specifically, being read in the inquiry request according to client for target service data by data inquiry module
After storage location information of the target service data in local disk, the target service data read are passed through into zero-copy skill
Art is transmitted to default network card interface from the storage location in local disk, to be returned to target service data by network card interface
To client;The target service data read are transmitted to default network card interface, Jin Ertong by using zero duplication technology
The method that target service data are back to client by network card interface is crossed, effectively prevents data when being back to client, is needed
First to be copied to kernel state and arrive User space again, then again from User space to kernel state after be copied to the process of network interface, reduce
It allows CPU to do the task of mass data copy, and then effectively increases the efficiency of transmission of data.
In the embodiment of the present application, the inquiry request that client is directed to target service data is received;It is stored in local memory
Index file information in, according to Data Identification, inquire the storage location information of target service data;Based on target service data
Corresponding storage location information reads target service data from local disk;The target that will be read using zero duplication technology
Business data transmission is to default network card interface, target service data are sent to client by network card interface.By by industry
Business data storage is cached in local memory into local disk, and by corresponding index information, wherein first passes through data point in advance
Big data multidomain treat-ment is obtained multiple partitioned files by area's server, and each partitioned file is distributed to corresponding distributed server
Carry out data storage, and the partitioned file that respectively stores be directed to by distributed server and generates corresponding index information, it is subsequent
After receiving data inquiry request, the index information being directly based upon in local memory quickly positions data to be checked in this earth magnetism
Storage location in disk both reduces data buffer storage cost, and can be from this without occupying excessive local memory space in this way
Corresponding data are quickly read in local disk, improve the response efficiency of data inquiry request, will be counted in combination with zero duplication technology
According to network card interface is transmitted to, reduce data copy number, further improves the efficiency data query of client.
Wherein, data partitioned server needs that big data to be processed is carried out multidomain treat-ment in advance, wherein the big data
It can be millions data, this specification one or more embodiment is handled mainly for mass data, specifically, as schemed
Shown in 3, before S201 distributed server receives inquiry request of the client for target service data, further includes:
S101 receives the data partitioned file that data partitioned server is sent, wherein data partitioned file is data subregion
Server carries out multidomain treat-ment to big data and obtains and setting according to the file identification of data partitioned file and distributed server
What standby mark was sent;
Specifically, needing to connect before distributed server receives inquiry request of the client for target service data
Receive the data partitioned file that data partitioned server is sent;Data partitioned server first to big data using spark or
The off-line technologies such as MapReduce or novel Flink handle to obtain key corresponding to each business datum that the big data is included
Value pair, wherein the key-value pair includes Data Identification Key and data volume;Then, the Data Identification Key for calculating each business datum exists
[0,1048576) cryptographic Hash in range, data text is generated according to the Data Identification, data volume and cryptographic Hash of each business datum
Part carries out multidomain treat-ment to data file generated, obtains multiple numbers using the data division module of data partitioned server
According to partitioned file, wherein each data partitioned file has its corresponding file identification information, data partitioned server according to
Obtained data partitioned file is sent in the distributed server with target device mark by file identification information;
The data partitioned file received is stored in local disk, wherein data partitioned file includes: more by S102
A business datum;
Specifically, the distributed server with target device mark receives data transmitted by data partitioned server point
Area file, and the data partitioned file received is stored into local disk, wherein data partitioned server is by copying
The copy for carrying data partitioned file instruction is sent to the distributed server by the mode of shellfish, and distributed server receives should
Copy instruction, and data partitioned file entrained in copy instruction is stored in local disk;
S103 generates the index of data partitioned file according to storage location information of each business datum in local disk
The file information;
Specifically, utilizing index after the data partitioned file received is stored in local disk by distributed server
Building module reads the data partitioned file being stored in local disk, according to the number of each business datum in the data partitioned file
According to initial position of the body in local disk, end position, the index file information of the data partitioned file is generated, wherein should
Index file information includes: the Data Identification of each business datum, is stored in the initial position in local disk, end position;
Wherein, the process of index file information and the process of data query of above-mentioned generation data partitioned file are asynchronous
It carries out, is mutually independent of each other;The index file information architecture of all business datums included in the data partitioned file is completed
Later, old index file information can be replaced with new index file information;The process, which can guarantee, is not influencing data query effect
Under the premise of rate, data query is not in intermediate state, guarantees that inquired data are not in part legacy data or portion
Divide new data or partial data also in the situation in importing;
S104 stores index file information into local memory, specifically, being connect if distributed server identifies
The data partitioned file received is new file, then needs to carry out the data partitioned file whole scannings, obtain the data subregion
Initial position of the data volume of each business datum included in file in local disk, end position, then, according to each
Data Identification, initial position, the end position of business datum generate the index file information of the data file, due to will index
The file information is stored in local memory, and data reading speed is very fast, therefore by the index file information preservation of generation at this
In ground memory;In addition, being re-started when in order to avoid restarting to the data volume for the business datum being stored in local disk
Index construct and waste the unnecessary time, for the data partitioned file using index construct module generate index file believe
After breath, it is also necessary to by the index file information preservation in local disk;When identify the data partitioned file be indexed
File further improves rope in such a way that the index file being stored in local disk is loaded directly into local memory
Draw the building speed of the file information;
Further, above-mentioned data partitioned file obtains in the following way:
Step 1, data partitioned server carry out default processing to multiple business datums to subregion, obtain respectively to subregion
The corresponding key-value pair of business datum, wherein key-value pair includes: Data Identification and data volume;
Specifically, carrying out pretreated process to multiple business datums to subregion includes: using spark, MapReduce
Or the off-line technologies such as novel Flink handle to obtain key-value pair corresponding to pending data, wherein the key-value pair includes data mark
Know Key and data volume;
Step 2 breathes out the Data Identification of the business datum for waiting for subregion for each business datum to subregion
Uncommon processing, obtains the cryptographic Hash of the business datum for waiting for subregion;
Specifically, for each Data Identification to the business datum of subregion, to the data of the business datum for waiting for subregion
Mark carries out Hash processing, calculate the Data Identification of each business datum [0,1048576) cryptographic Hash in range, wherein meter
It calculates this and waits for that the purpose of section service data cryptographic Hash is to limit this to wait for the district location that section service data are fallen in, value model
Enclose not too many restrict;
Step 3 carries out subregion to multiple business datums to subregion according to the cryptographic Hash respectively to the business datum of subregion
Processing, obtains multiple data partitioned files;
Specifically, being generated according to the Data Identification Key, data volume Value and cryptographic Hash of the business datum for waiting for subregion
To the data file of section service data, cryptographic Hash and default number of partitions to the pending data do modular arithmetic obtain it is remaining
The identical pending data of remainder is divided into same subregion by number, data division module, obtains the data subregion of the default number of partitions
File, or consistency hash algorithm can also be used, obtain the data partitioned file of the default number of partitions;
Wherein, the process of data is inquired in order to avoid influencing distributed server, and avoids reading data partitioned file
Required data format conversion operation, in step 3, according to the cryptographic Hash respectively to the business datum of subregion, to multiple to subregion
Business datum carry out multidomain treat-ment, after obtaining multiple data partitioned files, further includes:
Data partitioned server is sent using the file copy function that operating system provides to corresponding distributed server
Copy instruction, wherein copy instruction carries data partitioned file;
Specifically, the file copy function that data partitioned server utilizes operating system to provide, to corresponding distributed clothes
Business device sends copy instruction, wherein copy instruction can be rsync, cp, scp etc., carry data subregion in copy instruction
File;Since the copy procedure does not have buffer service participation, the speed of the copy data partitioned file can be adjusted arbitrarily, be compared
In by the way of caching, which is stored in this earth magnetism of the distributed server by way of copy
Disk, can guarantee influence data inquiry request, it is easily controllable into local disk import data partitioned file copying speed,
Data Format Transform required for reading data partitioned file is skipped over, data partitioned file is further improved and imports local disk
Efficiency;
Wherein, above-mentioned steps three, according to the cryptographic Hash respectively to the business datum of subregion, to multiple business datums to subregion
Multidomain treat-ment is carried out, multiple data partitioned files are obtained, comprising:
It will respectively be divided by the cryptographic Hash of the business datum of subregion with default number of partitions, it is corresponding remaining to obtain business datum
Number, wherein default number of partitions is equal with the quantity of distributed server;
At least one identical business datum of remainder is divided into a data partitioned file, wherein data subregion text
The file identification of part with it includes the corresponding remainder of business datum correspond, the equipment mark of each remainder and distributed server
Knowing has default corresponding relationship.
Fig. 4 is the realization principle figure that data partitioned file is generated in data query method provided by the embodiments of the present application, such as
Shown in Fig. 4, it is assumed that the number of the business datum to subregion is seven, and presetting the number of partitions is three;Using spark, MapReduce or new
The off-line technologies such as the Flink of type handle to obtain key-value pair corresponding to pending data, wherein the key-value pair includes Data Identification
Key and data volume Value;Hash processing is carried out to the pending data Data Identification Key, the business datum of subregion is waited for according to this
Data Identification Key, data volume Value and cryptographic Hash generate data file to section service data;
By calculate the Data Identification Key of the pending data [0,1048576) cryptographic Hash in range, obtain Key0
It is the corresponding hash value of 1, Key2 be the corresponding hash value of 2, Key3 is 3 that corresponding hash value, which is the corresponding hash value of 0, Key1,
It is the corresponding hash value of 5, Key6 is 6 that the corresponding hash value of Key4, which is the corresponding hash value of 4, Key5,;It is to be processed to above-mentioned seven
The cryptographic Hash of data divided by the default number of partitions 3, obtains remainder corresponding to above-mentioned seven pending datas, data subregion mould respectively
The identical pending data of remainder is divided into same subregion by block, generates three data subregions that file identification is respectively a, b, c
File;Wherein, the file identification of obtained data partitioned file with it includes the corresponding remainder of business datum correspond,
Device identification P0, p1, P2 of each remainder or data partitioned file mark and distributed server have default corresponding relationship.
Wherein, business datum includes: Data Identification and data volume, and data partitioned file includes the corresponding data of business datum
Body;Above-mentioned S103 generates the index text of data partitioned file according to storage location information of each business datum in local disk
Part information, comprising:
Step 1 determines storage of the data volume in local disk for each data volume in data partitioned file
Corresponding relationship between location information Data Identification corresponding with the data volume;
Step 2 generates the index file information of data partitioned file according to the corresponding corresponding relationship of each business datum.
Specifically, after distributed server receives the data partitioned file that data partitioned server is sent, by the data
Partitioned file is stored in local disk, by index construct module in distributed server or other can be used for reading and deposit
Store up the module of the data body position in disk, each business datum being stored in the data partitioned file in local disk to this
Corresponding data volume is read out, and determines initial position, the end position of storage of the business datum in local disk, with
And the corresponding relationship between Data Identification corresponding to the data volume;According to each business included in the data partitioned file
The Data Identification of data and the corresponding location information being stored in local disk, generate the index of the data partitioned file
The file information;
Fig. 5 is the realization principle figure that index file information is generated in data query method provided by the embodiments of the present application, such as
Include the identification information of tri- business datums of K0, K1, K2 shown in Fig. 5, in data partitioned file a and its corresponding is stored in
Location information in local disk, initial position of the data volume corresponding with Data Identification K0 in local disk are 0, stop bits
It is set to 2;Initial position of the data volume corresponding with Data Identification K1 in local disk is 2, end position 4;With data mark
Knowing initial position of the corresponding data volume of K2 in local disk is 4, end position 6;According to institute in data partitioned file a
The Data Identification for each business datum for including and the corresponding location information being stored in local disk generate the data point
The index file information of area file a;Similarly, comprising there are two the Data Identification of business datum and its institutes in data partitioned file b
The corresponding location information being stored in local disk, comprising there are two the Data Identifications of business datum in data partitioned file c
And its corresponding location information being stored in local disk, it is wrapped according in data partitioned file b, data partitioned file c
The Data Identification of each business datum contained and the corresponding location information being stored in local disk generate data subregion text
The index file information of part b, data partitioned file c;
Distributed server generates data subregion text in the storage location information according to each business datum in local disk
After the index file information of part, which receives the inquiry for target service data that client is sent and asks
It asks, specifically, Fig. 6 is the specific implementation process schematic of data query method provided by the embodiments of the present application, as shown in fig. 6,
Assuming that client to device identification be P0 distributed server send target service data to be checked be K3, the distribution
After server receives the data inquiry request of client, according to mesh entrained in the data inquiry request of client transmission
The Data Identification K3 of mark business datum inquires Data Identification K3 in the index file information being stored in local memory
Initial position of the corresponding data volume in local disk is 2, and end position 4 reads the target being stored in above-mentioned position
The data volume of business datum, and transmitted target service data corresponding to the Data Identification K3 read using zero duplication technology
To default network card interface, target service data are sent to the client by network card interface.
Wherein, for the data partitioned file for having constructed index file, when utilizing index construct to the data partitioned file
Module has generated index file information, saves it in after the distributed server, and the distributed server exists in order to prevent
It needs to rebuild index when restarting, the index file file of the data partitioned file is stored in local disk;
Further, if index file information is also stored in local disk;
Above-mentioned S202 in the index file information stored in local memory, according to Data Identification, inquires target service number
According to storage location information before, further includes:
Step 1 judges in local memory with the presence or absence of index file information;
If it does not exist, two are thened follow the steps, the index file information stored in local disk is loaded into local memory.
Specifically, when distributed server receive client transmission for target service data inquiry request when,
Alternatively, judging with the presence or absence of index file information in local memory, if it does not exist, then when distributed server restarting
The index file information stored in local disk is loaded into local memory, the index text that will be stored in local disk is passed through
The method that part information is loaded into local memory has further speeded up the speed of index construct;
Data query method in the embodiment of the present application receives the inquiry request that client is directed to target service data;?
In the index file information stored in local memory, according to Data Identification, the storage location information of target service data is inquired;Base
In the corresponding storage location information of target service data, target service data are read from local disk;Utilize zero duplication technology
The target service data read are transmitted to default network card interface, target service data are sent to visitor by network card interface
Family end.By the way that business datum storage is cached in local memory into local disk, and by corresponding index information, wherein
First passing through data partitioned server obtains multiple partitioned files for big data multidomain treat-ment in advance, and each partitioned file is distributed to correspondence
Distributed server carry out data storage, and the partitioned file that respectively stores is directed to by distributed server and generates corresponding rope
Fuse breath, subsequent after receiving data inquiry request, the index information being directly based upon in local memory quickly positions to be checked
Storage location of the data in local disk, so both without occupying excessive local memory space, reduce data buffer storage at
This, and corresponding data can be quickly read from local disk, the response efficiency of data inquiry request is improved, in combination with zero
Duplication technology sends data to network card interface, reduces data copy number, further improves the data query of client
Efficiency.
The data query method that corresponding above-mentioned Fig. 1 to Fig. 6 is described, based on the same technical idea, the embodiment of the present application is also
A kind of data query system is provided, Fig. 7 is that the first structure composition of data query system provided by the embodiments of the present application is shown
It is intended to, the system is for executing the data query method that Fig. 1 to Fig. 6 is described, as shown in fig. 7, the system includes: 20 He of client
Multiple distributed servers 30;
The client 20, for the inquiry request to the transmission of distributed server 30 for target service data, and
Receive the target service data that the distributed server 30 returns, wherein the inquiry request is the client 20
What the target device mark returned according to data partitioned server was sent, the target device mark is the data differentiated services
Distributed clothes where the target service data that device is chosen in multiple distributed servers 30 according to the Data Identification
The identification information of business device 30;
The distributed server 30, for receiving the inquiry request;And the index file stored in local memory
In information, according to the Data Identification, the storage location information of the target service data is inquired, wherein the index file
Information includes: the corresponding relationship between the Data Identification of business datum and storage location information;And
Based on the corresponding storage location information of the target service data, the target industry is read from local disk
Business data, wherein the pre-assigned number comprising multiple business datums of data partitioned server is stored in the local disk
According to partitioned file;And
The target service data read are transmitted to default network card interface using zero duplication technology, by described
The target service data are sent to the client 20 by network card interface.
Data query system in the embodiment of the present application by storing business datum into local disk, and will correspond to
Index information be cached in local memory, wherein in advance first pass through data partitioned server big data multidomain treat-ment is obtained it is more
Each partitioned file is distributed to corresponding distributed server and carries out data storage by a partitioned file, and by distributed server
Corresponding index information is generated for the partitioned file respectively stored, it is subsequent after receiving data inquiry request, it is directly based upon
Index information in local memory quickly positions storage location of the data to be checked in local disk, so both without occupying
Excessive local memory space reduces data buffer storage cost, and can quickly read corresponding data from local disk, mentions
The response efficiency of high data inquiry request sends data to network card interface in combination with zero duplication technology, reduces data and copies
Shellfish number further improves the efficiency data query of client.
Optionally, the distributed server 30, is also used to:
Receive the data partitioned file that data partitioned server is sent, wherein the data partitioned file is the data
Partitioned server carries out multidomain treat-ment to big data and obtains and according to the file identification and distributed server of data partitioned file
What 30 device identification was sent;
The data partitioned file received is stored in local disk, wherein the data partitioned file includes:
Multiple business datums;
According to storage location information of each business datum in the local disk, the data partitioned file is generated
Index file information;
The index file information is stored into local memory.
Optionally, as shown in figure 8, the system also includes data partitioned servers 40, wherein the data subregions clothes
Business device 40, is used for:
Default processing is carried out to multiple business datums to subregion, obtains each corresponding key of business datum to subregion
Value pair, wherein the key-value pair includes: Data Identification and data volume;
For each business datum to subregion, the Data Identification of the business datum for waiting for subregion is breathed out
Uncommon processing, obtains the cryptographic Hash of the business datum for waiting for subregion;
According to the cryptographic Hash of each business datum to subregion, the multiple business datum to subregion is carried out
Multidomain treat-ment obtains multiple data partitioned files.
Optionally, the data partitioned server 40, is also used to:
The file copy function of being provided using operating system sends copy instruction to corresponding distributed server, wherein
The copy instruction carries the data partitioned file.
Optionally, the data partitioned server 40, is specifically used for:
The cryptographic Hash of each business datum to subregion is divided by with default number of partitions, obtains the business number
According to corresponding remainder, wherein the default number of partitions is equal with the quantity of distributed server 30;
At least one identical described business datum of remainder is divided into a data partitioned file, wherein the number
According to partitioned file file identification with it includes the corresponding remainder of business datum correspond, each remainder and distributed take
The device identification of business device 30 has default corresponding relationship.
Optionally, the business datum includes: Data Identification and data volume, and the data partitioned file includes business datum
The corresponding data volume;
The distributed server 30, also particularly useful for:
For each data volume in the data partitioned file, storage of the data volume in the local disk is determined
Corresponding relationship between the location information Data Identification corresponding with the data volume;
According to the corresponding corresponding relationship of each business datum, the index file letter of the data partitioned file is generated
Breath.
Optionally, if the index file information is also stored in local disk;
The distributed server 30, is also used to:
Judge in local memory with the presence or absence of index file information;
If it does not exist, then the index file information stored in the local disk is loaded into the local memory
In.
Data query system in the embodiment of the present application by storing business datum into local disk, and will correspond to
Index information be cached in local memory, wherein in advance first pass through data partitioned server big data multidomain treat-ment is obtained it is more
Each partitioned file is distributed to corresponding distributed server and carries out data storage by a partitioned file, and by distributed server
Corresponding index information is generated for the partitioned file respectively stored, it is subsequent after receiving data inquiry request, it is directly based upon
Index information in local memory quickly positions storage location of the data to be checked in local disk, so both without occupying
Excessive local memory space reduces data buffer storage cost, and can quickly read corresponding data from local disk, mentions
The response efficiency of high data inquiry request sends data to network card interface in combination with zero duplication technology, reduces data and copies
Shellfish number further improves the efficiency data query of client 20.
It should be noted that data query system provided by the embodiments of the present application is looked into data provided by the embodiments of the present application
Based on the same inventive concept, therefore the specific implementation of the embodiment may refer to the implementation of aforementioned data querying method to inquiry method,
Overlaps will not be repeated.
Further, corresponding above-mentioned Fig. 1 is to method shown in fig. 6, and based on the same technical idea, the embodiment of the present application is also
A kind of data query equipment is provided, for executing above-mentioned data query method, Fig. 9 is provided the equipment for the embodiment of the present application
Data query equipment structural schematic diagram.
As shown in figure 9, data query equipment can generate bigger difference because configuration or performance are different, it may include one
A or more than one processor 901 and memory 902 can store one or more storages in memory 902 and answered
With program or data.Wherein, memory 902 can be of short duration storage or persistent storage.It is stored in the application program of memory 902
It may include one or more modules (diagram is not shown), each module may include to the system in data query equipment
Column count machine executable instruction.Further, processor 901 can be set to communicate with memory 902, set in data query
Series of computation machine executable instruction in standby upper execution memory 902.Data query equipment can also include one or one
The above power supply 903, one or more wired or wireless network interfaces 904, one or more input/output interfaces
905, one or more keyboards 906 etc..
In a specific embodiment, data query equipment includes memory and one or more journey
Sequence, perhaps more than one program is stored in memory and one or more than one program may include one for one of them
Or more than one module, and each module may include to the series of computation machine executable instruction in data query equipment, and
Be configured to be executed this by one or more than one processor or more than one program include by carry out it is following based on
Calculation machine executable instruction:
Distributed server receives the inquiry request that client is directed to target service data, wherein the inquiry request is taken
Data Identification with the target service data, the inquiry request are that the client is returned according to data partitioned server
Target device mark send, target device mark be the data partitioned server according to the Data Identification more
The identification information of distributed server where the target service data chosen in a distributed server;
In the index file information stored in local memory, according to the Data Identification, the target service number is inquired
According to storage location information, wherein the index file information include: business datum Data Identification and storage location information it
Between corresponding relationship;
Based on the corresponding storage location information of the target service data, the target industry is read from local disk
Business data, wherein the pre-assigned number comprising multiple business datums of data partitioned server is stored in the local disk
According to partitioned file;
The target service data read are transmitted to default network card interface using zero duplication technology, by described
The target service data are sent to the client by network card interface.
Optionally, computer executable instructions also include for carrying out following computer executable instructions when executed:
Before receiving inquiry request of the client for target service data, further includes:
Receive the data partitioned file that the data partitioned server is sent, wherein the data partitioned file is described
Data partitioned server carries out multidomain treat-ment to big data and obtains and according to the file identification of data partitioned file and distributed clothes
What the device identification of business device was sent;
The data partitioned file received is stored in local disk, wherein the data partitioned file includes:
Multiple business datums;
According to storage location information of each business datum in the local disk, the data partitioned file is generated
Index file information;
The index file information is stored into local memory.
Optionally, computer executable instructions also include for carrying out following computer executable instructions when executed:
The data partitioned file obtains in the following way:
The data partitioned server carries out default processing to multiple business datums to subregion, obtains each described to subregion
The corresponding key-value pair of business datum, wherein the key-value pair includes: Data Identification and data volume;
For each business datum to subregion, the Data Identification of the business datum for waiting for subregion is breathed out
Uncommon processing, obtains the cryptographic Hash of the business datum for waiting for subregion;
According to the cryptographic Hash of each business datum to subregion, the multiple business datum to subregion is carried out
Multidomain treat-ment obtains multiple data partitioned files.
Optionally, computer executable instructions also include for carrying out following computer executable instructions when executed:
In the cryptographic Hash according to each business datum to subregion, the multiple business datum to subregion is carried out at subregion
Reason, after obtaining multiple data partitioned files, further includes:
The data partitioned server utilizes the file copy function of operating system offer to corresponding distributed server
Send copy instruction, wherein the copy instruction carries the data partitioned file.
Optionally, computer executable instructions also include for carrying out following computer executable instructions when executed:
The cryptographic Hash according to each business datum to subregion carries out subregion to the multiple business datum to subregion
Processing, obtains multiple data partitioned files, comprising:
The cryptographic Hash of each business datum to subregion is divided by with default number of partitions, obtains the business number
According to corresponding remainder, wherein the default number of partitions is equal with the quantity of distributed server;
At least one identical described business datum of remainder is divided into a data partitioned file, wherein the number
According to partitioned file file identification with it includes the corresponding remainder of business datum correspond, each remainder and distributed take
The device identification of business device has default corresponding relationship.
Optionally, computer executable instructions also include for carrying out following computer executable instructions when executed:
The business datum includes: Data Identification and data volume, and the data partitioned file includes the corresponding data of business datum
Body;
The storage location information according to each business datum in the local disk, generates the data subregion
The index file information of file, comprising:
For each data volume in the data partitioned file, storage of the data volume in the local disk is determined
Corresponding relationship between the location information Data Identification corresponding with the data volume;
According to the corresponding corresponding relationship of each business datum, the index file letter of the data partitioned file is generated
Breath.
Optionally, computer executable instructions also include for carrying out following computer executable instructions when executed:
If the index file information is also stored in local disk;
In the index file information stored in local memory, according to the Data Identification, the target service number is inquired
According to storage location information before, further includes:
Judge in local memory with the presence or absence of index file information;
If it does not exist, then the index file information stored in the local disk is loaded into the local memory
In.
Data query equipment in the embodiment of the present application, distributed server receive client for target service data
Inquiry request, wherein inquiry request carries the Data Identification of target service data, and inquiry request is client according to data point
Area's server return target device mark send, target device mark be data partitioned server according to Data Identification more
The identification information of distributed server where the target service data chosen in a distributed server;It is deposited in local memory
In the index file information of storage, according to Data Identification, the storage location information of target service data is inquired, wherein index file
Information includes: the corresponding relationship between the Data Identification of business datum and storage location information;It is corresponding based on target service data
Storage location information, from local disk read target service data, wherein data differentiated services are stored in local disk
The pre-assigned data partitioned file comprising multiple business datums of device;The target service number that will be read using zero duplication technology
According to default network card interface is transmitted to, target service data are sent to client by network card interface.
As it can be seen that by the data query equipment in the embodiment of the present application, by storing business datum into local disk,
And corresponding index information is cached in local memory, wherein first passing through data partitioned server in advance will be at big data subregion
Reason obtains multiple partitioned files, and each partitioned file is distributed to corresponding distributed server and carries out data storage, and by being distributed
Formula server is directed to the partitioned file respectively stored and generates corresponding index information, subsequent after receiving data inquiry request,
The index information being directly based upon in local memory quickly positions storage location of the data to be checked in local disk, so both
Without occupying excessive local memory space, data buffer storage cost is reduced, and can quickly read from local disk corresponding
Data, improve the response efficiency of data inquiry request, send data to network card interface in combination with zero duplication technology, reduce
Data copy number, further improves the efficiency data query of client.
Preferably, the embodiment of the present application also provides a kind of data query equipment, including processor 901, and memory 902 is deposited
The computer program that can be run on memory 902 and on processor 901 is stored up, which is executed by processor 901
Each process of the above-mentioned data query embodiment of the method for Shi Shixian, and identical technical effect can be reached, to avoid repeating, here
It repeats no more.
Further, corresponding above-mentioned Fig. 1 is to method shown in fig. 6, and based on the same technical idea, the embodiment of the present application is also
A kind of computer readable storage medium is provided, is stored with computer program on computer readable storage medium, the computer journey
Each process of above-mentioned data query embodiment of the method is realized when sequence is executed by processor, and can reach identical technical effect,
To avoid repeating, which is not described herein again.Wherein, the computer readable storage medium, such as read-only memory (Read-Only
Memory, abbreviation ROM), random access memory (Random Access Memory, abbreviation RAM), magnetic or disk etc..
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment
Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for system reality
For applying example, since it is substantially similar to the method embodiment, so being described relatively simple, related place is referring to embodiment of the method
Part explanation.
The above description is only an example of the present application, is not intended to limit this application.For those skilled in the art
For, various changes and changes are possible in this application.All any modifications made within the spirit and principles of the present application are equal
Replacement, improvement etc., should be included within the scope of the claims of this application.
Claims (14)
1. a kind of data query method, which is characterized in that the described method includes:
Distributed server receives the inquiry request that client is directed to target service data, wherein the inquiry request carries
The Data Identification of the target service data, the inquiry request are the mesh that the client is returned according to data partitioned server
Marking device mark is sent, target device mark be the data partitioned server according to the Data Identification at multiple points
The identification information of distributed server where the target service data chosen in cloth server;
In the index file information stored in local memory, according to the Data Identification, the target service data are inquired
Storage location information, wherein the index file information includes: between the Data Identification of business datum and storage location information
Corresponding relationship;
Based on the corresponding storage location information of the target service data, the target service number is read from local disk
According to, wherein the pre-assigned data comprising multiple business datums of data partitioned server point are stored in the local disk
Area file;
The target service data read are transmitted to default network card interface using zero duplication technology, to pass through the network interface card
The target service data are sent to the client by interface.
2. the method according to claim 1, wherein being asked receiving inquiry of the client for target service data
Before asking, further includes:
Receive the data partitioned file that the data partitioned server is sent, wherein the data partitioned file is the data
Partitioned server carries out multidomain treat-ment to big data and obtains and according to the file identification and distributed server of data partitioned file
Device identification send;
The data partitioned file received is stored in local disk, wherein the data partitioned file includes: multiple
Business datum;
According to storage location information of each business datum in the local disk, the rope of the data partitioned file is generated
Draw the file information;
The index file information is stored into local memory.
3. according to the method described in claim 2, it is characterized in that, the data partitioned file is to obtain in the following way
:
The data partitioned server carries out default processing to multiple business datums to subregion, obtains each industry to subregion
The corresponding key-value pair of data of being engaged in, wherein the key-value pair includes: Data Identification and data volume;
For each business datum to subregion, the Data Identification of the business datum for waiting for subregion is carried out at Hash
Reason, obtains the cryptographic Hash of the business datum for waiting for subregion;
According to the cryptographic Hash of each business datum to subregion, subregion is carried out to the multiple business datum to subregion
Processing, obtains multiple data partitioned files.
4. according to the method described in claim 3, it is characterized in that, in the Kazakhstan according to each business datum to subregion
Uncommon value carries out multidomain treat-ment to the multiple business datum to subregion, after obtaining multiple data partitioned files, further includes:
The data partitioned server is sent using the file copy function that operating system provides to corresponding distributed server
Copy instruction, wherein the copy instruction carries the data partitioned file.
5. according to the method described in claim 3, it is characterized in that, described according to each business datum to subregion
Cryptographic Hash carries out multidomain treat-ment to the multiple business datum to subregion, obtains multiple data partitioned files, comprising:
The cryptographic Hash of each business datum to subregion is divided by with default number of partitions, obtains the business datum pair
The remainder answered, wherein the default number of partitions is equal with the quantity of distributed server;
At least one identical described business datum of remainder is divided into a data partitioned file, wherein the data point
The file identification of area file with it includes the corresponding remainder of business datum correspond, each remainder and distributed server
Device identification there is default corresponding relationship.
6. according to the method described in claim 2, it is characterized in that, the business datum includes: Data Identification and data volume, institute
Stating data partitioned file includes the corresponding data volume of business datum;
The storage location information according to each business datum in the local disk, generates the data partitioned file
Index file information, comprising:
For each data volume in the data partitioned file, storage location of the data volume in the local disk is determined
Corresponding relationship between the information Data Identification corresponding with the data volume;
According to the corresponding corresponding relationship of each business datum, the index file information of the data partitioned file is generated.
7. according to the method described in claim 2, it is characterized in that, if the index file information is also stored in local disk
In;
In the index file information stored in local memory, according to the Data Identification, the target service data are inquired
Before storage location information, further includes:
Judge in local memory with the presence or absence of index file information;
If it does not exist, then the index file information stored in the local disk is loaded into the local memory.
8. a kind of data query system, which is characterized in that the system comprises: client and multiple distributed servers;
The client, for the inquiry request to distributed server transmission for target service data, and described in reception
The target service data that distributed server returns, wherein the inquiry request is the client according to data subregion
What the target device mark that server returns was sent, the target device mark is the data partitioned server according to the number
The identification information of distributed server where the target service data chosen in multiple distributed servers according to mark;
The distributed server, for receiving the inquiry request;And in the index file information stored in local memory,
According to the Data Identification, the storage location information of the target service data is inquired, wherein the index file packet
It includes: the corresponding relationship between the Data Identification and storage location information of business datum;And
Based on the corresponding storage location information of the target service data, the target service number is read from local disk
According to, wherein the pre-assigned data comprising multiple business datums of data partitioned server point are stored in the local disk
Area file;And
The target service data read are transmitted to default network card interface using zero duplication technology, to pass through the network interface card
The target service data are sent to the client by interface.
9. system according to claim 8, which is characterized in that the distributed server is also used to:
Receive the data partitioned file that data partitioned server is sent, wherein the data partitioned file is the data subregion
Server carries out multidomain treat-ment to big data and obtains and setting according to the file identification of data partitioned file and distributed server
What standby mark was sent;
The data partitioned file received is stored in local disk, wherein the data partitioned file includes: multiple
Business datum;
According to storage location information of each business datum in the local disk, the rope of the data partitioned file is generated
Draw the file information;
The index file information is stored into local memory.
10. system according to claim 9, which is characterized in that the system also includes: data partitioned server, wherein
The data partitioned server, is used for:
Default processing is carried out to multiple business datums to subregion, obtains each corresponding key assignments of business datum to subregion
It is right, wherein the key-value pair includes: Data Identification and data volume;
For each business datum to subregion, the Data Identification of the business datum for waiting for subregion is carried out at Hash
Reason, obtains the cryptographic Hash of the business datum for waiting for subregion;
According to the cryptographic Hash of each business datum to subregion, subregion is carried out to the multiple business datum to subregion
Processing, obtains multiple data partitioned files.
11. system according to claim 10, which is characterized in that the data partitioned server is also used to:
The file copy function of being provided using operating system sends copy instruction to corresponding distributed server, wherein described
Copy instruction carries the data partitioned file.
12. system according to claim 10, which is characterized in that the data partitioned server is specifically used for:
The cryptographic Hash of each business datum to subregion is divided by with default number of partitions, obtains the business datum pair
The remainder answered, wherein the default number of partitions is equal with the quantity of distributed server;
At least one identical described business datum of remainder is divided into a data partitioned file, wherein the data point
The file identification of area file with it includes the corresponding remainder of business datum correspond, each remainder and distributed server
Device identification there is default corresponding relationship.
13. system according to claim 9, which is characterized in that the business datum includes: Data Identification and data volume,
The data partitioned file includes the corresponding data volume of business datum;
The distributed server, also particularly useful for:
For each data volume in the data partitioned file, storage location of the data volume in the local disk is determined
Corresponding relationship between the information Data Identification corresponding with the data volume;
According to the corresponding corresponding relationship of each business datum, the index file information of the data partitioned file is generated.
14. system according to claim 9, which is characterized in that if the index file information is also stored in local disk
In;The distributed server, is also used to:
Judge in local memory with the presence or absence of index file information;
If it does not exist, then the index file information stored in the local disk is loaded into the local memory.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910521866.XA CN110263061A (en) | 2019-06-17 | 2019-06-17 | A kind of data query method and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910521866.XA CN110263061A (en) | 2019-06-17 | 2019-06-17 | A kind of data query method and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110263061A true CN110263061A (en) | 2019-09-20 |
Family
ID=67918675
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910521866.XA Pending CN110263061A (en) | 2019-06-17 | 2019-06-17 | A kind of data query method and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110263061A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110728317A (en) * | 2019-09-30 | 2020-01-24 | 腾讯科技(深圳)有限公司 | Training method and system of decision tree model, storage medium and prediction method |
CN111008200A (en) * | 2019-12-18 | 2020-04-14 | 北京数衍科技有限公司 | Data query method and device and server |
CN111107019A (en) * | 2019-12-29 | 2020-05-05 | 浪潮电子信息产业股份有限公司 | Data transmission method, device, equipment and computer readable storage medium |
CN112162859A (en) * | 2020-09-24 | 2021-01-01 | 成都长城开发科技有限公司 | Data processing method and device, computer readable medium and electronic equipment |
CN112181900A (en) * | 2020-09-04 | 2021-01-05 | 中国银联股份有限公司 | Data processing method and device in server cluster |
CN112199442A (en) * | 2020-09-29 | 2021-01-08 | 中国平安人寿保险股份有限公司 | Distributed batch file downloading method and device, computer equipment and storage medium |
CN112632129A (en) * | 2020-12-31 | 2021-04-09 | 联想未来通信科技(重庆)有限公司 | Code stream data management method, device and storage medium |
CN112711580A (en) * | 2020-12-30 | 2021-04-27 | 陈静 | Big data mining method for cloud computing service and cloud computing financial server |
CN113297266A (en) * | 2020-07-08 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method, device, equipment and computer storage medium |
CN113568870A (en) * | 2020-04-28 | 2021-10-29 | 西安理邦科学仪器有限公司 | Storage method, server and monitoring system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1547714A (en) * | 2001-08-03 | 2004-11-17 | 易斯龙系统公司 | Systems and methods providing metadata for tracking of information on a distributed file system of storage devices |
CN102968498A (en) * | 2012-12-05 | 2013-03-13 | 华为技术有限公司 | Method and device for processing data |
CN103761275A (en) * | 2014-01-09 | 2014-04-30 | 浪潮电子信息产业股份有限公司 | Management method for metadata in distributed file system |
CN104346458A (en) * | 2014-10-31 | 2015-02-11 | 易准科技发展(上海)有限公司 | Data storage method and device |
US20160070754A1 (en) * | 2014-09-10 | 2016-03-10 | Umm Al-Qura University | System and method for microblogs data management |
CN108255958A (en) * | 2017-12-21 | 2018-07-06 | 百度在线网络技术(北京)有限公司 | Data query method, apparatus and storage medium |
CN109189995A (en) * | 2018-07-16 | 2019-01-11 | 哈尔滨理工大学 | Data disappear superfluous method in cloud storage based on MPI |
CN109299157A (en) * | 2018-08-27 | 2019-02-01 | 杭州安恒信息技术股份有限公司 | A kind of data export method and device of distributed big single table |
-
2019
- 2019-06-17 CN CN201910521866.XA patent/CN110263061A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1547714A (en) * | 2001-08-03 | 2004-11-17 | 易斯龙系统公司 | Systems and methods providing metadata for tracking of information on a distributed file system of storage devices |
CN102968498A (en) * | 2012-12-05 | 2013-03-13 | 华为技术有限公司 | Method and device for processing data |
CN103761275A (en) * | 2014-01-09 | 2014-04-30 | 浪潮电子信息产业股份有限公司 | Management method for metadata in distributed file system |
US20160070754A1 (en) * | 2014-09-10 | 2016-03-10 | Umm Al-Qura University | System and method for microblogs data management |
CN104346458A (en) * | 2014-10-31 | 2015-02-11 | 易准科技发展(上海)有限公司 | Data storage method and device |
CN108255958A (en) * | 2017-12-21 | 2018-07-06 | 百度在线网络技术(北京)有限公司 | Data query method, apparatus and storage medium |
CN109189995A (en) * | 2018-07-16 | 2019-01-11 | 哈尔滨理工大学 | Data disappear superfluous method in cloud storage based on MPI |
CN109299157A (en) * | 2018-08-27 | 2019-02-01 | 杭州安恒信息技术股份有限公司 | A kind of data export method and device of distributed big single table |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110728317A (en) * | 2019-09-30 | 2020-01-24 | 腾讯科技(深圳)有限公司 | Training method and system of decision tree model, storage medium and prediction method |
CN111008200A (en) * | 2019-12-18 | 2020-04-14 | 北京数衍科技有限公司 | Data query method and device and server |
CN111008200B (en) * | 2019-12-18 | 2024-01-16 | 北京数衍科技有限公司 | Data query method, device and server |
CN111107019A (en) * | 2019-12-29 | 2020-05-05 | 浪潮电子信息产业股份有限公司 | Data transmission method, device, equipment and computer readable storage medium |
CN113568870A (en) * | 2020-04-28 | 2021-10-29 | 西安理邦科学仪器有限公司 | Storage method, server and monitoring system |
CN113297266A (en) * | 2020-07-08 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method, device, equipment and computer storage medium |
CN112181900A (en) * | 2020-09-04 | 2021-01-05 | 中国银联股份有限公司 | Data processing method and device in server cluster |
CN112162859A (en) * | 2020-09-24 | 2021-01-01 | 成都长城开发科技有限公司 | Data processing method and device, computer readable medium and electronic equipment |
CN112199442B (en) * | 2020-09-29 | 2023-07-21 | 中国平安人寿保险股份有限公司 | Method, device, computer equipment and storage medium for distributed batch downloading files |
CN112199442A (en) * | 2020-09-29 | 2021-01-08 | 中国平安人寿保险股份有限公司 | Distributed batch file downloading method and device, computer equipment and storage medium |
CN112711580A (en) * | 2020-12-30 | 2021-04-27 | 陈静 | Big data mining method for cloud computing service and cloud computing financial server |
CN112632129A (en) * | 2020-12-31 | 2021-04-09 | 联想未来通信科技(重庆)有限公司 | Code stream data management method, device and storage medium |
CN112632129B (en) * | 2020-12-31 | 2023-11-21 | 联想未来通信科技(重庆)有限公司 | Code stream data management method, device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110263061A (en) | A kind of data query method and system | |
CN105324770B (en) | Effectively read copy | |
US8229899B2 (en) | Remote access agent for caching in a SAN file system | |
EP4202694A1 (en) | Node memory-based data processing method and apparatus, device, and medium | |
CN110191428B (en) | Data distribution method based on intelligent cloud platform | |
CN104754001A (en) | Cloud storage system and data storage method | |
CN108810041A (en) | A kind of data write-in of distributed cache system and expansion method, device | |
CN103853714B (en) | A kind of data processing method and device | |
CN101350030A (en) | Method and apparatus for caching data | |
CN110334297A (en) | Loading method, terminal, server and the storage medium of terminal page | |
CN103678523A (en) | Distributed cache data access method and device | |
CN103312624A (en) | Message queue service system and method | |
CN103442090A (en) | Cloud computing system for data scatter storage | |
CN109684273A (en) | A kind of snapshot management method, apparatus, equipment and readable storage medium storing program for executing | |
CN113259478B (en) | Method and device for executing transaction in blockchain system and blockchain system | |
CN112162846A (en) | Transaction processing method, device and computer readable storage medium | |
CN105320676A (en) | Customer data query service method and device | |
WO2023231339A1 (en) | Transaction execution method and node in blockchain system, and blockchain system | |
CN109597903A (en) | Image file processing apparatus and method, document storage system and storage medium | |
CN114003562B (en) | Directory traversal method, device and equipment and readable storage medium | |
US10146833B1 (en) | Write-back techniques at datastore accelerators | |
CN109241021A (en) | A kind of file polling method, apparatus, equipment and computer readable storage medium | |
CN110457307A (en) | Metadata management system, user's cluster creation method, device, equipment and medium | |
CN107908713A (en) | A kind of distributed dynamic cuckoo filtration system and its filter method based on Redis clusters | |
CN114265814B (en) | Data lake file system based on object storage |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190920 |
|
RJ01 | Rejection of invention patent application after publication |