CN110209693A - High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing - Google Patents

High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing Download PDF

Info

Publication number
CN110209693A
CN110209693A CN201910389924.8A CN201910389924A CN110209693A CN 110209693 A CN110209693 A CN 110209693A CN 201910389924 A CN201910389924 A CN 201910389924A CN 110209693 A CN110209693 A CN 110209693A
Authority
CN
China
Prior art keywords
data
write
concurrent
query
copies
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910389924.8A
Other languages
Chinese (zh)
Inventor
杨兆辉
汪金忠
徐根林
孙迁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suning Cloud Computing Co Ltd
Original Assignee
Suning Cloud Computing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suning Cloud Computing Co Ltd filed Critical Suning Cloud Computing Co Ltd
Priority to CN201910389924.8A priority Critical patent/CN110209693A/en
Publication of CN110209693A publication Critical patent/CN110209693A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Abstract

The invention discloses a kind of high concurrent data query method, apparatus, system, equipment and computer readable storage mediums, belong to big data technical field.The described method includes: obtaining inquiry number of concurrent according to the enquiry of historical data of data fragmentation;Write-in number of copies is calculated according to the inquiry number of concurrent;According to said write number of copies, it is respectively written into corresponding node from low to high according to write-in load factor;According to the inquiry request of data fragmentation, respective nodes are arrived from low to high according to query load index and obtain query result.The present invention is by providing a kind of high concurrent data query scheme, so that the query performance of user will not reduce in the case where big data high concurrent inquires scene, query performance is stablized, and fluctuates small.

Description

High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing
Technical field
The present invention relates to big data technical field, in particular to a kind of high concurrent data query method, apparatus, is set system Standby and computer readable storage medium.
Background technique
In the real-time analysis field of big data, need in a short time from mass data inquire data and be polymerized to as a result, Tool: data volume is big, computation-intensive, response time are short (usually in 5 seconds).In order to promote query performance, it will usually data Multiple data fragmentations are cut into according to certain logic, are respectively stored on multiple servers, when user inquires, from multiple clothes It is engaged in parallel query data on device, then gives and summarize server summarized results and return to user.Data fragmentation is usually active and standby two A copy is respectively present on different servers, system survivability and availability can be improved in this way.
In a small amount of user query, above-mentioned big data is analyzed in real time can respond within the relatively reasonable time, When inquiry concurrency is big, the case where fighting for due to read-write (IO), the calculating (CPU) to data fragmentation, lead to inquiry Response time is suddenly poly- to be risen, such as: it under 2 user concurrent request for information, was returned the result at 1 second, 20 user concurrent inquiries, It returns the result within 10 seconds, under high concurrent request for information, the user query waiting time is elongated.
Summary of the invention
In order to solve problems in the prior art, the embodiment of the invention provides a kind of high concurrent data query method, apparatus, System, equipment and readable storage medium storing program for executing, in the case where big data high concurrent inquires scene, so that the query performance of user will not reduce, Query performance is stablized, and fluctuates small.The technical solution is as follows:
In a first aspect, providing a kind of high concurrent data query method, which comprises according to the inquiry of data fragmentation Historical data obtains inquiry number of concurrent;Write-in number of copies is calculated according to the inquiry number of concurrent;According to said write copy Number is respectively written into corresponding node according to write-in load factor from low to high;According to the inquiry request of data fragmentation, according to inquiry Load factor arrives respective nodes from low to high and obtains query result.
With reference to first aspect, in the first possible implementation, write-in is calculated according to the inquiry number of concurrent Number of copies, comprising: support number of concurrent that write-in number of copies is calculated according to the inquiry number of concurrent and each copy.
With reference to first aspect, in the second possible implementation, it according to said write number of copies, is loaded according to write-in Index is respectively written into corresponding node from low to high, comprising: according to the current I/O utilization of each node, cpu busy percentage and The write-in load factor of all nodes is calculated in first default weight rule.
With reference to first aspect, in the third possible implementation, according to the inquiry request of data fragmentation, according to inquiry Load factor arrives respective nodes from low to high and obtains query result, comprising: according to the current I/O utilization of each node, CPU benefit The query load index of all nodes is calculated with rate, network load utilization rate and the second default weight rule.
With reference to first aspect and the first to three kind of possible implementation of first aspect it is any, can at the four to seven kind In the implementation of energy, the method also includes: according to the enquiry of historical data of the data fragmentation, utilize ARIMA time sequence Column model prediction future needs increased increase number of copies, carries out copy dynamic according to the increase number of copies and adjusts.
Second aspect, provides a kind of high concurrent data query device, and described device includes: that inquiry number of concurrent obtains mould Block, for obtaining inquiry number of concurrent according to the enquiry of historical data of data fragmentation;Number of copies computing module is written, for according to institute It states inquiry number of concurrent and write-in number of copies is calculated;Copy writing module, for being born according to write-in according to said write number of copies It carries index and is respectively written into corresponding node from low to high;Query result obtains module, for the inquiry request according to data fragmentation, Respective nodes are arrived from low to high according to query load index obtains query result.
In conjunction with second aspect, in the first possible implementation, said write number of copies computing module is used for: according to The inquiry number of concurrent and each copy support number of concurrent that write-in number of copies is calculated.
In conjunction with second aspect, in the second possible implementation, the copy writing module includes the first calculating Module, first computational submodule are used for: default according to the current I/O utilization of each node, cpu busy percentage and first The write-in load factor of all nodes is calculated in weight rule.
In conjunction with second aspect, in the third possible implementation, it includes the second meter that the query result, which obtains module, Operator module, second computational submodule are used for: according to the current I/O utilization of each node, cpu busy percentage, network load The query load index of all nodes is calculated in utilization rate and the second default weight rule.
In conjunction with second aspect and second aspect the first to three kind of possible implementation it is any, can at the four to seven kind In the implementation of energy, described device further includes copy dynamic adjustment module, is used for: according to the query history of the data fragmentation Data are predicted the following increased increase number of copies of needs using ARIMA time series models, are carried out according to the increase number of copies Copy dynamic adjusts.
The third aspect provides a kind of high concurrent data query system, and the system comprises data copy managers, data Loader, query load balanced device and multiple nodes, wherein the data copy manager is used for: according to data fragmentation Enquiry of historical data obtains inquiry number of concurrent, and write-in number of copies is calculated according to the inquiry number of concurrent;The data add Device is carried, for being respectively written into corresponding node from low to high according to write-in load factor according to said write number of copies;It is described to look into Load balancer is ask to obtain to respective nodes from low to high for the inquiry request according to data fragmentation according to query load index Take query result.
In conjunction with the third aspect, in the first possible implementation, the data copy manager is also used to: according to institute The enquiry of historical data for stating data fragmentation predicts the following increased increase number of copies of needs, root using ARIMA time series models Copy dynamic is carried out according to the increase number of copies to adjust.
Fourth aspect provides a kind of high concurrent data query equipment, comprising: processor;Memory, for storing State the executable instruction of processor;Wherein, the processor is configured to execute above scheme times via the executable instruction Described in one the step of high concurrent data query method.
5th aspect, provides a kind of computer readable storage medium, the computer-readable recording medium storage has meter Calculation machine program realizes above scheme described in any item high concurrent data query sides when the computer program is executed by processor The step of method.
Technical solution provided in an embodiment of the present invention has the benefit that
1, inquiry number of concurrent is obtained according to the enquiry of historical data of data fragmentation, is calculated further according to the inquiry number of concurrent To write-in number of copies, copy amount is written needed for capable of knowing according to the general inquiry demand of data fragmentation, then according to write-in Write-in copy is written to corresponding node respectively from low to high by load factor, is used such distribution write-in policy first, is avoided The single point pressure of data write-in, improves write performance,
2, when receiving the inquiry request of data fragmentation, respective nodes are arrived from low to high according to query load index and are obtained Query result, the inquiry request of dynamic dispatching user, the response delay of effective guarantee inquiry, i.e., when shorter inquiry response It is long, it is ensured that the response time held stationary of inquiry effectively improves system queries performance, such as: what is returned the result in 1 second looks into It askes, in the case where 10 times or more inquiries, the query performance of user is not reduced, and is still in 1 second and is returned the result, is The query performance of system is more stable, fluctuates very little;
3, have again due to the reasonable disposition according to the progress data fragmentation copy configuration of actual queries demand and query node Effect has ensured the stability of query performance, while also improving the utilization rate of storage resource;
4, according to the historical data of inquiry request, the following copy for needing to save is predicted according to ARIMA time series models Quantity, dynamic adjust the quantity of copy, keep the flatness of user query performance;
5, whole that there is sufficiently high fault-tolerance, reliability due to being more copy storages.
Detailed description of the invention
To describe the technical solutions in the embodiments of the present invention more clearly, make required in being described below to embodiment Attached drawing is briefly described, it should be apparent that, drawings in the following description are only some embodiments of the invention, for For those of ordinary skill in the art, without creative efforts, it can also be obtained according to these attached drawings other Attached drawing.
Fig. 1 is high concurrent data query method flow diagram provided in an embodiment of the present invention;
Fig. 2 is high concurrent data query device structural schematic diagram provided in an embodiment of the present invention;
Fig. 3 is high concurrent data query system structural schematic diagram provided in an embodiment of the present invention;
Fig. 4 is the copy write-in flow chart for the write-in copy that a preferred embodiment provides;
Fig. 5 is the demonstration graph of copy amount calculating and copy writing process that a preferred embodiment provides;
Fig. 6 is the demonstration graph for the data inquiry request process that a preferred embodiment provides;
Fig. 7 is the demonstration graph for the copy dynamic adjustment process that a preferred embodiment provides;
Fig. 8 is high concurrent data query device structure schematic diagram provided in an embodiment of the present invention.
Specific embodiment
To make the object, technical solutions and advantages of the present invention clearer, below in conjunction with attached in the embodiment of the present invention Figure, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only this Invention a part of the embodiment, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art exist Every other embodiment obtained under the premise of creative work is not made, shall fall within the protection scope of the present invention.
It should be noted that term " first ", " second " are used for description purposes only, it is not understood to indicate or imply phase To importance or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be with Explicitly or implicitly include one or more of the features.In the description of the present invention, the meaning of " plurality " is two with On, unless otherwise specifically defined.
High concurrent data query method, apparatus, system, equipment and computer-readable storage medium provided in an embodiment of the present invention Matter obtains inquiry number of concurrent according to the enquiry of historical data of data fragmentation, write-in is calculated further according to the inquiry number of concurrent Copy amount is written needed for capable of knowing according to the general inquiry demand of data fragmentation in number of copies, then refers to according to write-in load Write-in copy is written to corresponding node respectively from low to high by number, and enquiry of historical data here preferably uses QPS average value Statistical data.Such distribution write-in policy is used first, is avoided the single point pressure of data write-in, is improved write performance, Also, when receiving the inquiry request of data fragmentation, respective nodes are arrived from low to high according to query load index and obtain inquiry As a result, both having ensured shorter inquiry response duration, it is ensured that the response time held stationary of inquiry, the system of effectively improving are looked into Performance, and the reasonable disposition due to carrying out the configuration of data fragmentation copy and query node according to actual queries demand are ask, effectively It has ensured the stability of query performance, while having also improved the utilization rate of storage resource, and integrally had sufficiently high fault-tolerant Property, reliability.Therefore, high concurrent data query scheme provided in an embodiment of the present invention, can be widely applied to be related to high concurrent A variety of big data application scenarios of data query.
Combined with specific embodiments below and attached drawing, to high concurrent data query method, apparatus provided in an embodiment of the present invention, System, equipment and readable storage medium storing program for executing elaborate.
Fig. 1 is high concurrent data query method flow diagram provided in an embodiment of the present invention, as shown in Figure 1, the present invention is implemented The high concurrent data query method that example provides, comprising the following steps:
101, inquiry number of concurrent is obtained according to the enquiry of historical data of data fragmentation.
Specifically, concurrent by the inquiry that user query data fragmentation is calculated in the enquiry of historical data of data fragmentation Number.
It is worth noting that, step 101 obtains the process of inquiry number of concurrent according to the enquiry of historical data of data fragmentation, remove Except mode described in above-mentioned steps, the process can also be realized by other means, the embodiment of the present invention is to specific side Formula is not limited.
102, write-in number of copies is calculated according to inquiry number of concurrent.
Specifically, supporting number of concurrent that write-in number of copies, Ke Yitong is calculated according to inquiry number of concurrent and each copy Following formula is crossed to be calculated:
It inquires number of concurrent/every copy and supports number of concurrent=number of copies
The number of copies for needing to save is calculated according to inquiry number of concurrent, copy amount represents the number of concurrent supported, according to industry Business demand, different data collection can support different number of concurrent, avoid the waste of storage resource, (be saved by increasing machine Point), identical copy is saved on every machine, it can the inquiry number of concurrent that can be supported of linear lifting system.
It is worth noting that, the process of write-in number of copies is calculated according to inquiry number of concurrent for step 102, in addition to above-mentioned step Except the rapid mode, it can also realize that the process, the embodiment of the present invention are not subject to specific mode by other means It limits.
103, according to write-in number of copies, it is respectively written into corresponding node from low to high according to write-in load factor;
Specifically, being calculated according to the current I/O utilization of each node, cpu busy percentage and the first default weight rule To the write-in load factor of all nodes, then the copy that number of copies is written is write respectively from low to high according to write-in load factor Enter to corresponding node.Here, the first default weight rule refers to that I/O utilization and cpu busy percentage are calculating write-in load factor When weight distribution, such as I/O utilization weight: cpu busy percentage weight is 2:8, I/O utilization weight+cpu busy percentage weight= 1, etc., it can need accordingly to be arranged according to the actual situation.Write-in load factor can be calculated by the following formula to obtain:
Load factor=I/O utilization * I/O utilization weight+cpu busy percentage * cpu busy percentage weight is written
Data fragmentation is loaded into the node with write-in copy amount, the several pairs of write-in copy of data fragmentation are formed This, each copy is identical in the content of each node.
It is worth noting that, step 103 is respectively written into according to write-in number of copies according to write-in load factor from low to high The process of corresponding node other than the mode described in the above-mentioned steps, can also realize the process, the present invention by other means Embodiment is not limited specific mode.
104, according to the inquiry request of data fragmentation, respective nodes are arrived from low to high according to query load index and obtain inquiry As a result.
Specifically, pre- according to the current I/O utilization of each node, cpu busy percentage, network load utilization rate and second If the query load index of all nodes is calculated in weight rule.It then from low to high will write-in pair according to query load index The copy of this number is respectively written into corresponding node.Here, the second default weight rule refer to I/O utilization, cpu busy percentage with Network load utilization rate is calculating weight distribution when load factor is written, such as I/O utilization weight: cpu busy percentage weight: Network load utilization rate weight be 4:4:2, I/O utilization weight+cpu busy percentage weight+network load utilization rate weight=1, etc. Deng can need accordingly to be arranged according to the actual situation.Write-in load factor can be calculated by the following formula to obtain:
Query load index=I/O utilization * I/O utilization weight+cpu busy percentage * cpu busy percentage weight+network load Utilization rate * network load utilization rate weight
Find out from above querying flow, when a certain node is when handling inquiry request, other nodes do not increase Load, can receive other inquiry requests, effectively improves system queries performance, it is ensured that the response time of inquiry keeps Steadily.
It is worth noting that, inquiry request of the step 104 according to data fragmentation, is arrived from low to high according to query load index Respective nodes obtain the process of query result, other than the mode described in the above-mentioned steps, can also realize by other means The process, the embodiment of the present invention are not limited specific mode.
Optionally, above-mentioned high concurrent data query method further includes copy dynamic adjustment step:
According to the enquiry of historical data of data fragmentation, the following increased increasing of needs is predicted using ARIMA time series models Add number of copies, copy dynamic adjustment is carried out according to number of copies is increased, to keep the flatness of user query performance.
It, can be with other than the mode described in the above-mentioned steps it is worth noting that, the process of copy dynamic adjustment step Realize that the process, the embodiment of the present invention are not limited specific mode by other means.
Fig. 2 is high concurrent data query device structural schematic diagram provided in an embodiment of the present invention.As shown in Fig. 2, of the invention Embodiment provide high concurrent data query device 2 include inquiry number of concurrent obtain module 21, write-in number of copies computing module 22, Copy writing module 23 and query result obtain module 24.
Wherein, inquiry number of concurrent obtains module 21, concurrent for obtaining inquiry according to the enquiry of historical data of data fragmentation Number.
Number of copies computing module 22 is written, for write-in number of copies to be calculated according to inquiry number of concurrent.Specifically, write-in Number of copies computing module 22 is used for: supporting number of concurrent that write-in number of copies is calculated according to inquiry number of concurrent and each copy.
Copy writing module 23, for being respectively written into from low to high according to write-in load factor according to write-in number of copies Corresponding node.Specifically, copy writing module 23 includes the first computational submodule 231, the first computational submodule 231 is used for: root The write-in of all nodes is calculated according to the current I/O utilization of each node, cpu busy percentage and the first default weight rule Load factor.
Query result obtains module 24, for the inquiry request according to data fragmentation, according to query load index by as low as Height obtains query result to respective nodes.Specifically, query result obtain module 24 include the second computational submodule 241, second Computational submodule 241 is used for: according to the current I/O utilization of each node, cpu busy percentage, network load utilization rate and second The query load index of all nodes is calculated in default weight rule.
Optionally, above-mentioned high concurrent data query device further includes copy dynamic adjustment module 25, is specifically used for: according to number According to the enquiry of historical data of fragment, the following increased increase number of copies of needs is predicted using ARIMA time series models, according to increasing Add number of copies to carry out copy dynamic to adjust.
Fig. 3 is high concurrent data query system structural schematic diagram provided in an embodiment of the present invention, as shown in figure 3, of the invention The high concurrent data query system that embodiment provides includes data copy manager 31, data loader 32, query load equilibrium Device 33 and multiple nodes, wherein data copy manager 31 is used for: it is obtained and is inquired according to the enquiry of historical data of data fragmentation Number of concurrent, and write-in number of copies is calculated according to inquiry number of concurrent;Data loader 32, for pressing according to write-in number of copies It is respectively written into corresponding node from low to high according to write-in load factor;Query load balanced device 33, for according to data fragmentation Inquiry request arrives respective nodes according to query load index from low to high and obtains query result.
Optionally, data copy manager 31 is also used to: according to the enquiry of historical data of data fragmentation, when using ARIMA Between series model prediction is following needs increased increase number of copies, adjusted according to increasing number of copies and carrying out copy dynamic.
Optionally, query result is returned to user by query load balanced device 33, if inquiry is related to multiple data fragmentations, The operation of some data fragmentation merging is then carried out, returns result to user after the completion of merging again.
Below with reference to a preferred embodiment the embodiment of the present invention will be further explained provide high concurrent data query method and Device.
Fig. 4 is the copy write-in flow chart for the write-in copy that the preferred embodiment provides.As shown in figure 4, write-in copy Copy writing process mainly includes following below scheme:
1, the QPS average value statistics data that data copy manager 31 is inquired according to data fragmentation A calculate user query Number of concurrent, the quantity of write-in copy is calculated according to number of concurrent, calculation formula is as follows:
It concurrently inquires number/every copy and supports number of concurrent=number of copies
Such as: concurrently inquiring number is 8, and the number of concurrent that every copy is supported is 2, then the quantity of copy is written are as follows: 4.Data add Device 32 is carried to prepare the data fragmentation A 4 copies are written.Number of copies calculation formula is as follows:
8/2=4
2, first node is written as copy in the node that data loader 32 selects a load minimum.The load of node Situation is assessed from IO, CPU two dimensions, and the weight that I/O utilization accounts for is 30%, and cpu busy percentage weight is 70%.
Such as: I/O utilization 60%, cpu busy percentage 50%, then the write-in load factor of the node are as follows:
60%*0.3+50%*0.7=53%
3, data loader 32 writes data into first node, if first node causes to be written due to certain abnormal conditions Failure, then jump back to step 2, if be written successfully, enters next step.
4, first node write-in is completed, and copy is written to other three nodes by first node, in other three sections of write-in When point, other nodes are retried if failure is written.
Fig. 5 is the demonstration graph of copy amount calculating and copy writing process that the preferred embodiment provides.
As shown in figure 5, illustrating data flow, number of copies meter so that the example of 4 data copies A1, A2, A3, A4 is written Calculation and copy writing process mainly include following below scheme:
1. data copy manager 31 calculates copy amount;
2. data loader 32 writes data into first node;
3. first node is again by data distribution to remaining three node, other nodes are written successfully, then will be written successfully Message be sent to first node, first node will be written successful message and feed back to data loader 32 again.
If there is more copies need to be written, then continue to be distributed to other nodes by other three nodes, using in this way Distribution replication policy, avoid data write-in single point pressure, improve write performance.
Fig. 6 is the demonstration graph for the data inquiry request process that the preferred embodiment provides, as shown in fig. 6, data query is asked Process is sought, mainly includes following below scheme:
1. user initiates the request of inquiry data fragmentation A, query load balanced device 33 receives inquiry request.
2. query load balanced device 33 selects a present load most according to the node load situation of storing data fragment A Light node, preferential selection select load if all nodes have user query currently without the node of user query Lower node, otherwise prompt system at full capacity, tries again later.Such as discovery node 3 is most lightly loaded, is selected as inquiry Node.The loading condition of node is assessed from IO, CPU, network load three dimensions, and the weight that I/O utilization accounts for is 30%, cpu busy percentage weight is 50%, and network load utilization rate weight is 20%.
Such as: I/O utilization 60%, cpu busy percentage 50%, network utilization 40%, then the inquiry of the node is negative Carry index are as follows:
60%*0.3+50%*0.5+40%*0.2=51%
3. search result is returned to query load balanced device according to the data of inquiry request retrieval data fragmentation A by node 3 33.If node 3 abnormal conditions occurs when inquiring data, step is jumped back to 2..
4. after query load balanced device 33 receives the result of inquiry request, responding to user, querying flow terminates.
Find out from above querying flow, when node 3 is when handling inquiry request, other nodes do not increase negative It carries, can receive other inquiry requests, effectively improve the concurrency of system, it is ensured that the response time of inquiry keeps flat Surely.
Fig. 7 is the demonstration graph for the copy dynamic adjustment process that the preferred embodiment provides, as shown in fig. 7, copy dynamic is adjusted It is had suffered journey, mainly includes following below scheme:
1. data copy manager 31 is predicted according to inquiry QPS average value statistics data according to ARIMA time series models The trend of future Query, it is assumed that calculating future needs to increase a copy.
2. the node for selecting current loads lighter, such as node 4 has been selected, data copy manager 31 notifies the node Data copy is copied into new node.
3. data copy is copied to new node 5 by node 4, copy 5 is saved on node 5.
4. query load balanced device 33 perceives the copy 5 being newly added, when there is new inquiry request, according to load dispatch Strategy is dispatched to node 5 and is inquired.
Can be seen that from above process can dynamically adjust copy amount, work as inquiry according to the historical trend of inquiry When up-trend is concurrently presented, increase copy amount, when downward trend reduces copy amount, has effectively ensured query performance Stability, while also improving the utilization rate of storage resource.
Fig. 8 is high concurrent data query device structure schematic diagram provided in an embodiment of the present invention, as shown in figure 8, of the invention The high concurrent data query equipment 4 that embodiment provides, comprising:
Processor 41;
Memory 42, for being stored with the executable instruction of processor 41;
Wherein, processor 41 is configured to execute high concurrent number described in any of the above-described scheme via the executable instruction The step of according to querying method.
In addition, the embodiment of the invention also provides a kind of computer readable storage medium, the computer readable storage medium It is stored with computer program, high concurrent data query described in any of the above-described scheme is realized when computer program is executed by processor The step of method.
It should be understood that high concurrent data query device provided by the above embodiment, high concurrent data query device exist It, only the example of the division of the above functional modules, can basis in practical application when high concurrent data query business It needs and is completed by different functional modules above-mentioned function distribution, i.e., be divided into the internal structure of device, equipment different Functional module, to complete all or part of the functions described above.In addition, high concurrent data query provided by the above embodiment Method, high concurrent data query device, the embodiment of high concurrent data query equipment and computer readable storage medium belong to In same design, it is not necessary to repeat.
All the above alternatives can form alternative embodiment of the invention using any combination, herein no longer It repeats one by one.
In conclusion high concurrent data query method, apparatus, system, equipment and computer provided in an embodiment of the present invention Readable storage medium storing program for executing has the advantages that compared with prior art
1, inquiry number of concurrent is obtained according to the enquiry of historical data of data fragmentation, is calculated further according to the inquiry number of concurrent To write-in number of copies, copy amount is written needed for capable of knowing according to the general inquiry demand of data fragmentation, then according to write-in Write-in copy is written to corresponding node respectively from low to high by load factor, is used such distribution write-in policy first, is avoided The single point pressure of data write-in, improves write performance,
2, when receiving the inquiry request of data fragmentation, respective nodes are arrived from low to high according to query load index and are obtained Query result, the inquiry request of dynamic dispatching user, the response delay of effective guarantee inquiry, i.e., when shorter inquiry response It is long, it is ensured that the response time held stationary of inquiry effectively improves system queries performance, such as: what is returned the result in 1 second looks into It askes, in the case where 10 times or more inquiries, the query performance of user is not reduced, and is still in 1 second and is returned the result, is The query performance of system is more stable, fluctuates very little;
3, have again due to the reasonable disposition according to the progress data fragmentation copy configuration of actual queries demand and query node Effect has ensured the stability of query performance, while also improving the utilization rate of storage resource;
4, according to the historical data of inquiry request, the following copy for needing to save is predicted according to ARIMA time series models Quantity, dynamic adjust the quantity of copy, keep the flatness of user query performance;
5, whole that there is sufficiently high fault-tolerance, reliability due to being more copy storages.
Those of ordinary skill in the art will appreciate that realizing that all or part of the steps of above-described embodiment can pass through hardware It completes, relevant hardware can also be instructed to complete by program, the program can store in a kind of computer-readable In storage medium, storage medium mentioned above can be read-only memory, disk or CD etc..
It is referring to according to the method for embodiment, equipment (system) and calculating in the embodiment of the present application in the embodiment of the present application The flowchart and/or the block diagram of machine program product describes.It should be understood that can be realized by computer program instructions flow chart and/or The combination of the process and/or box in each flow and/or block and flowchart and/or the block diagram in block diagram.It can mention For the processing of these computer program instructions to general purpose computer, special purpose computer, Embedded Processor or other programmable datas The processor of equipment is to generate a machine, so that being executed by computer or the processor of other programmable data processing devices Instruction generation refer to for realizing in one or more flows of the flowchart and/or one or more blocks of the block diagram The device of fixed function.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Although the preferred embodiment in the embodiment of the present application has been described, once a person skilled in the art knows Basic creative concept, then additional changes and modifications may be made to these embodiments.So appended claims are intended to explain Being includes preferred embodiment and all change and modification for falling into range in the embodiment of the present application.
Obviously, various changes and modifications can be made to the invention without departing from essence of the invention by those skilled in the art Mind and range.In this way, if these modifications and changes of the present invention belongs to the range of the claims in the present invention and its equivalent technologies Within, then the present invention is also intended to include these modifications and variations.
The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all in spirit of the invention and Within principle, any modification, equivalent replacement, improvement and so on be should all be included in the protection scope of the present invention.
The content being not described in detail in this specification belongs to the prior art well known to professional and technical personnel in the field.

Claims (14)

1. a kind of high concurrent data query method, which is characterized in that the described method includes:
Inquiry number of concurrent is obtained according to the enquiry of historical data of data fragmentation;
Write-in number of copies is calculated according to the inquiry number of concurrent;
According to said write number of copies, it is respectively written into corresponding node from low to high according to write-in load factor;
According to the inquiry request of data fragmentation, respective nodes are arrived from low to high according to query load index and obtain query result.
2. the method according to claim 1, wherein write-in copy is calculated according to the inquiry number of concurrent Number, comprising:
Support number of concurrent that write-in number of copies is calculated according to the inquiry number of concurrent and each copy.
3. the method according to claim 1, wherein according to said write number of copies, according to write-in load factor It is respectively written into corresponding node from low to high, comprising:
All nodes are calculated according to the current I/O utilization of each node, cpu busy percentage and the first default weight rule Write-in load factor.
4. the method according to claim 1, wherein according to the inquiry request of data fragmentation, according to query load Index arrives respective nodes from low to high and obtains query result, comprising:
According to the current I/O utilization of each node, cpu busy percentage, network load utilization rate and the second default weight rule meter Calculation obtains the query load index of all nodes.
5. method according to any one of claims 1 to 4, which is characterized in that the method also includes:
According to the enquiry of historical data of the data fragmentation, the following increased increasing of needs is predicted using ARIMA time series models Add number of copies, copy dynamic is carried out according to the increase number of copies and is adjusted.
6. a kind of high concurrent data query device, which is characterized in that described device includes:
It inquires number of concurrent and obtains module, for obtaining inquiry number of concurrent according to the enquiry of historical data of data fragmentation;
Number of copies computing module is written, for write-in number of copies to be calculated according to the inquiry number of concurrent;
Copy writing module, for according to said write number of copies, it to be right to be respectively written into from low to high according to write-in load factor Answer node;
Query result obtains module and arrives phase from low to high according to query load index for the inquiry request according to data fragmentation Node is answered to obtain query result.
7. device according to claim 6, which is characterized in that said write number of copies computing module is used for:
Support number of concurrent that write-in number of copies is calculated according to the inquiry number of concurrent and each copy.
8. device according to claim 6, which is characterized in that the copy writing module includes the first computational submodule, First computational submodule is used for:
All nodes are calculated according to the current I/O utilization of each node, cpu busy percentage and the first default weight rule Write-in load factor.
9. device according to claim 6, which is characterized in that it includes the second calculating submodule that the query result, which obtains module, Block, second computational submodule are used for:
According to the current I/O utilization of each node, cpu busy percentage, network load utilization rate and the second default weight rule meter Calculation obtains the query load index of all nodes.
10. according to the described in any item devices of claim 6 to 9, which is characterized in that described device further includes that copy dynamic adjusts Module is used for:
According to the enquiry of historical data of the data fragmentation, the following increased increasing of needs is predicted using ARIMA time series models Add number of copies, copy dynamic is carried out according to the increase number of copies and is adjusted.
11. a kind of high concurrent data query system, which is characterized in that the system comprises data copy manager, data to load Device, query load balanced device and multiple nodes, wherein
The data copy manager is used for: obtaining inquiry number of concurrent according to the enquiry of historical data of data fragmentation, and according to institute It states inquiry number of concurrent and write-in number of copies is calculated;
The data loader, for being respectively written into from low to high according to write-in load factor according to said write number of copies Corresponding node;
The query load balanced device is arrived for the inquiry request according to data fragmentation according to query load index from low to high Respective nodes obtain query result.
12. system according to claim 11, which is characterized in that the data copy manager is also used to: according to described The enquiry of historical data of data fragmentation predicts the following increased increase number of copies of needs using ARIMA time series models, according to The increase number of copies carries out copy dynamic and adjusts.
13. a kind of high concurrent data query equipment characterized by comprising
Processor;
Memory, for being stored with the executable instruction of the processor;
Wherein, the processor is configured to carry out height described in any one of perform claim requirement 1 to 5 via the executable instruction The step of concurrent data querying method.
14. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage has computer journey Sequence, which is characterized in that the computer program realizes height described in any one of claims 1 to 5 simultaneously when being executed by processor The step of sending out data query method.
CN201910389924.8A 2019-05-10 2019-05-10 High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing Pending CN110209693A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910389924.8A CN110209693A (en) 2019-05-10 2019-05-10 High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910389924.8A CN110209693A (en) 2019-05-10 2019-05-10 High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing

Publications (1)

Publication Number Publication Date
CN110209693A true CN110209693A (en) 2019-09-06

Family

ID=67785793

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910389924.8A Pending CN110209693A (en) 2019-05-10 2019-05-10 High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing

Country Status (1)

Country Link
CN (1) CN110209693A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114338725A (en) * 2021-12-31 2022-04-12 深圳市瑞云科技有限公司 Distributed storage scheduling method for improving large-scale cluster rendering upper limit
CN114398371A (en) * 2022-01-13 2022-04-26 九有技术(深圳)有限公司 Multi-copy fragmentation method, device, equipment and storage medium for database cluster system
CN114787791A (en) * 2019-11-25 2022-07-22 谷歌有限责任公司 Rule violation detection

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488180A (en) * 2015-11-30 2016-04-13 中国建设银行股份有限公司 Data storing method and system
CN108363643A (en) * 2018-03-27 2018-08-03 东北大学 A kind of HDFS copy management methods based on file access temperature
CN109032801A (en) * 2018-07-26 2018-12-18 郑州云海信息技术有限公司 A kind of request scheduling method, system and electronic equipment and storage medium
CN109522289A (en) * 2018-10-30 2019-03-26 咪咕文化科技有限公司 The storage processing method, apparatus and computer storage medium of copy
CN109697018A (en) * 2017-10-20 2019-04-30 北京京东尚科信息技术有限公司 The method and apparatus for adjusting memory node copy amount

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105488180A (en) * 2015-11-30 2016-04-13 中国建设银行股份有限公司 Data storing method and system
CN109697018A (en) * 2017-10-20 2019-04-30 北京京东尚科信息技术有限公司 The method and apparatus for adjusting memory node copy amount
CN108363643A (en) * 2018-03-27 2018-08-03 东北大学 A kind of HDFS copy management methods based on file access temperature
CN109032801A (en) * 2018-07-26 2018-12-18 郑州云海信息技术有限公司 A kind of request scheduling method, system and electronic equipment and storage medium
CN109522289A (en) * 2018-10-30 2019-03-26 咪咕文化科技有限公司 The storage processing method, apparatus and computer storage medium of copy

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李强: "云计算及其应用", 《云计算及其应用 *
王敏: "制造业大数据分布式存储管理方法研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114787791A (en) * 2019-11-25 2022-07-22 谷歌有限责任公司 Rule violation detection
CN114338725A (en) * 2021-12-31 2022-04-12 深圳市瑞云科技有限公司 Distributed storage scheduling method for improving large-scale cluster rendering upper limit
CN114338725B (en) * 2021-12-31 2024-01-30 深圳市瑞云科技有限公司 Distributed storage scheduling method for improving upper limit of large-scale cluster rendering
CN114398371A (en) * 2022-01-13 2022-04-26 九有技术(深圳)有限公司 Multi-copy fragmentation method, device, equipment and storage medium for database cluster system

Similar Documents

Publication Publication Date Title
CN102546782B (en) Distribution system and data operation method thereof
KR102013004B1 (en) Dynamic load balancing in a scalable environment
JP5244236B2 (en) Computer system, method, and program
KR101502896B1 (en) Distributed memory cluster control apparatus and method using map reduce
US10356150B1 (en) Automated repartitioning of streaming data
US20160140235A1 (en) Real-time distributed in memory search architecture
AU2004266017B2 (en) Hierarchical management of the dynamic allocation of resources in a multi-node system
US10956990B2 (en) Methods and apparatuses for adjusting the distribution of partitioned data
JP6412244B2 (en) Dynamic integration based on load
CN110209693A (en) High concurrent data query method, apparatus, system, equipment and readable storage medium storing program for executing
CN108900626B (en) Data storage method, device and system in cloud environment
CN110825704B (en) Data reading method, data writing method and server
CN103077197A (en) Data storing method and device
CN103440290A (en) Big data loading system and method
US20130085895A1 (en) High throughput global order promising system
WO2020134364A1 (en) Virtual machine migration method, cloud computing management platform, and storage medium
CN111737168A (en) Cache system, cache processing method, device, equipment and medium
Zhang et al. Aurora: Adaptive block replication in distributed file systems
CN105975345A (en) Video frame data dynamic equilibrium memory management method based on distributed memory
CN114090580A (en) Data processing method, device, equipment, storage medium and product
CN104715044A (en) Distributed system and data manipulation method thereof
CN108132759A (en) A kind of method and apparatus that data are managed in file system
US10594620B1 (en) Bit vector analysis for resource placement in a distributed system
CN108664322A (en) Data processing method and system
Wang et al. Improved intermediate data management for mapreduce frameworks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190906