CN107239570A - Data processing method and server cluster - Google Patents
Data processing method and server cluster Download PDFInfo
- Publication number
- CN107239570A CN107239570A CN201710504831.6A CN201710504831A CN107239570A CN 107239570 A CN107239570 A CN 107239570A CN 201710504831 A CN201710504831 A CN 201710504831A CN 107239570 A CN107239570 A CN 107239570A
- Authority
- CN
- China
- Prior art keywords
- data
- tables
- memory system
- distributed memory
- computational
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2455—Query execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2458—Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
- G06F16/2471—Distributed queries
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Fuzzy Systems (AREA)
- Mathematical Physics (AREA)
- Probability & Statistics with Applications (AREA)
- Software Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Present disclose provides a kind of data processing method, applied to the PC cluster process including at least two Computational frames, methods described includes, when the first Computational frame performs the operation for the first tables of data, judge to whether there is the corresponding data message of first tables of data in distributed memory system, if being not present, the operation for the first tables of data is performed, and the corresponding data message of first tables of data is synchronized in the distributed memory system.The disclosure additionally provides a kind of server cluster.
Description
Technical field
This disclosure relates to a kind of data processing method and server cluster.
Background technology
In large-scale calculations cluster, the business scenario that different Computational frames load same tables of data is frequently encountered,
There are a variety of Computational frames such as Impala, Hive, SparkSQL simultaneously in such as cluster, different calculating tasks may be directed to
To the loading of a certain table data, each computing engines are respective load tables, and so processing actually repeats to be loaded with identical
Data, cause a large amount of disk read-writes to operate, overall performance effect is poor.
The content of the invention
An aspect of this disclosure provides a kind of data processing method, applied to including the collection of at least two Computational frames
Group's calculating process, methods described includes, and when the first Computational frame performs the operation for the first tables of data, judges in distributed
It whether there is the corresponding data message of first tables of data in deposit system, if being not present, perform described for the first data
The operation of table, and the corresponding data message of first tables of data is synchronized in the distributed memory system.
Alternatively, if methods described includes the presence of the corresponding data letter of first tables of data in distributed memory system
Breath, then first Computational frame loads the corresponding data message of first tables of data from the distributed memory system.
Alternatively, the corresponding data message of first tables of data is synchronized in the distributed memory system, including
Judge whether first tables of data belongs to the tables of data that may be shared by different Computational frames, it is described may be by different calculation blocks
The shared tables of data of frame is what the statistics based on tables of data inquiry plan was determined, if belonging to, by first tables of data
Corresponding data message is synchronized in the distributed memory system.
Alternatively, the execution includes query execution meter of the operation for the first tables of data for the operation of the first tables of data
Draw.
Alternatively, it whether there is the corresponding data packets of first tables of data in the judgement distributed memory system
Include, first Computational frame obtains the corresponding data of tables of data that other Computational frames are synchronized to the distributed memory system
Information, wherein, the storage class of each Computational frame at least two Computational frame is by extension, and judgement point
It whether there is the corresponding data message of first tables of data in cloth memory system.
Another aspect of the present disclosure provides a kind of server cluster, including at least one processor, and at least one
Memory.Be stored with computer-readable program on memory, when described program is by least one described computing device so that
At least one described processor is judged in distributed in the case where the first Computational frame performs the operation for the first tables of data
It whether there is the corresponding data message of first tables of data in deposit system, and in the absence of described in distributed memory system
In the case of first tables of data, the operation for being directed to the first tables of data is performed, and first tables of data is synchronized to described
In distributed memory system.
Alternatively, at least one described processor is also performed, and there is first tables of data in distributed memory system
In the case of corresponding data message, first Computational frame is set to load first number from the distributed memory system
According to the corresponding data message of data in table.
Alternatively, the corresponding data message of first tables of data is synchronized to described by least one described computing device
In distributed memory system, including, judge whether first tables of data belongs to the data that may be shared by different Computational frames
Table, the tables of data that may be shared by different Computational frames is what the statistics based on tables of data inquiry plan was determined, with
And if belong to the tables of data that may be shared by different Computational frames, the corresponding data message of first tables of data is synchronized to
In the distributed memory system.
Alternatively, the execution includes query execution meter of the operation for the first tables of data for the operation of the first tables of data
Draw.
Alternatively, at least one described processor judges to whether there is first tables of data pair in distributed memory system
The data message answered includes, and first Computational frame obtains the number that other Computational frames are synchronized to the distributed memory system
According to the corresponding data message of table, wherein, the storage class of each Computational frame at least two Computational frame is by expanding
Exhibition, and judge to whether there is the corresponding data message of first tables of data in distributed memory system.
Another aspect of the disclosure provides a kind of data handling system, and the system includes judge module and synchronous mould
Block.Judge module is used to, when the first Computational frame performs the operation for the first tables of data, judge in distributed memory system
With the presence or absence of the corresponding data message of first tables of data.Synchronization module is used in case of absence, perform the pin
Operation to the first tables of data, and the corresponding data message of first tables of data is synchronized to the distributed memory system
In.
Alternatively, the system also includes load-on module, for there are first data in distributed memory system
In the case of the corresponding data message of table, first Computational frame is set to load described first from the distributed memory system
The corresponding data message of tables of data.
Alternatively, the synchronization module includes the first judging submodule and synchronous submodule.First judging submodule, is used for
Judge whether first tables of data belongs to the tables of data that may be shared by different Computational frames, it is described may be by different meters
It is what the statistics based on tables of data inquiry plan was determined to calculate the shared tables of data of framework.Synchronous submodule, for belonging to
It is in the case of the tables of data that may be shared by different Computational frames, the corresponding data message of first tables of data is synchronous
Into the distributed memory system.
Alternatively, the execution includes query execution meter of the operation for the first tables of data for the operation of the first tables of data
Draw.
Alternatively, the judge module includes acquisition submodule and the second judging submodule.Acquisition submodule, for making
State the first Computational frame and obtain the corresponding data message of tables of data that other Computational frames are synchronized to the distributed memory system,
Wherein, the storage class of each Computational frame at least two Computational frame is by extension.Second judging submodule,
For judging to whether there is the corresponding data message of first tables of data in distributed memory system.
Another aspect of the present disclosure provides a kind of non-volatile memory medium, and be stored with computer executable instructions, institute
Stating instruction is used to realize method as described above when executed.
Brief description of the drawings
In order to be more fully understood from the disclosure and its advantage, referring now to the following description with reference to accompanying drawing, wherein:
Fig. 1 diagrammatically illustrates the schematic diagram of the server cluster according to the embodiment of the present disclosure;
Fig. 2 diagrammatically illustrates the flow chart of the data processing method according to the embodiment of the present disclosure;
Fig. 3 diagrammatically illustrates the flow chart of the data processing method according to the embodiment of the present disclosure;
Fig. 4 is diagrammatically illustrated to be divided according to the corresponding data message of the first tables of data is synchronized to by the embodiment of the present disclosure
Flow chart in cloth memory system;
Fig. 5, which is diagrammatically illustrated, whether there is described first in the judgement distributed memory system according to the embodiment of the present disclosure
The flow chart of the corresponding data message of tables of data;
Fig. 6 diagrammatically illustrates the schematic diagram of the data handling system according to the embodiment of the present disclosure;
Fig. 7 diagrammatically illustrates the schematic diagram of the data handling system according to the embodiment of the present disclosure;
Fig. 8 diagrammatically illustrates the schematic diagram of the data handling system according to the embodiment of the present disclosure;
Fig. 9 diagrammatically illustrates the schematic diagram of the judge module according to the embodiment of the present disclosure;And
Figure 10 diagrammatically illustrates the block diagram of a calculate node in the server cluster according to the embodiment of the present disclosure.
Embodiment
Hereinafter, it will be described with reference to the accompanying drawings embodiment of the disclosure.However, it should be understood that these descriptions are simply exemplary
, and it is not intended to limit the scope of the present disclosure.In addition, in the following description, the description to known features and technology is eliminated, with
Avoid unnecessarily obscuring the concept of the disclosure.
Term as used herein is not intended to limit the disclosure just for the sake of description specific embodiment.Used here as
Word " one ", " one (kind) " and "the" etc. should also include " multiple ", the meaning of " a variety of ", unless context clearly refers in addition
Go out.In addition, term " comprising " as used herein, "comprising" etc. indicate the presence of the feature, step, operation and/or part,
But it is not excluded that in the presence of or add one or more other features, step, operation or part.
All terms (including technology and scientific terminology) as used herein have what those skilled in the art were generally understood
Implication, unless otherwise defined.It should be noted that term used herein should be interpreted that with consistent with the context of this specification
Implication, without that should be explained with idealization or excessively mechanical mode.
Shown in the drawings of some block diagrams and/or flow chart.It should be understood that some sides in block diagram and/or flow chart
Frame or its combination can be realized by computer program instructions.These computer program instructions can be supplied to all-purpose computer,
The processor of special-purpose computer or other programmable data processing units, so that these instructions can be with when by the computing device
Create the device for realizing function/operation illustrated in these block diagrams and/or flow chart.
Therefore, the technology of the disclosure can be realized in the form of hardware and/or software (including firmware, microcode etc.).Separately
Outside, the technology of the disclosure can take the form of the computer program product on the computer-readable medium for the instruction that is stored with, should
Computer program product is available for instruction execution system use or combined command execution system to use.In the context of the disclosure
In, computer-readable medium can include, store, transmit, propagate or transmit the arbitrary medium of instruction.For example, calculating
Machine computer-readable recording medium can include but is not limited to electricity, magnetic, optical, electromagnetic, infrared or semiconductor system, device, device or propagation medium.
The specific example of computer-readable medium includes:Magnetic memory apparatus, such as tape or hard disk (HDD);Light storage device, such as CD
(CD-ROM);Memory, such as random access memory (RAM) or flash memory;And/or wire/wireless communication link.
The need for due to different pieces of information processing scene, in server cluster 100, multiple calculation blocks are may be simultaneously present
Frame.For example, for the less interactive inquiry scene of returning result collection, it is preferred to use Impala Computational frames, and for processing
The larger extraction of data throughput-conversion-loading procedure (ETL, Extract-Transform-Load), then be preferred to use Hive meters
Calculate framework.The handled data of different Computational frames are all stored in distributed file system, therefore, different calculation blocks
Frame, it is necessary to by same metadata location data table, and is loaded into internal memory when to data manipulation, then is distributed
Formula concurrent operation.
Embodiment of the disclosure provides a kind of data processing method and can apply the server cluster of this method.The party
The corresponding data message of first tables of data can be synchronized to by method when the first Computational frame performs the operation for the first tables of data
, can be directly from distributed memory system in order to which other Computational frames are when to the data table handling in distributed memory system
Loaded in system, reduce disk read-write operation, improve the efficiency inquired about across Computational frame with table.
Distributed memory system, the system for the internal memory formation being distributed across on many machines, the data stored thereon are regular
Synchronized with disk file.By data syn-chronization to distributed memory system, refer to data being loaded into many from disk file
In the internal memory of machine, burst storage.
Fig. 1 diagrammatically illustrates the schematic diagram of the server cluster according to the embodiment of the present disclosure.
As shown in figure 1, server cluster 100 can include at least one calculate node 110 and network 120.Network 120
Medium to provide communication link between calculate node 110.Network 120 can include various connection types, such as it is wired,
Wireless communication link or fiber optic cables etc..Calculate node 110 can be to provide the server of various services, for example, deposit number
According to table, the server for providing query function, changing function etc., but not limited to this.Server cluster 100 can be real by the disclosure
The method for applying example accelerates the loading procedure of tables of data.
It should be understood that the framework in Fig. 1 is only example, the component included in specific framework can be adjusted as the case may be
It is whole, according to needs are realized, can have any number of network and calculate node.
The data processing method of the embodiment of the present disclosure is illustrated referring to Fig. 2.
Fig. 2 diagrammatically illustrates the flow chart of the data processing method according to the embodiment of the present disclosure.
As shown in Fig. 2 this method includes operation S210 and operation S220.
In operation S210, when the first Computational frame performs the operation for the first tables of data, distributed memory system is judged
It whether there is the corresponding data message of first tables of data in system.
In operation S220, if being not present, the operation for being directed to the first tables of data is performed, and by first tables of data
Corresponding data message is synchronized in the distributed memory system.
This method, can be by first tables of data correspondence when the first Computational frame performs the operation for the first tables of data
Data message be synchronized in the distributed memory system, can in order to which other Computational frames are when to the data table handling
To be loaded directly from distributed memory system, disk read-write operation is reduced.
According to the embodiment of the present disclosure, in operation S210, the first Computational frame performs the operation of the first tables of data, Ke Yishi
Query execution plan of the first Computational frame operation for the first tables of data.Query execution plan is transported before specific calculate
OK, for being predicted and optimizing to query process.Therefore, judge in query execution plan of the operation for the first tables of data
It whether there is the corresponding data message of first tables of data in distributed memory system, and in case of absence, by the
The corresponding data message of one tables of data is synchronized in the distributed memory system so that the operation subsequently to the table data
To be loaded from distributed memory.
According to the embodiment of the present disclosure, the operation that the first Computational frame performs the first tables of data can also be to the first tables of data
Other operation, such as the information for increasing, being deleted or modified in tables of data.
In operation S220, according to the embodiment of the present disclosure, first tables of data pair is not present in distributed memory system
In the case of the data message answered, not only need to perform the execution of the first Computational frame for the operation of the first tables of data, for example, transport
The hand-manipulating of needle is to the query execution plan of the first tables of data, or increase, the information that is deleted or modified in tables of data etc., also by the first number
It is synchronized to according to the corresponding data message of table in the distributed memory system, in order to which other Computational frames are grasped to the tables of data
When making, it can be loaded directly from distributed memory system.
Fig. 3 diagrammatically illustrates the flow chart of the data processing method according to the embodiment of the present disclosure.
As shown in figure 3, this method includes operation S210, operation S220 and operation S230.
In operation S210, when the first Computational frame performs the operation for the first tables of data, distributed memory system is judged
It whether there is the corresponding data message of first tables of data in system, refer to the operation described by Fig. 2, here is omitted.
In operation S220, if being not present, the operation for being directed to the first tables of data is performed, and by first tables of data
Corresponding data message is synchronized in the distributed memory system.
S230 is being operated, it is described if there is the corresponding data message of first tables of data in distributed memory system
First Computational frame loads the corresponding data message of first tables of data from the distributed memory system.First calculation block
Frame is no longer loaded, but directly pass through distributed memory to having stored in the tables of data in distributed memory system from disk
System loads, reduce disk read-write operation.
Fig. 4 diagrammatically illustrate according to the embodiment of the present disclosure in operation S220 by the corresponding data of first tables of data
Synchronizing information is to the flow chart in the distributed memory system.
Wrapped as shown in figure 4, the corresponding data message of first tables of data is synchronized in the distributed memory system
Include operation S221 and operation S222.
In operation S221, judge whether first tables of data belongs to the tables of data that may be shared by different Computational frames.
In accordance with an embodiment of the present disclosure, the tables of data that may be shared by different Computational frames is the system based on tables of data inquiry plan
Count determination.According to the embodiment of the present disclosure, before many Computational frame parallel computations, server cluster obtains tables of data inquiry
Plan, to carry out overall prediction and optimization to query process.At this in the works, the tables of data that will be loaded can be obtained, with
And its shared situation, and then determine may be by the common tables of data of different Computational frames.
In operation S222, if first tables of data belongs to the tables of data that may be shared by different Computational frames, by institute
The corresponding data message of the first tables of data is stated to be synchronized in the distributed memory system.
This method passes through that recognize may shared tables of data, it is to avoid only operated tables of data to be once loaded into distribution
In formula internal memory, system resource has been saved.
Fig. 5 is diagrammatically illustrated judges whether deposited in distributed memory system according to the embodiment of the present disclosure in operation S210
In the flow chart of the corresponding data message of first tables of data.
As shown in figure 5, judging to whether there is the corresponding data packets of first tables of data in distributed memory system
Include operation S211 and operation S212.
In operation S211, first Computational frame obtains other Computational frames and is synchronized to the distributed memory system
The corresponding data message of tables of data, wherein, the storage class of each Computational frame at least two Computational frame be by
Extension.
In existing Computational frame, on the premise of Frame Source is not changed, each Computational frame needs each to add
Carry tables of data.According to the embodiment of the present disclosure, the storage class of each Computational frame can be extended, such as memoryStore
Class and tachyonStore classes, while changing the execution logic of operation operator, make it support to look into the redirection of certain metadata
Ask, for example, to having stored in the tables of data in distributed memory system, no longer being loaded from disk, but directly pass through distribution
Formula memory system is loaded, so as to realize that a Computational frame can obtain other Computational frames and be synchronized to the distributed memory system
The corresponding data message of tables of data of system.
In operation S212, judge to whether there is the corresponding data message of first tables of data in distributed memory system.
Obtained realizing after other Computational frames are synchronized to the corresponding data message of tables of data of the distributed memory system, first
Computational frame may determine that whether the first required tables of data is already present in distributed memory system.
This method overcomes Computational frame and is difficult to obtain other Computational frames synchronous by extending the storage class of Computational frame
The problem of tables of data crossed, reach that different Computational frames can share the technique effect of the tables of data after loading.
Fig. 6 diagrammatically illustrates the schematic diagram of the data handling system 600 according to the embodiment of the present disclosure.
As shown in fig. 6, data handling system 600 includes judge module 610 and synchronization module 620.
Judge module 610, for example, perform above with reference to the operation S210 described in Fig. 2, for being performed when the first Computational frame
For the first tables of data operation when, judge in distributed memory system with the presence or absence of first tables of data corresponding data letter
Breath.
Synchronization module 620, for example, perform above with reference to the operation S220 described in Fig. 2, in case of absence, holding
The row operation for being directed to the first tables of data, and the corresponding data message of first tables of data is synchronized in the distribution
In deposit system.
Fig. 7 diagrammatically illustrates the schematic diagram of the data handling system 700 according to the embodiment of the present disclosure.
As shown in fig. 7, data handling system 700 includes judge module 610, synchronization module 620 and load-on module 730.
Judge module 610, for example, perform above with reference to the operation S210 described in Fig. 3, for being performed when the first Computational frame
For the first tables of data operation when, judge in distributed memory system with the presence or absence of first tables of data corresponding data letter
Breath.
Synchronization module 620, for example, perform above with reference to the operation S220 described in Fig. 3, in case of absence, holding
The row operation for being directed to the first tables of data, and the corresponding data message of first tables of data is synchronized in the distribution
In deposit system.
Load-on module 730, for example, perform above with reference to the operation S230 described in Fig. 3, in distributed memory system
In the case of there is the corresponding data message of first tables of data, make first Computational frame from the distributed memory system
The corresponding data message of first tables of data is loaded in system.
Fig. 8 diagrammatically illustrates the schematic diagram of the synchronization module 620 according to the embodiment of the present disclosure.
As shown in figure 8, synchronization module 620 includes the first judging submodule 621 and synchronous submodule 622.
First judging submodule 621, for example, perform above with reference to the operation S221 described in Fig. 4, for judging described first
Whether tables of data belong to the tables of data that may be shared by different Computational frames, described to be shared by different Computational frames
Tables of data is what the statistics based on tables of data inquiry plan was determined.
Synchronous submodule 622, for example, perform above with reference to the operation S222 described in Fig. 4, for belonging to the possible quilt
In the case of the shared tables of data of different Computational frames, the corresponding data message of first tables of data is synchronized to the distribution
In formula memory system.
Fig. 9 diagrammatically illustrates the schematic diagram of the judge module 610 according to the embodiment of the present disclosure.
As shown in figure 9, judge module 610 includes the judging submodule 612 of acquisition submodule 611 and second.
Acquisition submodule 611, for example, perform above with reference to the operation S211 described in Fig. 5, for making first calculation block
Frame obtains the corresponding data message of tables of data that other Computational frames are synchronized to the distributed memory system.
Second judging submodule 612, for example, perform above with reference to the operation S212 described in Fig. 5, for judging in distribution
It whether there is the corresponding data message of first tables of data in deposit system.
It is understood that judge module 610, acquisition submodule 611, the second judging submodule 612, synchronization module 620,
First judging submodule 621, synchronous submodule 622 and load-on module 730 may be incorporated in a module and realize, or
Any one module therein can be split into multiple modules.Or, one or more of these modules module is at least
Partial function can be combined with least part function phase of other modules, and be realized in a module.According to the reality of the present invention
Apply example, judge module 610, acquisition submodule 611, the second judging submodule 612, synchronization module 620, the first judging submodule
621st, at least one in synchronous submodule 622 and load-on module 730 can at least be implemented partly as hardware circuit,
For example field programmable gate array (FPGA), programmable logic array (PLA), on-chip system, the system on substrate, in encapsulation
System, application specific integrated circuit (ASIC), or can be hard to carry out integrated or encapsulation any other rational method etc. to circuit
Part or firmware realize, or is realized with software, the appropriately combined of three kinds of implementations of hardware and firmware.Or, judge mould
Block 610, acquisition submodule 611, the second judging submodule 612, synchronization module 620, the first judging submodule 621, synchronous submodule
At least one in block 622 and load-on module 730 can at least be implemented partly as computer program module, when the journey
When sequence is run by computer, the function of corresponding module can be performed.
Figure 10 diagrammatically illustrates the block diagram of a calculate node in the server cluster according to the embodiment of the present disclosure.
As shown in Figure 10, server cluster includes at least one calculate node 1000.According to the embodiment of the present disclosure, section is calculated
Point 1000 includes a processor 1010, and a memory 1020.In other embodiments of the disclosure, calculate node
1000 can include any number of processor 1010 or memory 1020.The calculate node 1000 can for example be realized joins above
The calculate node 110 of Fig. 1 descriptions is examined, and constitutes server cluster 100.Server cluster 100 can perform above with reference to Fig. 2~
The method of Fig. 5 descriptions, to realize when the first Computational frame performs the operation for the first tables of data, by the corresponding number of tables of data
It is believed that breath be synchronized in distributed memory system, in order to which other Computational frames are when to the data table handling, can directly from
Loaded in distributed memory system, reduce disk read-write operation.
Specifically, processor 1010 can for example include general purpose microprocessor, instruction set processor and/or related chip group
And/or special microprocessor (for example, application specific integrated circuit (ASIC)), etc..Processor 1010 can also include being used to cache
The onboard storage device of purposes.Processor 1010 can be performed for reference to Fig. 2~Fig. 5 describe according to the embodiment of the present disclosure
Single treatment unit either multiple processing units of the different actions of method flow.
Memory 1020, for example can be can be comprising storage, transmission, the arbitrary medium for propagating or transmitting instruction.For example,
Readable storage medium storing program for executing can include but is not limited to electricity, magnetic, optical, electromagnetic, infrared or semiconductor system, device, device or propagate Jie
Matter.The specific example of readable storage medium storing program for executing includes:Magnetic memory apparatus, such as tape or hard disk (HDD);Light storage device, such as CD
(CD-ROM);Semiconductor memory, such as random access memory (RAM) or flash memory;And/or wire/wireless communication link.
Memory 1020, can include computer program 1021, and the computer program 1021 can include code/computer
Executable instruction, it as processor 1010 when being performed so that processor 1010 is performed described by for example above in conjunction with Fig. 2~Fig. 5
Method flow and its any deformation.
Computer program 1021 can be configured with such as computer program code including computer program module.Example
Such as, in the exemplary embodiment, the code in computer program 1021 can include one or more program modules, for example including
1021A, module 1021B ....It should be noted that the dividing mode and number of module are not fixed, those skilled in the art
It can be combined according to actual conditions using suitable program module or program module, when the combination of these program modules is by processor
1010 when being performed so that processor 1010 can be performed for example above in conjunction with the method flow described by Fig. 2~Fig. 5 and its any
Deformation.
Embodiments in accordance with the present invention, judge module 610, acquisition submodule 611, the second judging submodule 612, synchronous mould
At least one in block 620, the first judging submodule 621, synchronous submodule 622 and load-on module 730 can be implemented as ginseng
The computer program module of Figure 10 descriptions is examined, it by processor 1010 when being performed, it is possible to achieve corresponding operating described above.
It will be understood by those skilled in the art that the feature described in each embodiment and/or claim of the disclosure can
To carry out multiple combinations or/or combination, even if such combination or combination are not expressly recited in the disclosure.Especially, exist
In the case of not departing from disclosure spirit or teaching, the feature described in each embodiment and/or claim of the disclosure can
To carry out multiple combinations and/or combination.All these combinations and/or combination each fall within the scope of the present disclosure.
Although the disclosure, art technology has shown and described in the certain exemplary embodiments with reference to the disclosure
Personnel it should be understood that without departing substantially from appended claims and its equivalent restriction spirit and scope of the present disclosure in the case of,
A variety of changes in form and details can be carried out to the disclosure.Therefore, the scope of the present disclosure should not necessarily be limited by above-described embodiment,
But not only should be determined by appended claims, also it is defined by the equivalent of appended claims.
Claims (10)
1. a kind of data processing method, applied to the PC cluster process including at least two Computational frames, methods described includes:
When the first Computational frame performs the operation for the first tables of data, judge in distributed memory system with the presence or absence of described
The corresponding data message of first tables of data;And
If being not present, the operation for being directed to the first tables of data is performed, and by the corresponding data message of first tables of data
It is synchronized in the distributed memory system.
2. according to the method described in claim 1, in addition to:
If there is the corresponding data message of first tables of data in distributed memory system, first Computational frame is from institute
State and the corresponding data message of first tables of data is loaded in distributed memory system.
3. according to the method described in claim 1, wherein, it is described that the corresponding data message of first tables of data is synchronized to institute
State in distributed memory system, including:
Judge whether first tables of data belongs to the tables of data that may be shared by different Computational frames, it is described may be by different meters
It is what the statistics based on tables of data inquiry plan was determined to calculate the shared tables of data of framework;And
If belonging to, the corresponding data message of first tables of data is synchronized in the distributed memory system.
4. according to the method described in claim 1, wherein, described perform includes for the operation of the first tables of data:
Query execution plan of the operation for the first tables of data.
5. according to the method described in claim 1, wherein, it is described judgement distributed memory system in the presence or absence of described first number
According to the corresponding data message of table, including:
First Computational frame obtains the corresponding number of tables of data that other Computational frames are synchronized to the distributed memory system
It is believed that breath, wherein, the storage class of each Computational frame at least two Computational frame is by extension;And
Judge to whether there is the corresponding data message of first tables of data in distributed memory system.
6. a kind of server cluster, including:
At least one processor;And
At least one memory, is stored thereon with computer-readable program, when described program is held by least one described processor
During row so that at least one described processor:
In the case where the first Computational frame performs the operation for the first tables of data, judge whether deposited in distributed memory system
In the corresponding data message of first tables of data;And
In the case of first tables of data is not present in distributed memory system, the behaviour for being directed to the first tables of data is performed
Make, and first tables of data is synchronized in the distributed memory system.
7. server cluster according to claim 6, at least one described processor is also performed:
In the case of there is the corresponding data message of first tables of data in distributed memory system, described first is set to calculate
Framework loads the corresponding data message of data in first tables of data from the distributed memory system.
8. server cluster according to claim 6, at least one described computing device is by first tables of data pair
The data message answered is synchronized in the distributed memory system, including:
Judge whether first tables of data belongs to the tables of data that may be shared by different Computational frames, it is described may be by different meters
It is what the statistics based on tables of data inquiry plan was determined to calculate the shared tables of data of framework;And
If belonging to, the corresponding data message of first tables of data is synchronized in the distributed memory system.
9. server cluster according to claim 6, wherein, at least one described computing device is directed to the first tables of data
Operation include:
Query execution plan of the operation for the first tables of data.
10. server cluster according to claim 6, wherein, at least one described processor judges distributed memory system
It whether there is the corresponding data message of first tables of data in system, including:
First Computational frame obtains the corresponding number of tables of data that other Computational frames are synchronized to the distributed memory system
It is believed that breath, wherein, the storage class of each Computational frame at least two Computational frame is by extension;And
Judge to whether there is the corresponding data message of first tables of data in distributed memory system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710504831.6A CN107239570A (en) | 2017-06-27 | 2017-06-27 | Data processing method and server cluster |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710504831.6A CN107239570A (en) | 2017-06-27 | 2017-06-27 | Data processing method and server cluster |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107239570A true CN107239570A (en) | 2017-10-10 |
Family
ID=59987310
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710504831.6A Pending CN107239570A (en) | 2017-06-27 | 2017-06-27 | Data processing method and server cluster |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107239570A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109976905A (en) * | 2019-03-01 | 2019-07-05 | 联想(北京)有限公司 | EMS memory management process, device and electronic equipment |
CN110968599A (en) * | 2018-09-30 | 2020-04-07 | 北京国双科技有限公司 | Inquiry method and device based on Impala |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6519592B1 (en) * | 1999-03-31 | 2003-02-11 | Verizon Laboratories Inc. | Method for using data from a data query cache |
CN105488155A (en) * | 2015-11-30 | 2016-04-13 | 浪潮集团有限公司 | Method for quickly querying mass data |
CN105516284A (en) * | 2015-12-01 | 2016-04-20 | 深圳市华讯方舟软件技术有限公司 | Clustered database distributed storage method and device |
CN105574010A (en) * | 2014-10-13 | 2016-05-11 | 阿里巴巴集团控股有限公司 | Data querying method and device |
CN105700902A (en) * | 2014-11-27 | 2016-06-22 | 航天信息股份有限公司 | Data loading and refreshing method and apparatus |
CN106021484A (en) * | 2016-05-18 | 2016-10-12 | 中国电子科技集团公司第三十二研究所 | Customizable multi-mode big data processing system based on memory calculation |
CN106651748A (en) * | 2015-10-30 | 2017-05-10 | 华为技术有限公司 | Image processing method and apparatus |
-
2017
- 2017-06-27 CN CN201710504831.6A patent/CN107239570A/en active Pending
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6519592B1 (en) * | 1999-03-31 | 2003-02-11 | Verizon Laboratories Inc. | Method for using data from a data query cache |
CN105574010A (en) * | 2014-10-13 | 2016-05-11 | 阿里巴巴集团控股有限公司 | Data querying method and device |
CN105700902A (en) * | 2014-11-27 | 2016-06-22 | 航天信息股份有限公司 | Data loading and refreshing method and apparatus |
CN106651748A (en) * | 2015-10-30 | 2017-05-10 | 华为技术有限公司 | Image processing method and apparatus |
CN105488155A (en) * | 2015-11-30 | 2016-04-13 | 浪潮集团有限公司 | Method for quickly querying mass data |
CN105516284A (en) * | 2015-12-01 | 2016-04-20 | 深圳市华讯方舟软件技术有限公司 | Clustered database distributed storage method and device |
CN106021484A (en) * | 2016-05-18 | 2016-10-12 | 中国电子科技集团公司第三十二研究所 | Customizable multi-mode big data processing system based on memory calculation |
Non-Patent Citations (1)
Title |
---|
卜尧;吴斌;陈玉峰;白德盟;: "BDAP——一个基于Spark的数据挖掘工具平台", 《中国科学技术大学学报》 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110968599A (en) * | 2018-09-30 | 2020-04-07 | 北京国双科技有限公司 | Inquiry method and device based on Impala |
CN110968599B (en) * | 2018-09-30 | 2023-04-07 | 北京国双科技有限公司 | Inquiry method and device based on Impala |
CN109976905A (en) * | 2019-03-01 | 2019-07-05 | 联想(北京)有限公司 | EMS memory management process, device and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8984516B2 (en) | System and method for shared execution of mixed data flows | |
US11604654B2 (en) | Effective and scalable building and probing of hash tables using multiple GPUs | |
CN103997544B (en) | A kind of method and apparatus of resource downloading | |
US9400767B2 (en) | Subgraph-based distributed graph processing | |
CN104699723B (en) | Data synchronous system and method between data exchange adapter, heterogeneous system | |
CN107688853A (en) | A kind of device and method for being used to perform neural network computing | |
US9378533B2 (en) | Central processing unit, GPU simulation method thereof, and computing system including the same | |
WO2019084788A1 (en) | Computation apparatus, circuit and relevant method for neural network | |
Hu et al. | Trix: Triangle counting at extreme scale | |
CN105205154A (en) | Data migration method and device | |
CN108182281A (en) | Data processing control method, device, server and medium based on streaming computing | |
CN110019310A (en) | Data processing method and system, computer system, computer readable storage medium | |
CN107239570A (en) | Data processing method and server cluster | |
US20150172369A1 (en) | Method and system for iterative pipeline | |
CN114064562A (en) | ESL modeling method, device, equipment and medium for network on chip | |
CN112948025A (en) | Data loading method and device, storage medium, computing equipment and computing system | |
CN104408178B (en) | WEB controls loading device and method | |
CN111352896A (en) | Artificial intelligence accelerator, equipment, chip and data processing method | |
CN112860412B (en) | Service data processing method and device, electronic equipment and storage medium | |
CN108364327A (en) | A kind of method and device of diagram data processing | |
US20210334264A1 (en) | System, method, and program for increasing efficiency of database queries | |
CN103176843B (en) | The file migration method and apparatus of MapReduce distributed system | |
US20150314196A1 (en) | Deployment of an electronic game using device profiles | |
CN108491546A (en) | A kind of page switching method and electronic equipment | |
CN113807539B (en) | Machine learning and graphic computing power high multiplexing method, system, medium and terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20171010 |
|
RJ01 | Rejection of invention patent application after publication |