CN116069488A

CN116069488A - Parallel computing method and device for distributed data

Info

Publication number: CN116069488A
Application number: CN202111284211.9A
Authority: CN
Inventors: 罗伟锋; 方荣; 郭朕
Original assignee: 4Paradigm Beijing Technology Co Ltd
Current assignee: 4Paradigm Beijing Technology Co Ltd
Priority date: 2021-11-01
Filing date: 2021-11-01
Publication date: 2023-05-05

Abstract

The invention provides a parallel computing method and device for distributed data. The method comprises the following steps: performing data splitting on the original data according to a splitting policy to obtain a plurality of data slices; assigning a plurality of data slices to a plurality of computing units, wherein each computing unit performs a computation based on the assigned data slices to obtain a slice computation result; and aggregating the slice computation results, wherein the split policy comprises at least one of a first split policy and a second split policy, wherein the first split policy is to perform the split based on the number of data partitions to be allocated per computing unit, wherein the second split policy is to perform the split based on the number of data lines to be allocated per computing unit or the total number of computing units.

Description

Parallel computing method and device for distributed data

Technical Field

The present invention relates to the field of big data, and more particularly, to a parallel computing and apparatus for distributed data.

Background

In machine learning and big data computing scenarios, computing logic that requires acceleration of data sortability is often encountered, a common approach is to limit the description of computing processing logic to a particular language or framework, and then split data and tasks in a multi-computer environment (e.g., parallel computing environment) using a language and framework-level processing co-ordination. While there is no good solution for cases where the description framework cannot be constrained (or limited) to computation.

Disclosure of Invention

The invention aims to provide a parallel computing method and device for distributed data.

According to one or more aspects of the present invention, there is provided a parallel computing method of distributed data, the method comprising: performing data splitting on the data to be processed according to the splitting strategy to obtain a plurality of data slices; assigning a plurality of data slices to a plurality of computing units, wherein each computing unit performs a computation based on the assigned data slices to obtain a slice computation result; and aggregating the slice computation results, wherein the split policy comprises at least one of a first split policy that performs splitting based on the number of data partitions to be allocated per compute unit and a second split policy that performs splitting based on the number of data lines to be allocated per compute unit or the total number of compute units.

In an exemplary embodiment according to the inventive concept, the step of performing data splitting on the data to be processed according to the splitting policy to obtain a plurality of data slices may include: the data splitting is performed according to a first splitting policy to obtain a plurality of initial slices, and then the plurality of initial slices are adjusted according to a second splitting policy to obtain a plurality of data slices.

In an exemplary embodiment according to the inventive concept, the adjusting the plurality of initial slices according to the second splitting policy may be performed by repartitioning the plurality of initial slices.

In an exemplary embodiment according to the inventive concept, the splitting policy may further include: and a third splitting policy for performing splitting according to resource scheduling information, wherein the resource scheduling information includes at least one of a calculated expected total run length and a calculation resource expected to be used for performing the calculation.

In an exemplary embodiment according to the inventive concept, the step of performing data splitting according to a splitting policy to obtain a plurality of data slices may include: performing data splitting according to the first splitting strategy and/or the second splitting strategy to obtain a plurality of initial slices, and then adjusting the plurality of initial slices according to the third splitting strategy to obtain a plurality of data slices.

In an exemplary embodiment according to the inventive concept, the step of executing the third split policy may include: acquiring the resource scheduling information; acquiring an operation resource index and a secondary splitting index, wherein the operation resource index is used for representing the use amount of computing resources, and the secondary splitting index is used for representing the time cost consumed in executing adjustment; and executing splitting based on the running resource index and the secondary splitting index according to resource scheduling information.

In an exemplary embodiment according to the inventive concept, the computing resource may include a first computing unit that is performing the computation and a second computing unit that is not performing the computation, the first computing unit including a plurality of computing units. The step of performing splitting based on the running resource index and the secondary splitting index according to resource scheduling information may include: acquiring a calculation unit to be allocated in the second calculation unit based on the operation resource index; the method comprises the steps of presuming expected calculation duration according to data to be processed or a plurality of initial slices, a first calculation unit and a calculation unit to be distributed; presume the total operation duration changes based on at least the estimated calculation duration and the secondary split index; and performing splitting based on the total run-time variation according to the resource scheduling information.

In an exemplary embodiment according to the inventive concept, the operation resource index may include at least one of CPU utilization, memory utilization, and bandwidth utilization of each of the collected computing units.

In an exemplary embodiment according to the inventive concept, the step of predicting the total run-length variation may include: the total run length change is inferred based on at least one of data slice read consumption, preprocessing consumption, and the projected computation length and the secondary split index.

According to one or more aspects of the present invention, there is provided a distributed data parallel computing apparatus, the apparatus comprising: a data splitting unit configured to perform data splitting on the data to be processed according to a splitting policy to obtain a plurality of data slices; a plurality of calculation units configured to perform calculations based on the assigned data slices to obtain slice calculation results; and an aggregation unit configured to aggregate the slice computation results, wherein the splitting policy includes at least one of a first splitting policy to perform splitting based on the number of data partitions to be allocated per computation unit and a second splitting policy to perform splitting based on the number of data lines to be allocated per computation unit or the total number of computation units.

In an exemplary embodiment according to the inventive concept, the data splitting unit may be further configured to perform data splitting according to a first splitting policy to obtain a plurality of initial slices, and then adjust the plurality of initial slices according to a second splitting policy to obtain a plurality of data slices.

In an exemplary embodiment according to the inventive concept, the data splitting unit may be further configured to perform the step of adjusting the plurality of initial slices according to the second splitting policy by re-partitioning the plurality of initial slices.

In an exemplary embodiment according to the inventive concept, the data splitting unit may be configured to perform data splitting according to a first splitting policy and/or a second splitting policy to obtain a plurality of initial slices, and then adjust the plurality of initial slices according to a third splitting policy to obtain a plurality of data slices.

In an exemplary embodiment according to the inventive concept, the apparatus may further include an index obtaining unit configured to obtain an operation resource index for characterizing a usage amount of the computing resource and a secondary split index for characterizing a time cost consumed in performing the adjustment. The data splitting unit, when performing data splitting according to the third splitting policy, may be configured to: acquiring resource scheduling information; acquiring an operation resource index and a secondary splitting index from an index acquisition unit; splitting is performed based on the running resource index and the secondary splitting index according to the resource scheduling information.

In an exemplary embodiment according to the inventive concept, the computing resource may include a first computing unit that is performing the computation and a second computing unit that is not performing the computation, the first computing unit including a plurality of computing units. The data splitting unit is further configured to: acquiring a calculation unit to be allocated in the second calculation unit based on the operation resource index; the method comprises the steps of presuming expected calculation duration according to data to be processed or a plurality of initial slices, a first calculation unit and a calculation unit to be distributed; presume the total operation duration changes based on at least the estimated calculation duration and the secondary split index; and performing splitting based on the total run-time variation according to the resource scheduling information.

In an exemplary embodiment according to the inventive concept, the data splitting unit may be further configured to infer the total run-time variation based on at least one of data slice read consumption, preprocessing consumption, and an estimated calculated time period and a secondary split index.

Another aspect of the invention provides a computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform a method of parallel computing of distributed data as described above.

Another aspect of the invention provides a system comprising at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform a method of parallel computing of distributed data as described above.

According to one or more aspects of the present invention, by performing data splitting according to a splitting policy to obtain a plurality of data slices, by allocating the plurality of data slices to a plurality of calculation units to perform calculations respectively to obtain slice calculation results, and by aggregating the slice calculation results to implement parallel calculations, the use experience is improved and overall performance can be optimized.

Drawings

These and/or other aspects and advantages of the present disclosure will become more apparent and more readily appreciated from the following detailed description of the embodiments of the disclosure, taken in conjunction with the accompanying drawings, wherein:

FIG. 1 is an application scenario diagram illustrating a parallel computing method according to an embodiment of the present disclosure;

FIG. 2 is a flow chart illustrating a parallel computing method according to an embodiment of the present disclosure;

FIG. 3 is a schematic diagram illustrating a split strategy of a parallel computing method according to an embodiment of the present disclosure;

FIG. 4 is a block diagram illustrating a parallel computing device according to an embodiment of the present disclosure; and

fig. 5 is a block diagram illustrating an electronic device according to an exemplary embodiment of the present disclosure.

Detailed Description

Embodiments of the present invention are described in detail below with reference to the accompanying drawings. Examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. The embodiments will be described below in order to explain the present invention by referring to the figures.

Fig. 1 is an application scenario diagram illustrating a parallel computing method according to an embodiment of the present disclosure.

In an application scenario applied to parallel computing of distributed data, the data to be processed may first be logically or physically fragmented. For example, the data to be processed may include 0000.Par, 0001.Par, 0002.Par, and 0003.Par as shown in fig. 1. Where Parque is a columnar storage format, and Parque can be bound to a data framework and can adapt to multiple languages and components. For example, 0000.Par, 0001.Par, 0002.Par, and 0003.Par may be par files or partitions. In the step of performing data splitting, data 0000.Par, 0001.Par, 0002.Par, and 0003.Par may be divided into data slice 1, data slice 2, and data slice 3. Data slice 1, data slice 2, and data slice 3 may be the same as or different from the partitions of data 0000.Par, 0001.Par, 0002.Par, and 0003. Par. For example, data slice 1 may be identical to data 0000. Par. In another embodiment, by way of example only, data 0000.Par and some of data 0002.Par may be included in data slice 1, and such that the split data slice neither duplicates nor misses data (or lines of data), i.e., the split data includes all 0000.Par, 0001.Par, 0002.Par, and 0003. Par.

Then, the data slice 1, the data slice 2, and the data slice 3 are allocated to a plurality of calculation units, wherein each calculation unit performs calculation based on the allocated data slice to obtain a slice calculation result. For example, three calculation units (not shown) go through stage 1, stage 2, and stage 3, respectively, to obtain calculation result 1, calculation result 2, and calculation result 3, respectively.

Next, the calculation result 1, the calculation result 2, and the calculation result 3 are aggregated to obtain a calculation result. For example, as shown in fig. 1, calculation result 2, and calculation result 3 are aggregated to obtain data 0004. Par.

In a batch predictive service scene, different data have lower dependence. It is therefore well suited to use parallel computing methods, or what is known as batch separable computing. Next, a parallel computing method of improving the performance of batch type separable computing will be described with reference to fig. 2 and 3. More specifically, by implementing different levels of data splitting policies, the overall performance of the batch separable computation is improved.

Fig. 2 is a flowchart illustrating a parallel computing method according to an embodiment of the present disclosure, and fig. 3 is a schematic diagram illustrating a split policy of the parallel computing method according to an embodiment of the present disclosure.

As shown in fig. 2, in step S10, data splitting is performed on data to be processed according to a splitting policy to obtain a plurality of data slices.

As shown in fig. 3, the split policy includes at least one of a first split policy and a second split policy.

More specifically, the first split policy is to perform splitting based on the number of data partitions each computing unit will allocate. Where a data partition may refer to a logical partition/physical partition. In this case, according to the first split policy, data located in the same logical partition/physical partition may be allocated to the same computing unit, thereby reducing the cases where data is read/written between different partitions. In another embodiment, the data partitions may be data 0000.Par, 0001.Par, 0002.Par, and 0003.Par as shown in fig. 1, i.e., the data partitions may refer to partitioning in files. In this case, the first splitting policy may allocate data located in the same file to the same computing unit, thereby reducing the number of different computing units to perform read/write operations on a single file at the same time, and also avoiding the transfer of a particular file between different computing units. By way of example only, data 0000.Par and 0001.Par may be divided into data slice 1, data 0002.Par may be divided into data slice 2, and data 0003.Par may be divided into data slice 3.

Further, the second splitting policy is to perform splitting based on the number of data lines each computing unit is to allocate or the total number of computing units. More specifically, the data 0000.Par, 0001.Par, 0002.Par, and 0003.Par shown in fig. 1 may be subjected to preprocessing of the operation and maintenance properties. For example, the repartitioning operation may be performed on data 0000.Par, 0001.Par, 0002.Par, and 0003.Par based on the number of data lines each computing unit will allocate or the total number of computing units. The repartitioning operation may refer to a secondary physical/logical partition. For example, by repartitioning the data 0000.Par, 0001.Par, 0002.Par, and 0003. Par. The repartitioned data may be represented as 0000', parquet, 0001', parquet, 0002'. As an example, the data 0000' par may include all of the data 0000.Par and a portion of the data 0001.Par, the 0001' par may include the remaining portion of the 0001.Par and a portion of the 0002.Par, and the 0002' par may include the remaining portion of the 0002.Par and all of the 0003. Par. And such that the repartitioned data 0000'. Par-quet, 0001'. Par-que, 0002'. Par-que have the same or comparable data volume. In this case, data slice 1 corresponds to data 0000' parquet, data slice 2 corresponds to data 0001' parquet, and data slice 3 corresponds to data 0002' parquet.

In another embodiment, the repartitioning operation may also mean supplementing or deleting at least some of the data 0000.Par, 0001.Par, 0002.Par, and 0003.Par with a predetermined number of lines of data, such that the number of data slices in each computing unit is comparable or the number of lines of data in each data slice is comparable. By way of example only, a portion of data 0000.Par and 0001.Par may be divided into data slice 1, a portion of data 0000.Par and data 0002.Par may be divided into data slice 2, and a portion of data 0000.Par and data 0003.Par may be divided into data slice 3. In this case, the operations of supplementing and deleting will not change the total data amount, for example, the operations of supplementing and deleting will not change the total number of lines of data.

It should be noted that, although the splitting policy is illustrated in fig. 3 as including at least one of the first splitting policy and the second splitting policy, in example embodiments, only the first splitting policy may be performed, or only the second splitting policy may be performed.

In another embodiment, as shown in fig. 3, the data splitting may be performed according to a first splitting policy to obtain a plurality of initial slices, and then the plurality of initial slices may be adjusted according to a second splitting policy to obtain a plurality of data slices. For example, in an embodiment, the step of adjusting may be performed by repartitioning the plurality of initial slices obtained according to the first splitting policy. As an example, although not shown, the data 0000.Par, 0001.Par, 0002.Par, and 0003.Par may be first divided into an initial slice 1 including the data 0000.Par, an initial slice 2 including the data 0001.Par, an initial slice 3 including the data 0002.Par, and an initial slice 4 including the data 0003.Par according to the first splitting policy, and then the initial slices 1, 2, 3, and 4 are adjusted according to the second splitting policy to obtain the data slices 1, 2, and 3. Here, only an example in which the first split policy is executed first and then the second split policy is executed is exemplarily shown. In different embodiments, the execution of the split policies may have different orders, and combinations of split policies executed in different orders are to be understood as included within the scope of the present invention. For example, in another embodiment, at least one of the initial slice 1, the initial slice 2, the initial slice 3, and the initial slice 4 may be directly assigned to the calculation unit to perform the calculation, or may be assigned to the calculation unit to perform the calculation after the adjustment.

In an example embodiment, the split policy may further include a third split policy. The third splitting policy is to perform splitting according to resource scheduling information, wherein the resource scheduling information includes at least one of a calculated expected total run length and a calculated resource to perform the calculation. For example, the desired total operation time period may represent a desired total operation time period or a target operation time period set by the user. For example only, the desired total operating duration may be 1 hour, 1 day, or 1 week, but is not limited thereto. The computing resources performing the computation may represent available computing units. The computing resources may include a first computing unit that is performing the computation and a second computing unit that is not performing the computation. In other words, a computing resource may refer to a plurality of computing units (i.e., a first computing unit) that are performing the computation and computing units other than the plurality of computing units that may be allocated to the computation (i.e., a second computing unit).

In another embodiment, a run resource index (run Metrics) and a secondary split index (Repartition Metris) may be additionally obtained (e.g., obtained in real-time) while the third split policy is being executed. The operation resource index is used for representing the usage amount of the computing resource, and the secondary split index is used for representing the time cost consumed in executing adjustment. For example, the running resource index may include at least one of CPU utilization, memory utilization, and bandwidth utilization of each computing unit collected.

In an example embodiment, the step of executing the third split policy comprises: acquiring resource scheduling information; acquiring an operation resource index and a secondary splitting index; and executing splitting based on the running resource index and the secondary splitting index according to resource scheduling information.

In still further example embodiments, the performing of the splitting based on the running resource index and the secondary splitting index according to resource scheduling information may include: acquiring a calculation unit to be allocated in the second calculation unit based on the operation resource index; the method comprises the steps of presuming expected calculation duration according to data to be processed or a plurality of initial slices, a first calculation unit and a calculation unit to be distributed; presume the total operation duration changes based on at least the estimated calculation duration and the secondary split index; and performing splitting based on the total run-time variation according to the resource scheduling information.

It should be noted that, although the second computing unit is not allocated to the computing, the CPU utilization, the memory utilization, the bandwidth utilization, and the like may still be occupied by other tasks (e.g., other computing tasks, updating, reading and writing, maintenance, and the like). Therefore, it is possible to effectively detect whether there is a resource that can be allocated to the calculation (i.e., a calculation unit to be allocated) in the second calculation unit by running the resource index. For example, in the case where the user is sensitive to the run length, if there are computing units to be allocated, expansion may be performed (i.e., increasing the number of computing units performing parallel computation). In other words, the usage run resource index may be used to balance resource usage against total run length. Under the condition that the total operation time is required to be shortened, the resource usage amount is increased; in the event that a need exists to save computing resources (e.g., to make lines for other tasks), a reduction is performed and the overall run length is increased.

In a further example embodiment, in case there is a computational unit to be allocated in the computational resources, the estimated computational duration would be reduced if the computational unit to be allocated was allocated to the computation (i.e. more resource usage was allocated). For example, the remaining calculation time period may be estimated according to the data to be processed or the plurality of initial slices and the first calculation unit, the expected total operation time period set by the user may be compared with the sum of the remaining calculation time period and the operated time period, if the expected total operation time period set by the user is smaller than the sum of the remaining calculation time period and the operated time period, the capacity expansion may be performed, and otherwise, the capacity expansion may not be performed.

It is considered that assigning the calculation unit to be assigned to the calculation will bring about an operation such as repartitioning, which will cause the secondary split index to increase. Accordingly, the total operation time length variation is presumed based on the predicted calculation time length and the secondary split index, so that whether to perform the expansion can be adaptively adjusted. Further, if the total operation time length is reduced by the change of the total operation time length, the capacity expansion may be performed, otherwise, the capacity expansion may not be performed; as another example, if the total operation time length is reduced by the total operation time length change by more than a predetermined threshold, the capacity expansion may be performed, and vice versa, the capacity expansion may not be performed, so that the calculated total operation time length is not increased by an operation such as repartitioning.

In another embodiment, the step of predicting the total run-length change may include: the total run-length variation is inferred based on at least one of data slice Read consumption (fractions Read), preprocessing consumption (Preprocessing), and projected computation length and secondary split index.

Although not shown, for example, the computing power of the a cell may be greater than that of the B cell, which is greater than that of the C cell. Alternatively, in another embodiment, the a unit, the B unit, and the C unit have the same hardware configuration, however, part of the computing power in the B unit and the C unit is still occupied, so that the computing power of the a unit > the computing power of the B unit > the computing power of the C unit. In this case, according to the integrated consideration of the running resource index and the secondary split index, the data slice 1, the data slice 2, and the data slice 3 to be allocated to the a unit, the B unit, and the C unit, respectively, are adjusted (e.g., the repartitioning operation as described above is performed) based on the resource scheduling information so that the number of data slices or the number of data lines allocated to the respective computing units matches with their corresponding computing capacities.

In another embodiment, as shown in fig. 3, the data splitting may be performed according to a first splitting policy and/or a second splitting policy to obtain a plurality of initial slices, and then the plurality of initial slices may be adjusted according to a third splitting policy to obtain a plurality of data slices. Here, only an example in which the first splitting policy and/or the second splitting policy are executed first and then the third splitting policy is executed is exemplarily shown. In different embodiments, the execution of the split policies may have different orders, and combinations of split policies executed in different orders are to be understood as included within the scope of the present invention.

Referring back to fig. 2, in step S20, a plurality of data slices are allocated to a plurality of calculation units, wherein each calculation unit performs calculation based on the allocated data slices to obtain a slice calculation result.

In another embodiment, in the process of executing the calculation by each calculation unit, the operation resource indexes and/or the secondary splitting indexes of the plurality of calculation units may be collected in real time, so as to receive the operation state feedback, and according to the operation state feedback, the consumption of the operation resource indexes, the secondary splitting indexes, and the like in the calculation is comprehensively considered, and according to the resource scheduling information, the data slice 1, the data slice 2, and the data slice 3 which have been allocated to the calculation unit 1, the calculation unit 2, and the calculation unit 3 are adjusted based on the operation resource indexes and the secondary splitting indexes, so as to adaptively adjust or allocate the calculation performance, thereby improving the overall performance of the batch separable calculation. For example, although not shown, if there is a computing unit (e.g., a D unit) to be allocated, and the total operation time period is changed as described above such that the total operation time period is reduced beyond a predetermined threshold, the expansion may be performed. In an example embodiment, the repartitioning operation as described above may be performed on the data slice 1, the data slice 2, and the data slice 3 to obtain a data slice 1', a data slice 2', a data slice 3', and a data slice 4', and they are respectively allocated to the a unit, the B unit, the C unit, and the D unit to perform the computation.

In step S30, the slice calculation results are aggregated. For example, in a batch predictive service scenario, as shown in fig. 1 and 3, by performing batch calculations based on data slice 1, data slice 2, and data slice 3, respectively, calculation result 1, calculation result 2, and calculation result 3 are obtained, respectively, and the above calculation results are aggregated to obtain a final result, that is, a predicted value in the predictive service scenario.

As described above with reference to steps S10 to S30, since a plurality of data slices are obtained by performing data splitting according to different splitting policies (at least one of a first splitting policy, a second splitting policy, and a third splitting policy), a slice calculation result is obtained by respectively performing calculations by assigning the plurality of data slices to a plurality of calculation units, and parallel calculations are realized by aggregating the slice calculation results, thereby improving the use experience and enabling optimization of overall performance.

Fig. 4 is a block diagram of a distributed data parallel computing device 10 according to the present disclosure.

According to one or more aspects of the present disclosure, the present disclosure provides a distributed data parallel computing apparatus 10, the apparatus 10 comprising: a data splitting unit 110, a computing unit 120, an aggregation unit 130.

The data splitting unit 110 is configured to perform data splitting according to a splitting policy to obtain a plurality of data slices. Wherein the split policy includes at least one of a first split policy, a second split policy, and a third split policy. The first split policy is to perform splitting based on the number of data partitions to be allocated per computing unit, the second split policy is to perform splitting based on the number of data lines to be allocated per computing unit or the total number of the plurality of computing units, and the third split policy is to perform splitting according to resource scheduling information, wherein the resource scheduling information includes at least one of a total running time of computation and computing resources to perform computation. The data splitting unit 110 is configured to perform the steps of data splitting described with reference to fig. 2 and 3, and thus redundant description is omitted herein.

The calculation unit 120 is configured to perform calculations based on the assigned data slices to obtain slice calculation results. The calculation unit 120 may be configured to perform step S20 with reference to fig. 2, and thus redundant description is omitted herein.

The aggregation unit 130 is configured to aggregate the slice calculation results. The aggregation unit 130 may be configured to perform the step S30 with reference to fig. 2, and thus redundant description is omitted herein.

Furthermore, the parallel computing device 10 may further include an index acquisition unit 140. The index acquisition unit 140 is configured to acquire the running resource index and the secondary split index of the plurality of calculation units, and the data splitting unit 110, when performing data splitting according to the third split policy, is configured to: acquiring the resource scheduling information; acquiring an operation resource index and a secondary split index from the index acquisition unit 140; splitting is performed based on the running resource index and the secondary splitting index according to the resource scheduling information. The operation resource index is used for representing time cost consumed by a plurality of computing units when performing computation, and the secondary split index is used for representing time cost consumed when performing adjustment. The index obtaining unit 140 may be configured to perform the method described with reference to fig. 3, and thus redundant description is omitted herein.

The specific manner in which the individual modules/units perform the operations in relation to the apparatus of the above embodiments has been described in detail in relation to the embodiments of the method and will not be described in detail here.

Fig. 5 is a block diagram illustrating an electronic device 500 according to an exemplary embodiment of the present disclosure.

Referring to fig. 5, an electronic device 500 includes at least one memory 501 and at least one processor 502, the at least one memory 501 storing computer-executable instructions that, when executed by the at least one processor 502, cause the at least one processor 502 to perform a method of parallel computing of distributed data according to embodiments of the present disclosure.

By way of example, the electronic device 500 may be a PC computer, tablet device, personal digital assistant, smart phone, or other device capable of executing the instructions described above. Here, the electronic device 500 is not necessarily a single electronic device, but may be any apparatus or a collection of circuits capable of executing the above-described instructions (or instruction set) individually or in combination. The electronic device 500 may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with either locally or remotely (e.g., via wireless transmission).

In electronic device 500, processor 502 may include a Central Processing Unit (CPU), a Graphics Processor (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.

The processor 502 may execute instructions or code stored in the memory 501, wherein the memory 501 may also store data. The instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.

The memory 501 may be integrated with the processor 502, for example, RAM or flash memory disposed within an integrated circuit microprocessor or the like. In addition, memory 501 may include a stand-alone device, such as an external disk drive, a storage array, or any other storage device usable by a database system. The memory 501 and the processor 502 may be operatively coupled or may communicate with each other, for example, through an I/O port, network connection, etc., such that the processor 502 is able to read files stored in the memory.

In addition, the electronic device 500 may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the electronic device 500 may be connected to each other via a bus and/or a network.

According to an embodiment of the present disclosure, there may also be provided a computer-readable storage medium, wherein instructions stored in the computer-readable storage medium, when executed by at least one processor, cause the at least one processor to perform a parallel computing method of distributed data according to an embodiment of the present disclosure. Examples of the computer readable storage medium herein include: read-only memory (ROM), random-access programmable read-only memory (PROM), electrically erasable programmable read-only memory (EEPROM), random-access memory (RAM), dynamic random-access memory (DRAM), static random-access memory (SRAM), flash memory, nonvolatile memory, CD-ROM, CD-R, CD + R, CD-RW, CD+RW, DVD-ROM, DVD-R, DVD + R, DVD-RW, DVD+RW, DVD-RAM, BD-ROM, BD-R, BD-R LTH, BD-RE, blu-ray or optical disk storage, hard Disk Drives (HDD), solid State Disks (SSD), card memory (such as multimedia cards, secure Digital (SD) cards or ultra-fast digital (XD) cards), magnetic tape, floppy disks, magneto-optical data storage, hard disks, solid state disks, and any other means configured to store computer programs and any associated data, data files and data structures in a non-transitory manner and to provide the computer programs and any associated data, data files and data structures to a processor or computer to enable the processor or computer to execute the programs. The computer programs in the computer readable storage media described above can be run in an environment deployed in a computer device, such as a client, host, proxy device, server, etc., and further, in one example, the computer programs and any associated data, data files, and data structures are distributed across networked computer systems such that the computer programs and any associated data, data files, and data structures are stored, accessed, and executed in a distributed fashion by one or more processors or computers.

According to an embodiment of the present disclosure, there may also be provided a computer program product comprising computer instructions which, when executed by at least one processor, implement a method of parallel computing of distributed data according to an embodiment of the present disclosure.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It is to be understood that the present disclosure is not limited to the precise arrangements and instrumentalities shown in the drawings, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of parallel computing of distributed data, the method comprising:

performing data splitting on the data to be processed according to the splitting strategy to obtain a plurality of data slices;

assigning the plurality of data slices to a plurality of computing units, wherein each computing unit performs a computation based on the assigned data slices to obtain a slice computation result; and

the slice computation results are aggregated and the slice computation results are,

wherein the splitting policy comprises at least one of a first splitting policy and a second splitting policy,

wherein the first split policy is to perform splitting based on the number of data partitions to be allocated by each computing unit,

wherein the second split policy is to perform splitting based on the number of data lines to be allocated by each of the computing units or the total number of the plurality of computing units.

2. The parallel computing method of claim 1, wherein the step of performing data splitting on the data to be processed according to a splitting policy to obtain a plurality of data slices comprises: performing data splitting according to the first splitting strategy to obtain a plurality of initial slices, and then adjusting the plurality of initial slices according to the second splitting strategy to obtain the plurality of data slices.

3. The parallel computing method of claim 2, wherein the step of adjusting the plurality of initial slices according to the second splitting policy is performed by repartitioning the plurality of initial slices.

4. The parallel computing method of claim 1, wherein the splitting policy further comprises: a third splitting policy that performs splitting according to resource scheduling information, wherein the resource scheduling information includes at least one of a desired total run length of the computation and a computing resource desired for performing the computation.

5. The parallel computing method of claim 4, wherein the step of performing data splitting according to a splitting policy to obtain a plurality of data slices comprises: performing data splitting according to the first splitting strategy and/or the second splitting strategy to obtain a plurality of initial slices, and then adjusting the plurality of initial slices according to the third splitting strategy to obtain the plurality of data slices.

6. The parallel computing method of claim 4 or 5, wherein the step of executing the third split policy comprises:

acquiring the resource scheduling information;

acquiring an operation resource index and a secondary splitting index, wherein the operation resource index is used for representing the usage amount of the computing resource, and the secondary splitting index is used for representing the time cost consumed in executing adjustment;

and executing splitting based on the running resource index and the secondary splitting index according to resource scheduling information.

7. The parallel computing method of claim 6, wherein the computing resources comprise a first computing unit that is performing the computation and a second computing unit that is not performing the computation, the first computing unit comprising the plurality of computing units,

the step of performing splitting based on the running resource index and the secondary splitting index according to resource scheduling information includes:

acquiring a calculation unit to be allocated in the second calculation unit based on the operation resource index;

estimating estimated calculation time length according to the data to be processed or the plurality of initial slices, the first calculation unit and the calculation unit to be allocated;

estimating a total run length change based at least on the estimated calculated length and the secondary split indicator; and is also provided with

And executing splitting according to the resource scheduling information based on the total operation time length change.

8. A parallel computing device of distributed data, the device comprising:

a data splitting unit configured to perform data splitting on the data to be processed according to a splitting policy to obtain a plurality of data slices;

a plurality of calculation units configured to perform calculations based on the assigned data slices to obtain slice calculation results; and

an aggregation unit configured to aggregate the slice calculation results,

wherein the first splitting policy is to perform splitting based on the number of data partitions to be allocated per computing unit,

wherein the second split policy is to perform splitting based on the number of data lines each computing unit is to allocate or the total number of the plurality of computing units.

9. A system comprising at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform a method of parallel computing of distributed data as claimed in any one of claims 1 to 7.

10. A computer readable storage medium storing instructions which, when executed by at least one computing device, cause the at least one computing device to perform the method of parallel computing of distributed data as claimed in any one of claims 1 to 7.