CN111858630B

CN111858630B - Data processing method, device and equipment and readable storage medium

Info

Publication number: CN111858630B
Application number: CN202010664232.2A
Authority: CN
Inventors: 韩海跃; 梅国强; 王江为
Original assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Current assignee: Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date: 2020-07-10
Filing date: 2020-07-10
Publication date: 2022-06-17
Anticipated expiration: 2040-07-10
Also published as: CN111858630A

Abstract

The invention discloses a data processing method, a device, equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring target data; calculating the target data to obtain first data; sending the first data to each other data processing device, and receiving each second data sent by each other data processing device; updating target data by using the first data and the second data, and counting the processing times; when the processing times reach a preset threshold value, determining to finish data processing; the method divides the data to be processed into a plurality of parts, processes the corresponding part by each data processing device, and synchronizes the data among the data processing devices. Therefore, the data to be processed can be processed in parallel by using a plurality of data processing devices, and the data processing speed and efficiency are improved.

Description

Data processing method, device and equipment and readable storage medium

Technical Field

The present invention relates to the field of data processing technologies, and in particular, to a data processing method, a data processing apparatus, a data processing device, and a computer-readable storage medium.

Background

In the big data era, the graph is used as a basic data representation mode and widely applied to various algorithms such as deep learning, user recommendation and the like. At present, the scale of the graph is often in the order of tens of millions to hundreds of millions of nodes, and the scale of the edge (node-node contact) information of the graph is in the order of billions. In the related art, when performing iterative computation of a graph, the graph is usually partitioned according to the number of source nodes and destination nodes, and after partitioning, each partition is subjected to parallel computation by using a single data processing device, or a plurality of channels are partitioned in each partition and subjected to parallel computation, so as to improve the data processing speed. Because the computing capacity of a single data processing device is limited, a block parallel or channel parallel computing method is carried out in time, but the data processing speed is low, and the processing efficiency is low.

Therefore, how to solve the problems of slow data processing speed and low processing efficiency in the related art is a technical problem to be solved by those skilled in the art.

Disclosure of Invention

In view of the above, the present invention provides a data processing method, a data processing apparatus, a data processing device, and a computer readable storage medium, which solve the problems of slow data processing speed and low processing efficiency in the related art.

In order to solve the above technical problem, the present invention provides a data processing method, including:

acquiring target data;

calculating the target data to obtain first data;

sending the first data to each other data processing device, and receiving each second data sent by each other data processing device;

updating the target data by using the first data and the second data, and counting the processing times;

and when the processing times reach a preset threshold value, determining that the data processing is finished.

Optionally, the performing calculation processing on the target data to obtain first data includes:

extracting a plurality of source node address information from the target data, and respectively selecting low-order data in the source node address information as channel distribution information;

sequencing a plurality of processing channels, and establishing a corresponding relation between the channel distribution information and the processing channels according to a sequencing result;

according to the corresponding relation, channel distribution is carried out on the target subdata corresponding to the source node address information;

in each processing channel, calculating and processing the target subdata by using processing parameters to obtain first subdata;

and obtaining the first data by utilizing the first subdata.

Optionally, after receiving the second data respectively sent by the other data processing apparatuses, the method further includes:

updating the processing parameter using the first data and the second data.

Optionally, the extracting address information of a plurality of source nodes from the target data includes:

determining a target data block in the target data, and extracting a plurality of source node address information from the target data block;

correspondingly, the obtaining the first data by using the first sub-data includes:

updating the target data block;

and after all the first subdata corresponding to the target data is obtained, generating the first data by using all the first subdata.

updating the mark information according to the receiving condition of the second data;

and when the flag information is in the all-acquisition state, executing a step of updating the target data by using the first data and the second data.

Optionally, the acquiring target data includes:

acquiring original data sent by a server, and filtering the original data by using preset information to obtain the target data;

and sending the original data to the other data processing equipment.

Optionally, the acquiring target data includes:

the method comprises the steps of obtaining original data sent by target data processing equipment, and filtering the original data by utilizing preset information to obtain the target data.

The present invention also provides a data processing apparatus, comprising:

the data acquisition module is used for acquiring target data;

the calculation processing module is used for calculating the target data to obtain first data;

the data synchronization module is used for sending the first data to each other data processing device and receiving each second data sent by each other data processing device;

the data updating module is used for updating the target data by using the first data and the second data and counting the processing times;

and the determining module is used for determining to finish the data processing when the processing times reach a preset threshold value.

The invention also provides a data processing device comprising a memory and a processor, wherein:

the memory is used for storing a computer program;

the processor is configured to execute the computer program to implement the data processing method.

The invention also provides a computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the data processing method described above.

The data processing method provided by the invention obtains target data; calculating the target data to obtain first data; sending the first data to each other data processing device, and receiving each second data sent by each other data processing device; updating target data by using the first data and the second data, and counting the processing times; and when the processing times reach a preset threshold value, determining that the data processing is finished.

Therefore, the method is applied to any data processing equipment, and the target data is obtained and then is calculated to obtain the first data. And after the calculation is finished, performing data synchronization with other data processing equipment, namely, transmitting the first data to the other data processing equipment and acquiring second data transmitted by the other data processing equipment. Updating the target data with the first data and the second data may prepare for the next data processing, and when the number of times of processing reaches a preset threshold, it may be determined that the data processing is ended. Dividing the data to be processed into a plurality of parts, processing the corresponding part by each data processing device, and synchronizing the data among the data processing devices. Therefore, the data to be processed can be processed in parallel by utilizing a plurality of data processing devices, the data processing speed and efficiency are improved, and the problems of low data processing speed and low processing efficiency in the related technology are solved.

In addition, the invention also provides a data processing device, data processing equipment and a computer readable storage medium, which also have the beneficial effects.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the provided drawings without creative efforts.

Fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention;

fig. 2 is a schematic diagram of original data partitioning according to an embodiment of the present invention;

fig. 3 is a schematic diagram of a data synchronization process of a data processing device according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of a data processing device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

In a possible implementation manner, please refer to fig. 1, where fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention. The method comprises the following steps:

s101: target data is acquired.

Each step in this embodiment may be executed by a specified data processing apparatus, and may be referred to as the present data processing apparatus. The specific content of the designated data processing device is not limited, and may be, for example, a heterogeneous processing device with any architecture, such as an FPGA heterogeneous accelerator card. The designated data processing apparatus may be any one of a plurality of data processing apparatuses that perform data processing jobs, and the other data processing apparatuses other than the designated data processing apparatus may be referred to as other data processing apparatuses. The types of the data processing devices may be the same or different, and for example, all the data processing devices may be FPGA heterogeneous accelerator cards, or may include FPGA heterogeneous accelerator cards and other heterogeneous processing devices, as long as the data calculated by the data processing devices are compatible and data synchronization can be completed.

The target data is specifically data that needs to be used by a specified data processing device when performing a calculation, and may be all input data (or referred to as raw data) or part of input data in the calculation, and specifically may include multiple sub-data (i.e., include multiple items of data). Since the data processing of the graph is performed in parallel by a plurality of data processing apparatuses in the present embodiment, the data processing work performed by each data processing apparatus differs, and thus the required data may also differ. Data required by data processing is generally provided by a server, so that the server can divide original data into blocks and then send the blocks to data processing equipment. By the method, the data processing equipment can directly acquire the corresponding original data block according to the own needs, and the time consumed by filtering the original data is reduced. In another embodiment, corresponding data filtering rules may be set in each data processing device, and the original data may be filtered according to the data filtering rules to obtain the target data. The method can perform data blocking without a server, and reduces the pressure of the server. For example, when the original data includes 16 original data blocks, of which only 4 original data blocks are data that the present data processing apparatus needs to use, the 4 original data blocks are obtained from the original data, and the four original data blocks are determined as target data. After the target data is acquired, it may be stored in a High Bandwidth Memory (HBM).

Referring to fig. 2, fig. 2 is a schematic diagram of original data blocking according to an embodiment of the present invention, wherein the original data is divided into 16 original data blocks from block <0, 0> to block <3, 3 >. In the division, the horizontal division is carried out according to the source node address of the graph. For example, when the data in the original data has 400 different source node addresses, which are 1-400 respectively, the original data can be divided according to 1-100, 101-200, 201-300 and 301-400 to obtain four large original data blocks of block <0, 0> -block <0, 3>, block <1, 0> -block <1, 3>, block <2, 0> -block <2, 3> and block <3, 0> -block <3, 3 >. After the horizontal division is finished, the vertical division can be performed according to the destination node address of the graph, and the process of the vertical division is similar to the process of the horizontal division, except that the source node address is changed into the destination node address. When the data processing device acquires the original data, the corresponding original data block can be selected according to the preset selection parameters, and the acquisition of the target data is completed.

In order to reduce the communication traffic between the server and the data processing devices, the device providing data, for example, the server, may establish a connection with only one of the data processing devices, for example, the connection may be made through a PCIE (peripheral component interconnect express) interface. The data processing apparatuses are connected to each other, for example, through a Media Access Control (MAC) interface. In an embodiment, the data processing apparatus is a target data processing apparatus connected to a server, and in this case, the step S101 may include:

step 11: and acquiring original data sent by the server, and filtering the original data by using preset information to obtain target data.

Step 12: sending raw data to other data processing devices

It should be noted that the preset information may be a data filtering rule, or may be data block selection information. For example, when the original data is not blocked, the preset information is a data filtering rule; when the original data is chunked, the preset information is data block selection information, for example, <0, 0>, <0, 1>, <0, 2> <0, 3>, indicating that four original data blocks from <0, 0> to <0, 3> are selected. After the raw data is obtained, the raw data may be sent to other data processing devices so that the other data processing devices may obtain corresponding data.

In another embodiment, the data processing apparatus is not a target data processing apparatus connected to the server, and in this case, the step S101 may include:

step 13: and acquiring original data sent by the target data processing equipment, and filtering the original data by using preset information to obtain target data.

In this embodiment, the present data processing apparatus needs to acquire raw data transmitted from a target data processing apparatus and obtain target data therefrom.

S102: and calculating the target data to obtain first data.

And after the target data are obtained, processing the target data by using a graph calculation algorithm to obtain first data. The graph calculation algorithm may be set according to actual needs, and may be, for example, a PageRank algorithm (web page level algorithm), a shortest path algorithm, a community discovery algorithm, or the like. The specific process of the calculation processing is different according to different calculation algorithms of the graph.

For example, when the graph calculation algorithm is the PageRank algorithm, the data processing apparatus may perform the calculation processing according to formula (1), where formula (1) is:

PR_i+1＝initial value +dangling+∑_iPR (1)

wherein, when the current calculation is the i +1 th word calculation, PR_i+1For a destination node value obtained after the (i + 1) th calculation of a certain destination node, that is, first data, initial value is an initial parameter, and dangling is a processing parameter, and the size of the processing parameter is related to the situation of the last calculation. Sigma_iPR is the sum of the various source node values in the target data that point to the destination node.

In one embodiment, to further increase the data processing speed, the calculation processing may be performed on each data processing apparatus in a lane-by-lane manner. At this time, the S102 step may include:

step 21: and extracting a plurality of source node address information from the target data, and respectively selecting low-order data in the source node address information as channel distribution information.

Step 22: and sequencing the processing channels, and establishing a corresponding relation between the channel distribution information and the processing channels according to a sequencing result.

Step 23: and according to the corresponding relation, performing channel distribution on the target subdata corresponding to the address information of each source node.

Step 24: and in each processing channel, calculating and processing the target subdata by using the processing parameters to obtain first subdata.

Step 25: and obtaining first data by using the first subdata.

In this implementation method, since the plurality of processing channels are divided and each processing channel processes data in parallel, before processing the data, a corresponding processing channel needs to be allocated to each target sub-data in the target data. In this embodiment, the target sub-data is a source node value, so that part of data of the source node address information corresponding to the target sub-data may be used as channel allocation information. According to the difference of the number of processing channels, the length of the selected partial data is different. For example, when the number of processing channels is 8, the lower three-bit data of the source node address information may be adopted as the channel allocation information, since the lower three-bit data has eight cases of 000, 001, 010, 011, 100, 101, 110, and 111, and thus the processing channel corresponding to each target sub-data may be determined using the same. When the number of processing channels is 16, the low four-bit data of the address information of the source node may be used as channel allocation information, and the low four-bit data may have sixteen cases, namely 0000, 0001, 0010, 0011, 0100, 0101, 0110, 0111, 1000, 1001, 1010, 1011, 1100, 1101, 1110, and 1111, and thus may be used to determine the processing channel corresponding to each target sub-data. After the channel allocation information is obtained, the processing channels need to be sorted, so as to establish a corresponding relationship with the channel allocation information according to a sorting result. For example, when 16 processing channels are provided, the first channel to the sixteenth channel are respectively obtained after sorting, the first channel corresponds to the channel allocation information of 0000, the second channel corresponds to the channel allocation information of 0001, the third channel corresponds to the channel allocation information of 0010, and so on. After the corresponding relation is established, processing channels are distributed for the target subdata corresponding to the address information of each source node by using the corresponding relation, and the target subdata is calculated and processed by using the processing parameters in each processing channel to obtain first subdata. The processing parameters are parameters necessary for executing the graph calculation algorithm, for example, when the graph calculation algorithm is the PageRank algorithm, the dangling is the processing parameters. After all the first subdata is obtained, all the first subdata construct first data.

Based on the above-described embodiments, in one implementation, the target data is composed of multiple raw data blocks, so only a single raw data block can be processed each time data processing is performed. Thus the step of obtaining the address information of the source node comprises

Step 31: a target data block is determined in the target data and a plurality of source node address information is extracted from the target data block.

Correspondingly, the step of obtaining the first data by using the first subdata includes:

step 32: and updating the target data block.

Step 33: and after all the first subdata corresponding to the target data is obtained, generating first data by using all the first subdata.

In this embodiment, each time the target data block is subjected to multi-channel data processing, and after the target data block is processed to obtain corresponding first sub-data, the target data block is updated, and other original data blocks in the target data are processed until all first sub-data corresponding to the target data are obtained, and then the first data are generated by using the first sub-data. The first data includes all data to be synchronized, and the specific content thereof is not limited.

S103: and sending the first data to each other data processing device, and receiving each second data sent by each other data processing device.

After the first data is obtained, the first data may be transmitted to each of the other data processing apparatuses. Meanwhile, the second data respectively sent by other data processing devices are received. It should be noted that the receiving step of the second data may be performed at any time, that is, the second data sent by another data processing device is detected at any time, and may be acquired and stored. In order to prevent the data from being damaged or covered, the embodiment preferably stores the acquired second data at a ping-pong location corresponding to the HBM location used in the current calculation. Referring to fig. 3, fig. 3 is a schematic diagram illustrating a data synchronization process of a data processing apparatus according to an embodiment of the present invention. The accelerator cards 0 to 3 are different-numbered FPGA heterogeneous accelerator cards, and each FPGA heterogeneous accelerator card can realize data synchronization among all data processing devices through data synchronization so as to prepare for the next data processing process.

S104: and updating the target data by using the first data and the second data, and counting the processing times.

And after the first data and all the second data are obtained, the target data are updated by using the first data and all the second data, namely, the data required by the next calculation are obtained from the first data and the second data and are determined as the target data. Whether the data processing process is finished or not can be judged by counting the processing times.

Further, in an embodiment, the second data may be obtained by using the flag information, and the step of updating the target data may be performed after all the second data are obtained. Specifically, after receiving each second data respectively sent by each other data processing device, before step S104, the method may further include:

step 41: updating the flag information according to the receiving condition of the second data;

step 42: when the flag information is the all-acquisition state, a step of updating the target data with the first data and the second data is performed.

The flag information may specifically be a binary character string, where each character in the character string is used to indicate whether corresponding second data is received, for example, when a first character in the character string is 1, indicating that second data sent by another data processing device with a sequence number of 1 is received; the flag information may also be the number of the acquired second data, and when the number is equal to the number of the other data processing apparatuses, it is determined that all the second data is acquired.

In one implementation, the processing parameters may need to be updated after the second data is obtained. Therefore, after receiving the respective second data respectively transmitted by the respective other data processing devices, the method may further include:

step 51: the processing parameter is updated using the first data and the second data.

For example, when the graph calculation algorithm is the PageRank algorithm, the dangling is a processing parameter, and after the target data is processed by using the graph calculation algorithm, the processing sub-parameter is obtained, the processing sub-parameter is put into the first data to be sent, the processing sub-parameter obtained by other data processing equipment is obtained from the obtained second data, the sum of all the processing sub-parameters is calculated, and the value of the dangling is updated by using the sum, so that the updating of the processing parameter is completed.

S105: and when the processing times reach a preset threshold value, determining that the data processing is finished.

When the number of times of processing reaches a preset threshold, it may be determined that the data processing is completed. The specific size of the preset threshold can be set according to actual conditions.

By applying the data processing method provided by the embodiment of the invention, the target data is obtained and then is calculated to obtain the first data. And after the calculation is finished, performing data synchronization with other data processing equipment, namely sending the first data to the other data processing equipment and acquiring second data sent by the other data processing equipment. Updating the target data with the first data and the second data may prepare for the next data processing, and when the number of times of processing reaches a preset threshold, it may be determined that the data processing is ended. Dividing the data to be processed into a plurality of parts, processing the corresponding part by each data processing device, and synchronizing the data among the data processing devices. Therefore, the data to be processed can be processed in parallel by utilizing a plurality of data processing devices, the data processing speed and efficiency are improved, and the problems of low data processing speed and low processing efficiency in the related technology are solved.

In the following, the data processing apparatus provided by the embodiment of the present invention is introduced, and the data processing apparatus described below and the data processing method described above may be referred to correspondingly.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention, including:

a data acquisition module 110, configured to acquire target data;

the calculation processing module 120 is configured to perform calculation processing on the target data to obtain first data;

the data synchronization module 130 is configured to send the first data to each of the other data processing apparatuses, and receive each of the second data sent by each of the other data processing apparatuses;

a data updating module 140, configured to update the target data with the first data and the second data, and count the number of times of processing;

and the determining module 150 is configured to determine that the data processing is completed when the processing time reaches a preset threshold.

Optionally, the calculation processing module 120 includes:

the channel distribution information acquisition unit is used for extracting a plurality of source node address information from the target data and respectively selecting low-order data in the source node address information as channel distribution information;

the corresponding relation establishing unit is used for sequencing the processing channels and establishing the corresponding relation between the channel distribution information and the processing channels according to the sequencing result;

the channel distribution unit is used for carrying out channel distribution on the target subdata corresponding to the address information of each source node according to the corresponding relation;

the processing unit is used for calculating and processing the target subdata by using the processing parameters in each processing channel to obtain first subdata;

and the first data generation unit is used for obtaining first data by utilizing the first subdata.

Optionally, the method further comprises:

and the processing parameter updating module is used for updating the processing parameters by utilizing the first data and the second data.

Optionally, the channel allocation information obtaining unit includes:

the target data block determining subunit is used for determining a target data block in the target data and extracting a plurality of source node address information from the target data block;

accordingly, the first data generation unit comprises:

a target data block updating subunit, configured to update the target data block;

and the generation subunit is used for generating the first data by using all the first sub-data after all the first sub-data corresponding to the target data are obtained.

Optionally, the method further comprises:

the mark information updating module is used for updating mark information according to the receiving condition of the second data;

accordingly, the data updating module 140 is a module that updates the target data with the first data and the second data when the flag information is in the all-acquisition state.

Optionally, the data obtaining module 110 includes:

the first acquisition unit is used for acquiring original data sent by the server and filtering the original data by using preset information to obtain target data;

and the data sending unit is used for sending the original data to other data processing equipment.

Optionally, the data obtaining module 110 includes:

and the second acquisition unit is used for acquiring the original data sent by the target data processing equipment and filtering the original data by using the preset information to obtain the target data.

Referring to fig. 5, fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention. Wherein the data processing device 100 may include a processor 101 and a memory 102, and may further include one or more of a multimedia component 103, an information input/information output (I/O) interface 104, and a communication component 105.

The processor 101 is configured to control the overall operation of the data processing apparatus 100 to complete all or part of the steps in the data processing method; the memory 102 is used to store various types of data to support operation at the data processing device 100, which may include, for example, instructions for any application or method operating on the data processing device 100, as well as application-related data. The Memory 102 may be implemented by any type or combination of volatile and non-volatile Memory devices, such as one or more of Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic or optical disk.

The multimedia component 103 may include a screen and an audio component. Wherein the screen may be, for example, a touch screen and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may further be stored in the memory 102 or transmitted through the communication component 105. The audio assembly further comprises at least one speaker for outputting audio signals. The I/O interface 104 provides an interface between the processor 101 and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons. The communication component 105 is used for wired or wireless communication between the data processing device 100 and other devices. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G, or 4G, or a combination of one or more of them, so that the corresponding Communication component 105 may include: Wi-Fi components, Bluetooth components, NFC components.

The data Processing apparatus 100 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components, and is configured to perform the data Processing method provided by the above embodiments.

In the following, the computer-readable storage medium provided by the embodiment of the present invention is introduced, and the computer-readable storage medium described below and the data processing method described above may be referred to correspondingly.

The present invention also provides a computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the steps of the data processing method described above.

The computer-readable storage medium may include: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The embodiments are described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same or similar parts among the embodiments are referred to each other. The device disclosed by the embodiment corresponds to the method disclosed by the embodiment, so that the description is simple, and the relevant points can be referred to the method part for description.

Those of skill would further appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative components and steps have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present invention.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be embodied directly in hardware, in a software module executed by a processor, or in a combination of the two. A software module may reside in Random Access Memory (RAM), memory, read-only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, a removable disk, a CD-ROM, or any other form of storage medium known in the art.

Finally, it should also be noted that, herein, relationships such as first and second, etc., are intended only to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms include, or any other variation is intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that includes a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.

The data processing method, the data processing apparatus, the data processing device and the computer readable storage medium provided by the present invention are described in detail above, and a specific example is applied herein to illustrate the principles and embodiments of the present invention, and the description of the above embodiment is only used to help understand the method and the core idea of the present invention; meanwhile, for a person skilled in the art, according to the idea of the present invention, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present invention.

Claims

1. A data processing method is applied to a data processing device and comprises the following steps:

acquiring target data; the target data is any one of a plurality of parts of data to be processed, and the plurality of parts respectively correspond to one data processing device;

calculating the target data to obtain first data;

sending the first data to each other data processing device, and receiving each second data sent by each other data processing device, so as to synchronize data among the data processing devices, and processing data to be processed in parallel by using a plurality of data processing devices;

when the processing times reach a preset threshold value, determining that data processing is finished;

the calculating the target data to obtain first data includes:

obtaining the first data by utilizing the first subdata;

after receiving each second data respectively sent by each of the other data processing apparatuses, the method further includes:

updating the processing parameter using the first data and the second data;

2. The data processing method of claim 1, wherein the extracting the plurality of source node address information from the target data comprises:

updating the target data block;

3. The data processing method of claim 1, wherein the obtaining target data comprises:

and sending the original data to the other data processing equipment.

4. The data processing method of claim 1, wherein the obtaining target data comprises:

5. A data processing apparatus, applied to a data processing device, includes:

the data acquisition module is used for acquiring target data; the target data is any one of a plurality of parts of data to be processed, and the plurality of parts respectively correspond to one data processing device;

the data synchronization module is used for sending the first data to each other data processing device and receiving each second data sent by each other data processing device respectively so as to synchronize the data among the data processing devices and process the data to be processed in parallel by using a plurality of data processing devices;

the determining module is used for determining to finish data processing when the processing times reach a preset threshold;

a calculation processing module comprising:

the first data generating unit is used for obtaining first data by utilizing the first subdata;

the processing parameter updating module is used for updating the processing parameters by utilizing the first data and the second data;

correspondingly, the data updating module is a module for updating the target data by using the first data and the second data when the flag information is in the all-acquisition state.

6. A data processing apparatus comprising a memory and a processor, wherein:

the memory is used for storing a computer program;

the processor for executing the computer program to implement the data processing method of any one of claims 1 to 4.

7. A computer-readable storage medium for storing a computer program, wherein the computer program, when executed by a processor, implements the data processing method of any one of claims 1 to 4.