CN112860417B - Data processing method, device, equipment, system and storage medium - Google Patents

Data processing method, device, equipment, system and storage medium Download PDF

Info

Publication number
CN112860417B
CN112860417B CN201911180814.7A CN201911180814A CN112860417B CN 112860417 B CN112860417 B CN 112860417B CN 201911180814 A CN201911180814 A CN 201911180814A CN 112860417 B CN112860417 B CN 112860417B
Authority
CN
China
Prior art keywords
accumulator
accumulated value
target variable
actuator
accumulated
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911180814.7A
Other languages
Chinese (zh)
Other versions
CN112860417A (en
Inventor
陈国锋
余万水
杨锋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mashang Consumer Finance Co Ltd
Original Assignee
Mashang Consumer Finance Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mashang Consumer Finance Co Ltd filed Critical Mashang Consumer Finance Co Ltd
Priority to CN201911180814.7A priority Critical patent/CN112860417B/en
Publication of CN112860417A publication Critical patent/CN112860417A/en
Application granted granted Critical
Publication of CN112860417B publication Critical patent/CN112860417B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/485Task life-cycle, e.g. stopping, restarting, resuming execution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Combined Controls Of Internal Combustion Engines (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a data processing method, a device, equipment, a system and a storage medium, which relate to the technical field of data processing and aim to solve the problem of inaccurate processing results of an actuator. The method comprises the following steps: acquiring an accumulated value of a target variable from a distributed accumulator, wherein the accumulated value is acquired by the distributed accumulator according to accumulated information sent by an actuator in Spark; judging whether the accumulated value meets a preset condition or not to obtain a judging result; and determining a processing mode according to the judging result. The embodiment of the invention can improve the accuracy of the processing result of the actuator.

Description

Data processing method, device, equipment, system and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a data processing method, apparatus, device, system, and storage medium.
Background
Spark is a distributed cluster computing platform. Driver is an Application (Application) running in Spark, and is used for coordinating and managing the whole flow; executor (executor) is a Spark executor, which is a process running on a worker node for an Application.
When the Spark big data platform is used for processing data, a certain variable needs to be accumulated and calculated in real time, and different processes are executed on the variable according to accumulation results. In this process, the driver assigns an initial value of the variable to each actuator, and then the value of the variable is accumulated by each actuator.
Due to the limitation of the distributed processing structure, in this way, the executor can only process the accumulated value of a certain variable according to the executor, so that the processing result of the executor is inaccurate.
Disclosure of Invention
The embodiment of the invention provides a data processing method, a device, equipment, a system and a storage medium, which are used for solving the problem of inaccurate processing results of an actuator.
In a first aspect, an embodiment of the present invention provides a data processing method, which is applied to at least one actuator in Spark, where the method includes:
Acquiring an accumulated value of a target variable from a distributed accumulator, wherein the accumulated value is acquired by the distributed accumulator according to accumulated information sent by an actuator in Spark;
Judging whether the accumulated value meets a preset condition or not to obtain a judging result;
and determining a processing mode according to the judging result.
In a second aspect, an embodiment of the present invention further provides a data processing method, applied to a distributed accumulator, including:
And transmitting an accumulated value of the target variable to an actuator in Spark, wherein the accumulated value is obtained according to accumulated information transmitted by the actuator in Spark.
In a third aspect, an embodiment of the present invention further provides a data processing apparatus, which is applied to an actuator in Spark, including:
The acquisition module is used for acquiring an accumulated value of a target variable from a distributed accumulator, wherein the accumulated value is acquired by the distributed accumulator according to accumulated information sent by an actuator in Spark;
the judging module is used for judging whether the accumulated value meets a preset condition or not to obtain a judging result;
and the processing module is used for determining a processing mode according to the judging result.
In a fourth aspect, an embodiment of the present invention further provides a data processing apparatus, applied to a distributed accumulator, including:
And the transmitting module is used for transmitting the accumulated value of the target variable to the actuator in the Spark, wherein the accumulated value is obtained according to accumulated information transmitted by the actuator in the Spark.
In a fifth aspect, an embodiment of the present invention further provides an electronic device, including: a transceiver for receiving and transmitting data under the control of a processor, a memory, a processor and a program stored on the memory and executable on the processor, the processor implementing the steps in the data processing method as described above when executing the program.
In a sixth aspect, embodiments of the present invention also provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of a data processing method as described above.
In a seventh aspect, an embodiment of the present invention further provides a data processing system, including: an actuator in Spark, a distributed accumulator;
the distributed accumulator is used for sending an accumulated value of a target variable to the actuator, wherein the accumulated value is obtained according to accumulated information sent by the actuator in Spark;
The executor is used for acquiring the accumulated value of the target variable from the distributed accumulator, judging whether the accumulated value meets a preset condition or not, and obtaining a judging result; and determining a processing mode according to the judging result.
In an embodiment of the present invention, the actuator in Spark may obtain the accumulated value of the target variable from a distributed accumulator, and the accumulated value is obtained by the distributed accumulator according to accumulated information obtained from the actuator in Spark. That is, in the solution of the embodiment of the present invention, the accumulated values used by the actuators in the processing are not the accumulated values of the actuators themselves, but the accumulated values obtained from the accumulated information of the respective actuators. Therefore, by utilizing the scheme of the embodiment of the invention, the accuracy of the processing result of the actuator can be improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort to a person of ordinary skill in the art.
FIG. 1 is one of the flowcharts of a data processing method provided by an embodiment of the present invention;
FIG. 2 is a second flowchart of a data processing method according to an embodiment of the present invention;
FIG. 3 is one of the block diagrams of a data processing system provided by an embodiment of the present invention;
FIG. 4 is a second block diagram of a data processing system according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of data interactions in a data processing system provided by an embodiment of the present invention;
FIG. 6 is a block diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 7 is a second block diagram of a data processing apparatus according to an embodiment of the present invention;
FIG. 8 is one of the block diagrams of the electronic device provided by the embodiment of the invention;
Fig. 9 is a second block diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, fig. 1 is a flowchart of a data processing method according to an embodiment of the present invention, where the data processing method is applied to at least one Spark actuator, as shown in fig. 1, and includes the following steps:
and step 101, acquiring an accumulated value of a target variable, wherein the accumulated value is acquired by a distributed accumulator according to accumulated information sent by an actuator in Spark.
In the embodiment of the invention, the distributed accumulator is a software component taking key (key value) -value as a memory storage mode, and the component is used for realizing atomic operation on value change. That is, the distributed accumulator may find and operate on the corresponding value according to the key, such as accumulating, reducing, etc. The accumulated information may be, for example, a processing result of the actuator, a change value of a certain variable obtained by the actuator, and the like.
In the embodiment of the invention, in order to balance the processing of the distributed accumulator and avoid the problem of overlarge balanced distributed accumulation load, an executor can send an accumulation request to the distributed accumulator through an accumulator connecting device and receive the accumulation value sent by the distributed accumulator through the accumulator connecting device.
The actuator may be any actuator in Spark.
And 102, judging whether the accumulated value meets a preset condition or not to obtain a judging result.
And step 103, determining a processing mode according to the judging result.
In the embodiment of the present invention, the preset condition may be a certain set value. The preset conditions of the different actuators may be the same or different. For a certain actuator, the obtained accumulated value and the corresponding preset condition are judged to obtain a judging result, and corresponding processing is carried out according to the judging result. That is, the process of determining and determining the manner of processing for each actuator is relatively independent.
For example, if the preset condition of each actuator is different, the actuators satisfying the preset condition end the process, and the actuators not satisfying the preset condition may continue the process. If the preset conditions of the respective actuators are consistent, the current actuator ends the process, and the other actuators continue to acquire the accumulated value and end the process if the preset conditions are satisfied.
Different actuators may perform different functions, and there may be different ways of processing depending on the function performed. Taking the money release in the financial industry as an example, assume that the daily accumulated money release amount is A. During the payout process, each actuator may obtain an accumulated value from a distributed accumulator and compare the obtained accumulated value to a. If the accumulated value is smaller than A, each accumulator can send the obtained accumulated information to a distributed accumulator, and the distributed accumulator accumulates according to the accumulated information to obtain the accumulated value; if the accumulated value is greater than or equal to A, the executor can stop processing, namely judging that the paying-out is needed to be stopped.
Because the accumulated value of the distributed accumulator is shared by each actuator in the processing process, that is, each actuator can acquire the accumulated value in real time without waiting for all data to be processed, each actuator can make real-time judgment according to the accumulated value and timely release the resources of the actuator, thereby improving the utilization rate of the actuator.
Furthermore, based on the above embodiment, the executor may further obtain accumulation information of the target variable, and send the accumulation information of the target variable to the distributed accumulator. In practical application, the executor sends the accumulated information to the distributed accumulator through the accumulator connecting device. Wherein, the accumulated information can be the processing result of the executor and the like. In this way, the distributed accumulator can be made to constantly update the accumulated value of the target variable, thereby improving the accuracy of the processing.
In an embodiment of the present invention, the actuator in Spark may obtain the accumulated value of the target variable from a distributed accumulator, and the accumulated value is obtained by the distributed accumulator according to accumulated information obtained from the actuator in Spark. That is, in the solution of the embodiment of the present invention, the accumulated values used by the actuators in the processing are not the accumulated values of the actuators themselves, but the accumulated values obtained from the accumulated information of the respective actuators. Therefore, by utilizing the scheme of the embodiment of the invention, the accuracy of the processing result of the actuator can be improved.
Referring to fig. 2, fig. 2 is a flowchart of a data processing method according to an embodiment of the present invention, which is applied to a distributed accumulator, as shown in fig. 2, and includes the following steps:
step 201, sending an accumulated value of a target variable to an actuator of Spark, wherein the accumulated value is obtained according to accumulated information sent by the actuator in Spark.
In actual practice, the distributed accumulator may send the accumulated value in response to a request by an actuator. Specifically, the distributed accumulator may receive an accumulation request sent by the executor through the accumulator connection device, and determine, according to the accumulation request, a Key value Key of the target variable. And then, obtaining the accumulated value of the target variable according to the key value of the target variable. Finally, the accumulated value of the target variable is transmitted to the actuator through the accumulator connection device.
The distributed accumulator is data-stored in the form of a key-value. Thus, when the actuator sends an accumulation request for obtaining the accumulated value of the target variable, the distributed accumulator may determine the key of the target variable based on the accumulation request. Then, according to the data stored by the user, a corresponding value is obtained. The value returned to the actuator is the final accumulated value of the target variable obtained by the distributed accumulator at the current time. Therefore, the executor can process according to the accumulated value and accurately obtain the processing result.
Based on the above embodiments, the distributed accumulator may further update the stored accumulated value according to accumulated information reported by the executor. Specifically, the distributed accumulator may further receive accumulation information of the target variable sent by the executor through the accumulator connection device, and update the accumulation value of the target variable according to the accumulation information and the stored accumulation value of the target variable. For example, the value of the target variable may be accumulated, reduced, etc. according to accumulation information reported by the executor.
In addition, to facilitate centralized management of data by Spark, a distributed accumulator may also send the final accumulated value of the target variable to a driver in the Spark.
In an embodiment of the present invention, the actuator in Spark may obtain the accumulated value of the target variable from a distributed accumulator, and the accumulated value is obtained by the distributed accumulator according to accumulated information obtained from the actuator in Spark. That is, in the solution of the embodiment of the present invention, the accumulated values used by the actuators in the processing are not the accumulated values of the actuators themselves, but the accumulated values obtained from the accumulated information of the respective actuators. Therefore, by utilizing the scheme of the embodiment of the invention, the accuracy of the processing result of the actuator can be improved.
With reference to FIG. 3, FIG. 3 is a schematic diagram of a data processing system according to an embodiment of the present invention. As shown in fig. 3, the system may include: an actuator 301 at Spark, and a distributed accumulator 302. The number of the actuators is at least one.
The distributed accumulator 302 is configured to send an accumulated value of a target variable to the actuator, where the accumulated value is obtained according to accumulated information sent by the actuator in Spark; the executor 301 is configured to obtain an accumulated value of the target variable from a distributed accumulator, determine whether the accumulated value meets a preset condition, and obtain a determination result; and determining a processing mode according to the judging result.
Furthermore, as shown in fig. 4, the system may further include: and a driver 303 located at Spark, configured to send the initialized value of the target variable to the distributed accumulator, and obtain the final accumulated value of the target variable from the distributed accumulator. In addition, the driver is also used for starting the actuator.
In the system shown in fig. 4, the driver initiates the initialization of variables to the distributed accumulator upon start-up. After the driver is started, the actuator is started through resource allocation. The executor calls the distributed accumulator to accumulate the initialization variable in the process of processing the data, and the distributed accumulator synchronously returns the variable value of the accumulator after the calling is completed. And each executor can judge the returned variable value in real time in the processing process and perform corresponding processing. After all the data are processed by each executor, the driver synchronously acquires the variable value of the accumulator from the distributed accumulator and destroys the accumulator variable.
In the system of the embodiment of the invention, accumulation is not carried out in a single actuator, and a distributed accumulator is used for accumulation, so that the actuator can obtain a final accumulated value, and the efficiency and accuracy of data processing are improved.
Wherein, as shown in fig. 5, the actuators interact through a pool of accumulator connections and a distributed accumulator. Wherein the accumulator connection pool is a resource that the driver configures to each actuator, which can be shared by each actuator. Each executor may form a data processing module that processes each operator's computing processing power and initiates a request to invoke an accumulation variable when accumulation computation is required. The accumulator connection pool is used as a bridge between the executor and the distributed accumulator and is used for scheduling and managing connection resources of the distributed accumulator and forwarding call requests. Specifically, the accumulator connection pool receives call requests of the executors and buffers the call requests; after receiving the accumulated values returned by the distributed accumulator, each accumulated value is returned to the corresponding actuator. The distributed accumulator is a processing container, uses a key-value mode memory to store variables, and increases and decreases atomic operation value.
In the system of the embodiment of the invention, the variables of each actuator can be shared in the processing process of the Spark platform, the mutual influence of the processing among the actuators is reduced, and the processing efficiency of the actuators is improved. Meanwhile, as the actuators can effectively share variables and data, each actuator can adjust the subsequent data processing mode in real time, and the data use efficiency is improved.
The embodiment of the invention also provides a data processing device which is applied to the executor in Spark. Referring to fig. 6, fig. 6 is a block diagram of a data processing apparatus according to an embodiment of the present invention. Since the principle of the data processing apparatus for solving the problem is similar to that of the data processing method in the embodiment of the present invention, the implementation of the data processing apparatus may refer to the implementation of the method, and the repetition is not repeated.
As shown in fig. 6, the data processing apparatus 600 includes: an obtaining module 601, configured to obtain an accumulated value of a target variable from a distributed accumulator, where the accumulated value is obtained by the distributed accumulator according to accumulated information sent by an actuator in Spark; a judging module 602, configured to judge whether the accumulated value meets a preset condition, to obtain a judging result; and the processing module 603 is configured to determine a processing manner according to the determination result.
Optionally, the acquiring module 601 may include: the transmitting submodule is used for transmitting an accumulation request to the distributed accumulator through the accumulator connecting device; and the receiving submodule is used for receiving the accumulated value sent by the distributed accumulator through the accumulator connecting device.
Optionally, the apparatus may further include: the second acquisition module is used for acquiring accumulated information of the target variable; and the sending module is used for sending the accumulated information of the target variable to the distributed accumulator.
Optionally, the sending module is specifically configured to send the accumulated information to the distributed accumulator through an accumulator connection device.
The device provided by the embodiment of the present invention may execute the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein.
The embodiment of the invention also provides a data processing device which is applied to the distributed accumulator. Referring to fig. 7, fig. 7 is a block diagram of a data processing apparatus according to an embodiment of the present invention. Since the principle of the data processing apparatus for solving the problem is similar to that of the data processing method in the embodiment of the present invention, the implementation of the data processing apparatus may refer to the implementation of the method, and the repetition is not repeated.
As shown in fig. 7, the data processing apparatus 700 includes: the sending module 701 is configured to send an accumulated value of the target variable to an actuator in Spark, where the accumulated value is obtained according to accumulated information sent by the actuator in Spark.
Optionally, the sending module 701 may include:
The receiving sub-module is used for receiving an accumulation request sent by the executor through the accumulator connecting device; the determining submodule is used for determining the key value of the target variable according to the accumulation request; the acquisition submodule is used for acquiring an accumulated value of the target variable according to the key value of the target variable; and the transmitting submodule is used for transmitting the accumulated value of the target variable to the actuator through the accumulator connecting device.
Optionally, the apparatus may further include: the receiving module is used for receiving accumulation information of the target variable sent by the executor through the accumulator connecting device; and the updating module is used for updating the accumulated value of the target variable according to the accumulated information and the stored accumulated value of the target variable.
Optionally, the sending module is further configured to send the final accumulated value of the target variable to a driver in the Spark.
The device provided by the embodiment of the present invention may execute the above method embodiment, and its implementation principle and technical effects are similar, and this embodiment will not be described herein.
The embodiment of the invention also provides electronic equipment which is applied to the executor in Spark. As shown in fig. 8, an electronic device according to an embodiment of the present invention includes: processor 800, for reading the program in memory 820, performs the following processes:
Acquiring an accumulated value of a target variable from a distributed accumulator, wherein the accumulated value is acquired by the distributed accumulator according to accumulated information sent by an actuator in Spark;
Judging whether the accumulated value meets a preset condition or not to obtain a judging result;
and determining a processing mode according to the judging result.
A transceiver 810 for receiving and transmitting data under the control of the processor 800.
Wherein in fig. 8, a bus architecture may comprise any number of interconnected buses and bridges, and in particular, one or more processors represented by processor 800 and various circuits of memory represented by memory 820, linked together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. Transceiver 810 may be a number of elements, including a transmitter and a transceiver, providing a means for communicating with various other apparatus over a transmission medium. The processor 800 is responsible for managing the bus architecture and general processing, and the memory 820 may store data used by the processor 800 in performing operations.
The processor 800 is responsible for managing the bus architecture and general processing, and the memory 820 may store data used by the processor 800 in performing operations.
The processor 800 is further configured to read the program and perform the following steps:
transmitting an accumulation request to the distributed accumulator through an accumulator connecting device;
the accumulated value sent by the distributed accumulator through the accumulator connection device is received.
The processor 800 is further configured to read the program and perform the following steps:
Acquiring accumulation information of the target variable;
And sending accumulation information of the target variable to the distributed accumulator.
The embodiment of the invention also provides electronic equipment which is applied to the distributed accumulator. As shown in fig. 9, an electronic device according to an embodiment of the present invention includes: processor 900, for reading the program in memory 920, performs the following procedures:
And transmitting an accumulated value of the target variable to an actuator in Spark, wherein the accumulated value is obtained according to accumulated information transmitted by the actuator in Spark.
A transceiver 910 for receiving and transmitting data under the control of the processor 900.
Wherein in fig. 9, a bus architecture may comprise any number of interconnected buses and bridges, and in particular one or more processors represented by processor 900 and various circuits of memory represented by memory 920, linked together. The bus architecture may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., which are well known in the art and, therefore, will not be described further herein. The bus interface provides an interface. The transceiver 910 may be a number of elements, i.e., include a transmitter and a transceiver, providing a means for communicating with various other apparatus over a transmission medium. The processor 900 is responsible for managing the bus architecture and general processing, and the memory 920 may store data used by the processor 900 in performing operations.
The processor 900 is responsible for managing the bus architecture and general processing, and the memory 920 may store data used by the processor 900 in performing operations.
The processor 900 is further configured to read the program, and perform the following steps:
Receiving an accumulation request sent by the executor through an accumulator connecting device;
determining a key value of the target variable according to the accumulation request;
Acquiring an accumulated value of the target variable according to the key value of the target variable;
And sending the accumulated value of the target variable to the actuator through the accumulator connecting device.
The processor 900 is further configured to read the program, and perform the following steps:
receiving accumulation information of the target variable sent by the executor through an accumulator connecting device;
And updating the accumulated value of the target variable according to the accumulated information and the stored accumulated value of the target variable.
The processor 900 is further configured to read the program, and perform the following steps:
and sending the final accumulated value of the target variable to a driver in the Spark.
The embodiment of the invention also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the respective processes of the above-mentioned data processing method embodiment, and can achieve the same technical effects, and in order to avoid repetition, the description is omitted here. The computer readable storage medium is, for example, a Read-Only Memory (ROM), a random access Memory (Random Access Memory RAM), a magnetic disk or an optical disk.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. In light of such understanding, the technical solutions of the present invention may be embodied essentially or in part in the form of a software product stored in a storage medium (e.g., ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal (which may be a cell phone, computer, server, air conditioner, or network device, etc.) to perform the methods described in the various embodiments of the present invention.
The embodiments of the present invention have been described above with reference to the accompanying drawings, but the present invention is not limited to the above-described embodiments, which are merely illustrative and not restrictive, and many forms may be made by those having ordinary skill in the art without departing from the spirit of the present invention and the scope of the claims, which are to be protected by the present invention.

Claims (11)

1. A data processing method, characterized by an actuator applied to Spark, the actuator being at least one, the method comprising:
acquiring an accumulated value of a target variable from a distributed accumulator, wherein the accumulated value is obtained by the distributed accumulator according to accumulated information sent by each actuator of Spark;
Judging whether the accumulated value meets a preset condition or not to obtain a judging result;
Determining a processing mode according to the judging result;
Wherein the obtaining the accumulated value of the target variable from the distributed accumulator includes:
transmitting an accumulation request to the distributed accumulator through an accumulator connecting device;
the accumulated value sent by the distributed accumulator through the accumulator connection device is received.
2. The method according to claim 1, wherein the method further comprises:
Acquiring accumulation information of the target variable;
And sending accumulation information of the target variable to the distributed accumulator.
3. A method of data processing, for use with a distributed accumulator, the method comprising:
Transmitting an accumulated value of a target variable to an actuator of Spark, wherein the accumulated value is obtained according to accumulated information transmitted by each actuator of Spark;
Wherein the sending the accumulated value of the target variable to the Spark executor includes:
Receiving an accumulation request sent by the executor through an accumulator connecting device;
determining a key value of the target variable according to the accumulation request;
Acquiring an accumulated value of the target variable according to the key value of the target variable;
And sending the accumulated value of the target variable to the actuator through the accumulator connecting device.
4. A method according to claim 3, characterized in that the method further comprises:
receiving accumulation information of the target variable sent by the executor through an accumulator connecting device;
And updating the accumulated value of the target variable according to the accumulated information and the stored accumulated value of the target variable.
5. The method according to any one of claims 3-4, further comprising:
and sending the final accumulated value of the target variable to a driver in the Spark.
6. A data processing apparatus, characterized by an actuator for use in Spark, comprising:
The acquisition module is used for acquiring an accumulated value of a target variable from a distributed accumulator, wherein the accumulated value is an accumulated value obtained by the distributed accumulator according to accumulated information sent by each actuator in Spark;
the judging module is used for judging whether the accumulated value meets a preset condition or not to obtain a judging result;
the processing module is used for determining a processing mode according to the judging result;
Wherein, the acquisition module includes: the transmitting submodule is used for transmitting an accumulation request to the distributed accumulator through the accumulator connecting device; and the receiving submodule is used for receiving the accumulated value sent by the distributed accumulator through the accumulator connecting device.
7. A data processing apparatus for use with a distributed accumulator, comprising:
The transmission module is used for transmitting the accumulated value of the target variable to the executors of Spark, wherein the accumulated value is obtained according to accumulated information transmitted by each executor in Spark;
wherein, the sending module includes:
The receiving sub-module is used for receiving an accumulation request sent by the executor through the accumulator connecting device; the determining submodule is used for determining the key value of the target variable according to the accumulation request; the acquisition submodule is used for acquiring an accumulated value of the target variable according to the key value of the target variable; and the transmitting submodule is used for transmitting the accumulated value of the target variable to the actuator through the accumulator connecting device.
8. An electronic device, comprising: a transceiver, a memory, a processor, and a program stored on the memory and executable on the processor, the transceiver for receiving and transmitting data under control of the processor; it is characterized in that the method comprises the steps of,
The processor for reading a program in a memory to implement the steps in the method of any one of claims 1 to 2; or to carry out the steps of the method according to any one of claims 3 to 5.
9. A computer readable storage medium storing a computer program, characterized in that the computer program when executed by a processor implements the steps of the method according to any one of claims 1 to 2; or to carry out the steps of the method according to any one of claims 3 to 5.
10. A data processing system, comprising: at least one actuator in Spark, a distributed accumulator;
the distributed accumulator is used for sending accumulated values of target variables to the executors, wherein the accumulated values are obtained according to accumulated information sent by each executor in Spark;
the executor is used for acquiring the accumulated value of the target variable from the distributed accumulator, judging whether the accumulated value meets a preset condition or not, and obtaining a judging result; determining a processing mode according to the judging result;
wherein the actuator obtains the accumulated value of the target variable by:
transmitting an accumulation request to the distributed accumulator through an accumulator connecting device; receiving the accumulated value sent by the distributed accumulator through the accumulator connection device;
the distributed accumulator sends the accumulated value of the target variable to the actuator by:
Receiving an accumulation request sent by the executor through an accumulator connecting device;
determining a key value of the target variable according to the accumulation request;
Acquiring an accumulated value of the target variable according to the key value of the target variable;
And sending the accumulated value of the target variable to the actuator through the accumulator connecting device.
11. The system of claim 10, wherein the system further comprises:
And the Spark driver is used for sending the initialized value of the target variable to the distributed accumulator and obtaining the final accumulated value of the target variable from the distributed accumulator.
CN201911180814.7A 2019-11-27 2019-11-27 Data processing method, device, equipment, system and storage medium Active CN112860417B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911180814.7A CN112860417B (en) 2019-11-27 2019-11-27 Data processing method, device, equipment, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911180814.7A CN112860417B (en) 2019-11-27 2019-11-27 Data processing method, device, equipment, system and storage medium

Publications (2)

Publication Number Publication Date
CN112860417A CN112860417A (en) 2021-05-28
CN112860417B true CN112860417B (en) 2024-07-05

Family

ID=75985539

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911180814.7A Active CN112860417B (en) 2019-11-27 2019-11-27 Data processing method, device, equipment, system and storage medium

Country Status (1)

Country Link
CN (1) CN112860417B (en)

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10176092B2 (en) * 2016-09-21 2019-01-08 Ngd Systems, Inc. System and method for executing data processing tasks using resilient distributed datasets (RDDs) in a storage device
CN107579734B (en) * 2017-09-28 2020-12-29 北京集创北方科技股份有限公司 Signal processing method, signal processing device, storage medium and processor
CN110019367B (en) * 2017-12-28 2022-04-12 北京京东尚科信息技术有限公司 Method and device for counting data characteristics
CN109145007A (en) * 2018-07-28 2019-01-04 重庆小雨点小额贷款有限公司 A kind of data processing method, device, server and computer storage medium
CN110175124A (en) * 2019-05-23 2019-08-27 深圳前海微众银行股份有限公司 A kind of method and device of diagnosis Spark application

Also Published As

Publication number Publication date
CN112860417A (en) 2021-05-28

Similar Documents

Publication Publication Date Title
US11449774B2 (en) Resource configuration method and apparatus for heterogeneous cloud services
CN108632365B (en) Service resource adjusting method, related device and equipment
CN113196238B (en) Service-aware server-less cloud computing system
CN110768914B (en) Decentralized Internet of things gateway system based on semantic scene instance migration
CN110378529B (en) Data generation method and device, readable storage medium and electronic equipment
CN110275764B (en) Method, device and system for processing call timeout
CN112465615B (en) Method, device and system for processing bill data
CN111813524B (en) Task execution method and device, electronic equipment and storage medium
CN113986497B (en) Queue scheduling method, device and system based on multi-tenant technology
CN113608751B (en) Operation method, device and equipment of reasoning service platform and storage medium
CN111376953B (en) Method and system for issuing plan for train
CN112163734B (en) Cloud platform-based setting computing resource dynamic scheduling method and device
CN111400028B (en) Load balancing processing method for train management
CN112860417B (en) Data processing method, device, equipment, system and storage medium
CN111143063B (en) Task resource reservation method and device
CN115640113A (en) Multi-plane flexible scheduling method
CN113535346B (en) Method, device, equipment and computer storage medium for adjusting thread number
CN112988405B (en) Automatic degradation method and device for micro-service and computing equipment
CN114036250A (en) High-precision map task processing method and device, electronic equipment and medium
CN109978206B (en) Method and device for requesting service resources
CN111866159A (en) Method, system, device and storage medium for calling artificial intelligence service
CN110351334A (en) Service request processing and payment transaction request processing method and device
CN113254177B (en) Task submitting method based on cluster, computer program product and electronic equipment
CN118034941B (en) Cluster computing power optimization method, device, equipment and storage medium
CN117971524B (en) Message scheduling system, method, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant