WO2020233252A1

WO2020233252A1 - Method and apparatus for diagnosing spark application

Info

Publication number: WO2020233252A1
Application number: PCT/CN2020/083381
Authority: WO
Inventors: 王和平; 尹强; 刘有; 黄山; 杨峙岳; 邸帅; 卢道和
Original assignee: 深圳前海微众银行股份有限公司
Priority date: 2019-05-23
Filing date: 2020-04-03
Publication date: 2020-11-26
Also published as: CN110175124A

Abstract

The present invention relates to the technical field of finances. Disclosed are a method and an apparatus for diagnosing a Spark application. The method comprises: obtaining context information of a Spark application; determining, according to the context information, a diagnostic indicator for the Spark application and an indicator rule corresponding to the diagnostic indicator; acquiring, according to the diagnostic indicator of the Spark application, running information corresponding to the diagnostic indicator during the running process of the Spark application; and diagnosing, according to the indicator rule corresponding to the diagnostic indicator, the running information corresponding to the diagnostic indicator to determine the diagnosis result of the Spark application. In the present technical solution, the running indicator of the Spark application is acquired in real time during the running process of the Spark application, real-time diagnosis is conducted for problems during the running of the Spark application, and an effective diagnostic measure is provided.

Description

Method and device for diagnosing Spark application

Cross references to related applications

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 23, 2019, the application number is 201910432603.1, and the application name is "a method and device for diagnosing Spark applications", the entire content of which is incorporated herein by reference Applying.

Technical field

The embodiments of the present application relate to the field of Fintech, and in particular, to a method and device for diagnosing Spark applications.

Background technique

With the development of computer technology, more and more technologies are applied in the financial field. The traditional financial industry is gradually transforming to financial technology. Spark technology is no exception. However, due to the security and real-time requirements of the financial and payment industries, Spark technology puts forward higher requirements.

Spark technology is a fast and general-purpose computing engine designed for large-scale data processing. Spark uses memory computing technology, which can analyze and compute in memory when the data has not been written to the hard disk. The existing Spark application diagnosis is to collect and analyze the logs during the running process after the Spark application runs, determine the problems existing in the Spark application running based on the preset rule method, and make corresponding adjustments.

In the prior art, the log is diagnosed and analyzed after the Spark application is running, and problems in the running process cannot be found in time and effective diagnostic measures are provided.

Summary of the invention

The embodiments of the application provide a method and device for diagnosing Spark applications, which are used to collect the running indicators of the Spark application in real time during the running process of the Spark application, perform real-time diagnosis of the problems in the running of the Spark application, and provide effective diagnosis Measures.

In the first aspect, an embodiment of the present application provides a method for diagnosing a Spark application. The method can be executed by a runtime diagnostic tool to which the method for diagnosing a Spark application is provided in the financial technology field, including: obtaining information about the Spark application Context information; determine the diagnostic indicators of the Spark application and the indicator rules corresponding to the diagnostic indicators according to the context information; collect the operating information corresponding to the diagnostic indicators of the Spark application during the running process according to the diagnostic indicators of the Spark application; diagnose according to the indicator rules corresponding to the diagnostic indicators Diagnose the running information corresponding to the indicator and determine the diagnosis result of the Spark application.

In the above technical solution, by obtaining the context information of the Spark application, generating diagnostic indicators and indicator rules, it is possible to obtain real-time operating information corresponding to the diagnostic indicators during the Spark application process, diagnose the operating information, and determine the diagnostic result of the Spark application. In this way, it is possible to perform real-time diagnosis of operating faults that occur during the operation of Spark applications, and obtain the diagnosis results. Further, to obtain operating information during the operation of Spark applications, it is possible to more comprehensively collect the parameters and indicators of the operation of Spark applications. In the running log after the Spark application has finished running, the running parameters are more comprehensive and reflect the current running status of the Spark application.

Optionally, there are multiple diagnostic indicators; for any one of the diagnostic indicators, the operating information corresponding to the diagnostic indicator is diagnosed according to the indicator rules corresponding to the diagnostic indicator, and the diagnostic result corresponding to the diagnostic indicator is determined; the diagnostic result corresponding to the multiple diagnostic indicators The diagnosis result that meets the preset rules is determined in the file and determined to be the diagnosis result of the Spark application.

In the above technical solution, for any diagnosis index, a corresponding diagnosis result is set, and at the same time, a preset rule is used to determine a diagnosis result that meets the preset rule from a plurality of diagnosis results, and further determine it as Diagnosis result of Spark application. Diagnose the running indicators of the Spark application in various aspects, evaluate the Spark application in multiple dimensions, and find the faults in the running in time, and according to preset rules, the diagnosis results corresponding to the representative diagnostic indicators are used as the diagnosis of the current Spark application As a result, it is convenient for users to intuitively understand current Spark application running problems, and improve user experience.

Optionally, after the diagnosis result of the Spark application is determined, it further includes: according to the diagnosis code in the diagnosis result that meets the preset rule, obtain the diagnosis measure corresponding to the diagnosis code in the diagnosis result that meets the preset rule from the preset database And report to the user; the corresponding relationship between the diagnosis code and the diagnosis measures is preset in the preset database.

In the above technical solution, a preset database is provided, and the corresponding relationship between the diagnosis code and the diagnosis measure is preset in the preset database, so that after the diagnosis result of the Spark application is determined, the user can provide the targeted diagnosis measure, namely The solution is convenient for users to solve problems independently based on diagnostic measures, so as to solve the problems in the running of the Spark application in time. This technical solution does not require users to query related materials to solve the running problems of the Spark application, but directly sets up related solutions and provides them to users, improving the efficiency of users in solving problems, and improving user experience.

Optionally, after the operating information corresponding to the diagnostic indicators is unified, the operating indicators corresponding to the diagnostic indicators are generated; the operating indicators corresponding to the diagnostic indicators are diagnosed according to the indicator rules corresponding to the diagnostic indicators.

In the above technical solution, the operation information corresponding to the diagnostic index is unified and encapsulated into an operation index that can be diagnosed, so that the operation index is diagnosed.

Optionally, before determining the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information, it also includes: obtaining user configuration information; determining the diagnostic index of the Spark application and the corresponding diagnostic index according to the user configuration information and context information Indicator rules.

In the above technical solution, the user is supported to select the diagnostic index and the index rule corresponding to the diagnostic index, that is, the user can choose the index collector and the diagnostic ruler to perform real-time diagnosis of the Spark job to meet the needs of different users.

In the second aspect, the embodiments of the present application provide a device for diagnosing Spark applications. The device may be the runtime diagnostic in the first aspect mentioned above, or a device including the aforementioned runtime diagnostic device, or a device with runtime diagnostics. The corresponding function of the chip, etc. The device includes a module, unit, or means corresponding to the foregoing method, and the module, unit, or means can be implemented by hardware, software, or hardware executing corresponding software. The hardware or software includes one or more modules or units corresponding to the above-mentioned functions. Among them, the device includes: an acquisition unit for acquiring context information of the Spark application; a processing unit for determining the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information; and collecting the Spark application according to the diagnostic index of the Spark application Diagnose the running information corresponding to the diagnostic index during the running process; diagnose the running information corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index, and determine the diagnosis result of the Spark application.

Optionally, there are multiple diagnosis indicators; the processing unit is specifically used to: for any one diagnosis indicator, diagnose the operating information corresponding to the diagnosis indicator according to the indicator rule corresponding to the diagnosis indicator, and determine the diagnosis result corresponding to the diagnosis indicator; The diagnosis result corresponding to the diagnosis index determines the diagnosis result that meets the preset rules, and is determined to be the diagnosis result of the Spark application.

Optionally, the processing unit is further configured to: after determining the diagnosis result of the Spark application, according to the diagnosis code in the diagnosis result that meets the preset rule, obtain the diagnosis result that meets the preset rule from the preset database through the obtaining unit The diagnostic measures corresponding to the diagnostic code are reported to the user; wherein the corresponding relationship between the diagnostic code and the diagnostic measure is preset in the preset database.

Optionally, the processing unit is specifically configured to: uniformly process the operating information corresponding to the diagnostic indicators to generate operating indicators corresponding to the diagnostic indicators; and diagnose the operating indicators corresponding to the diagnostic indicators according to the indicator rules corresponding to the diagnostic indicators.

Optionally, the processing unit is further configured to: before determining the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information, obtain user configuration information through the obtaining unit; determine the diagnosis of the Spark application according to the user configuration information and context information Indicator rules corresponding to indicators and diagnostic indicators.

In the third aspect, the present application also provides a computing device, including: a processor and a memory; the processor is configured to be coupled with the memory, and by calling and executing the memory stored in the memory storing computer programs or instructions, when the processor When the computer program or instruction is executed, the communication device can execute the method of the first aspect. The computing device may be the operating diagnostic device in the first aspect described above, or a device including the operating diagnostic device, or a chip with corresponding functions of the operating diagnostic device, or the like.

In a fourth aspect, the present application also provides a computer-readable non-volatile storage medium including computer-readable instructions. When the computer reads and executes the computer-readable instructions, the computer executes the above-mentioned method for diagnosing the Spark application.

In the fifth aspect, this application provides a computer program product containing instructions, which when run on a computer, enables the computer to execute the method of the first aspect.

Among them, the technical effects brought by any one of the possible implementation manners of the foregoing second aspect to the fifth aspect may refer to the technical effects brought about by the different implementation manners of the foregoing first aspect, and details are not described herein again.

Description of the drawings

In order to more clearly explain the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the description of the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained from these drawings.

FIG. 1 is a schematic diagram of a system architecture provided by an embodiment of the application;

FIG. 2 is a schematic flowchart of a method for diagnosing a Spark application provided by an embodiment of the application;

3 is a schematic structural diagram of a device for diagnosing Spark applications provided by an embodiment of the application;

FIG. 4 is a schematic structural diagram of a computing device provided by an embodiment of the application.

Detailed ways

In order to make the objectives, technical solutions, and advantages of the present invention clearer, the application will be further described in detail below in conjunction with the accompanying drawings. Obviously, the described embodiments are only a part of the embodiments of the present application, rather than all the embodiments. Based on the embodiments in this application, all other embodiments obtained by a person of ordinary skill in the art without creative work shall fall within the protection scope of this application.

Spark provides convenient APIs (Application Programming Interface) such as listeners and metrics systems, which are used to collect the running information of Spark applications during the running process. On this basis, FIG. 1 exemplarily shows a runtime diagnostic device (Runtime Diagnoser) 100 applicable to the method for diagnosing Spark applications in the financial technology field provided by an embodiment of the present application. The runtime diagnostic device 100 may include an indicator collector ( Metric Collector 101, Metric Ruler 102, Rule Result Merger 103, Diagnostic Notifer 104, Database 105; Run Diagnostics 100 is connected to Monitor 200 .

The running diagnostic device 100 is used for the scheduling of the entire Spark application in the diagnosis process. Specifically, the running diagnostic device 100 obtains the context information of the Spark application, instantiates the diagnostic context information (Diagnostic Context) according to the context information; registers according to the diagnostic context information The metric collector 101 and the metric ruler 102 trigger the task of diagnosing the Spark application by timing or active triggering. That is, the metric collector 101 is triggered to collect the metric information of the Spark application during the running process according to the metric rules, and will collect The received indicator information is sent to the indicator ruler 102, and the indicator ruler 102 generates rule results for the indicators according to the corresponding indicator rules. Then, the indicator ruler 102 sends the rule results to the rule result merger 103, and the rule result merger 103 receives The multiple rule results obtained are generated, the diagnosis result of the Spark application is generated, and the diagnosis result is sent to the diagnosis notifier 104. The diagnosis notifier 104 obtains the corresponding diagnosis measure from the database 105 according to the diagnosis result, and sends the diagnosis result and diagnosis measure to The monitor 200 enables the monitor 200 to display the diagnostic results and diagnostic measures of the Spark application in operation to the user.

Based on the above description, FIG. 2 exemplarily shows the flow of a method for diagnosing a Spark application provided by an embodiment of the present application. The flow can be executed by a device for diagnosing a Spark application, and the device can be located in the above-mentioned running diagnostic device, It is the above-mentioned running diagnostic tool. As shown in Figure 2, the process specifically includes:

Step 201: Acquire context information of the Spark application.

Here, when the Spark application is running, it generates the context information of the Spark application, which can also be said to be the diagnostic context information of the Spark application. The context information of the Spark application can include Spark Context, which plays a leading role in the execution of the Spark application. It is responsible for interacting with the program and the Spark cluster, including applying for cluster resources, creating RDD (Resilient Distributed Datasets, flexible distributed Data set), Accumulators (accumulator) and broadcast variables.

Step 202: Determine the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information.

Specifically, the basic information of the Spark application is acquired according to the context information of the Spark application for index collection in the operation diagnosis. It can also be said that the diagnostic context information generates a metric collector and a diagnostic ruler through the Spark Context, and transmits the interfaces (Listener and/or Metrics) in the Spark Context to the metric collector for metric collection.

In the embodiments of the present application, the user can also be supported to select the diagnostic index and the index rule corresponding to the diagnostic index, that is, the user can choose the index collector and the diagnostic ruler to perform real-time diagnosis of the Spark job. Specifically, before determining the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information, the user configuration information can be obtained first, and then the diagnostic index of the Spark application and the index corresponding to the diagnostic index can be determined according to the user configuration information and context information rule.

Step 203: Collect operating information corresponding to the diagnostic indicators of the Spark application during the running process according to the diagnostic indicators of the Spark application.

The diagnostic indicators may include Task-related indicators, Executor-related indicators, and Job-related indicators. Exemplarily, the running information corresponding to the diagnostic indicators during the running of the Spark application may include the running information corresponding to the task-related indicators, such as the task execution time, the number of task attempts, the task start time, the number of task input records, the number of task output records, Task status, etc.; operation information corresponding to Executor-related indicators, such as the number of Executor parameter settings, the number of existing Executors, the number of Executor exits, the amount of data read, the amount of data output, etc.; the operation information corresponding to the Job-related indicators, such as Job running time, the total number of stages and the total number of successes of the job, the total number of tasks and the total number of successes of the job, etc.

In the embodiment of the application, the running information corresponding to the diagnostic index during the running of the Spark application can be collected according to the diagnostic index of the Spark application at a certain collection frequency. For example, during the running of the Spark application, it is set to collect the diagnostic index every 1 minute. Operating information. The collection frequency can be set based on experience or according to user needs. The collection frequency of different diagnostic indicators can be the same or different.

Step 204: Diagnose the operation information corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index, and determine the diagnosis result of the Spark application.

In the embodiments of the present application, the operating information corresponding to the diagnostic indicators may be unified to generate the operating indicators corresponding to the diagnostic indicators, and then the operating indicators corresponding to the diagnostic indicators can be diagnosed according to the indicator rules corresponding to the diagnostic indicators. Unification processing can include unit unification, format conversion and other processing. After unification processing, the processed operating information is encapsulated into operating indicators corresponding to diagnostic indicators, so that the diagnostic ruler can diagnose the operating indicators corresponding to the diagnostic indicators. Obtain the diagnosis result of the Spark application.

The acquired diagnostic index can be multiple, and each diagnostic index corresponds to the operating index. After the diagnostic index corresponding to the operating index is diagnosed, the diagnostic result corresponding to each diagnostic index can be generated and determined from the diagnostic results corresponding to multiple diagnostic indicators A diagnosis result that meets the preset rules is output and determined as the diagnosis result of the Spark application.

In order to better explain the above-mentioned implementation of generating Spark application diagnosis results, the following provides examples in specific implementation scenarios. This embodiment may have three scenarios, a data tilt scenario, a queue resource insufficient scenario, and a memory excess scenario, and diagnose the diagnostic indicators in the three scenarios, and generate a diagnostic result recording a diagnostic score.

Data skew scenario: For example, when the execution time of a task is abnormal due to data skew at a certain stage, the diagnostic ruler will use the execution time of all tasks obtained from the Task indicator to take the median and maximum number of multiple execution times. If the maximum number is greater than ten times the median (parameters can be configured), get the number of input records of the task with the maximum execution time and the number of input records of the task with the median execution time, if the number of input records of the task with the maximum execution time Ten times the number of input records of the task with the median execution time (parameters are configurable), it is determined that there is data skew at this time, and the diagnosis score is determined by the multiple of the execution time and the multiple of the number of input records. For example, if the current execution time of all tasks is 1min, 2min, 4min, 5min, 45min, it is determined that the maximum execution time is 45min, and the median execution time is 4min. If 45min is greater than ten times of 4min, the maximum execution time is further determined The number of input records of the task and the number of input records of the task with the median execution time are assumed to be 300 and 40 respectively. Then it is determined that 300 is greater than ten times of 40, that is, there is a data skew at this time, and the execution time is The score is determined to be 1 point, and the score is determined to be 1 point by inputting the number of records, and the diagnostic score of the data skew condition is 1+1=2 points.

Scenarios of insufficient queue resources: For example, when the queue has no resources and the number of executors is much smaller than the value set by the user, the diagnostic ruler will obtain the number of existing Executors and the number of set Executors from the executor indicators, and determine the current Whether the number of some Executors is less than 2/3 set by the user (parameters can be configured), if so, it is determined that there is insufficient queue resources at this time, and the diagnosis score is determined by the number of lacking Executors. For example, if the number of existing Executors is 10, the number of Executors set by the user is 30, and the number of existing Executors is less than 2/3 of the user setting, there is insufficient queue resources at this time, and the lack of Divide the number of Executors by 5 for scoring. If the number of Executors currently lacking is 20, the diagnostic score for insufficient queue resources is 4 points.

Out-of-memory scenarios: For example, when a user executes a complex query with a large amount of data, and multiple Executors exit due to memory excess, the diagnostic ruler will obtain the number of existing Executors and the number of failed Executors from the actuator indicators. , And judge whether the number of failed Executors exceeds 1/4 of the existing number (parameters are configurable), if so, it is determined that there is a memory excess at this time, and the diagnosis score is determined by the number of failed Executors. For example, the number of existing Executors is 10, and the number of failed Executors is 3. At this time, there is a memory excess. Determine the score based on the number of failed Executors divided by 3, and the diagnostic score for the memory excess is 1 point .

According to the diagnosis score in the diagnosis result corresponding to each of the above-mentioned diagnosis indexes, it can be determined that the diagnosis result corresponding to the diagnosis index with the highest diagnosis score is determined as the diagnosis result of the Spark application. In the above example, the diagnosis score of data skew is 2 points, the diagnosis score of queue resource shortage is 4 points, and the diagnosis score of memory excess is 1 point, then the diagnosis result of queue resource shortage is determined as the diagnosis result of Spark application .

In addition, the diagnosis result corresponding to the diagnosis index can include not only the diagnosis score of each diagnosis index, but also the diagnosis code and diagnosis information of each diagnosis index. For example, the diagnosis code for data skewing is d10001, and the diagnosis information is the current task. The maximum execution time is 45min, the median execution time is 4min, and the corresponding input records are 300 and 40 respectively.

Of course, in the embodiments of the present application, other preset rules may also be used to determine the diagnosis result that meets the preset rule from the diagnosis results corresponding to multiple diagnosis indicators, and determine it as the diagnosis result of the Spark application. Other preset rules such as The diagnosis result corresponding to the minimum value in the diagnosis score, or the diagnosis result corresponding to the diagnosis code with the largest weight in the diagnosis code, etc. Of course, it is also possible to determine the Spark application based on the diagnosis results corresponding to multiple diagnosis indicators, comprehensive weights or other parameters The diagnosis result is not limited here.

In the embodiment of the present application, a preset database is also provided, and the corresponding relationship between the diagnostic code and the diagnostic measure (solution) is preset in the preset database. After the diagnosis result of the Spark application is determined, the diagnosis code corresponding to the diagnosis code in the diagnosis result can be obtained from the preset database according to the diagnosis code in the diagnosis result, and the diagnosis measures are reported to the user. For example, a data tilt scenario will obtain a data tilt processing solution, and a queue resource shortage scenario will obtain a queue resource shortage processing solution. This technical solution can provide users with targeted diagnostic measures, that is, solutions, after the diagnosis results of the Spark application are determined, so that the users can solve problems autonomously according to the diagnostic measures, so as to solve the problems in the running of the Spark application in time. There is no need for users to query relevant information to solve the running problems of the Spark application, but directly set up relevant solutions and provide them to users, improve the efficiency of users in solving problems, and improve user experience.

The embodiments of this application can be applied to the field of financial technology (Fintech). The field of financial technology refers to a new innovative technology brought to the financial field after information technology is integrated into the financial field. Financial operations are assisted by the use of advanced information technology. , Transaction execution and financial system improvement can improve the processing efficiency and business scale of the financial system, and can reduce costs and financial risks. Exemplarily, Spark can be used in the bank to do whitelist analysis and blacklist analysis of users, and ETL (Extract-transform-load, data extraction, cleaning, conversion, and loading) operations can be executed based on Spark in the bank. When using a Spark application, you can perform diagnostics during Spark running, and perform real-time diagnostics to monitor the normality of the Spark application in real time.

Based on the same concept, FIG. 3 exemplarily shows the structure of a device for diagnosing a Spark application provided by an embodiment of the present application, and the device can execute the flow of the method for diagnosing a Spark application. The device can exist in the form of software or hardware. The device may include: a processing unit 302 and an acquiring unit 301. As an implementation manner, the acquiring unit 301 may include a receiving unit, and the apparatus may also include a sending unit. The processing unit 302 is used to control and manage the actions of the device. The acquiring unit 301 and the sending unit are used to support communication between the device and other network entities.

The processing unit 302 may be a processor or a control device, for example, a general-purpose central processing unit (CPU), a general-purpose processor, a digital signal processing (digital signal processing, DSP), and an application specific integrated circuit (application specific integrated circuit). circuits, ASIC), field programmable gate array (FPGA) or other programmable logic devices, transistor logic devices, hardware components or any combination thereof. It can implement or execute various exemplary logical blocks, modules and circuits described in conjunction with the disclosure of this application. The processor may also be a combination that implements computing functions, for example, including a combination of one or more microprocessors, a combination of a DSP and a microprocessor, and so on. The acquisition unit 301 is an interface circuit of the device for receiving signals from other devices. For example, when the device is implemented as a chip, the acquisition unit 301 is an interface circuit for the chip to receive signals from other chips or devices, and the sending unit is an interface circuit for the chip to send signals to other chips or devices.

The device may be the operating diagnostic device 100 in the above-mentioned embodiment, and may also be a chip for operating the diagnostic device 100. For example, when the device is the operating diagnostic device 100, the processing unit 302 may be a processor, for example, and the acquiring unit 301 may be a transceiver, for example. Optionally, the transceiver may include a radio frequency circuit, and the storage unit may be, for example, a memory. For example, when the device is a chip for running the diagnostic device 100, the processing unit 302 may be a processor, for example, and the acquiring unit 301 may be an input/output interface, a pin, or a circuit, for example. The processing unit 302 can execute computer-executable instructions stored in the storage unit. Optionally, the storage unit is a storage unit in the chip, such as a register, a cache, etc., and the storage unit may also be located in the chip in the first forwarding server. External storage units, such as read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM), etc.

In one embodiment, the device is the operating diagnostic device 100 in the above-mentioned embodiment. Among them, the obtaining unit 301 is used to obtain the context information of the Spark application; the processing unit 302 is used to determine the diagnosis indicators of the Spark application and the indicator rules corresponding to the diagnosis indicators according to the context information; according to the diagnosis indicators of the Spark application, the obtaining unit 301 collects During the running of the Spark application, the running information corresponding to the diagnostic index is diagnosed; the running information corresponding to the diagnostic index is diagnosed according to the index rule corresponding to the diagnostic index, and the diagnosis result of the Spark application is determined.

Optionally, there are multiple diagnosis indicators; the processing unit 302 is specifically configured to: for any one diagnosis indicator, diagnose the operating information corresponding to the diagnosis indicator according to the indicator rule corresponding to the diagnosis indicator, and determine the diagnosis result corresponding to the diagnosis indicator; The diagnosis results corresponding to the three diagnosis indicators determine the diagnosis results that meet the preset rules, and are determined to be the diagnosis results of the Spark application.

Optionally, the processing unit 302 is further configured to: after determining the diagnosis result of the Spark application, obtain the diagnosis code in the diagnosis result conforming to the preset rule from the preset database according to the diagnosis code in the diagnosis result conforming to the preset rule Corresponding diagnosis measures are reported to the user; wherein, the corresponding relationship between the diagnosis code and the diagnosis measure is preset in the preset database.

Optionally, the processing unit 302 is specifically configured to: after uniformly processing the operating information corresponding to the diagnostic indicators, generate the operating indicators corresponding to the diagnostic indicators; and diagnose the operating indicators corresponding to the diagnostic indicators according to the indicator rules corresponding to the diagnostic indicators.

Optionally, the processing unit 302 is further configured to: before determining the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information, obtain user configuration information through the obtaining unit 301; determine the Spark application according to the user configuration information and context information The diagnostic index and the index rule corresponding to the diagnostic index.

Based on the same concept, as shown in FIG. 4, an embodiment of the present application further provides a computing device 400, and the computing device 400 may be the operating diagnostic device in the foregoing embodiment. The computing device 400 includes a processor 402 and a communication interface 403. Optionally, the computing device 400 may further include a memory 401. Optionally, the computing device 400 may further include a communication line 404. Among them, the communication interface 403, the processor 402, and the memory 401 may be connected to each other through a communication line 404; the communication line 404 may be a peripheral component interconnect standard (peripheral component interconnect, PCI for short) bus or an extended industry standard architecture (extended industry standard architecture) , Referred to as EISA) bus and so on. The communication line 404 can be divided into an address bus, a data bus, a control bus, and so on. For ease of presentation, only one thick line is used in FIG. 4 to represent, but it does not mean that there is only one bus or one type of bus.

The processor 402 may be a CPU, a microprocessor, an ASIC, or one or more integrated circuits used to control the execution of the program of the present application.

In a possible embodiment, the processor 402 may be used to: determine the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information; according to the diagnostic index of the Spark application, collect the running process of the Spark application through the communication interface 403 The operating information corresponding to the diagnostic indicator in the diagnostic indicator; the operating information corresponding to the diagnostic indicator is diagnosed according to the indicator rule corresponding to the diagnostic indicator, and the diagnosis result of the Spark application is determined.

The communication interface 403 uses any device such as a transceiver to communicate with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), Wired access network, etc.

The memory 401 may be a ROM or other types of static storage devices that can store static information and instructions, RAM or other types of dynamic storage devices that can store information and instructions, or an electrically erasable programmable read-only memory (electrically erasable programmable read-only memory). read-only memory, EEPROM), compact disc (read-only memory, CD-ROM) or other optical disc storage, optical disc storage (including compact discs, laser discs, optical discs, digital universal discs, Blu-ray discs, etc.), magnetic disks A storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program codes in the form of instructions or data structures and that can be accessed by a computer, but is not limited thereto. The memory may exist independently, and is connected to the processor through a communication line 404. The memory can also be integrated with the processor.

The memory 401 is used to store computer-executed instructions for executing the solution of the present application, and the processor 402 controls the execution. The processor 402 is configured to execute computer-executable instructions stored in the memory 401, so as to implement the method provided in the foregoing embodiment of the present application.

Optionally, the computer-executable instructions in the embodiments of the present application may also be referred to as application program code, which is not specifically limited in the embodiments of the present application.

Based on the same inventive concept, the embodiments of the present application also provide a computer-readable non-volatile storage medium, including computer-readable instructions. When the computer reads and executes the computer-readable instructions, the computer executes the above diagnostic Spark application. method.

This application is described with reference to flowcharts and/or block diagrams of methods, equipment (systems), and computer program products according to the embodiments of this application. It should be understood that each process and/or block in the flowchart and/or block diagram, and the combination of processes and/or blocks in the flowchart and/or block diagram can be implemented by computer program instructions. These computer program instructions can be provided to the processor of a general-purpose computer, a special-purpose computer, an embedded processor, or other programmable data processing equipment to generate a machine, so that the instructions executed by the processor of the computer or other programmable data processing equipment are generated It is a device that realizes the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be stored in a computer-readable memory that can guide a computer or other programmable data processing equipment to work in a specific manner, so that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction device. The device implements the functions specified in one process or multiple processes in the flowchart and/or one block or multiple blocks in the block diagram.

These computer program instructions can also be loaded on a computer or other programmable data processing equipment, so that a series of operation steps are executed on the computer or other programmable equipment to produce computer-implemented processing, so as to execute on the computer or other programmable equipment. The instructions provide steps for implementing functions specified in a flow or multiple flows in the flowchart and/or a block or multiple blocks in the block diagram.

Although the preferred embodiments of the present application have been described, those skilled in the art can make additional changes and modifications to these embodiments once they learn the basic creative concept. Therefore, the appended claims are intended to be interpreted as including the preferred embodiments and all changes and modifications falling within the scope of the present invention.

Obviously, those skilled in the art can make various changes and modifications to the present invention without departing from the spirit and scope of the present invention. In this way, if these modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalent technologies, the present invention is also intended to include these modifications and variations.

Claims

A method for diagnosing Spark applications, characterized in that it includes:

Get the context information of the Spark application;

Determine the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information;

According to the diagnostic index of the Spark application, collecting operation information corresponding to the diagnostic index during the operation of the Spark application;

Diagnose the operation information corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index, and determine the diagnosis result of the Spark application.
The method of claim 1, wherein there are multiple diagnostic indicators;

The diagnosing the operation information corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index, and determining the diagnosis result of the Spark application, includes:

For any diagnostic index, diagnose the operating information corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index, and determine the diagnostic result corresponding to the diagnostic index;

The diagnosis result that meets the preset rule is determined from the diagnosis results corresponding to the multiple diagnosis indicators, and it is determined as the diagnosis result of the Spark application.
The method according to claim 1 or 2, wherein after the determining the diagnosis result of the Spark application, the method further comprises:

According to the diagnostic code in the diagnostic result that meets the preset rules, the diagnostic measures corresponding to the diagnostic code in the diagnostic result that meets the preset rules are obtained from a preset database and reported to the user; the preset database is in advance Correspondence between diagnostic code and diagnostic measures is set.
The method according to any one of claims 1 to 3, wherein the diagnosing the operation information corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index comprises:

After unifying the operation information corresponding to the diagnostic index, the operation index corresponding to the diagnostic index is generated;

Diagnose the operating index corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index.
The method according to any one of claims 1 to 4, wherein before the determining the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information, the method further comprises:

Obtain user configuration information;

The determining the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information includes:

Determine the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the user configuration information and the context information.
A device for diagnosing Spark applications, characterized in that it includes:

The obtaining unit is used to obtain the context information of the Spark application;

The processing unit is configured to determine the diagnostic index of the Spark application and the index rule corresponding to the diagnostic index according to the context information; according to the diagnostic index of the Spark application, the acquisition unit collects the running process of the Spark application The operating information corresponding to the diagnostic indicator in the operating information; the operating information corresponding to the diagnostic indicator is diagnosed according to the indicator rule corresponding to the diagnostic indicator, and the diagnosis result of the Spark application is determined.
The device according to claim 6, wherein the diagnosis index is multiple; the processing unit is specifically configured to: for any one diagnosis index, correspond to the diagnosis index according to an index rule corresponding to the diagnosis index The diagnosis result corresponding to the diagnosis index is determined; the diagnosis result conforming to the preset rule is determined from the diagnosis results corresponding to the plurality of diagnosis indexes, and it is determined as the diagnosis result of the Spark application.
The device according to any one of claims 6-7, wherein the processing unit is further configured to: after the determination of the diagnosis result of the Spark application, according to the diagnosis result that meets the preset rule The diagnostic code corresponding to the diagnostic code in the diagnostic result that conforms to the preset rules is obtained from the preset database through the obtaining unit and reported to the user; wherein the preset database is preset with the diagnostic code Correspondence with diagnostic measures.
8. The device according to any one of claims 6-8, wherein the processing unit is specifically configured to: after unifying the operation information corresponding to the diagnostic index, generate the operation index corresponding to the diagnostic index ; Diagnose the operating index corresponding to the diagnostic index according to the index rule corresponding to the diagnostic index.
The device according to any one of claims 6 to 9, wherein the processing unit is further configured to: determine the diagnostic index of the Spark application and the index corresponding to the diagnostic index according to the context information Before the rule, user configuration information is obtained through the obtaining unit; according to the user configuration information and the context information, the diagnosis index of the Spark application and the index rule corresponding to the diagnosis index are determined.
A computing device, characterized by comprising:

Memory, used to store program instructions;

The processor is configured to call the program instructions stored in the memory, and execute the method according to any one of claims 1 to 5 according to the obtained program.
A computer-readable non-volatile storage medium, characterized by comprising computer-readable instructions, when the computer reads and executes the computer-readable instructions, the computer is caused to execute any one of claims 1 to 5 Methods.