WO2022001209A1 - 作业执行方法、装置、系统及计算机可读存储介质 - Google Patents

作业执行方法、装置、系统及计算机可读存储介质 Download PDF

Info

Publication number
WO2022001209A1
WO2022001209A1 PCT/CN2021/081960 CN2021081960W WO2022001209A1 WO 2022001209 A1 WO2022001209 A1 WO 2022001209A1 CN 2021081960 W CN2021081960 W CN 2021081960W WO 2022001209 A1 WO2022001209 A1 WO 2022001209A1
Authority
WO
WIPO (PCT)
Prior art keywords
spark
job
engine
target
version
Prior art date
Application number
PCT/CN2021/081960
Other languages
English (en)
French (fr)
Inventor
刘有
尹强
王和平
黄山
杨峙岳
冯朝阁
杨永坤
邸帅
卢道和
Original Assignee
深圳前海微众银行股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳前海微众银行股份有限公司 filed Critical 深圳前海微众银行股份有限公司
Publication of WO2022001209A1 publication Critical patent/WO2022001209A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • G06F9/4451User profiles; Roaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44536Selecting among different versions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4488Object-oriented
    • G06F9/449Object-oriented method invocation or resolution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the present application relates to the technical field of financial technology (Fintech), and in particular, to a job execution method, apparatus, system, and computer-readable storage medium.
  • Each version update of Spark will bring many new features, but sometimes it is not very compatible with the old version of the job, and there are often a large number of users in an environment, some users need to use the new version of Spark, and some users need to use the new version of Spark.
  • users need an older version of Spark, but the current Linkis (a data middleware that connects multiple computing and storage engines) service clusters can only support one version of Spark job running, and cannot integrate multiple versions of Spark to run at the same time. Therefore, in order to meet the multi-version requirements of these users, it is usually necessary to deploy multiple sets of Linkis clusters, which requires a large number of machines to deploy Spark Drivers of different versions.
  • a set of Linkis increases the cost and difficulty of operation and maintenance, and it is also difficult for users to maintain the use of each set of environments.
  • the main purpose of the present application is to provide a job execution method, device, system and computer-readable storage medium, which aim to simultaneously support the running of multiple versions of Spark jobs in a Linkis service cluster and reduce operation and maintenance costs.
  • the present application provides a job execution method, the job execution method includes:
  • the present application also provides a job execution device, the job execution device includes:
  • a first obtaining module configured to obtain the version number, dynamic configuration parameters and Spark job code of the target Spark engine according to the execution request when receiving the execution request of the Spark job;
  • a first determining module configured to determine the deployment directory information and version loading rules of the target Spark engine according to the version number
  • An engine initialization module configured to obtain static configuration parameters according to the deployment directory information, and use the dynamic configuration parameters and the static configuration parameters to initialize the target Spark engine according to the version loading rules to start the target Spark engine engine;
  • a job execution module configured to submit the Spark job code to the target Spark engine to execute the job.
  • the present application also provides a job execution system
  • the job execution system includes: a memory, a processor, and a job execution program stored on the memory and running on the processor.
  • the job execution program is executed by the processor, the steps of the job execution method described above are implemented.
  • the present application also provides a computer-readable storage medium, where a job execution program is stored on the computer-readable storage medium, and when the job execution program is executed by a processor, the above-mentioned job execution is realized steps of the method.
  • this application by pre-installing and deploying dependent jab packages that are prone to conflict in different versions of Spark, when receiving an execution request for a Spark job, obtain the version number in the request to determine the corresponding deployment directory information, and obtain the corresponding deployment directory information. Start parameters (including dynamic configuration parameters and static configuration parameters), and then initialize the Spark engine of the target version according to the startup parameters to start the Spark engine of the target version.
  • the version jar conflict problem enables multiple versions of Spark to be executed in parallel under the same Linkis cluster. Therefore, this application only needs to deploy a set of Linkis service clusters, in which different versions of Spark engines can be created to support the running of multi-version Spark jobs. By deploying different versions of Spark engines, this application can greatly reduce machine resources and operation and maintenance costs.
  • FIG. 1 is a schematic diagram of the device structure of the hardware operating environment involved in the solution of the embodiment of the present application;
  • FIG. 2 is a schematic flowchart of the first embodiment of the job execution method of the application
  • FIG. 3 is a schematic diagram of the system architecture of the job execution system of the application.
  • FIG. 4 is a schematic diagram of functional modules of the first embodiment of the job execution apparatus of the present application.
  • FIG. 1 is a schematic diagram of a device structure of a hardware operating environment involved in the solution of an embodiment of the present application.
  • the job execution device in this embodiment of the application may be a server or a PC (Personal Computer, personal computer), tablet computer, portable computer and other terminal equipment.
  • PC Personal Computer, personal computer
  • tablet computer Portable computer
  • portable computer and other terminal equipment.
  • the job execution device may include: a processor 1001 , such as a CPU, a communication bus 1002 , a user interface 1003 , a network interface 1004 , and a memory 1005 .
  • the communication bus 1002 is used to realize the connection and communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may include a standard wired interface and a wireless interface (eg, a Wi-Fi interface).
  • the memory 1005 may be high-speed RAM memory, or may be non-volatile memory, such as disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • the structure of the job execution device shown in FIG. 1 does not constitute a limitation on the job execution device, and may include more or less components than the one shown, or combine some components, or different components layout.
  • the memory 1005 as a computer storage medium may include an operating system, a network communication module, and a job execution program.
  • the network interface 1004 is mainly used to connect to the background server and perform data communication with the background server; the user interface 1003 is mainly used to connect to the client and perform data communication with the client; and the processor 1001 can be used for
  • the job execution program stored in the memory 1005 is called, and each step of the following job execution method is executed.
  • the present application provides a job execution method.
  • FIG. 2 is a schematic flowchart of the first embodiment of the job execution method of the present application.
  • the job execution method includes:
  • Step S10 when receiving the execution request of the Spark job, obtain the version number, dynamic configuration parameters and Spark job code of the target Spark engine according to the execution request;
  • the job execution method in this embodiment is applied to a job execution system.
  • the job execution system includes a Linkis cluster service and a Yarn cluster.
  • the Linkis cluster service includes an external service entry and an engine management service, and the external service entry includes an external service entry and an engine management service.
  • the engine management service has been transformed, and the dynamic loading scheme of multi-version spark-dependent jars and the dynamic adjustment of class loader are added to solve the parallel execution of different versions of spark jobs. problem.
  • a version identifier is added to the execution request parameters of a Spark job.
  • an engine manager is added to manage Spark engine services of different versions and users, and to create or select engines based on the version number.
  • the Linkis cluster service is a service that connects multiple computing and storage engines, provides a unified Restful (a design style and development method for network applications) interfaces, submits and executes SQL (Structured Query Language, structured query language), PySpark ( Spark provides API for Python developers), HiveQL (Hive Query Language, a query language for data warehouse tools), Scala (a multi-paradigm programming language) and other scripted data middleware.
  • the Job management module in the external service portal is used to receive Spark job execution requests and automatically adjust the error codes of Spark jobs.
  • Engine Manager used to manage Spark engine services of different versions and different users, and also used to create or select engines based on version numbers.
  • the engine management service is used to manage the process creation, status tracking, and process destruction of the Spark Context (Spark context content).
  • the engine management service includes one or more Spark engines. Multiple Spark engines can be executed in parallel. Procedure Call, remote procedure call) to communicate with the external service entry.
  • Yarn cluster is a framework that provides job scheduling and cluster resource management in big data platforms.
  • the external service portal when receiving an execution request of a Spark job, obtains the version number, dynamic configuration parameters, and Spark job code of the target Spark engine according to the execution request.
  • the dynamic configuration parameters include the execution number of the executor node (Executor), the CPU (Central One or more of the number of Processing Units, the size of the memory, and the number of CPUs and the size of the memory of the driver node (Driver).
  • Step S20 determining the deployment directory information and version loading rules of the target Spark engine according to the version number
  • the deployment directory information and version loading rules of the target Spark engine are determined according to the version number.
  • the deployment directory information includes root directory deployment information and configuration directory deployment information, and version loading rules include loading rules for dependent libraries of different Spark versions.
  • the root directory of the Spark service needs to be established on the machine where the engine management service is located in advance, and then the Spark_home directory and the Spark_conf_dir directory are established in the root directory according to different Spark version numbers.
  • the Spark_home directory is the root directory, which is the Spark installation directory.
  • the Spark_conf_dir directory includes static configuration parameters of different versions of Spark, such as application property parameters, running environment parameters, runtime behavior parameters, and network parameters.
  • Step S30 obtaining static configuration parameters according to the deployment directory information, and using the dynamic configuration parameters and the static configuration parameters to initialize the target Spark engine according to the version loading rule to start the target Spark engine;
  • the target Spark engine is initialized by using startup parameters (including dynamic configuration parameters and static configuration parameters) according to the version loading rules to start the target Spark engine. That is, the dynamic configuration parameters and the static configuration parameters are correspondingly filled into the configuration parameters of the configuration file of the target Spark engine for initialization.
  • the archives (parameter values) used by pySpark (the API provided by Spark for Python developers) also need to be uploaded to the corresponding hdfs (Hadoop Distributed File System, distribution) according to different versions. type file system) directory, it is convenient for the executor (executor node) of the Spark engine to download these third-party files from the correct location to solve the Jar problem that the executor depends on when it starts.
  • job execution method further includes:
  • Step a1 in the initialization process, determine the target calling method according to the version number and the preset abstraction layer interface;
  • Step a2 load the file package that the target Spark engine depends on in the directory corresponding to the deployment directory information according to the target calling method.
  • the Spark engine service of Linkis has some dependencies on Spark, these dependencies are currently placed in a lib directory in the root directory in the form of jar packages.
  • these jars will conflict. Therefore, these dependent jars need to be extracted in advance, not all loaded when the application starts, but according to different engine versions, in the Spark engine service Add a layer of abstract interface definition to realize the multi-version dependency abstraction layer, and make the underlying package that resolves the jar conflict into a version module (that is, the dependent file package below), and only load the specified version module when creating the engine to avoid multi-version. jar conflict problem.
  • the target invocation method is determined according to the version number and the preset abstraction layer interface, and then, according to the target invocation method, the file package that the target Spark engine depends on in the directory corresponding to the deployment directory information is loaded, so as to load the file package based on the version difference.
  • conflicting dependent libraries are determined according to the version number and the preset abstraction layer interface, and then, according to the target invocation method, the file package that the target Spark engine depends on in the directory corresponding to the deployment directory information is loaded, so as to load the file package based on the version difference.
  • Step S40 submit the Spark job code to the target Spark engine to execute the job.
  • Spark job code submits the Spark job code to the target Spark engine to execute the job.
  • submit the Spark job code submits the Spark job code to the driver node (Driver) of the target Spark engine, and then transform the Spark job code through the Driver to obtain a Spark task (task).
  • Spark tasks are assigned to executor nodes (Executors) deployed on the Yarn cluster to execute jobs.
  • the embodiment of the present application provides a job execution method.
  • the version number, dynamic configuration parameters, and Spark job code of the target Spark engine are obtained according to the execution request; and then the version number of the target Spark engine is determined according to the version number.
  • Deployment directory information and version loading rules obtain static configuration parameters according to the deployment directory information, and initialize the target Spark engine with dynamic configuration parameters and static configuration parameters according to the version loading rules to start the target Spark engine; then submit the Spark job code to the target Spark engine to execute the job.
  • the version number in the request is obtained to determine the corresponding deployment directory information, and Obtain the corresponding startup parameters (including dynamic configuration parameters and static configuration parameters), and then initialize the Spark engine of the target version according to the startup parameters to start the Spark engine of the target version.
  • This application only needs to deploy a set of Linkis service clusters, in which different versions of Spark engines can be created to support the running of multi-version Spark jobs.
  • the job execution method further includes:
  • Step A obtaining the user ID corresponding to the execution request, and detecting whether there is an idle Spark engine corresponding to the user ID and the version number;
  • the external service portal when acquiring the execution request of the Spark job, can also acquire the user ID corresponding to the execution request while acquiring the version number, dynamic configuration parameters and Spark job code of the target Spark engine, wherein , and the user ID can be a user name or a user ID. Then, check if there is an idle Spark engine corresponding to the user ID and version number. During detection, the version number and user of the currently idle Spark engine can be obtained from the engine manager, and then matched with the obtained user ID and version number.
  • step S20 determine the deployment directory information and version loading rules of the target Spark engine according to the version number
  • step B submit the Spark job code to the idle Spark engine to execute the job.
  • the Spark job code is directly submitted to the idle Spark engine corresponding to the user ID and version number to execute the job.
  • a version label (ie, version number) is added to the execution request of the Spark job, and idle Spark engines corresponding to different users and version numbers are managed through the engine manager.
  • idle Spark engines corresponding to different users and version numbers are managed through the engine manager.
  • the job execution method further includes:
  • Step C obtaining the user identification corresponding to the execution request, and determining whether the user is in the preset grayscale list according to the user identification;
  • the external service portal when acquiring the execution request of the Spark job, can also acquire the user ID corresponding to the execution request while acquiring the version number, dynamic configuration parameters and Spark job code of the target Spark engine, wherein , and the user ID can be a user name or a user ID. Then, it is determined whether the user is in the preset grayscale list according to the user identification.
  • the preset grayscale list is preset and used to specify some users to perform grayscale, that is, to submit the role of the specified part of the users to the new version of the Spark engine under the newly deployed Linkis service cluster for execution.
  • the number of grayscale users specified in the preset grayscale list can be adjusted according to the success rate of the specified user's job execution on the new version of the Spark engine, so that the user's Spark job can be gradually migrated to the newly deployed Linkis Executed on the new version of the Spark engine under the service cluster.
  • step S20 determine the deployment directory information and version loading rules of the target Spark engine according to the version number
  • step D create a grayscale Spark engine, and submit the Spark job code to the grayscale Spark engine to execute the job.
  • the user does not belong to the specified grayscale user. In this case, the user only needs to submit the job to the Spark engine under the original Linkis service cluster for execution. Specifically, according to the version number to determine the deployment directory information and version loading rules of the target Spark engine, and then execute the subsequent steps. For the specific execution process, refer to the above-mentioned first embodiment, which will not be repeated this time.
  • the grayscale Spark engine is a new version of the Spark engine under the newly deployed Linkis service cluster, and its creation method is the same as the creation method of the target Spark engine in the above-mentioned first embodiment, and will not be repeated this time.
  • the grayscale Spark engine may be pre-created. In this case, the Spark job code is directly submitted to the pre-created grayscale Spark engine to execute the job.
  • the version of the Spark engine to be submitted since the version of the Spark engine to be submitted has changed, the corresponding version number and configuration parameters must be preset. When creating a grayscale Spark engine, it is based on the preset version number and configuration parameters. to build.
  • the front and back ends can be configured into the database using a unified version code mechanism, such as v1 representing Spark 1.6.0, etc., to perform unified numbering to avoid uncontrollable multiple versions. The user is allowed to specify the loading at will. Only the version number with the uniform number can be submitted correctly, otherwise the job submission will fail.
  • grayscale is performed on a specified part of users, and the role request of the grayscale user is submitted to the new version of the grayscale Spark engine for execution.
  • the Spark job code will be somewhat different.
  • the code of the new Spark version that needs to be submitted may not be used, but the code of the old Spark version is still used to send the execution request of the Spark job. At this time, there will be a problem of job execution failure. .
  • the job execution method further includes:
  • Step E modifying the Spark job code according to the version number
  • the Spark job code can be modified according to the version number, so that the modified Spark job code and its version number Compatible match.
  • a code parser of different versions is defined at the external service entrance, and the changes between different versions and their corresponding modification strategies are pre-defined by the code parser, so as to predefine the grammar of different versions based on the known changes Replacement operations, such as changes introduced by different versions of the package, or API (Application Programming Interface, application program interface) function call interface changes.
  • the code parser matches the Spark job code with the predefined code differences of different versions and its modification strategy according to the version number, and then makes corresponding modifications according to the matching results.
  • step S40 includes: submitting the modified Spark job code to the target Spark engine to execute the job.
  • the operation of multiple versions can be compatible without the user needing to modify the existing code.
  • step S40 includes:
  • Step a41 submitting the Spark job code to the driver node of the target Spark engine
  • the job execution process is as follows:
  • Driver is used to convert the user's Spark job code into multiple physical execution units. These units are also called tasks. They are also used to track the running status of the Executor and convert all Tasks according to the current set of Executor nodes. Assign the appropriate Executor based on the location of the data.
  • Step a42 converting the Spark job code through the driver node to obtain a Spark task
  • the Spark job code is converted through the driver node to obtain the Spark task.
  • the specific transformation method can refer to the prior art.
  • Step a43 Allocate the Spark task to the executor nodes deployed on the Yarn cluster to execute the job.
  • the Spark task is assigned to the executor node (Executor) deployed on the Yarn cluster to execute the job.
  • Yarn cluster provides a framework for job scheduling and cluster resource management in the big data platform, which can realize the management and scheduling of executor nodes.
  • Executor nodes are used to run Spark tasks and return execution results to driver nodes.
  • the Spark tasks dynamically generated by the Driver will be serialized and distributed to each Executor for distributed execution.
  • the Driver also provides a remote class loading service, and the Executor deserializes the serialized code.
  • the serialization result received by the Executor after execution by the Driver cannot be correctly deserialized.
  • this class loader is the default class loader of the Spark Driver, and the user's Spark job code is submitted to the Driver for interpretation and execution, so the Driver will also start a Scala interpreter That is, when the Spark Driver is initialized, a SparkILoop will also be created.
  • a class loader will also be set in it.
  • a user-defined piece of code ie, Spark job code
  • the Scala interpreter in the Driver interprets and executes, so some classes newly defined by the user are completely in the class loader of the Scala interpreter.
  • the operation reference of the instance object will submit the serialized execution result to the Spark Driver, but at this time the Spark
  • the driver's class loader is the default class loader. It will fail to run without the class information defined by the user's dynamic code in the Scala interpreter, that is, it can only support the running of part of the code. situation, it cannot be executed correctly.
  • the job execution method further includes:
  • Step F in the initialization process, when the Scala interpreter is created in the driver node of the target Spark engine, the class loader of the main thread is injected into the Scala interpreter, so that the class loader of the main thread becomes The parent of the Scala interpreter class loader, and causes the Scala interpreter to create a corresponding class loader according to the parent's class loader;
  • a Scala interpreter will be created in the drive node of the target Spark engine.
  • the class loader of the main thread is injected into the Scala interpreter, so that the class loader of the main thread becomes The parent of the Scala interpreter class loader, so that the Scala interpreter creates the corresponding class loader according to the parent's class loader;
  • step a42 includes:
  • the Spark job code is transformed by the class loader of the Scala interpreter created in the driver node Driver to obtain a Spark task, and then the Spark task is allocated to the executor node Executor deployed on the Yarn cluster to execute the job.
  • the job execution method further includes:
  • Step G when receiving the serialized execution result returned by the executor node based on the Spark task, modify the class loader of the current thread of the target Spark engine to the class loader of the Scala interpreter to pass The class loader of the Scala interpreter deserializes the serialization execution result.
  • the class loader of the Spark engine is dynamically modified to ensure the consistency between the class loader of the Spark engine and the class loader of the Scala interpreter in the Driver, so that the deserialized execution result of the Executor can be used by the Driver. Parse correctly.
  • the udf function requires the ability to deduce the specific type of the input and return value, and if through reflection, we cannot determine the return value (it may be org.apache.Spark.ml.linalg.Vector, or it may be org.apache.Spark.mllib.linalg.Vector), it cannot be compiled at this time.
  • the source code version of Spark needs to be modified. According to the current version parameters, the corresponding type can be dynamically loaded as the return value.
  • the present application also provides a job execution device.
  • FIG. 4 is a schematic diagram of functional modules of the first embodiment of the job execution apparatus of the present application.
  • the job execution device includes:
  • the first obtaining module 10 is configured to obtain the version number, dynamic configuration parameters and Spark job code of the target Spark engine according to the execution request when receiving the execution request of the Spark job;
  • the first determination module 20 is used to determine the deployment directory information and version loading rules of the target Spark engine according to the version number;
  • the engine initialization module 30 is configured to obtain static configuration parameters according to the deployment directory information, and use the dynamic configuration parameters and the static configuration parameters to initialize the target Spark engine according to the version loading rules to start the target Spark engine;
  • the job execution module 40 is configured to submit the Spark job code to the target Spark engine to execute the job.
  • job execution device also includes:
  • a detection module configured to obtain the user identifier corresponding to the execution request, and detect whether there is an idle Spark engine corresponding to the user identifier and the version number;
  • the first determining module 20 is further configured to, if there is no idle Spark engine corresponding to the user ID and the version number, execute the step: determine the deployment directory information and the target Spark engine according to the version number. version loading rules;
  • the first submitting module is configured to submit the Spark job code to the idle Spark engine to execute the job if there is an idle Spark engine corresponding to the user ID and the version number.
  • job execution device also includes:
  • a judgment module configured to obtain the user ID corresponding to the execution request, and judge whether the user is in the preset grayscale list according to the user ID;
  • the first determining module 20 is further configured to, if the user is not in the preset grayscale list, perform the steps of: determining the deployment directory information and version loading rules of the target Spark engine according to the version number;
  • the second submitting module is configured to create a grayscale Spark engine if the user is in the preset grayscale list, and submit the Spark job code to the grayscale Spark engine to execute the job.
  • job execution device also includes:
  • the second determining module is used to determine the target calling method according to the version number and the preset abstraction layer interface during the initialization process;
  • a file package loading module configured to load the file package that the target Spark engine depends on in the directory corresponding to the deployment directory information according to the target calling method.
  • job execution device also includes:
  • a first modification module configured to modify the Spark job code according to the version number
  • the job execution module 40 is further configured to:
  • job execution module 40 includes:
  • a code submission unit for submitting the Spark job code to the driver node of the target Spark engine
  • a task generation unit configured to convert the Spark job code through the driver node to obtain a Spark task
  • the task allocation unit is used for allocating the Spark task to the executor nodes deployed on the Yarn cluster to execute the job.
  • job execution device also includes:
  • the second modification module is used to inject the class loader of the main thread into the Scala interpreter when the Scala interpreter is created in the driver node of the target Spark engine during the initialization process, so that the The class loader becomes the parent of the Scala interpreter class loader, and causes the Scala interpreter to create a corresponding class loader according to the parent's class loader;
  • the task generation unit is also used for: transforming the Spark job code through the class loader of the Scala interpreter created in the driver node to obtain a Spark task;
  • a third modification module configured to modify the class loader of the current thread of the target Spark engine to the class loader of the Scala interpreter when receiving the serialized execution result returned by the executor node based on the Spark task , to deserialize the serialization execution result through the class loader of the Scala interpreter.
  • each module in the above job execution apparatus corresponds to each step in the above job execution method embodiment, and the functions and implementation processes thereof will not be repeated here.
  • the present application further provides a computer-readable storage medium, where a job execution program is stored on the computer-readable storage medium, and when the job execution program is executed by a processor, the job execution method according to any of the above embodiments is implemented. step.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

本申请涉及金融科技技术领域,公开了一种作业执行方法、装置、系统及计算机可读存储介质。该作业执行方法包括:在接收到Spark作业的执行请求时,根据执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;根据版本号确定目标Spark引擎的部署目录信息和版本加载规则;根据部署目录信息获取静态配置参数,根据版本加载规则使用动态配置参数和静态配置参数对目标Spark引擎进行初始化,以启动目标Spark引擎;将Spark作业代码提交至目标Spark引擎,以执行作业。

Description

作业执行方法、装置、系统及计算机可读存储介质
本申请要求于2020年6月30日申请的、申请号为202010624055.5的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及金融科技(Fintech)技术领域,尤其涉及一种作业执行方法、装置、系统及计算机可读存储介质。
背景技术
随着计算机技术的发展,越来越多的技术应用在金融领域,传统金融业正在逐步向金融科技(Fintech)转变,Spark技术也不例外,但由于金融行业的安全性、实时性要求,也对Spark技术提出了更高的要求。
现有的大数据集群应用中,金融机构(如银行)内的大数据处理环境都比较集中,数据也非常集中,数据量也非常大。由于数据的集中处理,很多大数据应用组件也是集中部署,而这些组件的版本更新也比较频繁,每年都会好几个版本的更新,比如正在使用Apache Spark(为大规模数据处理而设计的快速通用的计算引擎)。
Spark每次的版本更新都会带来很多的新特性,但有时并不能很好的兼容旧版本的作业,而一个环境内往往存在着大量的用户,有的用户需要使用新版本的Spark,而有的用户需要旧版本的Spark,现在的Linkis(一种打通了多个计算存储引擎的数据中间件)服务集群下却只能支持一种版本的Spark作业运行,无法融合多版本的Spark同时运行。因此,为了满足这些用户多版本的需求,通常需要部署多套Linkis集群,从而需要大量的机器来部署不同版本的Spark Driver,同时,用户的作业也需要在不同的环境之间切换部署,每增加一套Linkis,就增加了运维成本和难度,用户要维护每套环境的使用也比较困难。
技术问题
本申请的主要目的在于提供一种作业执行方法、装置、系统及计算机可读存储介质,旨在实现在一个Linkis服务集群内同时支持多版本的Spark作业运行,减少运维成本。
技术解决方案
为实现上述目的,本申请提供一种作业执行方法,所述作业执行方法包括:
在接收到Spark作业的执行请求时,根据所述执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;
根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
根据所述部署目录信息获取静态配置参数,根据所述版本加载规则使用所述动态配置参数和所述静态配置参数对所述目标Spark引擎进行初始化,以启动所述目标Spark引擎;
将所述Spark作业代码提交至所述目标Spark引擎,以执行作业。
此外,为实现上述目的,本申请还提供一种作业执行装置,所述作业执行装置包括:
第一获取模块,用于在接收到Spark作业的执行请求时,根据所述执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;
第一确定模块,用于根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
引擎初始化模块,用于根据所述部署目录信息获取静态配置参数,根据所述版本加载规则使用所述动态配置参数和所述静态配置参数对所述目标Spark引擎进行初始化,以启动所述目标Spark引擎;
作业执行模块,用于将所述Spark作业代码提交至所述目标Spark引擎,以执行作业。
此外,为实现上述目的,本申请还提供一种作业执行系统,所述作业执行系统包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的作业执行程序,所述作业执行程序被所述处理器执行时实现如上所述的作业执行方法的步骤。
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有作业执行程序,所述作业执行程序被处理器执行时实现如上所述的作业执行方法的步骤。
有益效果
本申请中,通过预先对不同版本Spark中容易发生冲突的依赖的jab包进行安装部署,在接收到Spark作业的执行请求时,获取请求中的版本号以确定对应的部署目录信息,并获取对应的启动参数(包括动态配置参数和静态配置参数),进而根据启动参数对目标版本的Spark引擎进行初始化,以启动目标版本的Spark引擎,可实现不同版本Spark依赖jar包的动态加载,避免出现多版本jar冲突问题,使得多版本Spark可在同一Linkis集群下并行执行。因此,本申请只需要部署一套Linkis服务集群,在该Linkis服务集群内可创建不同版本的Spark引擎,以支持多版本Spark作业的运行,相比于现有技术中通过部署多套Linkis集群来部署不同版本的Spark引擎,本申请可在很大程度上减少了机器资源和运维成本。
附图说明
图1为本申请实施例方案涉及的硬件运行环境的设备结构示意图;
图2为本申请作业执行方法第一实施例的流程示意图;
图3为本申请作业执行系统的系统架构示意图;
图4为本申请作业执行装置第一实施例的功能模块示意图。
本发明的实施方式
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
参照图1,图1为本申请实施例方案涉及的硬件运行环境的设备结构示意图。
本申请实施例作业执行设备可以是服务器,也可以是PC(Personal Computer,个人计算机)、平板电脑、便携计算机等终端设备。
如图1所示,该作业执行设备可以包括:处理器1001,例如CPU,通信总线1002,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如Wi-Fi接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的作业执行设备结构并不构成对作业执行设备的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。
如图1所示,作为一种计算机存储介质的存储器1005中可以包括操作系统、网络通信模块以及作业执行程序。
在图1所示的终端中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要用于连接客户端,与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的作业执行程序,并执行以下作业执行方法的各个步骤。
基于上述硬件结构,提出本申请作业执行方法的各实施例。
本申请提供一种作业执行方法。
参照图2,图2为本申请作业执行方法第一实施例的流程示意图。
在本实施例中,该作业执行方法包括:
步骤S10,在接收到Spark作业的执行请求时,根据所述执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;
本实施例的作业执行方法应用于作业执行系统,如图3所示,该作业执行系统包括Linkis集群服务和Yarn集群,其中,Linkis集群服务中包括对外服务入口和引擎管理服务,对外服务入口中包括Job作业管理模块和引擎管理器。具体的,相比于现有的作业执行系统,对引擎管理服务进行了改造,增加了多版本spark依赖jar的动态加载方案,以及类加载器的动态调整,来解决不同版本的spark作业并行执行问题。此外,在Spark作业的执行请求参数中增加了版本标识,对应的,增设了引擎管理器,用于管理不同版本和不同用户的Spark引擎服务,还用于根据版本号创建或选择引擎。
其中,Linkis集群服务是一个打通了多个计算存储引擎,对外提供统一Restful(一种网络应用程序的设计风格和开发方式)接口,提交执行SQL(Structured Query Language,结构化查询语言)、PySpark(Spark为 Python开发者提供的API)、HiveQL(Hive Query Language,一个数据仓库工具的查询语言)、Scala(一门多范式的编程语言)等脚本的数据中间件。对外服务入口中的Job作业管理模块用于接收Spark作业的执行请求,还用于自动调整Spark作业的错误代码。引擎管理器,用于管理不同版本和不同用户的Spark引擎服务,还用于根据版本号创建或选择引擎。引擎管理服务用于管理Spark Context(Spark上下文内容)所在的进程创建、状态跟踪、进程销毁等,引擎管理服务包括一个或多个Spark引擎,多个Spark引擎可并行执行,Spark引擎通过RPC(Remote Procedure Call,远程过程调用)与对外服务入口进行通信。Yarn集群为提供大数据平台中作业调度和集群资源管理的框架。
在本实施例中,对外服务入口在接收到Spark作业的执行请求时,根据该执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码。其中,动态配置参数包括执行器节点(Executor)的执行个数、CPU(Central Processing Unit,中央处理器)个数、内存大小,以及驱动器节点(Driver)的CPU个数、内存大小中的一个或多个。
步骤S20,根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
然后,根据版本号确定目标Spark引擎的部署目录信息和版本加载规则。其中,部署目录信息包括根目录部署信息和配置目录部署信息,版本加载规则包括不同Spark版本依赖库的加载规则。
需要说明的是,在实施时,需预先在引擎管理服务所在的机器建立Spark服务的根目录,然后在根目录下根据不同的Spark版本号建立Spark_home目录和Spark_conf_dir目录,在安装部署Spark的时候,就根据这个路径规则进行部署,具体的,在Spark_home目录和Spark_conf_dir目录中下设各个版本的子目录,对应的将Spark安装至对应的子目录下。需要说明的是,在安装时,需将不同版本Spark中容易发生冲突的依赖的jab包分开部署。其中,Spark_home目录为根目录,是Spark的安装目录,Spark_conf_dir目录下包括不同版本Spark的静态配置参数,如应用程序属性参数、运行环境参数、运行时行为参数、网络参数等。
此外,还需要说明的是,在安装部署不同版本的Spark之前,为实现Spark引擎本身多版本的支持,可先对不同版本的Spark包的API接口进行比对,获取差异API,根据所述差异API对不同版本的Spark包进行编译,进而,将编译得到的Spark包部署至对应的目录下。即,先对Spark包进行编译,将API(Application Programming Interface,应用程序接口)有变化的部分独立出来,进而将经过编译处理的Spark包进行安装部署。比如,对于Vector(向量)相关的API那么我们可以剥离出两个工程,每个工程适配对应的版本,然后发布jar包,在Maven(麦文,项目对象模型)中根据Profile(用户配置文件)机制,根据Spark版本引入不同的适配包。
步骤S30,根据所述部署目录信息获取静态配置参数,根据所述版本加载规则使用所述动态配置参数和所述静态配置参数对所述目标Spark引擎进行初始化,以启动所述目标Spark引擎;
然后,根据部署目录信息获取静态配置参数,具体的,可获取Spark_conf_dir目录下的静态配置参数,如应用程序属性参数、运行环境参数、运行时行为参数、网络参数等。进而,根据版本加载规则使用启动参数(包括动态配置参数和静态配置参数)对目标Spark引擎进行初始化,以启动目标Spark引擎。即,将动态配置参数和静态配置参数对应填充至目标Spark引擎的配置文件的配置参数中,以进行初始化。
此外,需要说明的是,在初始化过程中,对于pySpark(Spark为 Python开发者提供的API)用到的archives(参数值)也需要根据不同的版本上传到对应的hdfs(Hadoop Distributed File System,分布式文件系统)目录下,方便Spark引擎的executor(执行器节点)从正确的位置下载这些第三方的文件,以用于解决Executor启动时依赖的Jar问题。
进一步地,该作业执行方法还包括:
步骤a1,在初始化过程中,根据所述版本号和预设抽象层接口确定目标调用方法;
步骤a2,根据所述目标调用方法加载所述部署目录信息对应目录下的所述目标Spark引擎依赖的文件包。
进一步地,由于Linkis的Spark引擎服务运行时对于Spark存在一些依赖,目前这些依赖通过都以jar包的形式放在根目录的一个lib目录下。当存在不同的Spark版本时,这些jar就会出现冲突,因此,需要预先把这些依赖的jar抽离出来,不是在应用启动时才全部加载,而是根据不同的引擎版本,在Spark引擎服务中加一层抽象接口定义,实现多版本依赖抽象层,把解决好jar冲突的底层包做成版本模块(即下文的依赖的文件包),创建引擎的时候只加载指定版本模块,避免多版本的jar冲突问题。
具体的,在初始化过程中,根据版本号和预设抽象层接口确定目标调用方法,然后,根据目标调用方法加载部署目录信息对应目录下的目标Spark引擎依赖的文件包,以根据版本差异加载无冲突的依赖库。
步骤S40,将所述Spark作业代码提交至所述目标Spark引擎,以执行作业。
最后,将Spark作业代码提交至目标Spark引擎,以执行作业。具体的,将Spark作业代码提交至目标Spark引擎的驱动器节点(Driver),然后,通过Driver对Spark作业代码进行转化,得到Spark任务(task),具体的转化过程可参照现有技术;最后,将Spark任务分配至部署在Yarn集群上的执行器节点(Executor),以执行作业。
本申请实施例提供一种作业执行方法,在接收到Spark作业的执行请求时,根据执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;然后,根据版本号确定目标Spark引擎的部署目录信息和版本加载规则;根据部署目录信息获取静态配置参数,根据版本加载规则使用动态配置参数和静态配置参数对目标Spark引擎进行初始化,以启动目标Spark引擎;进而将Spark作业代码提交至目标Spark引擎,以执行作业。本申请实施例中,通过预先对不同版本Spark中容易发生冲突的依赖的jab包进行安装部署,在接收到Spark作业的执行请求时,获取请求中的版本号以确定对应的部署目录信息,并获取对应的启动参数(包括动态配置参数和静态配置参数),进而根据启动参数对目标版本的Spark引擎进行初始化,以启动目标版本的Spark引擎,可实现不同版本Spark依赖jar包的动态加载,避免出现多版本jar冲突问题,使得多版本Spark可在同一Linkis集群下并行执行。因此,本申请只需要部署一套Linkis服务集群,在该Linkis服务集群内可创建不同版本的Spark引擎,以支持多版本Spark作业的运行,相比于现有技术中通过部署多套Linkis集群来部署不同版本的Spark引擎,本申请可在很大程度上减少了机器资源和运维成本。
进一步地,由于启动一个Spark引擎需要很久的时间,而且一旦启动成功就会锁定一部分的集群计算资源,当一个用户的作业运行完成后,并不会立即结束掉这个已经处在运行状态的引擎,而会让用户的下一个作业得到立即执行,以提升用户体验。引擎复用在很大程度上节约了时间和计算资源,然而现有的Linkis用户的Session(会话控制)模式管理下,在复用已有的Spark引擎时,是随机向空闲的Spark引擎提交作业的。当在一个环境下存在不同版本的Spark引擎时,用户的一部分作业会被提交到旧版本的Spark引擎,而一部分作业会被提交到新版本的Spark引擎,从而导致作业执行失败的情况。
对此,基于上述第一实施例,提出本申请作业执行方法的第二实施例。
在本实施例中,在上述步骤S20之前,该作业执行方法还包括:
步骤A,获取所述执行请求对应的用户标识,检测是否存在与所述用户标识和所述版本号对应的空闲Spark引擎;
在本实施例中,对外服务入口在获取到Spark作业的执行请求时,在获取目标Spark引擎的版本号、动态配置参数和Spark作业代码的同时,还可以获取该执行请求对应的用户标识,其中,用户标识可以为用户名或用户编号等。然后,检测是否存在与用户标识和版本号对应的空闲Spark引擎。在检测时,可从引擎管理器中获取当前空闲的Spark引擎的版本号及所属用户,进而与获取到的用户标识和版本号进行匹配。
若不存在,则执行步骤S20:根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
若不存在与用户标识和版本号对应的空闲Spark引擎,则需创建新的Spark引擎,此时,则根据版本号确定目标Spark引擎的部署目录信息和版本加载规则,进而执行后续步骤,具体的执行过程可参照上述第一实施例,此次不作赘述。
若存在,则执行步骤B:将所述Spark作业代码提交至所述空闲Spark引擎,以执行作业。
若存在与用户标识和版本号对应的空闲Spark引擎,则直接将该Spark作业代码提交至与用户标识和版本号对应的空闲Spark引擎,以执行作业。
本实施例中,在Spark作业的执行请求中添加版本标签(即版本号),同时通过引擎管理器管理不同用户及版本号对应的空闲Spark引擎,在Linkis的对外服务入口接收到执行请求后,通过检测是否存在与用户标识和版本号对应的空闲Spark引擎,进而自动根据作业的版本标签提交给对应版本的空闲Spark引擎,可在实现引擎复用的同时,避免Spark作业被随机提交到不同版本的Spark引擎、使得作业执行失败的情况发生。
进一步地,当需要更新Spark版本时,Linkis需要先kill(杀)掉所有正在运行的Spark引擎,再停止已有的Spark引擎管理服务,更新所有Spark引擎管理服务器的配置文件,并更新所有的Spark引擎管理服务用到的依赖的Jar(一种软件包文件格式)包,这样对于业务的影响非常大,用户的Spark作业在更新期间都不能执行。此外,在更新后,还存在有些Spark作业在新版本Spark引擎下无法正确执行的风险。
对此,基于上述第一实施例,提出本申请作业执行方法的第三实施例。
在本实施例中,在上述步骤S20之前,该作业执行方法还包括:
步骤C,获取所述执行请求对应的用户标识,根据所述用户标识判断用户是否在预设灰度名单中;
在本实施例中,对外服务入口在获取到Spark作业的执行请求时,在获取目标Spark引擎的版本号、动态配置参数和Spark作业代码的同时,还可以获取该执行请求对应的用户标识,其中,用户标识可以为用户名或用户编号等。然后,根据用户标识判断该用户是否在预设灰度名单中。其中,预设灰度名单是预先设定的,用于指定部分用户进行灰度,即,将该指定部分的用户的作用提交至新部署的Linkis服务集群下新版本的Spark引擎上执行。在具体实施时,可根据指定用户的作业在新版本的Spark引擎上执行的成功率,对应调整预设灰度名单中指定的灰度用户数量,使得用户的Spark作业逐步迁移至新部署的Linkis服务集群下新版本的Spark引擎上执行。
若用户不在预设灰度名单中,则执行步骤S20:根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
若用户在预设灰度名单中,则执行步骤D:创建灰度Spark引擎,并将所述Spark作业代码提交至所述灰度Spark引擎,以执行作业。
若用户不在预设灰度名单中,则该用户不属于指定的灰度用户,此时,只需将其作业提交至原Linkis服务集群下的Spark引擎上执行即可,具体的,先根据版本号确定目标Spark引擎的部署目录信息和版本加载规则,进而执行后续步骤,具体的执行过程可参照上述第一实施例,此次不作赘述。
若用户在预设灰度名单中,则创建灰度Spark引擎,并将Spark作业代码提交至灰度Spark引擎,以执行作业。其中,灰度Spark引擎即为新部署的Linkis服务集群下新版本的Spark引擎,其创建方式与上述第一实施例中目标Spark引擎的创建方式相同,此次不作赘述。需要说明的是,在具体实施时,该灰度Spark引擎可以是预先创建好的,此时,则直接将Spark作业代码提交至预先创建好的灰度Spark引擎,以执行作业。
还需要说明的是,由于所要提交的Spark引擎版本变化了,对应的版本号、配置参数均需预先设定好,在创建灰度Spark引擎,则是基于预先设定好的版本号、配置参数进行构建。此外,由于目前的Spark版本参数定义较为复杂,在具体实施时,前后端可采用统一的版本代号机制配置到数据库中,如v1代表Spark1.6.0等,以进行统一编号,避免多版本不可控制。任由用户随意指定加载,只有版本号是统一编号的版本号才可以正确提交,否则作业提交失败。
本实施例中,通过建立基于用户的版本灰度机制,对指定的部分用户进行灰度,将灰度用户的作用请求提交至新版本的灰度Spark引擎上执行。通过上述方式,在Linkis下更新Spark版本时,无需停止已有的Spark引擎管理服务,同时用户的Spark作业在更新期间仍可执行,从而可避免对业务造成不良影响。
进一步地,由于不同版本的Spark引擎,其Spark作业代码会存在一些不同。而用户在发送Spark作业时,可能并未采用所需提交的新Spark版本的代码,而是仍然使用旧Spark版本的代码来发送Spark作业的执行请求,此时,则会存在作业执行失败的问题。
对此,基于上述第一至第三实施例,提出本申请作业执行方法的第四实施例。
在本实施例中,在上述步骤S40之前,该作业执行方法还包括:
步骤E,根据所述版本号对所述Spark作业代码进行修改;
在本实施例中,在根据Spark作业的执行请求获取到目标Spark引擎的版本号和Spark作业代码之后,可根据版本号对该Spark作业代码进行修改,以使得修改后的Spark作业代码与其版本号兼容匹配。
具体的,在对外服务入口定义一个不同版本的代码解析器,通过该代码解析器预先定义了不同版本之间的变化及其对应的修改策略,以基于已知的变化,预定义不同版本的语法替换操作,比如不同版本的包引入变化问题,或者API(Application Programming Interface,应用程序接口)函数调用接口的变化问题。在修改时,代码解析器根据版本号将Spark作业代码与预定义的不同版本的代码差异及其修改策略进行匹配,进而根据匹配结果进行对应的修改。
此时,步骤S40包括:将修改后的Spark作业代码提交至所述目标Spark引擎,以执行作业。
然后,将修改后的Spark作业代码提交至目标Spark引擎,以执行作业。
本实施例中,通过在代码提交阶段,根据不同版本的代码差异及其修改策略自动修改Spark作业代码,可在用户不用修改已有代码的情况下,能够兼容多版本运行。
进一步地,基于上述第一至第三实施例,提出本申请作业执行方法的第五实施例。
在本实施例中,步骤S40包括:
步骤a41,将所述Spark作业代码提交至所述目标Spark引擎的驱动器节点;
在本实施例中,作业执行过程如下:
先将Spark作业代码提交至目标Spark引擎的驱动器节点(Driver)。其中,Driver用于将用户的Spark作业代码转为多个物理执行的单元,这些单元也被称为任务(task),还用于跟踪Executor的运行状况、根据当前的Executor节点集合,将所有Task基于数据所在位置分配给合适的Executor。
步骤a42,通过所述驱动器节点对所述Spark作业代码进行转化,得到Spark任务;
然后,通过驱动器节点对Spark作业代码进行转化,得到Spark任务。具体的转化方式可参照现有技术。
步骤a43,将所述Spark任务分配至部署在Yarn集群上的执行器节点,以执行作业。
最后,将Spark任务分配至部署在Yarn集群上的执行器节点(Executor),以执行作业。其中,Yarn集群为提供大数据平台中作业调度和集群资源管理的框架,可实现对执行器节点的管理和调度。执行器节点用于运行Spark 任务,并将执行结果返回给驱动器节点。
进一步地,由于Spark是分布式的,在Driver动态生成的Spark任务会被序列化后分配至各个Executor上去分布式执行,Driver同时提供一个远程类加载服务,Executor再把序列化后代码反序列化回来进行动态加载时,就有可能出现问题,有一种情况就是Driver收到的Executor执行后的序列化结果不能被正确的反序列化回来。由于Linkis启动Spark引擎的时候是已经存在一个类加载器,该类加载器是Spark Driver默认的类加载器,用户的Spark作业代码是提交给Driver解释执行的,所以Driver还会启动一个Scala解释器,即,在初始化Spark Driver时,也会创建一个SparkILoop,由于SparkILoop复用的是Scala语言的代码解释器,里面也会设定一个类加载器。当用户定义一段代码(即Spark作业代码)提交至Spark引擎时,首先被Spark Driver中的Scala解释器解释执行,所以用户新定义的一些类是完全在Scala解释器的类加载器中,当序列化后的Spark任务提交到Executor执行,如果需要继续返回用户自己定义的类的实例对象的操作引用,就会把序列化执行结果提交给Spark Driver,但这时Spark Driver的类加载器是默认的类加载器,缺乏Scala解释器中的用户动态代码定义的类信息,就会运行失败,即,只能支持部分代码的运行,对于需要返回用户新定义类对象的情况,则无法正确执行。
针对上述情况,基于上述第五实施例,提出本申请作业执行方法的第六实施例。
在本实施例中,在步骤a42之前,该作业执行方法还包括:
步骤F,在初始化过程中,在所述目标Spark引擎的驱动器节点中创建Scala解释器时,将主线程的类加载器注入至所述Scala解释器中,使得所述主线程的类加载器成为Scala解释器类加载器的父级,并使得所述Scala解释器根据父级的类加载器创建对应的类加载器;
在对目标Spark引擎进行初始化的过程中,会在目标Spark引擎的驱动器节点中创建Scala解释器,此时,将主线程的类加载器注入至Scala解释器中,使得主线程的类加载器成为Scala解释器类加载器的父级,进而使得Scala解释器根据父级的类加载器创建对应的类加载器;
此时,步骤a42包括:
通过所述驱动器节点中创建的Scala解释器的类加载器对所述Spark作业代码进行转化,得到Spark任务;
然后,通过驱动器节点Driver中创建的Scala解释器的类加载器对Spark作业代码进行转化,得到Spark任务,进而,将Spark任务分配至部署在Yarn集群上的执行器节点Executor,以执行作业。
进一步地,在步骤a43之后,该作业执行方法还包括:
步骤G,在接收到所述执行器节点基于所述Spark任务返回的序列化执行结果时,将所述目标Spark引擎当前线程的类加载器修改为所述Scala解释器的类加载器,以通过所述Scala解释器的类加载器对所述序列化执行结果进行反序列化。
在接收到执行器节点Executor基于Spark任务返回的序列化执行结果时,将目标Spark引擎当前线程的类加载器修改为Scala解释器的类加载器,以通过Scala解释器的类加载器对序列化执行结果进行反序列化。
本实施例中,通过动态修改Spark引擎的类加载器,以保证Spark引擎的类加载与Driver中Scala解释器的类加载器的一致性,从而能使得Executor反序列化后的执行结果能被Driver正确解析。
进一步地,需要说明的是,由于Spark版本的差异,有些代码就很难做到兼容了,切换Spark就无法通过编译,此时,可采用动态编译和反射相结合的方法。正常情况下,可以准备两份不同Spark版本对应的Spark作业代码,然后在运行时决定编译哪份代码。然而,这种方式有一个问题,如果动态编译返回的值是需要被序列化,继而发送至Executor的,由于里面生成的一些匿名类在Executor中并不存在,因此在反序列化时会出现问题。
对此,可针对有变化的类做一层自己的封装实现,由于在Spark中,可以通过定义一个类:org.apache.Spark.SPARK_VERSION(用于获取spark版本),来获取Spark的版本。具体的,根据自定义的类在代码中获取不同版本参数,并动态加载对应版本的类,然后通过反射来调用方法,可避免上述编译时的错误。通过自己封装类屏蔽掉不同版本的调用接口差异问题,然而通过反射,就无法使用原来UDF(Universal Disc Format,统一光盘格式)之类的代码了。这是因为udf函数要求能够推导出输入和返回值是哪种具体的类型,而如果通过反射,因为返回值我们无法确定(有可能是org.apache.Spark.ml.linalg.Vector,也有可能是org.apache.Spark.mllib.linalg.Vector),这个时候就无法通过编译了。对于这种情况就需要修改Spark的源码版本,根据当前的版本参数,能够动态加载对应的类型作为返回值。
本申请还提供一种作业执行装置。
参照图4,图4为本申请作业执行装置第一实施例的功能模块示意图。
如图4所示,所述作业执行装置包括:
第一获取模块10,用于在接收到Spark作业的执行请求时,根据所述执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;
第一确定模块20,用于根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
引擎初始化模块30,用于根据所述部署目录信息获取静态配置参数,根据所述版本加载规则使用所述动态配置参数和所述静态配置参数对所述目标Spark引擎进行初始化,以启动所述目标Spark引擎;
作业执行模块40,用于将所述Spark作业代码提交至所述目标Spark引擎,以执行作业。
进一步地,所述作业执行装置还包括:
检测模块,用于获取所述执行请求对应的用户标识,检测是否存在与所述用户标识和所述版本号对应的空闲Spark引擎;
所述第一确定模块20,还用于若不存在与所述用户标识和所述版本号对应的空闲Spark引擎,则执行步骤:根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
第一提交模块,用于若存在所述用户标识和所述版本号对应的空闲Spark引擎,则将所述Spark作业代码提交至所述空闲Spark引擎,以执行作业。
进一步地,所述作业执行装置还包括:
判断模块,用于获取所述执行请求对应的用户标识,根据所述用户标识判断用户是否在预设灰度名单中;
所述第一确定模块20,还用于若用户不在预设灰度名单中,则执行步骤:根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
第二提交模块,用于若用户在预设灰度名单中,则创建灰度Spark引擎,并将所述Spark作业代码提交至所述灰度Spark引擎,以执行作业。
进一步地,所述作业执行装置还包括:
第二确定模块,用于在初始化过程中,根据所述版本号和预设抽象层接口确定目标调用方法;
文件包加载模块,用于根据所述目标调用方法加载所述部署目录信息对应目录下的所述目标Spark引擎依赖的文件包。
进一步地,所述作业执行装置还包括:
第一修改模块,用于根据所述版本号对所述Spark作业代码进行修改;
所述作业执行模块40,还用于:
将修改后的Spark作业代码提交至所述目标Spark引擎,以执行作业。
进一步地,所述作业执行模块40包括:
代码提交单元,用于将所述Spark作业代码提交至所述目标Spark引擎的驱动器节点;
任务生成单元,用于通过所述驱动器节点对所述Spark作业代码进行转化,得到Spark任务;
任务分配单元,用于将所述Spark任务分配至部署在Yarn集群上的执行器节点,以执行作业。
进一步地,所述作业执行装置还包括:
第二修改模块,用于在初始化过程中,在所述目标Spark引擎的驱动器节点中创建Scala解释器时,将主线程的类加载器注入至所述Scala解释器中,使得所述主线程的类加载器成为Scala解释器类加载器的父级,并使得所述Scala解释器根据父级的类加载器创建对应的类加载器;
所述任务生成单元还用于:通过所述驱动器节点中创建的Scala解释器的类加载器对所述Spark作业代码进行转化,得到Spark任务;
第三修改模块,用于在接收到所述执行器节点基于所述Spark任务返回的序列化执行结果时,将所述目标Spark引擎当前线程的类加载器修改为所述Scala解释器的类加载器,以通过所述Scala解释器的类加载器对所述序列化执行结果进行反序列化。
其中,上述作业执行装置中各个模块的功能实现与上述作业执行方法实施例中各步骤相对应,其功能和实现过程在此处不再一一赘述。
本申请还提供一种计算机可读存储介质,该计算机可读存储介质上存储有作业执行程序,所述作业执行程序被处理器执行时实现如以上任一项实施例所述的作业执行方法的步骤。
本申请计算机可读存储介质的具体实施例与上述作业执行方法各实施例基本相同,在此不作赘述。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,空调器,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (10)

  1. 一种作业执行方法,其中,所述作业执行方法包括:
    在接收到Spark作业的执行请求时,根据所述执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;
    根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
    根据所述部署目录信息获取静态配置参数,根据所述版本加载规则使用所述动态配置参数和所述静态配置参数对所述目标Spark引擎进行初始化,以启动所述目标Spark引擎;
    将所述Spark作业代码提交至所述目标Spark引擎,以执行作业。
  2. 如权利要求1所述的作业执行方法,其中,所述根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则的步骤之前,还包括:
    获取所述执行请求对应的用户标识,检测是否存在与所述用户标识和所述版本号对应的空闲Spark引擎;
    若不存在,则执行步骤:根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
    若存在,则将所述Spark作业代码提交至所述空闲Spark引擎,以执行作业。
  3. 如权利要求1所述的作业执行方法,其中,所述根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则的步骤之前,还包括:
    获取所述执行请求对应的用户标识,根据所述用户标识判断用户是否在预设灰度名单中;
    若用户不在预设灰度名单中,则执行步骤:根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
    若用户在预设灰度名单中,则创建灰度Spark引擎,并将所述Spark作业代码提交至所述灰度Spark引擎,以执行作业。
  4. 如权利要求1所述的作业执行方法,其中,所述作业执行方法还包括:
    在初始化过程中,根据所述版本号和预设抽象层接口确定目标调用方法;
    根据所述目标调用方法加载所述部署目录信息对应目录下的所述目标Spark引擎依赖的文件包。
  5. 如权利要求1至4中任一项所述的作业执行方法,其中,所述将所述Spark作业代码提交至所述目标Spark引擎,以执行作业的步骤之前,还包括:
    根据所述版本号对所述Spark作业代码进行修改;
    所述将所述Spark作业代码提交至所述目标Spark引擎,以执行作业的步骤包括:
    将修改后的Spark作业代码提交至所述目标Spark引擎,以执行作业。
  6. 如权利要求1至4中任一项所述的作业执行方法,其中,所述将所述Spark作业代码提交至所述目标Spark引擎,以执行作业的步骤包括:
    将所述Spark作业代码提交至所述目标Spark引擎的驱动器节点;
    通过所述驱动器节点对所述Spark作业代码进行转化,得到Spark任务;
    将所述Spark任务分配至部署在Yarn集群上的执行器节点,以执行作业。
  7. 如权利要求6所述的作业执行方法,其中,所述通过所述驱动器节点对所述Spark作业代码进行转化,得到Spark任务的步骤之前,还包括:
    在初始化过程中,在所述目标Spark引擎的驱动器节点中创建Scala解释器时,将主线程的类加载器注入至所述Scala解释器中,使得所述主线程的类加载器成为Scala解释器类加载器的父级,并使得所述Scala解释器根据父级的类加载器创建对应的类加载器;
    所述通过所述驱动器节点对所述Spark作业代码进行转化,得到Spark任务的步骤包括:
    通过所述驱动器节点中创建的Scala解释器的类加载器对所述Spark作业代码进行转化,得到Spark任务;
    所述将所述Spark任务分配至部署在Yarn集群上的执行器节点,以执行作业的步骤之后,还包括:
    在接收到所述执行器节点基于所述Spark任务返回的序列化执行结果时,将所述目标Spark引擎当前线程的类加载器修改为所述Scala解释器的类加载器,以通过所述Scala解释器的类加载器对所述序列化执行结果进行反序列化。
  8. 一种作业执行装置,其中,所述作业执行装置包括:
    第一获取模块,用于在接收到Spark作业的执行请求时,根据所述执行请求获取目标Spark引擎的版本号、动态配置参数和Spark作业代码;
    第一确定模块,用于根据所述版本号确定所述目标Spark引擎的部署目录信息和版本加载规则;
    引擎初始化模块,用于根据所述部署目录信息获取静态配置参数,根据所述版本加载规则使用所述动态配置参数和所述静态配置参数对所述目标Spark引擎进行初始化,以启动所述目标Spark引擎;
    作业执行模块,用于将所述Spark作业代码提交至所述目标Spark引擎,以执行作业。
  9. 一种作业执行系统,其中,所述作业执行系统包括:存储器、处理器及存储在所述存储器上并可在所述处理器上运行的作业执行程序,所述作业执行程序被所述处理器执行时实现如权利要求1至7中任一项所述的作业执行方法的步骤。
  10. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有作业执行程序,所述作业执行程序被处理器执行时实现如权利要求1至7中任一项所述的作业执行方法的步骤。
PCT/CN2021/081960 2020-06-30 2021-03-22 作业执行方法、装置、系统及计算机可读存储介质 WO2022001209A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010624055.5A CN111767092B (zh) 2020-06-30 2020-06-30 作业执行方法、装置、系统及计算机可读存储介质
CN202010624055.5 2020-06-30

Publications (1)

Publication Number Publication Date
WO2022001209A1 true WO2022001209A1 (zh) 2022-01-06

Family

ID=72724494

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/081960 WO2022001209A1 (zh) 2020-06-30 2021-03-22 作业执行方法、装置、系统及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN111767092B (zh)
WO (1) WO2022001209A1 (zh)

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114615135A (zh) * 2022-02-18 2022-06-10 佐朋数科(深圳)信息技术有限责任公司 一种前端灰度发布方法、系统及存储介质
CN114968274A (zh) * 2022-07-29 2022-08-30 之江实验室 一种基于灰度发布的自动化快速部署前置机的方法及系统
CN115061790A (zh) * 2022-06-10 2022-09-16 苏州浪潮智能科技有限公司 一种用于ARM双路服务器的Spark Kmeans核心分配方法及系统
CN115129325A (zh) * 2022-06-29 2022-09-30 北京五八信息技术有限公司 一种数据处理方法、装置、电子设备及存储介质
CN115242877A (zh) * 2022-09-21 2022-10-25 之江实验室 面向多K8s集群的Spark协同计算、作业方法及装置
CN115237818A (zh) * 2022-09-26 2022-10-25 浩鲸云计算科技股份有限公司 一种基于全链路标识实现多环境复用的方法及系统
CN116048817A (zh) * 2023-03-29 2023-05-02 腾讯科技(深圳)有限公司 数据处理控制方法、装置、计算机设备和存储介质
CN116909681A (zh) * 2023-06-13 2023-10-20 北京远舢智能科技有限公司 数据处理组件的生成方法、装置、电子设备及存储介质
CN117453278A (zh) * 2023-11-01 2024-01-26 国任财产保险股份有限公司 一种基于业务规则的规则管理系统
US11954525B1 (en) 2022-09-21 2024-04-09 Zhejiang Lab Method and apparatus of executing collaborative job for spark faced to multiple K8s clusters

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111767092B (zh) * 2020-06-30 2023-05-12 深圳前海微众银行股份有限公司 作业执行方法、装置、系统及计算机可读存储介质
CN112311603A (zh) * 2020-10-30 2021-02-02 上海中通吉网络技术有限公司 一种动态更改Spark用户配置的方法、装置及其系统
CN112698839B (zh) * 2020-12-30 2024-04-12 深圳前海微众银行股份有限公司 数据中心节点部署方法、装置、系统及计算机存储介质
CN114968267A (zh) * 2021-02-26 2022-08-30 京东方科技集团股份有限公司 服务部署方法、装置、电子设备及存储介质
CN113553533A (zh) * 2021-06-10 2021-10-26 国网安徽省电力有限公司 一种基于数字化内部五级市场考核体系的指标计算方法
CN113642021B (zh) * 2021-08-20 2024-05-28 深信服科技股份有限公司 一种业务代码提交方法、处理方法、装置及电子设备
CN113722019B (zh) * 2021-11-04 2022-02-08 海尔数字科技(青岛)有限公司 平台程序的显示方法、装置及设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105867928A (zh) * 2016-03-30 2016-08-17 北京奇虎科技有限公司 一种在指定分布式系统中接入指定计算模型的方法和装置
CN108255689A (zh) * 2018-01-11 2018-07-06 哈尔滨工业大学 一种基于历史任务分析的Apache Spark应用自动化调优方法
US20190065336A1 (en) * 2017-08-24 2019-02-28 Tata Consultancy Services Limited System and method for predicting application performance for large data size on big data cluster
CN109614167A (zh) * 2018-12-07 2019-04-12 杭州数澜科技有限公司 一种管理插件的方法和系统
CN110262881A (zh) * 2019-06-12 2019-09-20 深圳前海微众银行股份有限公司 一种Spark作业的提交方法及装置
CN111767092A (zh) * 2020-06-30 2020-10-13 深圳前海微众银行股份有限公司 作业执行方法、装置、系统及计算机可读存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2017207388B2 (en) * 2016-01-12 2021-05-13 Kavi Associates, Llc Multi-technology visual integrated data management and analytics development and deployment environment
US10275278B2 (en) * 2016-09-14 2019-04-30 Salesforce.Com, Inc. Stream processing task deployment using precompiled libraries
CN108845884B (zh) * 2018-06-15 2024-04-19 中国平安人寿保险股份有限公司 物理资源分配方法、装置、计算机设备和存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105867928A (zh) * 2016-03-30 2016-08-17 北京奇虎科技有限公司 一种在指定分布式系统中接入指定计算模型的方法和装置
US20190065336A1 (en) * 2017-08-24 2019-02-28 Tata Consultancy Services Limited System and method for predicting application performance for large data size on big data cluster
CN108255689A (zh) * 2018-01-11 2018-07-06 哈尔滨工业大学 一种基于历史任务分析的Apache Spark应用自动化调优方法
CN109614167A (zh) * 2018-12-07 2019-04-12 杭州数澜科技有限公司 一种管理插件的方法和系统
CN110262881A (zh) * 2019-06-12 2019-09-20 深圳前海微众银行股份有限公司 一种Spark作业的提交方法及装置
CN111767092A (zh) * 2020-06-30 2020-10-13 深圳前海微众银行股份有限公司 作业执行方法、装置、系统及计算机可读存储介质

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114615135B (zh) * 2022-02-18 2024-03-22 佐朋数科(深圳)信息技术有限责任公司 一种前端灰度发布方法、系统及存储介质
CN114615135A (zh) * 2022-02-18 2022-06-10 佐朋数科(深圳)信息技术有限责任公司 一种前端灰度发布方法、系统及存储介质
CN115061790A (zh) * 2022-06-10 2022-09-16 苏州浪潮智能科技有限公司 一种用于ARM双路服务器的Spark Kmeans核心分配方法及系统
CN115061790B (zh) * 2022-06-10 2024-05-14 苏州浪潮智能科技有限公司 一种用于ARM双路服务器的Spark Kmeans核心分配方法及系统
CN115129325B (zh) * 2022-06-29 2023-05-23 北京五八信息技术有限公司 一种数据处理方法、装置、电子设备及存储介质
CN115129325A (zh) * 2022-06-29 2022-09-30 北京五八信息技术有限公司 一种数据处理方法、装置、电子设备及存储介质
CN114968274A (zh) * 2022-07-29 2022-08-30 之江实验室 一种基于灰度发布的自动化快速部署前置机的方法及系统
CN115242877B (zh) * 2022-09-21 2023-01-24 之江实验室 面向多K8s集群的Spark协同计算、作业方法及装置
US11954525B1 (en) 2022-09-21 2024-04-09 Zhejiang Lab Method and apparatus of executing collaborative job for spark faced to multiple K8s clusters
CN115242877A (zh) * 2022-09-21 2022-10-25 之江实验室 面向多K8s集群的Spark协同计算、作业方法及装置
CN115237818A (zh) * 2022-09-26 2022-10-25 浩鲸云计算科技股份有限公司 一种基于全链路标识实现多环境复用的方法及系统
CN116048817A (zh) * 2023-03-29 2023-05-02 腾讯科技(深圳)有限公司 数据处理控制方法、装置、计算机设备和存储介质
CN116909681A (zh) * 2023-06-13 2023-10-20 北京远舢智能科技有限公司 数据处理组件的生成方法、装置、电子设备及存储介质
CN117453278A (zh) * 2023-11-01 2024-01-26 国任财产保险股份有限公司 一种基于业务规则的规则管理系统
CN117453278B (zh) * 2023-11-01 2024-05-14 国任财产保险股份有限公司 一种基于业务规则的规则管理系统

Also Published As

Publication number Publication date
CN111767092A (zh) 2020-10-13
CN111767092B (zh) 2023-05-12

Similar Documents

Publication Publication Date Title
WO2022001209A1 (zh) 作业执行方法、装置、系统及计算机可读存储介质
US11789715B2 (en) Systems and methods for transformation of reporting schema
US6871223B2 (en) System and method for agent reporting in to server
US11429365B2 (en) Systems and methods for automated retrofitting of customized code objects
KR101574366B1 (ko) 가상 머신 및 애플리케이션 수명들의 동기화
US7039923B2 (en) Class dependency graph-based class loading and reloading
US20050262501A1 (en) Software distribution method and system supporting configuration management
US20030181196A1 (en) Extensible framework for code generation from XML tags
US20030182625A1 (en) Language and object model for describing MIDlets
US20020188941A1 (en) Efficient installation of software packages
US20150212812A1 (en) Declarative and pluggable business logic for systems management
US20150220308A1 (en) Model-based development
US20030182626A1 (en) On-demand creation of MIDlets
US20120272204A1 (en) Uninterruptible upgrade for a build service engine
CN101384995A (zh) 应用服务器中的管理自动化
US8839223B2 (en) Validation of current states of provisioned software products in a cloud environment
WO2024002243A1 (zh) 应用管理方法、应用订阅方法及相关设备
US20210182054A1 (en) Preventing unexpected behavior in software systems due to third party artifacts
US20230418623A1 (en) Application remodeling method, system, cluster, medium, and program product
US20160170739A1 (en) Alter application behaviour during runtime
WO2024002302A1 (zh) 应用管理方法、应用订阅方法及相关设备
JP2022531736A (ja) Dbmsにおけるサービス管理
US20240143340A1 (en) Hybrid multi-tenant framework for reconfiguring software components
US11385876B1 (en) Infrastructure control interface for database systems
US11243751B1 (en) Proxy compilation for execution in a foreign architecture controlled by execution within a native architecture

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21832208

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21832208

Country of ref document: EP

Kind code of ref document: A1