WO2017107456A1 - 确定任务消耗资源的方法及装置 - Google Patents

确定任务消耗资源的方法及装置 Download PDF

Info

Publication number
WO2017107456A1
WO2017107456A1 PCT/CN2016/089272 CN2016089272W WO2017107456A1 WO 2017107456 A1 WO2017107456 A1 WO 2017107456A1 CN 2016089272 W CN2016089272 W CN 2016089272W WO 2017107456 A1 WO2017107456 A1 WO 2017107456A1
Authority
WO
WIPO (PCT)
Prior art keywords
task
cluster
resource
determining
resources
Prior art date
Application number
PCT/CN2016/089272
Other languages
English (en)
French (fr)
Inventor
许鹭清
Original Assignee
乐视控股(北京)有限公司
乐视网信息技术(北京)股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 乐视控股(北京)有限公司, 乐视网信息技术(北京)股份有限公司 filed Critical 乐视控股(北京)有限公司
Priority to US15/241,389 priority Critical patent/US20170185454A1/en
Publication of WO2017107456A1 publication Critical patent/WO2017107456A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3003Monitoring arrangements specially adapted to the computing system or computing system component being monitored
    • G06F11/3006Monitoring arrangements specially adapted to the computing system or computing system component being monitored where the computing system is distributed, e.g. networked systems, clusters, multiprocessor systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/3051Monitoring arrangements for monitoring the configuration of the computing system or of the computing system component, e.g. monitoring the presence of processing resources, peripherals, I/O links, software programs

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a method and apparatus for determining resource consumption of a task.
  • Hadoop implements a distributed file system (Hadoop Distributed File System), referred to as HDFS. Users can develop distributed programs without taking into account the underlying details of the distribution, making full use of the power of the cluster for high-speed computing and storage.
  • a cluster generally contains multiple nodes, CPU resources and storage resources on each node.
  • a Hadoop cluster in an enterprise may be used by many R&D personnel in the enterprise in actual applications, because each task submitted to the cluster needs to consume certain resources, such as CPU resources and storage. Resources, etc., for some R&D personnel that need to consume a lot of cluster resources, may cause resource competition, and may also affect the operation of other cluster tasks.
  • the embodiments of the present invention provide a method and an apparatus for determining resources consumed by a task.
  • a method for determining resource consumption of a task including:
  • Obtaining a task record of the cluster task where the task record includes: a task process started when the task is executed;
  • the method further includes:
  • the method further includes:
  • the task priority corresponding to the cluster resource consumed by the cluster task is determined as the priority of the cluster task.
  • the task record further includes: an attempt process
  • the calculating the resource occupation time of each task process occupying the preset unit resource in the corresponding process time includes:
  • the attempted process of the statistically successful operation occupies the resource occupation time of the preset unit resource.
  • the task record of acquiring the cluster task includes:
  • an apparatus for determining a task consuming resource includes:
  • a first acquiring module configured to acquire a task record of the cluster task, where the task record includes: a task process started when the task is executed;
  • a calculation module configured to calculate a resource occupation time of each task process occupying a preset unit resource
  • the first statistic module is configured to count the total resource occupation time of the preset unit resources occupied by the multiple task processes initiated by the cluster task;
  • the first determining module is configured to determine, according to the total resource occupation time and the preset unit resource, a cluster resource consumed by the cluster task during execution.
  • the device further includes:
  • a second statistic module configured to count multi-dimensional resources on each node in the cluster
  • a partitioning module for dividing a multi-dimensional resource on each node into a plurality of single-dimensional preset units source.
  • the device further includes:
  • a second acquiring module configured to acquire a correspondence between a preset cluster resource and a task priority
  • a second determining module configured to determine a task priority corresponding to the cluster resource consumed by the cluster task as a priority of the cluster task.
  • the task record further includes: an attempt process
  • the calculation module includes:
  • a first obtaining submodule configured to acquire, for each task process, an attempt process initiated by each task process
  • the statistics sub-module is used to count the resource occupation time of the preset unit resource when the successful running trial process exists.
  • the first obtaining module includes:
  • the second obtaining sub-module is configured to acquire a task record of the cluster task in a load balancing manner by using a preset interface.
  • a server which includes some or all of the modules in the device for determining resource consumption of the task provided by the second aspect of the embodiments of the present invention.
  • a non-transitory computer readable storage medium wherein the non-transitory computer readable storage medium can store computer instructions that can implement an embodiment of the present invention
  • the first aspect provides some or all of the steps in various implementations of the method of determining resource consumption by a task.
  • the present invention obtains a task record of a cluster task, where the task record includes: a task process started when the task is executed; calculates a resource occupation time of each task process occupying a preset unit resource; and counts a plurality of task processes initiated by the cluster task Presetting the total resource occupation time of the unit resource; determining, according to the total resource occupation time and the preset unit resource, the cluster resource consumed by the cluster task during execution.
  • the method provided by the embodiment of the present invention can determine the cluster resources occupied by each cluster task during execution, and facilitate tracking the resources consumed by the cluster tasks calculated in the cluster every day, thereby facilitating analysis according to departments, users, or services.
  • the cluster task with the lowest resource consumption is convenient for statistics of various departments or industries.
  • the resource consumption of the service line is convenient for guiding various departments to optimize the calculation tasks, which is beneficial to control the cost control of the cluster construction.
  • FIG. 1 is a flowchart of a method for determining a task consuming resources according to an exemplary embodiment
  • FIG. 2 is another flow chart of a method for determining a task consuming resources according to an exemplary embodiment
  • FIG. 3 is another flowchart of a method for determining a task consuming resources according to an exemplary embodiment
  • FIG. 4 is a structural diagram of an apparatus for determining a task consuming resources according to an exemplary embodiment.
  • a method for determining resource consumption of a task is provided, which is applied to a server, and includes the following steps.
  • step S101 a task record of the cluster task is acquired.
  • the task record includes: a task process initiated when the task is executed, and the server may obtain a task record of the cluster task in a load balancing manner through a preset interface.
  • the cluster task can be a task submitted to the Hadoop cluster.
  • the JobTracker records the detailed information of the task, including the basic configuration information of the task and the specific execution of the MapReduce task. This information can be obtained from the JobTracker web site and each subpage; the data collection program is a Newlisp script that requests the content of the specified page of the JobTracker site through Http Get, and parses the content to obtain the specified Details of the MapReduce task.
  • the information collected is divided into three categories:
  • task Id user name, task name, Hive execution statement, task submission machine, task submission machine ip, task submission time, task Launch time, task Launch time, task end time, total task time consumption, task operation result, Failure information.
  • the program For each MapReduce task, the program collects the above three types of information, aggregates them into a single task record, and sends them back to the server through Http.
  • the server receives the data sent by the program through the REST API.
  • LVS is adopted. +Nginx+ dual-machine load balancing solution, the database uses MongoDB three-machine cluster to ensure high performance and no single point of data storage.
  • step S102 the resource occupation time of each preset process resource occupied by each task process is calculated.
  • a preset unit resource may be a Slot, and an attempt process initiated by each task process may be acquired for each task process; when there is a successful attempted process, the statistical operation is successful.
  • the process occupies the resource occupation time of the preset unit resource.
  • Each cluster task (that is, a MapReduce task) is composed of several task processes (that is, Task), and each task process may start to form multiple attempt processes (ie, Attempt), and each attempt process is a process for completing the task.
  • One try When an attempted process is executed, the attempted process may fail or be executed abnormally due to a running node exception. At this time, the computing framework will start another attempting process to execute the same task process.
  • Hadoop clusters use this mechanism to ensure that each task process runs successfully and that tasks are not executed too long due to the slowness of one task process. Only a few attempts of each task process will be in a state of successful operation.
  • the cost of multiple attempts to run the process should not be repeatedly calculated on each task, that is, only all running states in one task are calculated.
  • the sum of the execution times of the SUCCESS attempted process as the total duration of the task's task process.
  • step S103 the total resource occupation time of the preset unit resources occupied by the plurality of task processes initiated by the cluster task is counted.
  • the resource occupation time of each task process occupying the preset unit resources may be summed to obtain the total resource occupation time.
  • step S104 the cluster resource consumed by the cluster task during execution is determined according to the total resource occupation time and the preset unit resource.
  • the method can determine the cluster resources occupied by each cluster task during execution, and it is convenient to track the resources consumed by the cluster tasks calculated in the cluster every day, analyze the departments, users, or services, and find the cluster tasks with the lowest resource consumption, which is convenient for statistics.
  • the resource consumption of each department or each line of business, which is convenient for guiding various departments to optimize computing tasks, is conducive to controlling the cost control of cluster construction.
  • the method further includes the following steps.
  • step S201 multi-dimensional resources on each node in the cluster are counted.
  • step S202 the multi-dimensional resources on each node are divided into a plurality of single-dimensional preset units. source.
  • multi-dimensional resources (CPU, memory, network I/O, disk I/O, etc.) on each node in the Hadoop cluster can be divided into multiple one-dimensional Slots, taking into account the use of Map Task and Reduce Task resources.
  • the Slot can be further divided into Map Slot and Reduce Slot, and the Map Task can only use the Map Slot.
  • the Reduce Task can only use the Reduce Slot.
  • the embodiment of the invention can divide the resources on each node to obtain a plurality of single-dimension preset unit resources, so as to determine the total resource occupation time of the cluster task according to the time of the preset unit resources occupied by each task process.
  • the method further includes the following steps.
  • step S301 a correspondence between a preset cluster resource and a task priority is acquired.
  • the correspondence between the preset cluster resource and the task priority may be the correspondence between the threshold range of the cluster resource and the task priority. For example, when the threshold of the cluster resource ranges from 100 to 200, the corresponding priority is 2. Level and so on.
  • step S302 the task priority corresponding to the cluster resource consumed by the cluster task is determined as the priority of the cluster task.
  • the method provided by the embodiment of the present invention can determine the priority of the cluster task according to the resource consumption of the cluster task, and conveniently determine the scheduling control of the cluster task according to the priority of the task.
  • an apparatus for determining a resource consumption of a task including: a first obtaining module 401, a calculating module 402, a first statistic module 403, and a first determining module 404.
  • the first obtaining module 401 is configured to acquire a task record of the cluster task, where the task record includes: a task process started when the task is executed.
  • the second obtaining sub-module is configured to acquire the task record of the cluster task in a load balancing manner through the preset interface.
  • the calculation module 402 is configured to calculate a resource occupation time of each task process occupying a preset unit resource.
  • the calculating module includes:
  • a first obtaining submodule configured to acquire, for each task process, an attempt process initiated by each task process
  • the statistics sub-module is used to count the resource occupation time of the preset unit resource when the successful running trial process exists.
  • the first statistic module 403 is configured to collect a total resource occupation time of a preset unit resource occupied by multiple task processes initiated by the cluster task.
  • the first determining module 404 is configured to determine, according to the total resource occupation time and the preset unit resource, a cluster resource consumed by the cluster task when executed.
  • the apparatus further includes: a second statistic module and a partitioning module.
  • the second statistic module is configured to count multi-dimensional resources on each node in the cluster.
  • a dividing module is configured to divide the multi-dimensional resource on each node into a plurality of single-dimensional preset unit resources.
  • the apparatus further includes: a second acquisition module and a second determination module.
  • the second obtaining module is configured to obtain a correspondence between the preset cluster resource and the task priority.
  • a second determining module configured to determine a task priority corresponding to the cluster resource consumed by the cluster task as a priority of the cluster task.
  • the embodiment of the present invention further provides a server, which includes some or all of the modules in the device for determining resource consumption of the task provided by the embodiment shown in FIG. 4 .
  • the embodiment of the present invention further provides a non-transitory computer readable storage medium, wherein the non-transitory computer readable storage medium can store computer instructions, which can implement the embodiments provided in the embodiments shown in FIG. 1 to FIG. Some or all of the various implementations of the method of determining a resource consumed by a task.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

一种确定任务消耗资源的方法及装置,所示方法包括:获取集群任务的任务记录(S101),所述任务记录包括:任务执行时启动的任务进程;计算每个任务进程占用预设单位资源的资源占用时间(S102);统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间(S103);根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源(S104)。所示方法能够确定每个集群任务在执行时占用的集群资源,便于追踪每天在集群中计算的集群任务消耗的资源。

Description

确定任务消耗资源的方法及装置
本申请要求于2015年12月25日提交中国专利局、申请号为201510997430.X、发明名称为“确定任务消耗资源的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及计算机技术领域,尤其涉及一种确定任务消耗资源的方法及装置。
背景技术
Hadoop实现了一个分布式文件系统(Hadoop Distributed File System),简称HDFS。用户可以在不了解分布式底层细节的情况下,开发分布式程序,充分利用集群的威力进行高速运算和存储。集群中一般包含多个节点,每个节点上的CPU资源和存储资源等。
在实际应用中,一个企业中的Hadoop集群在实际应用中可能会由企业中很多的研发人员使用,由于每个提交到集群中的任务在执行时均需要消耗一定的资源,例如CPU资源和存储资源等,对于一些研发人员提供的需要消耗很多集群资源的程序,可能会造成资源的争抢,可能还会影响其它集群任务的运行。
发明内容
为克服相关技术中存在的问题,本发明实施例提供一种确定任务消耗资源的方法及装置。
根据本发明实施例的第一方面,提供一种确定任务消耗资源的方法,包括:
获取集群任务的任务记录,所述任务记录包括:任务执行时启动的任务进程;
计算每个任务进程占用预设单位资源的资源占用时间;
统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间;
根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源。
可选地,所述方法还包括:
统计集群中每个节点上的多维度资源;
将每个节点上的多维度资源划分成多个单维度的预设单位资源。
可选地,所述方法还包括:
获取预设集群资源与任务优先级的对应关系;
将与所述集群任务消耗的集群资源对应的任务优先级确定为所述集群任务的优先级。
可选地,所述任务记录还包括:尝试进程;
所述计算每个任务进程在对应的所述进程时间内占用预设单位资源的资源占用时间包括:
针对每个任务进程,获取每个任务进程启动的尝试进程;
当存在运行成功的尝试进程时,统计运行成功的尝试进程占用预设单位资源的资源占用时间。
可选地,所述获取集群任务的任务记录包括:
通过预设接口以负载均衡的方式获取集群任务的任务记录。
根据本发明实施例的第二方面,提供一种确定任务消耗资源的装置,包括:
第一获取模块,用于获取集群任务的任务记录,所述任务记录包括:任务执行时启动的任务进程;
计算模块,用于计算每个任务进程占用预设单位资源的资源占用时间;
第一统计模块,用于统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间;
第一确定模块,用于根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源。
可选地,所述装置还包括:
第二统计模块,用于统计集群中每个节点上的多维度资源;
划分模块,用于将每个节点上的多维度资源划分成多个单维度的预设单位资 源。
可选地,所述装置还包括:
第二获取模块,用于获取预设集群资源与任务优先级的对应关系;
第二确定模块,用于将与所述集群任务消耗的集群资源对应的任务优先级确定为所述集群任务的优先级。
可选地,所述任务记录还包括:尝试进程;
所述计算模块包括:
第一获取子模块,用于针对每个任务进程,获取每个任务进程启动的尝试进程;
统计子模块,用于当存在运行成功的尝试进程时,统计运行成功的尝试进程占用预设单位资源的资源占用时间。
可选地,所述第一获取模块包括:
第二获取子模块,用于通过预设接口以负载均衡的方式获取集群任务的任务记录。
根据本发明实施例的第三方面,还提供一种服务器,该服务器包括本发明实施例第二方面提供的一种确定任务消耗资源的装置中的部分或全部模块。
根据本发明实施例的第四方面,还提供一种非易失性计算机可读存储介质,其中,该非易失性计算机可读存储介质可存储计算机指令,该计算机指令可实现本发明实施例第一方面提供一种确定任务消耗资源的方法的各实现方式中的部分或全部步骤。
本发明的实施例提供的技术方案可以包括以下有益效果:
本发明通过获取集群任务的任务记录,所述任务记录包括:任务执行时启动的任务进程;计算每个任务进程占用预设单位资源的资源占用时间;统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间;根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源。
本发明实施例提供的该方法,能够确定每个集群任务在执行时占用的集群资源,便于追踪每天在集群中计算的集群任务消耗的资源,进而便于按照部门、用户或者业务进行分析,找出资源占用最低的集群任务,便于统计各个部门或者各个业 务线的资源消耗,进而便于指导各部门优化计算任务,有利于控制集群建设的成本控制。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本发明。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是根据一示例性实施例示出的一种确定任务消耗资源的方法的一种流程图;
图2是根据一示例性实施例示出的一种确定任务消耗资源的方法的另一种流程图;
图3是根据一示例性实施例示出的一种确定任务消耗资源的方法的另一种流程图;
图4是根据一示例性实施例示出的一种确定任务消耗资源的装置的结构图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。
如图1所示,在本发明的又一实施例中,提供一种确定任务消耗资源的方法,应用于服务器,包括以下步骤。
在步骤S101中,获取集群任务的任务记录。
在本发明实施例中,所述任务记录包括:任务执行时启动的任务进程,服务器可以通过预设接口以负载均衡的方式获取集群任务的任务记录。
在该步骤中,集群任务可以为提交到Hadoop集群中的任务,对于每一个运行完成的MapReduce任务,JobTracker都记录了该任务的详细信息,包括任务的基本配置信息和该MapReduce任务具体执行情况。这些信息都可以从JobTracker的Web站点以及各个子页面里获取得到;数据收集程序是一个Newlisp脚本,该脚本通过Http Get的方式请求JobTracker站点指定页面的内容,并对内容进行解析,获取到指定的MapReduce任务的详细信息。一般地,收集的信息具体分为三类:
1)任务的基本信息;
包括:任务Id、用户名、任务名称、Hive执行语句、任务提交机器、任务提交机器ip、任务提交时间、任务Launch时间、任务Launch耗时、任务结束时间、任务总共耗时、任务运行结果、失败信息。
2)任务运行的统计信息;
包括:各种Task的数目、成功运行的Task数目、失败的Task数目、杀死的Task数目、各个阶段(Setup、Map、Reduce、Cleanup)的开始时间、结束时间、总耗时、各个Counter的统计值。
3)每个Task的每个Attempt执行的详细信息;
包括:Attempt的id、所属Task id、Attempt开始时间、Shuffle阶段结束时间、Shuffle阶段耗时、Sort阶段结束时间、sort阶段耗时、Attempt结束时间、总共耗时、执行机器、执行结果、错误信息、Counter数目。
对于每一个MapReduce任务,程序都会收集上述三类信息,汇总成一条任务记录,通过Http的方式发回到服务器,服务器通过REST API的方式接收程序发送过来的数据,为了防止单点,采用了LVS+Nginx+双机负载均衡的方案,数据库采用了MongoDB三机集群,保证数据存储的高性能和无单点。
在步骤S102中,计算每个任务进程占用预设单位资源的资源占用时间。
在本发明实施例实施例中,一个预设单位资源可以指一个Slot,可以针对每个任务进程,获取每个任务进程启动的尝试进程;当存在运行成功的尝试进程时,统计运行成功的尝试进程占用预设单位资源的资源占用时间。
在该步骤中,当一个集群任务(即MapReduce任务)运行的时候,总是需要运行一定数目的Map Task和Reduce Task。而每一任务进程(即Task)的运行总是要占据一个Slot一段时间,也就是占据着机器上的一定的资源一段时间。
每个集群任务(即MapReduce任务)都是由若干个任务进程(即Task)组成,而每一任务进程都可能启动多个尝试进程(即Attempt)组成,每个尝试进程是对完成该任务进程的一次尝试。在执行一次尝试进程的时候,可能由于运行节点异常导致该尝试进程失败或是执行的异常缓慢,这时候计算框架就会再启动一次尝试进程执行相同的任务进程。Hadoop集群使用这种机制来保证每个任务进程能够运行成功且任务不会因为一个任务进程的缓慢而执行时间过长。每个任务进程的若干次尝试进程只有至多一次会是运行成功的状态。
由于每一个任务进程的多次尝试进程大多数情况是由于集群计算节点的异常导致的,所以多次尝试进程运行的成本不应该重复计算在每一个任务上面,即只计算一个任务中所有运行状态为SUCCESS的尝试进程的执行时间之和,作为该任务的任务进程运行总时长。
在步骤S103中,统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间。
在该步骤中,可以将每个任务进程占用预设单位资源的资源占用时间求和,得到总资源占用时间。
在步骤S104中,根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源。
由于Hadoop集群的机器数目是有限的,每个机器上能够配置的Slot数目也是一定的,所以集群每天总共能够提供的Map Task和Reduce Task的运行时间也是一定的,所以本发明实施例提供的该方法,能够确定每个集群任务在执行时占用的集群资源,便于追踪每天在集群中计算的集群任务消耗的资源,按照部门、用户或者业务进行分析,找出资源占用最低的集群任务,便于统计各个部门或者各个业务线的资源消耗,进而便于指导各部门优化计算任务,有利于控制集群建设的成本控制。
如图2所示,在本发明实施例的又一实施例中,所述方法还包括以下步骤。
在步骤S201中,统计集群中每个节点上的多维度资源。
在步骤S202中,将每个节点上的多维度资源划分成多个单维度的预设单位资 源。
在该步骤中,可以将Hadoop集群中各个节点上的多维度资源(CPU、内存、网络I/O和磁盘I/O等)等分成多个一维度Slot,考虑到Map Task和Reduce Task资源使用量不同,可以将Slot进一步划分成Map Slot和Reduce Slot两种,并规定Map Task只能使用Map Slot,Reduce Task只能使用Reduce Slot。
本发明实施例能够将各个节点上的资源进行划分,得到多个单维度的预设单位资源,以便于根据每个任务进程占用的预设单位资源的时时间确定集群任务的总资源占用时间。
如图3所示,再本发明的有一个实施例中,所述方法还包括以下步骤。
在步骤S301中,获取预设集群资源与任务优先级的对应关系。
在该步骤中,预设集群资源与任务优先级的对应关系可以为集群资源的阈值范围与任务优先级的对应关系,例如:集群资源的阈值范围在100至200时,对应的优先级为2级等。
在步骤S302中,将与所述集群任务消耗的集群资源对应的任务优先级确定为所述集群任务的优先级。
本发明实施例提供的该方法,能够根据集群任务的资源消耗情况,确定集群任务的优先级,便于确定根据任务的优先级对集群任务进行调度控制等。
如图4所示,在本发明的又一实施例中,提供一种确定任务消耗资源的装置,包括:第一获取模块401、计算模块402、第一统计模块403和第一确定模块404。
第一获取模块401,用于获取集群任务的任务记录,所述任务记录包括:任务执行时启动的任务进程。
在本发明实施例中,第二获取子模块,用于通过预设接口以负载均衡的方式获取集群任务的任务记录。
计算模块402,用于计算每个任务进程占用预设单位资源的资源占用时间。
在本发明实施例中,所述计算模块包括:
第一获取子模块,用于针对每个任务进程,获取每个任务进程启动的尝试进程;
统计子模块,用于当存在运行成功的尝试进程时,统计运行成功的尝试进程占用预设单位资源的资源占用时间。
第一统计模块403,用于统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间。
第一确定模块404,用于根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源。
在本发明的又一实施例中,所述装置还包括:第二统计模块和划分模块。
第二统计模块,用于统计集群中每个节点上的多维度资源。
划分模块,用于将每个节点上的多维度资源划分成多个单维度的预设单位资源。
在本发明的又一实施例中,所述装置还包括:第二获取模块和第二确定模块。
第二获取模块,用于获取预设集群资源与任务优先级的对应关系。
第二确定模块,用于将与所述集群任务消耗的集群资源对应的任务优先级确定为所述集群任务的优先级。
本发明实施例还提供一种服务器,该服务器包括图4所示实施例提供的一种确定任务消耗资源的装置中的部分或全部模块。
本发明实施例还提供非易失性计算机可读存储介质,其中,该非易失性计算机可读存储介质可存储有计算机指令,该计算机指令可实现图1至图3所示实施例提供的一种确定任务消耗资源的方法的各实现方式中的部分或全部步骤。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本发明的其它实施方案。本申请旨在涵盖本发明的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本发明的一般性原理并包括本发明未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本发明的真正范围和精神由所附的权利要求指出。
应当理解的是,本发明并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本发明的范围仅由所附的权利要求来限制。

Claims (10)

  1. 一种确定任务消耗资源的方法,其特征在于,包括:
    获取集群任务的任务记录,所述任务记录包括:任务执行时启动的任务进程;
    计算每个任务进程占用预设单位资源的资源占用时间;
    统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间;
    根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源。
  2. 根据权利要求1所述的确定任务消耗资源的方法,其特征在于,所述方法还包括:
    统计集群中每个节点上的多维度资源;
    将每个节点上的多维度资源划分成多个单维度的预设单位资源。
  3. 根据权利要求1所述的确定任务消耗资源的方法,其特征在于,所述方法还包括:
    获取预设集群资源与任务优先级的对应关系;
    将与所述集群任务消耗的集群资源对应的任务优先级确定为所述集群任务的优先级。
  4. 根据权利要求1至3任意一项所述的确定任务消耗资源的方法,其特征在于,所述任务记录还包括:尝试进程;
    所述计算每个任务进程在对应的所述进程时间内占用预设单位资源的资源占用时间包括:
    针对每个任务进程,获取每个任务进程启动的尝试进程;
    当存在运行成功的尝试进程时,统计运行成功的尝试进程占用预设单位资源的资源占用时间。
  5. 根据权利要求4所述的确定任务消耗资源的方法,其特征在于,所述获取集群任务的任务记录包括:
    通过预设接口以负载均衡的方式获取集群任务的任务记录。
  6. 一种确定任务消耗资源的装置,其特征在于,包括:
    第一获取模块,用于获取集群任务的任务记录,所述任务记录包括:任务执行时启动的任务进程;
    计算模块,用于计算每个任务进程占用预设单位资源的资源占用时间;
    第一统计模块,用于统计集群任务启动的多个任务进程占用的预设单位资源的总资源占用时间;
    第一确定模块,用于根据所述总资源占用时间及预设单位资源确定所述集群任务在执行时消耗的集群资源。
  7. 根据权利要求6所述的确定任务消耗资源的装置,其特征在于,所述装置还包括:
    第二统计模块,用于统计集群中每个节点上的多维度资源;
    划分模块,用于将每个节点上的多维度资源划分成多个单维度的预设单位资源。
  8. 根据权利要求6所述的确定任务消耗资源的装置,其特征在于,所述装置还包括:
    第二获取模块,用于获取预设集群资源与任务优先级的对应关系;
    第二确定模块,用于将与所述集群任务消耗的集群资源对应的任务优先级确定为所述集群任务的优先级。
  9. 根据权利要求6至8任意一项所述的确定任务消耗资源的装置,其特征在于,所述任务记录还包括:尝试进程;
    所述计算模块包括:
    第一获取子模块,用于针对每个任务进程,获取每个任务进程启动的尝试进程;
    统计子模块,用于当存在运行成功的尝试进程时,统计运行成功的尝试进程占用预设单位资源的资源占用时间。
  10. 根据权利要求9所述的确定任务消耗资源的装置,其特征在于,所 述第一获取模块包括:
    第二获取子模块,用于通过预设接口以负载均衡的方式获取集群任务的任务记录。
PCT/CN2016/089272 2015-12-25 2016-07-07 确定任务消耗资源的方法及装置 WO2017107456A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/241,389 US20170185454A1 (en) 2015-12-25 2016-08-19 Method and Electronic Device for Determining Resource Consumption of Task

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510997430.X 2015-12-25
CN201510997430.XA CN105868070A (zh) 2015-12-25 2015-12-25 确定任务消耗资源的方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/241,389 Continuation US20170185454A1 (en) 2015-12-25 2016-08-19 Method and Electronic Device for Determining Resource Consumption of Task

Publications (1)

Publication Number Publication Date
WO2017107456A1 true WO2017107456A1 (zh) 2017-06-29

Family

ID=56624390

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/089272 WO2017107456A1 (zh) 2015-12-25 2016-07-07 确定任务消耗资源的方法及装置

Country Status (2)

Country Link
CN (1) CN105868070A (zh)
WO (1) WO2017107456A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111580951A (zh) * 2019-02-15 2020-08-25 杭州海康威视数字技术股份有限公司 一种任务分配方法及资源管理平台

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108021450A (zh) * 2017-12-04 2018-05-11 北京小度信息科技有限公司 基于yarn的作业分析方法和装置
CN110599148B (zh) * 2019-09-16 2022-05-31 广州虎牙科技有限公司 集群数据处理方法、装置、计算机集群及可读存储介质
CN111833022B (zh) * 2020-07-17 2021-11-09 海南大学 跨数据、信息、知识模态与量纲的任务处理方法及组件
CN112749055A (zh) * 2020-12-29 2021-05-04 拉卡拉支付股份有限公司 资源消耗计量方法、装置、电子设备及存储介质
CN117234711B (zh) * 2023-09-05 2024-05-07 合芯科技(苏州)有限公司 Flink系统资源动态分配方法、系统、设备及介质

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178688A (zh) * 2007-11-29 2008-05-14 中兴通讯股份有限公司 系统任务的cpu占用率检测方法及系统
CN103246570A (zh) * 2013-05-20 2013-08-14 百度在线网络技术(北京)有限公司 Hadoop的调度方法、系统及管理节点
US8560779B2 (en) * 2011-05-20 2013-10-15 International Business Machines Corporation I/O performance of data analytic workloads
CN103699433A (zh) * 2013-12-18 2014-04-02 中国科学院计算技术研究所 一种于Hadoop平台中动态调整任务数目的方法及系统
CN103761146A (zh) * 2014-01-06 2014-04-30 浪潮电子信息产业股份有限公司 一种MapReduce动态设定slots数量的方法
US20150227394A1 (en) * 2014-02-07 2015-08-13 International Business Machines Corporation Detection of time points to voluntarily yield resources for context switching

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130290972A1 (en) * 2012-04-27 2013-10-31 Ludmila Cherkasova Workload manager for mapreduce environments
CN103970604B (zh) * 2013-01-31 2017-05-03 国际商业机器公司 基于MapReduce架构实现图处理的方法和装置
CN103455375B (zh) * 2013-01-31 2017-02-08 南京理工大学连云港研究院 Hadoop云平台下基于负载监控的混合调度方法
US9183016B2 (en) * 2013-02-27 2015-11-10 Vmware, Inc. Adaptive task scheduling of Hadoop in a virtualized environment
CN104298550B (zh) * 2014-10-09 2017-11-14 南通大学 一种面向Hadoop的动态调度方法
CN104915407B (zh) * 2015-06-03 2018-06-12 华中科技大学 一种基于Hadoop多作业环境下的资源调度方法
CN105138405B (zh) * 2015-08-06 2019-05-14 湖南大学 基于待释放资源列表的MapReduce任务推测执行方法和装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101178688A (zh) * 2007-11-29 2008-05-14 中兴通讯股份有限公司 系统任务的cpu占用率检测方法及系统
US8560779B2 (en) * 2011-05-20 2013-10-15 International Business Machines Corporation I/O performance of data analytic workloads
CN103246570A (zh) * 2013-05-20 2013-08-14 百度在线网络技术(北京)有限公司 Hadoop的调度方法、系统及管理节点
CN103699433A (zh) * 2013-12-18 2014-04-02 中国科学院计算技术研究所 一种于Hadoop平台中动态调整任务数目的方法及系统
CN103761146A (zh) * 2014-01-06 2014-04-30 浪潮电子信息产业股份有限公司 一种MapReduce动态设定slots数量的方法
US20150227394A1 (en) * 2014-02-07 2015-08-13 International Business Machines Corporation Detection of time points to voluntarily yield resources for context switching

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111580951A (zh) * 2019-02-15 2020-08-25 杭州海康威视数字技术股份有限公司 一种任务分配方法及资源管理平台
CN111580951B (zh) * 2019-02-15 2023-10-10 杭州海康威视数字技术股份有限公司 一种任务分配方法及资源管理平台

Also Published As

Publication number Publication date
CN105868070A (zh) 2016-08-17

Similar Documents

Publication Publication Date Title
WO2017107456A1 (zh) 确定任务消耗资源的方法及装置
US10713092B2 (en) Dynamic resource management of a pool of resources for multi-tenant applications based on sample exceution, query type or jobs
US10430332B2 (en) System and method for performance tuning of garbage collection algorithms
CN108776934B (zh) 分布式数据计算方法、装置、计算机设备及可读存储介质
US10831633B2 (en) Methods, apparatuses, and systems for workflow run-time prediction in a distributed computing system
Coutinho et al. Elasticity in cloud computing: a survey
US9544403B2 (en) Estimating latency of an application
US8701108B2 (en) Apparatus and method for controlling live-migrations of a plurality of virtual machines
US9104498B2 (en) Maximizing server utilization within a datacenter
US9774654B2 (en) Service call graphs for website performance
US11474874B2 (en) Systems and methods for auto-scaling a big data system
WO2017166803A1 (zh) 一种资源调度方法及装置
CN112162865A (zh) 服务器的调度方法、装置和服务器
US8910128B2 (en) Methods and apparatus for application performance and capacity analysis
US8606905B1 (en) Automated determination of system scalability and scalability constraint factors
US10356167B1 (en) Workload profiling
US20160225042A1 (en) Determining a cost of an application programming interface
US11144325B2 (en) Systems and methods for optimized cluster resource utilization
Taft et al. P-store: An elastic database system with predictive provisioning
US20170185454A1 (en) Method and Electronic Device for Determining Resource Consumption of Task
US20130305245A1 (en) Methods for managing work load bursts and devices thereof
CN107430526B (zh) 用于调度数据处理的方法和节点
Choi et al. pHPA: A proactive autoscaling framework for microservice chain
CN110599148A (zh) 集群数据处理方法、装置、计算机集群及可读存储介质
US10871988B1 (en) Methods for feedback-based optimal workload scheduling and devices thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16877268

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16877268

Country of ref document: EP

Kind code of ref document: A1