CN112256418B - Big data task scheduling method - Google Patents

Big data task scheduling method Download PDF

Info

Publication number
CN112256418B
CN112256418B CN202011157921.0A CN202011157921A CN112256418B CN 112256418 B CN112256418 B CN 112256418B CN 202011157921 A CN202011157921 A CN 202011157921A CN 112256418 B CN112256418 B CN 112256418B
Authority
CN
China
Prior art keywords
big data
task
data analysis
tasks
complexity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011157921.0A
Other languages
Chinese (zh)
Other versions
CN112256418A (en
Inventor
胡亚军
邵若梅
孙树清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen International Graduate School of Tsinghua University
Original Assignee
Shenzhen International Graduate School of Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen International Graduate School of Tsinghua University filed Critical Shenzhen International Graduate School of Tsinghua University
Priority to CN202011157921.0A priority Critical patent/CN112256418B/en
Publication of CN112256418A publication Critical patent/CN112256418A/en
Application granted granted Critical
Publication of CN112256418B publication Critical patent/CN112256418B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a big data task scheduling method, which comprises the following steps: s1, dividing a plurality of big data analysis tasks into a plurality of priorities, dividing the big data analysis tasks with the same priority into the same group, and determining the complexity of each big data analysis task in each group of task groups; s2, constructing a task scheduling subprogram based on a neural network of a cyclic scheduling learning algorithm in the Hadoop computing cluster, and distributing computing resources of the Hadoop computing cluster to each big data analysis task according to priority and complexity by the task scheduling subprogram. The invention can enable the calculation cluster to reach the optimal running state in big data analysis, solves the problem of excessive preemption of the resources of the calculation task, and ensures that the calculation resources are fully utilized by timely recovering the calculation resources of the Hadoop cluster.

Description

Big data task scheduling method
Technical Field
The invention relates to the field of big data intelligent processing methods, in particular to a big data task scheduling method.
Background
When the world advances in the 5G age, data become more and more enterprise gold ores, and the gold ores are extracted into the data gold ores, so that a large data analysis technology is needed to be utilized, and various data reports are obtained by utilizing the strong calculation power of a server cluster, so that related businesses can be intuitively and clearly known and understood through the reports. With the increase of data volume, from GB to TB at the beginning and even to PB level data, a very huge large data cluster is required to meet the data analysis requirement, and the analysis requirement is also from several to tens to hundreds.
Currently, in the field of big data analysis, non-sensitive behavior data of users need to be collected under the condition of legal permission, and meanwhile, the TB level and even PB level data are analyzed and learned by utilizing a big data technology, so that a Hadoop ecological big data analysis technology is needed. Because business requires big data analysis of each dimension every day, most analysis tasks have a time dimension of analysis, such as month, week, day, time, and the like, and the larger the time dimension is, the larger the data corresponding to the analysis needs to be performed at one time, and more calculation resources are needed to obtain the analysis result in a certain time.
In the prior art, a Hadoop computing cluster is started, and a prest technology is utilized to correspondingly trigger each computing task after a specific time point of each day, but the method has various disadvantages, on one hand, the problem of mutual preemption of resources of each computing task occurs, and finally, certain analysis tasks cannot successfully obtain analysis results due to insufficient computing resources; on the other hand, since a cluster with a fixed size is started, the analysis task generally starts to run in the early morning, and the analysis result is needed to be obtained in the morning, so that the cluster is almost fully loaded to run in a period of time, but more than half of the time is idle, and resource waste occurs. Meanwhile, hundreds of tasks are operated in the condition of needing corresponding resources, if the tasks are not distinguished, some relatively important tasks cannot calculate analysis results in the expected time, but relatively less important or less urgent analysis tasks acquire more resources to quickly output the analysis results, and the situation can cause great trouble and inconvenience for big data analysis.
Disclosure of Invention
The invention aims to provide a big data task scheduling method, which solves the problem of poor utilization condition of computing resources in the process of big data task analysis in the prior art and realizes the completion of the most big data analysis business by using the least machines.
In order to achieve the above purpose, the technical scheme adopted by the invention is as follows:
a big data task scheduling method comprises the following steps:
s1, dividing a plurality of big data analysis tasks into a plurality of priorities according to the importance degrees of the big data analysis tasks, dividing the big data analysis tasks with the same priority into the same group to obtain a plurality of groups of task groups, and determining the complexity of each big data analysis task in each group of task groups;
s2, constructing a task scheduling subprogram based on a neural network of a cyclic scheduling learning algorithm in the Hadoop computing cluster, and distributing computing resources of the Hadoop computing cluster to each big data analysis task by the task scheduling subprogram to perform task analysis, wherein the task scheduling subprogram distribution process is as follows:
allocating computing resources to a plurality of task groups according to priorities, wherein the allocation of the computing resources is reduced according to the order of the priorities from high to low;
in each task group, according to the complexity of each big data analysis task, a plurality of big data analysis tasks with the complexity larger than a preset threshold value are respectively and exclusively analyzed by the corresponding allocated computing resources, and after the analysis of the plurality of big data analysis tasks with the complexity larger than the preset threshold value is completed, the rest big data analysis tasks are analyzed by the corresponding allocated computing resources.
Optionally, in some embodiments:
in the big data task scheduling method, in step S1, the priorities of the big data analysis tasks are divided according to the importance of the big data analysis tasks in terms of the tasks, and the higher the importance is, the higher the priority is.
The big data task scheduling method is characterized by comprising the following steps of: in step S1, the priorities of the big data analysis tasks are classified according to the analysis conclusion, and the higher the importance is, the higher the priority is.
In the big data task scheduling method, in step S1, the complexity of each big data analysis task is determined according to the amount of calculation resources which are theoretically required to be occupied when the analysis of each big data analysis task is completed within the same time period, and the greater the amount of calculation resources is occupied, the higher the complexity is.
In the big data task scheduling method, in step S1, the complexity of each big data analysis task is determined according to the event complexity, the space complexity and the total data amount required to be called of codes required for completing analysis of the big data analysis task.
In the big data task scheduling method, in each task group in step S2, a plurality of big data analysis tasks with complexity greater than a preset threshold value sequentially and exclusively analyze the corresponding allocated computing resources according to a serial sequence, and after the analysis of the plurality of big data analysis tasks with complexity greater than the preset threshold value is completed, the rest big data analysis tasks analyze the corresponding allocated computing resources according to the serial or parallel sequence.
In the big data task scheduling method, after all big data tasks in each task group are analyzed in step S2, the corresponding allocated computing resources are released and used for analyzing other big data analysis tasks.
In the big data task scheduling method, in step S2, after analysis of each big data analysis task is completed by using computing resources in the Hadoop computing cluster, the task execution condition is fed back to the task scheduling sub-program, and the task scheduling sub-program performs self-learning according to the task execution condition, so that a new task scheduling sub-program is obtained for subsequent computing resource allocation.
In the field of mobile phones, the correct knowledge of the user needs to be obtained through an effective way, so that insensitive behavior data of the user can be collected under the condition of legal permission, and meanwhile, the TB level data and even PB level data are analyzed and learned by utilizing a big data technology, so that a big data analysis technology of Hadoop ecology is needed. Since the analysis of big data in each dimension is almost performed every day, most analysis tasks have a time dimension of analysis, such as month, week, day, time, etc., and the larger the time dimension is, the larger the data to be analyzed at one time is, and if the analysis result needs to be obtained within a certain time, more calculation resources are needed.
In the past, a Hadoop computing cluster is started, and a prest technology is utilized to correspondingly trigger each computing task after a specific time point of each day, but the method has various disadvantages, on one hand, the problem of mutual preemption of resources of each computing task occurs, and finally, certain analysis tasks cannot successfully obtain analysis results due to insufficient computing resources; on the other hand, since a cluster with a fixed size is started, the analysis task generally starts to run in the early morning, and the analysis result is needed to be obtained at 8 a.m., the cluster is almost fully loaded and running in a period of time, but more than half of the time is idle, and resource waste occurs. Meanwhile, hundreds of tasks are running in need of corresponding resources, and in the past, the tasks are indistinguishable, so that some relatively important tasks cannot calculate analysis results in the expected time, and relatively less important or less urgent analysis tasks obtain more resources to quickly output analysis results, and the situation causes great trouble and inconvenience for checking and analyzing the daily operation conditions of people.
In the invention, priority determination is carried out on all big data analysis tasks, meanwhile, a task scheduling subprogram is utilized to scientifically and effectively schedule a plurality of big data analysis tasks, so that the high priority tasks can be optimally obtained to calculate calculation resources, the resources are not preempted by the tasks with low priority, meanwhile, the situation of the running process of the clusters and each big data analysis task is analyzed through intelligent self-learning of the task scheduling subprogram, the tasks are mechanically adjusted to enter and exit the serial or parallel queues according to the priority of the tasks and the resource requirement situation, and the effective running of the analysis tasks and the most efficient utilization of the big data cluster resources are ensured, thereby enabling the calculated clusters to reach the optimal running state during big data analysis.
According to the method, large data analysis tasks are decomposed, tasks are independently operated in a Hadoop computing cluster in batches, priority definition is carried out on each large data analysis task, more computing resources are allocated to groups with high priority in the Hadoop computing cluster according to the priority of the tasks, meanwhile, the characteristics of task serial and parallel are supported, the computing resources (namely serial mode) are exclusively used for some large data analysis tasks which are important and have high complexity and need more resources, and after the task operation is finished, other large data analysis tasks are operated in series or in parallel to use the computing resources. And recovering the distributed computing resources after the operation of all the big data analysis tasks of the group is finished so as to be used for other computing services, thus solving the problem of excessive preemption of the computing tasks, and simultaneously ensuring that the computing resources are fully utilized by timely recovering the computing resources of the Hadoop cluster.
Drawings
FIG. 1 is a flow chart of a method of an embodiment of the present invention.
Fig. 2 is a schematic diagram of a core algorithm of a task scheduling service according to an embodiment of the present invention.
Fig. 3 is a comparison chart of success rates of new and old task scheduling services in the embodiment of the present invention.
FIG. 4 is a diagram showing the comparison of the cost of new and old task scheduling services according to an embodiment of the present invention.
Detailed Description
The invention will be further described with reference to the drawings and examples.
As shown in fig. 1, a big data task scheduling method includes the following steps:
s1, dividing a plurality of big data analysis tasks into a plurality of priorities according to the importance degrees of the big data analysis tasks, wherein each big data analysis task has a respective priority, and dividing the big data analysis tasks with the same priority into the same group to obtain a plurality of groups of task groups.
In step S1, the priorities of the big data analysis tasks may be classified according to their importance in terms of traffic, with higher priorities being higher.
In step S1, the priorities of the big data analysis tasks may be further classified according to the analysis conclusion, where the higher the importance, the higher the priority of the service instruction.
Wherein, each step is explained as follows:
s2, determining the complexity of each big data analysis task in each task group.
In step S2, the complexity of each big data analysis task may be determined according to the amount of computing resources that need to be occupied in theory when the analysis of each big data analysis task is completed within the same time period, for example, the complexity is higher when the amount of computing resources occupied is greater according to the memory, CPU, storage, number of computers, etc. of the computer that needs to be occupied.
In step S2, the complexity of each big data analysis task is determined according to the event complexity, the space complexity, and the total amount of data to be called of the code required for completing the analysis of the big data analysis task.
And S3, constructing a task scheduling subprogram based on a neural network of a cyclic scheduling learning algorithm in the Hadoop computing cluster, and distributing computing resources of the Hadoop computing cluster to each big data analysis task by the task scheduling subprogram to perform task analysis.
As shown in fig. 2, in the present invention, a task scheduling sub-program is constructed based on a neural network of a cyclic scheduling learning algorithm (CSL, cyclic scheduling learning), and the task scheduling sub-program performs intelligent grouping adjustment and resource allocation on a big data analysis task according to a predefined task priority, learns the execution condition of the task, performs effective try and adjustment on the task scheduling sub-program again, performs grouping adjustment and resource allocation again on the big data analysis task by reusing the latest task scheduling sub-program when the task runs next time, and through cyclic learning and adjustment, finally achieves the maximized use of resources, and simultaneously ensures that the big data analysis task can analyze the required result according to expectations, so that on one hand, the analysis efficiency of the big data is improved (as shown in fig. 3, the time of big data analysis is reduced, and efficiency is improved), and on the other hand, the calculation expense is saved (as shown in fig. 4).
In the invention, a data warehouse and a task warehouse are also built in the Hadoop computing cluster. The data warehouse collects various data of the current business system, including structured data (such as Mysql database, etc.), unstructured data (such as pictures, videos, log files, etc.), and also called data lakes. The task warehouse records the detailed information of all the current tasks for data analysis, can be stored by using a common relational database (such as Mysql and SQL Server), and an administrator can perform operations such as adding, deleting, modifying and checking the task warehouse at any time.
In the invention, the Hadoop computing cluster supports various offline technical frameworks, such as Hive, presto, impala and the like.
The task scheduling subprogram is the core for scheduling, and mainly utilizes the AI capability of the task scheduling subprogram to perform intelligent grouping and resource allocation of big data analysis tasks according to various characteristics and indexes of the tasks, so as to ensure the correct operation of the tasks and the reasonable utilization of the resources.
The task scheduling sub-process distribution process of the invention is as follows:
allocating computing resources to a plurality of task groups according to priorities, wherein the allocation of the computing resources is reduced according to the order of the priorities from high to low;
in each task group, according to the complexity of each big data analysis task, a plurality of big data analysis tasks with the complexity larger than a preset threshold value are respectively and exclusively analyzed by the corresponding allocated computing resources, and after the analysis of the plurality of big data analysis tasks with the complexity larger than the preset threshold value is completed, the rest big data analysis tasks are analyzed by the corresponding allocated computing resources.
In the invention, in each task group, a plurality of big data analysis tasks with complexity greater than a preset threshold value sequentially and exclusively analyze the corresponding allocated computing resources according to a serial sequence, and after the analysis of the plurality of big data analysis tasks with complexity greater than the preset threshold value is completed, the rest big data analysis tasks analyze the corresponding allocated computing resources according to the serial or parallel sequence.
In the invention, after all big data tasks in each task group are analyzed, the corresponding allocated computing resources are released and used for analyzing other big data analysis tasks. While the main implementation of releasing computing resources is achieved through techniques of computing framework and data separation. The Hive table corresponding to the Hadoop computing cluster adopts an external table mode, and data are placed outside the Hadoop computing cluster, so that the data are not affected when the computing resources of the Hadoop computing cluster are released.
In the invention, after analysis of each big data analysis task is completed by using computing resources in the Hadoop computing cluster, the task execution condition is fed back to the task scheduling sub-program, and the task scheduling sub-program carries out self-learning according to the task execution condition, so that a new task scheduling sub-program is obtained for subsequent computing resource allocation.
The invention relates to an excellent task scheduling system, namely taking reliability and effectiveness of scheduling into consideration, and taking improvement space brought by resource scheduling to resource optimization and cost saving into consideration. Creating a big data analysis task, setting initial attributes of the task, and setting the initial attributes of the big data task as shown in table 1.
TABLE 1 big data task initial Property setting Table
The task scheduling sub-process predicts the allocated initial resources, and the analysis result of the big data analysis task enters the neural network of the task scheduling sub-process. Considering that the successful operation state of some big data analysis tasks is affected by the codes, the logs of the tasks with failed operation are required to be analyzed, and the situation that the tasks cannot normally operate due to the errors of the codes of the big data analysis tasks is filtered. The neural network of the task scheduling subprogram carries out continuous self-learning and adjustment, and the effectiveness of resource allocation and task scheduling is improved. The neural network of the task scheduling subprogram is not invariable, but is a dynamic self-learning process, and the level and the parameter of the neural network are self-adjusted through continuous data input, so that the adaptability of the task scheduling subprogram is continuously improved, and the correct operation rate of a big data analysis task reaches 99.9% (the situation that the correct operation cannot be carried out due to the error of a task code is not included); the calculation cost is reduced by 50 percent. By timely recycling the calculated resources, the calculation cost is reduced by 50% compared with the prior art under the condition of supporting the same big data analysis task.
By providing the cyclic scheduling learning algorithm CSL, the task execution efficiency is greatly improved, and the scale of the cluster is halved when the analysis tasks with the same scale are supported, so that the calculation cost is greatly saved, in addition, the important dimension of the task weight is added, the core analysis task is ensured to obtain resources preferentially for scheduling, and the important analysis service is ensured to be output rapidly. Of course, the system still has some shortages at present, for example, if the analysis task cannot be operated normally due to code errors, if the system cannot accurately distinguish the abnormal operation of the task from the error log due to the code errors, the model adjustment of the CSL algorithm is interfered, the accuracy of the task scheduling system is affected, the reason of the abnormal operation of the task is further determined manually at present, and the operation failure result which is not caused by cluster computing resources is filtered. Since the analysis time consumed by the big data analysis task is influenced by the conditions of data volume change, the quality of the analysis program code and the like besides the cluster resource allocation condition, more influencing factors need to be considered, and the problems are the problems to be optimized and solved next.
The embodiments of the present invention are merely described in terms of preferred embodiments of the present invention, and are not intended to limit the spirit and scope of the present invention, and various modifications and improvements made by those skilled in the art to the technical solutions of the present invention should fall within the protection scope of the present invention, and the technical contents of the protection of the present invention are all described in the claims.

Claims (7)

1. A big data task scheduling method is characterized in that: the method comprises the following steps:
s1, dividing a plurality of big data analysis tasks into a plurality of priorities according to the importance degrees of the big data analysis tasks, dividing the big data analysis tasks with the same priority into the same group to obtain a plurality of groups of task groups, and determining the complexity of each big data analysis task in each group of task groups;
s2, constructing a task scheduling subprogram based on a neural network of a cyclic scheduling learning algorithm in the Hadoop computing cluster, and distributing computing resources of the Hadoop computing cluster to each big data analysis task by the task scheduling subprogram to perform task analysis, wherein the task scheduling subprogram distribution process is as follows:
allocating computing resources to a plurality of task groups according to priorities, wherein the allocation of the computing resources is reduced according to the order of the priorities from high to low;
in each group of task groups, according to the complexity of each big data analysis task, a plurality of big data analysis tasks with the complexity larger than a preset threshold value are respectively and exclusively analyzed by the corresponding allocated computing resources, after the analysis of the plurality of big data analysis tasks with the complexity larger than the preset threshold value is completed, the rest big data analysis tasks are analyzed by the corresponding allocated computing resources, wherein,
in each group of task groups, a plurality of big data analysis tasks with the complexity larger than a preset threshold value are sequentially and exclusively analyzed according to serial sequences, after the analysis of the plurality of big data analysis tasks with the complexity larger than the preset threshold value is completed, the rest big data analysis tasks are analyzed according to serial or parallel sequences by using the corresponding distributed computing resources.
2. The big data task scheduling method according to claim 1, wherein: in step S1, the priorities of the big data analysis tasks are classified according to their importance in terms of traffic, and the higher the importance, the higher the priority.
3. The big data task scheduling method according to claim 1, wherein: in step S1, the priorities of the big data analysis tasks are classified according to the analysis conclusion, and the higher the importance is, the higher the priority is.
4. The big data task scheduling method according to claim 1, wherein: in step S1, the complexity of each big data analysis task is determined according to the amount of computing resources that need to be occupied in theory when the analysis of each big data analysis task is completed within the same time period, and the greater the amount of computing resources occupied, the higher the complexity.
5. The big data task scheduling method according to claim 1, wherein: in step S1, the complexity of each big data analysis task is determined according to the event complexity, the space complexity, and the total amount of data to be called of the code required for completing the analysis of the big data analysis task.
6. The big data task scheduling method according to claim 1, wherein: and step S2, after all big data tasks in each task group are analyzed, the corresponding allocated computing resources are released and used for analyzing other big data analysis tasks.
7. The big data task scheduling method according to claim 1, wherein: in step S2, after analysis of each big data analysis task is completed by using computing resources in the Hadoop computing cluster, the task execution condition is fed back to the task scheduling sub-program, and the task scheduling sub-program performs self-learning according to the task execution condition, so that a new task scheduling sub-program is obtained for subsequent computing resource allocation.
CN202011157921.0A 2020-10-26 2020-10-26 Big data task scheduling method Active CN112256418B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011157921.0A CN112256418B (en) 2020-10-26 2020-10-26 Big data task scheduling method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011157921.0A CN112256418B (en) 2020-10-26 2020-10-26 Big data task scheduling method

Publications (2)

Publication Number Publication Date
CN112256418A CN112256418A (en) 2021-01-22
CN112256418B true CN112256418B (en) 2023-10-24

Family

ID=74262449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011157921.0A Active CN112256418B (en) 2020-10-26 2020-10-26 Big data task scheduling method

Country Status (1)

Country Link
CN (1) CN112256418B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113434153B (en) * 2021-06-04 2023-03-21 郑州阿帕斯数云信息科技有限公司 Attribution method and attribution device for application installation

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106210156A (en) * 2015-04-29 2016-12-07 阿里巴巴集团控股有限公司 The processing method of parsing task, device and server
CN107273200A (en) * 2017-06-22 2017-10-20 中国科学院计算技术研究所 A kind of method for scheduling task stored for isomery
CN109992404A (en) * 2017-12-31 2019-07-09 中国移动通信集团湖北有限公司 PC cluster resource regulating method, device, equipment and medium
CN110597626A (en) * 2019-08-23 2019-12-20 第四范式(北京)技术有限公司 Method, device and system for allocating resources and tasks in distributed system
WO2020206705A1 (en) * 2019-04-10 2020-10-15 山东科技大学 Cluster node load state prediction-based job scheduling method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106210156A (en) * 2015-04-29 2016-12-07 阿里巴巴集团控股有限公司 The processing method of parsing task, device and server
CN107273200A (en) * 2017-06-22 2017-10-20 中国科学院计算技术研究所 A kind of method for scheduling task stored for isomery
CN109992404A (en) * 2017-12-31 2019-07-09 中国移动通信集团湖北有限公司 PC cluster resource regulating method, device, equipment and medium
WO2020206705A1 (en) * 2019-04-10 2020-10-15 山东科技大学 Cluster node load state prediction-based job scheduling method
CN110597626A (en) * 2019-08-23 2019-12-20 第四范式(北京)技术有限公司 Method, device and system for allocating resources and tasks in distributed system

Also Published As

Publication number Publication date
CN112256418A (en) 2021-01-22

Similar Documents

Publication Publication Date Title
WO2021213293A1 (en) Ubiquitous operating system oriented toward group intelligence perception
CN106802826B (en) Service processing method and device based on thread pool
US11847103B2 (en) Data migration using customizable database consolidation rules
US20200104377A1 (en) Rules Based Scheduling and Migration of Databases Using Complexity and Weight
CN111125444A (en) Big data task scheduling management method, device, equipment and storage medium
CN105978960A (en) Cloud scheduling system and method based on mass video structured processing
CN106354817B (en) Log processing method and device
CN110928655A (en) Task processing method and device
CN108681598B (en) Automatic task rerun method, system, computer equipment and storage medium
CN112685153A (en) Micro-service scheduling method and device and electronic equipment
CN111797604A (en) Report generation method, device, equipment and computer readable storage medium
CN109885456A (en) A kind of polymorphic type event of failure prediction technique and device based on system log cluster
CN106383746A (en) Configuration parameter determination method and apparatus of big data processing system
CN109669975B (en) Industrial big data processing system and method
CN112256418B (en) Big data task scheduling method
CN103198099A (en) Cloud-based data mining application method facing telecommunication service
CN115543577A (en) Kubernetes resource scheduling optimization method based on covariates, storage medium and equipment
CN107506381A (en) A kind of big data distributed scheduling analysis method, system and device and storage medium
Bommala et al. Machine learning job failure analysis and prediction model for the cloud environment
CN116483546B (en) Distributed training task scheduling method, device, equipment and storage medium
CN111625414A (en) Method for realizing automatic scheduling monitoring system of data conversion integration software
CN110928659A (en) Numerical value pool system remote multi-platform access method with self-adaptive function
US8229946B1 (en) Business rules application parallel processing system
CN115454718A (en) Automatic database backup file validity detection method
CN112965793B (en) Identification analysis data-oriented data warehouse task scheduling method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant