CN112101773B - Multi-agent system task scheduling method and system for process industry - Google Patents
Multi-agent system task scheduling method and system for process industry Download PDFInfo
- Publication number
- CN112101773B CN112101773B CN202010948695.1A CN202010948695A CN112101773B CN 112101773 B CN112101773 B CN 112101773B CN 202010948695 A CN202010948695 A CN 202010948695A CN 112101773 B CN112101773 B CN 112101773B
- Authority
- CN
- China
- Prior art keywords
- agent
- algorithm
- agents
- task scheduling
- qlearning
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 70
- 230000008569 process Effects 0.000 title claims abstract description 44
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 65
- 238000004519 manufacturing process Methods 0.000 claims abstract description 36
- 238000012545 processing Methods 0.000 claims description 20
- 238000005457 optimization Methods 0.000 claims description 12
- 238000010845 search algorithm Methods 0.000 claims description 7
- 238000004590 computer program Methods 0.000 claims description 6
- 238000004364 calculation method Methods 0.000 claims description 5
- 238000003860 storage Methods 0.000 claims description 4
- 238000000354 decomposition reaction Methods 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 2
- 238000012549 training Methods 0.000 description 10
- 230000009471 action Effects 0.000 description 5
- 238000011160 research Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000008859 change Effects 0.000 description 2
- 230000007547 defect Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000009776 industrial production Methods 0.000 description 1
- 238000003754 machining Methods 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000004088 simulation Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06311—Scheduling, planning or task assignment for a person or group
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q10/00—Administration; Management
- G06Q10/06—Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
- G06Q10/063—Operations research, analysis or management
- G06Q10/0631—Resource planning, allocation, distributing or scheduling for enterprises or organisations
- G06Q10/06316—Sequencing of tasks or work
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/04—Manufacturing
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02P—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
- Y02P90/00—Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
- Y02P90/30—Computing systems specially adapted for manufacturing
Landscapes
- Business, Economics & Management (AREA)
- Human Resources & Organizations (AREA)
- Engineering & Computer Science (AREA)
- Strategic Management (AREA)
- Theoretical Computer Science (AREA)
- Economics (AREA)
- Entrepreneurship & Innovation (AREA)
- General Physics & Mathematics (AREA)
- Physics & Mathematics (AREA)
- Marketing (AREA)
- General Business, Economics & Management (AREA)
- Tourism & Hospitality (AREA)
- Quality & Reliability (AREA)
- Game Theory and Decision Science (AREA)
- Software Systems (AREA)
- Development Economics (AREA)
- Educational Administration (AREA)
- Operations Research (AREA)
- Artificial Intelligence (AREA)
- Manufacturing & Machinery (AREA)
- Primary Health Care (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- General Engineering & Computer Science (AREA)
- Mathematical Physics (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The scheme is characterized in that a task scheduling model integrating a plurality of production units is built based on MAS technology according to the characteristics of a manufacturing process of the process industry, and meanwhile, a TS_ QLEARNING algorithm is applied to the model to form a task control system applied to the process industry.
Description
Technical Field
The disclosure relates to the technical field of control of process industry, in particular to a multi-agent system task scheduling method and system for the process industry.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
Modern industry is increasingly dependent on data, and the amount of data in industrial production is beginning to enter the PB level, which causes quality changes in industrial data compared to past production data. In recent years, research on multi-Agent artificial intelligence shows that the multi-Agent system theory in the multi-Agent artificial intelligence provides feasible technical support for the realization of an intelligent manufacturing system, and the multi-Agent system theory also becomes one of research hotspots in the manufacturing field.
In one aspect, currently, multi-Agent control models in the flow industry manufacturing process fall into three categories: centralized, hierarchical, and distributed. The inventors have found that there is a low fault tolerance in the centralized way and may easily lead to security problems, which will lead to a breakdown of the whole system once the central control computer fails; the upper layer and the lower layer in the layering are in subordinate relation, and the upper layer and the lower layer are strongly dependent; compared to centralized and hierarchical systems, the distributed system is relatively independent, each subsystem can achieve local optimization of each subsystem, however, overall optimization of the entire system is difficult to achieve, and requires higher network and computing power requirements.
On the other hand, task scheduling is also one of the important contents of the multi-Agent system, and a reasonable production task scheduling scheme plays an important role in improving the production efficiency of enterprises. Job shop is taken as a production task scheduling problem, is a strong NP-hard problem, and the inventor discovers that a plurality of researchers apply heuristic algorithms to solve the NP-hard problem, but the method has defects, such as the defects that a Q learning algorithm is easy to fall into local optimum, low calculation efficiency and the like when solving large-scale task scheduling.
Disclosure of Invention
In order to solve the problems, the present disclosure provides a multi-agent system task scheduling method and system for process industry; according to the scheme, the improved Q learning algorithm is applied to task scheduling of the multi-agent system in the process industry, so that more excellent operation sequences can be obtained, resources of the multi-agent system are scheduled more reasonably, and idle time of the multi-agent system is reduced.
According to a first aspect of the disclosed embodiments, there is provided a multi-intelligent system task scheduling method for a process industry, including:
constructing an intelligent cooperative control model oriented to the whole process, wherein the model is composed of agents connected with each production stage through a bus by a system Agent;
Acquiring an initial job sequence of a task, and completing field agents required by each job and processing time required by each field Agent for executing each job;
solving a job sequence with the shortest total idle time of the field agents by using a TS_ QLEARNING algorithm;
and the intelligent cooperative control model performs task scheduling according to the job sequence.
Furthermore, the intelligent system control model is of a layered structure, the upper system agents are used for uniform resource scheduling and task allocation, each workshop Agent of the lower layer comprises a workshop control Agent and a plurality of field agents, the system agents issue tasks through buses, task decomposition is achieved through interaction among workshops, the tasks are allocated to the field agents through the workshop control agents, and the field agents cooperate with each other to complete the tasks.
Further, the task scheduling method searches for an optimal job sequence by minimizing the sum of all the on-site Agent idle times.
Further, the scheduling method needs to follow the following constraints:
each field Agent can only execute one operation at a time; the operation of each task can only be executed by one field Agent at a time; once operation is started on the machine, it cannot be interrupted; other task operations cannot be performed until the previous operation is not completed; task operations can only be performed by machines of the same type, and the processing time per Agent on site and the number of agents available on site are known.
Further, the task includes a number of jobs that require processing with a number of field agents.
Further, the TS_ QLEARNING algorithm is a combination of a tabu search algorithm and a Q learning algorithm, and initial solutions of a preset number of operation sequences are obtained through the tabu search algorithm and stored in a tabu table; and carrying out optimization solution by QLEARNING algorithm based on the initial solution in the tabu list to obtain the optimal operation sequence.
Furthermore, in the optimization process of the TS_ QLEARNING algorithm, the idle time is used as a feedback signal, and the complete operation sequence and the corresponding total idle time are obtained through iterative computation.
According to a second aspect of the disclosed embodiments, there is provided a multi-intelligent system task scheduling system for a process industry, comprising:
The model building module is used for building an intelligent cooperative control model facing the whole process, and the model is composed of agents connected with each production stage through a bus by a system Agent;
the data acquisition module is used for acquiring agents required by different tasks and processing time data required by each Agent;
And the optimal job sequence acquisition module is used for solving an optimal job sequence by utilizing a TS_ QLEARNING algorithm, and the intelligent cooperative control model performs task scheduling according to the job sequence.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic device, including a memory, a processor, and a computer program stored to run on the memory, where the processor implements the multi-agent system task scheduling method for the process industry when executing the program.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the multi-agent system task scheduling method for the process industry.
Compared with the prior art, the beneficial effects of the present disclosure are:
(1) According to the results of simulation experiments, the TS_ QLEARNING algorithm has obvious advantages over the QLEARNING algorithm in terms of task scheduling, and can obtain more excellent job sequences, so that resources of the multi-agent system are more reasonably scheduled.
(2) Because of the excellent convergence rate of the tabu search algorithm and the fact that it can be done before QLEARNING training without consuming much time, the tabu table can be reused as an initially solved memory table. Therefore, the problem that the result is poor due to the strange environment in the early training stage of QLEARNING algorithm is solved.
(3) In the actual production process, there are always some urgent tasks, and the QLEARNING algorithm has poor operability for processing the urgent tasks. And TS_ QLEARNING can adjust the length of the tabu list by setting special amnesty criterion, thereby realizing quick emergency task processing.
Drawings
The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate and explain the exemplary embodiments of the disclosure and together with the description serve to explain the disclosure, and do not constitute an undue limitation on the disclosure.
FIG. 1 is a flow industry multi-Agent hierarchical control model as described in one embodiment of the present disclosure;
FIG. 2 is an example of a task scheduling Gantt chart according to one embodiment of the present disclosure;
Fig. 3 is a flowchart of the ts_ QLEARNING algorithm described in the first embodiment of the present disclosure.
Detailed Description
The disclosure is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is illustrative and is intended to provide further explanation of the present disclosure. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments in accordance with the present disclosure. As used herein, the singular is also intended to include the plural unless the context clearly indicates otherwise, and furthermore, it is to be understood that the terms "comprises" and/or "comprising" when used in this specification are taken to specify the presence of stated features, steps, operations, devices, components, and/or combinations thereof.
The process industry is an important support for the development of national economy. Modern process industry integrated manufacturing systems are one of the important competing technologies that increase the competitiveness of process enterprises. The selection of a proper model to realize intelligent optimization control of the production process is always the key point of research.
The rapid development of artificial intelligence has also accelerated the level of intelligence in the process industry in the 4.0 era of industry. In industry 4.0, the information physical production system of intelligent plants is the core of this transformation, from intelligent material entering intelligent plants to intelligent products. It is a dynamic configuration production method. The workstation may access the real-time network through the network. All relevant information is automatically switched to the production mode and the production material is replaced according to the information content, so that the production operation mode which is the best match is adjusted.
Aiming at the characteristic of industry 4.0, the current multi-agent control model is difficult to realize global optimization. Finding the appropriate algorithm to achieve collaboration between agents is also a major problem. The disclosure provides an intelligent cooperative control model for a whole process, which consists of a system agent and agents in each production stage. The structure of the model is layered, the upper system agents are mainly used for uniform resource scheduling and task allocation, each workshop using the production steps of multiple agents is equivalent to a small control system, and the multiple agents cooperate with each other to complete tasks; each Agent may communicate with each other. Wherein the multi-Agent control can cooperate with each other to complete a task (as shown in figure one); the task scheduled by the scheduling control Agent is used for realizing global cooperative control of the intelligent manufacturing process; for such a global optimization model, how to implement task scheduling for multi-agent systems is a problem that must be addressed.
Embodiment one:
An object of the present embodiment is to provide a multi-agent system task scheduling method for the process industry.
Task scheduling in the manufacturing process refers to planning, scheduling and arranging various production tasks in space, time and resources under the condition of meeting the technological requirements and the existing production equipment requirements; because the process of producing the product in the process industry or multiple processes of the same product needs to share resources and equipment, the production must be reasonably planned through an algorithm; the aim of the production task scheduling is to reasonably plan and configure resources, determine the processing time and sequence of products in different equipment, and improve the production efficiency; a process industry manufacturing process task schedule may be described as n jobs being processed on m machines; each job contains several production operations that must be performed on different machines. All jobs have the same processing order as they pass through the machine; no priority constraint exists between the operations of different jobs; operation cannot be interrupted and each machine is at time; all machines, each part having the same machining path; the working sequence is arbitrary; the aim is to find an appropriate sequence of operations to minimize the sum of the machine idle times and to take into account the following constraints and assumptions:
(1) Each machine can only perform one operation at a time
(2) The operation of a job being performed by only one machine at a time
(3) Once the operation is started on the machine, it cannot be interrupted
(4) Other job operations cannot be performed until the previous operation is not completed
(5) There is no backup route, i.e., job operations can only be performed by one type of machine, and the operation processing time and the number of operable machines are known in advance.
Based on the above constraints, the present embodiment proposes a multi-intelligent system task scheduling method for process industry, including:
constructing an intelligent cooperative control model oriented to the whole process, wherein the model is composed of agents connected with each production stage through a bus by a system Agent;
Acquiring an initial job sequence of a task, and completing field agents required by each job and processing time required by each field Agent for executing each job;
solving a job sequence with the shortest total idle time of the field agents by using a TS_ QLEARNING algorithm;
and the intelligent cooperative control model performs task scheduling according to the job sequence.
Furthermore, the intelligent system control model is of a layered structure, the upper system agents are used for uniform resource scheduling and task allocation, each workshop Agent of the lower layer comprises a workshop control Agent and a plurality of field agents, the system agents issue tasks through buses, task decomposition is achieved through interaction among workshops, the tasks are allocated to the field agents through the workshop control agents, and the field agents cooperate with each other to complete the tasks.
Further, the task includes a number of jobs that require processing with a number of field agents.
Further, the field Agent in this embodiment represents a machine that executes a job; assuming that there are 4 jobs, each job must be processed on 3 unrelated machines, the resulting job sequence is assumed to be { J A,JB,JC,JD }.
The time that job i needs to spend on machine J is represented by J imj; as shown in fig. 2, a gand diagram for the case of this work sequence is shown.
In fig. 2, x i (i= {1,2,3,4 }) represents the idle time of different machines during the job, byRepresenting the total idle time of three scheduling tasks; the task scheduling optimization objective is to find a job sequence that minimizes X, and in this disclosure, the idle time outside the gantt chart is defined as "external machine idle time", such as X 1,x2. Others are defined as "internal machine idle times", e.g., x 3,x4.
In addition, in order to reflect the time required for the sequence to actually complete all jobs, the maximum completion time makespan or C max required for the jobs to actually complete needs to be calculated to reflect the rationality of the results; if the end result reduces the value of the machine idle time, but in practice more time is needed to complete the task sequence, the result is obviously also unreasonable; the general task scheduling problem is expressed as n/m/C max, and involves n jobs, each of which needs to be operated on m machines; in Python we define N-dimensional matrices p and C, with a number of rows N and a number of columns M; we can obtain the processing time p (i, J) of job i on machine m from the dataset and job sequence { J 1,J2,...,Jn }, then calculate the completion time of C (J i, J) as follows:
C(J1,m1)=p(J1,m1)
C(Ji,m1)=C(Ji-1,m1)+p(Ji,m1)
C(J1,mj)=C(J1,mj-1)+p(J1,mj)
C(Ji,mj)=max{C(Ji-1,mj),C(Ji,mj-1)}+p(i,mj)
Cmax=C(Jn,mm)
Wherein i=2, n; j=2..m.
Thus, when job permutation is { J 1,J2,...,Jn }, C max is the time when the last operation of job J n was completed; since the task scheduling problem is NP-hard problem, a large amount of resources are consumed; we simply substituted the final job sequence trained by QLEARNING algorithm or ts_ QLEARNING algorithm into the above formula to obtain C max.
Assuming that n jobs need to be executed on m uncorrelated machines, and an optimal job sequence is obtained through training of an algorithm; also, assume that J k is the first job in the obtained optimal job sequence; then, J kml represents the time that job J k needs to spend on machine m l; in fact, J k is the initial solution obtained by the algorithm; from this initial solution J k, the value of "external idle time" can be obtained, while the calculation of "internal idle time" requires a complete solution.
We define the idle time as T, the external idle time as T e, and the internal idle time as T i. Obviously, the total idle time T is equal to the sum of T e and T i, and T e can be obtained according to the following formula:
Te=(m-1)Jkm1+(m-2)Jkm2+…+Jkmm-1
from this formula we can conclude that the external idle time is only related to the initial solution and that the larger the number of machines, the greater the impact on the overall.
The Q-based learning algorithm suffers from the disadvantage that it is not known what action should be taken in the invisible state, in other words, the Q-learning agent cannot evaluate the unknown state; this is likely to occur in the early stages of training. To solve this problem, the present disclosure proposes a novel ts_ QLEARNING algorithm that combines a TS algorithm and a Q learning algorithm; in the algorithm, some better initial solutions are recorded through a tabu table; it is emphasized that the TS QLEARNING algorithm does not obtain the best initial solution by the TS algorithm, but instead treats the tabu table as a memory table to exclude some very poor initial solutions, which tend to result in very large external idle times.
Among them, the tabu search is a meta heuristic developed by Glover (1986); in each iteration, the tabu search moves from one solution to an improved solution near the current solution, and a tabu table can be used to prevent some old solutions from having certain features in the iteration of the new solution, so the convergence speed of the TS algorithm is very fast.
Since J km1 has the greatest coefficient in the external idle time function, J km1 has the greatest impact on external idle time; firstly, defining a candidate solution as { J 1m1,J2m1,…,Jkm1,…Jnm1 }, and placing a task with smaller external idle time in the candidate solution into a tabu list until the tabu list is full; the length of the tabu table is set to be 1/3 of the task number, and the length of the tabu table can be adjusted according to actual requirements so as to control the range of the initial solution.
Also, in jobshop schedule training, machine time and tooling costs are used as input parameters, and job sequences are used as variable parameters. The goal is to find a suitable working order to minimize idle time.
To accommodate reinforcement learning methods, states may be reasonably defined as job sequences, or more precisely as job priority relationships. A state change (or operation) is defined as a change in a job priority relationship. Unlike Q-learning, the initial solution of TS QLEARNING is randomly obtained from a list of taboos. Notably, TS_ QLEARNING selects an initial solution, which also amounts to performing an action; likewise, after the action is performed, a reward (i.e., the next state and updated Q-table) is also obtained; when the scheduling problem is solved, different feedback signals can be used, and idle time is adopted as a reward signal in the scheme disclosed by the disclosure, and the specific technical concept is that the shorter the idle time is, the more excellent the action is.
Furthermore, the TS_ QLEARNING algorithm can obtain a required tabu list before training, and the preference is updated continuously along with the training, so that the behavior selection strategy is influenced to converge to the found quasi-optimal operation sequence; after training is completed, a final operation sequence and total idle time are obtained, and then C max is obtained according to the calculation formula of C max.
Further, to demonstrate the superiority of the solution of the present disclosure, in this embodiment, the task scheduling results of the method of the present disclosure and the existing Q learning algorithm are verified using the basic scheduling reference examples available in OR-Library.
Wherein the OR-Library is a collection of test data sets for various Operations Research (OR) problems; there are n jobs that need to be performed on m unrelated machines; in this case, each job consists of m non-preemptive operations, each operation of the industry using a different machine at a given time, can wait before being processed, and the degree instance provides three types of instances, and the data sets describe the machines required for each job and the processing time of all jobs in each machine.
To evaluate the quality of the different algorithms, different cases were randomly selected, and 10 operations were performed on the method of the present disclosure and the Q learning algorithm, respectively, to obtain an average value. Tailard there are many instances in the dataset, each instance having a size (work x machine) of 20x5, 20x10, 4x 20x15, 20x20, respectively. We implemented the Q-Learning algorithm and algorithm on Python and run on a device with CPU i7 and 16GB RAM.
To make the experiment more reasonable, the q-learning algorithm is set in this embodiment to have episodes (max_ episodes =10,000) with the same TS-QLEARNING setting as described in this disclosure, learning rate (α=0.1) and discount factor (γ=0.8) to ensure that both algorithms run under the same conditions; for the TS_ QLEARNING algorithm, the length of the tabu list is set to be one third of the number of jobs in the embodiment; the initial solution of the TS_ QLEARNING algorithm is obtained from the tabu list; the final sequence obtained by algorithm training is obtained through the two methods respectively, and the value of C max is calculated according to the final sequence.
In the experiment, 10,000 iterations of the Q learning algorithm and the ts_ QLearing algorithm were performed for each Taillard questions, and after 10 runs, the average of the experimental results was recorded in table 1. As shown in Table 1, the experimental results (specifically including the results of 16 Taillard examples) performed by selecting examples suitable for different complexity problems are shown
Table 1:Q-Learning and TS_Q-Learning algorithm experimental results
Overall the results show that the idle time obtained by the ts_q learning algorithm is better than the idle time obtained by the Q learning algorithm in any dataset. For the value of C max, we have also obtained better results in the TS_Q learning algorithm than the Q learning algorithm. Therefore, our algorithm is more advantageous than the Q-learning algorithm in solving the task scheduling problem.
Further, as shown in fig. 3, a flowchart of the ts_ QLEARNING algorithm is shown, and specific steps of the ts_ QLEARNING algorithm are as follows:
Step 1: initializing a tabu table, a Q table and an optimal idle time best.
Step 2: the length of the tabu table, the maximum number of tabu search iterations, and special amnesty criteria are set. And storing the better candidate solutions into a tabu list through tabu search until the maximum iteration number of the tabu search is reached.
Step 3: for each training period, if there are more tasks to complete the scheduling, the iteration is started. The state s is initialized and the task sequence job _ seq.
Step 4: judging whether an initial solution is obtained; if no initial solution is obtained, one is randomly selected from the tabu list and the state s', r after execution is observed. And updates the Q table, state s, and task sequence job_seq according to Q (s, a) ≡ (1- α) Q (s, a) +α [ r+γmax a' Q (s ', a') ], s≡s ', job_seq fact ζ job_seq+s'; if an initial solution has been obtained, then actions are selected according to the strategy of Q (ε -greed) and the state after execution s', r is observed. And updates the Q table, state s, and task sequence job_seq according to Q (s, a) ≡ (1- α) Q (s, a) +α [ r+γmax a' Q (s ', a') ], s≡s ', job_seq fact ζ job_seq+s'.
And 5, repeating the step 4 until all tasks are scheduled.
Step 6, if the idle time of s is less than best, then update best according to best≡s
Step 7, repeating the steps 3-5 until reaching QLEARNING maximum iteration times
And finally, outputting the complete operation sequence and the corresponding total idle time.
Embodiment two:
An object of the present embodiment is to provide a multi-intelligent system task scheduling system for the process industry.
A multi-intelligent system task scheduling system for a process industry, comprising:
The model building module is used for building an intelligent cooperative control model facing the whole process, and the model is composed of agents connected with each production stage through a bus by a system Agent;
the data acquisition module is used for acquiring agents required by different tasks and processing time data required by each Agent;
And the optimal job sequence acquisition module is used for solving an optimal job sequence by utilizing a TS_ QLEARNING algorithm, and the intelligent cooperative control model performs task scheduling according to the job sequence.
Embodiment III:
An object of the present embodiment is to provide an electronic apparatus.
An electronic device comprising, a memory, a processor and a computer program stored to run on the memory, the processor implementing the steps of:
constructing an intelligent cooperative control model oriented to the whole process, wherein the model is composed of agents connected with each production stage through a bus by a system Agent;
Acquiring an initial job sequence of a task, and completing field agents required by each job and processing time required by each field Agent for executing each job;
solving a job sequence with the shortest total idle time of the field agents by using a TS_ QLEARNING algorithm;
and the intelligent cooperative control model performs task scheduling according to the job sequence.
Embodiment four:
an object of the present embodiment is to provide a computer-readable storage medium.
A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps comprising:
constructing an intelligent cooperative control model oriented to the whole process, wherein the model is composed of agents connected with each production stage through a bus by a system Agent;
Acquiring an initial job sequence of a task, and completing field agents required by each job and processing time required by each field Agent for executing each job;
solving a job sequence with the shortest total idle time of the field agents by using a TS_ QLEARNING algorithm;
and the intelligent cooperative control model performs task scheduling according to the job sequence.
The multi-agent system task scheduling method and system for the process industry provided by the embodiment can be completely realized, and have wide application prospects.
The foregoing description of the preferred embodiments of the present disclosure is provided only and not intended to limit the disclosure so that various modifications and changes may be made to the present disclosure by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present disclosure should be included in the protection scope of the present disclosure.
While the specific embodiments of the present disclosure have been described above with reference to the drawings, it should be understood that the present disclosure is not limited to the embodiments, and that various modifications and changes can be made by one skilled in the art without inventive effort on the basis of the technical solutions of the present disclosure while remaining within the scope of the present disclosure.
Claims (7)
1. A multi-intelligent system task scheduling method for a process industry, comprising:
constructing an intelligent cooperative control model oriented to the whole process, wherein the model is composed of agents connected with each production stage through a bus by a system Agent;
Acquiring an initial job sequence of a task, and completing a field Agent required by each job and processing time required by each field Agent executing each job;
solving a job sequence with the shortest total idle time of the field agents by using a TS_ QLEARNING algorithm;
the intelligent cooperative control model performs task scheduling according to the job sequence;
The task scheduling method searches for an optimal job sequence by minimizing the sum of idle time of all on-site agents;
The TS_ QLEARNING algorithm is a combination of a tabu search algorithm and a Q learning algorithm, and initial solutions of a preset number of operation sequences are obtained through the tabu search algorithm and stored in a tabu table; based on the initial solution in the tabu list, carrying out optimization solution through QLEARNING algorithm to obtain an optimal operation sequence;
In the optimization process of the TS_ QLEARNING algorithm, the idle time is used as a feedback signal, and a complete operation sequence and the corresponding total idle time are obtained through iterative calculation; wherein the initial solution of TS QLEARNING is randomly obtained from the list of taboos.
2. The multi-intelligent system task scheduling method for the process industry according to claim 1, wherein the intelligent cooperative control model is of a layered structure, an upper system Agent is used for uniform resource scheduling and task allocation, each lower workshop Agent comprises a workshop control Agent and a plurality of field agents, the system agents issue tasks through buses, task decomposition is achieved through cooperation among workshops, the tasks are allocated to the field agents through the workshop control agents, and the field agents cooperate with each other to complete the tasks.
3. A multi-intelligent system task scheduling method for use in a process industry as claimed in claim 1, wherein said scheduling method requires following constraints:
each field Agent can only execute one operation at a time; the operation of each task can only be executed by one field Agent at a time; once operation is started on the machine, it cannot be interrupted; other task operations cannot be performed until the previous operation is not completed; task operations can only be performed by machines of the same type, and the processing time per Agent on site and the number of agents available on site are known.
4. A multi-intelligent system task scheduling method for a process industry, as set forth in claim 1, wherein said tasks comprise jobs that require processing by a plurality of on-site agents.
5. A multi-intelligent system task scheduling system for use in a process industry, comprising:
The model building module is used for building an intelligent cooperative control model facing the whole process, and the model is composed of agents connected with each production stage through a bus by a system Agent;
the data acquisition module is used for acquiring agents required by different tasks and processing time data required by each Agent;
The optimal operation sequence acquisition module is used for solving an optimal operation sequence by utilizing a TS_ QLEARNING algorithm, and the intelligent cooperative control model performs task scheduling according to the operation sequence;
The task scheduling method searches for an optimal job sequence by minimizing the sum of idle time of all on-site agents;
The TS_ QLEARNING algorithm is a combination of a tabu search algorithm and a Q learning algorithm, and initial solutions of a preset number of operation sequences are obtained through the tabu search algorithm and stored in a tabu table; based on the initial solution in the tabu list, carrying out optimization solution through QLEARNING algorithm to obtain an optimal operation sequence;
In the optimization process of the TS_ QLEARNING algorithm, the idle time is used as a feedback signal, and a complete operation sequence and the corresponding total idle time are obtained through iterative calculation; wherein the initial solution of TS QLEARNING is randomly obtained from the list of taboos.
6. An electronic device comprising a memory, a processor and a computer program stored for execution on the memory, wherein the processor, when executing the program, implements a multi-agent system task scheduling method for a process industry as claimed in any one of claims 1-4.
7. A computer readable storage medium having stored thereon a computer program, which when executed by a processor implements a multi-agent system task scheduling method for a process industry according to any of claims 1-4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010948695.1A CN112101773B (en) | 2020-09-10 | 2020-09-10 | Multi-agent system task scheduling method and system for process industry |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010948695.1A CN112101773B (en) | 2020-09-10 | 2020-09-10 | Multi-agent system task scheduling method and system for process industry |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112101773A CN112101773A (en) | 2020-12-18 |
CN112101773B true CN112101773B (en) | 2024-06-07 |
Family
ID=73750765
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010948695.1A Active CN112101773B (en) | 2020-09-10 | 2020-09-10 | Multi-agent system task scheduling method and system for process industry |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112101773B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112700099A (en) * | 2020-12-24 | 2021-04-23 | 亿景智联(北京)科技有限公司 | Resource scheduling planning method based on reinforcement learning and operation research |
CN112633772B (en) * | 2021-01-05 | 2021-12-10 | 东华大学 | Multi-agent deep reinforcement learning and scheduling method for textile fabric dyeing workshop |
CN113448687B (en) * | 2021-06-24 | 2022-07-26 | 山东大学 | Hyper-heuristic task scheduling method and system based on reinforcement learning in cloud environment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105976030A (en) * | 2016-03-15 | 2016-09-28 | 武汉宝钢华中贸易有限公司 | Multi-agent-based platform scheduling intelligent sorting model structure |
CN109407644A (en) * | 2019-01-07 | 2019-03-01 | 齐鲁工业大学 | One kind being used for manufacturing enterprise's Multi-Agent model control method and system |
CN110427006A (en) * | 2019-08-22 | 2019-11-08 | 齐鲁工业大学 | A kind of multi-agent cooperative control system and method for process industry |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107392402B (en) * | 2017-09-11 | 2018-08-31 | 合肥工业大学 | Production and transport coordinated dispatching method based on modified Tabu search algorithm and system |
-
2020
- 2020-09-10 CN CN202010948695.1A patent/CN112101773B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105976030A (en) * | 2016-03-15 | 2016-09-28 | 武汉宝钢华中贸易有限公司 | Multi-agent-based platform scheduling intelligent sorting model structure |
CN109407644A (en) * | 2019-01-07 | 2019-03-01 | 齐鲁工业大学 | One kind being used for manufacturing enterprise's Multi-Agent model control method and system |
CN110427006A (en) * | 2019-08-22 | 2019-11-08 | 齐鲁工业大学 | A kind of multi-agent cooperative control system and method for process industry |
Also Published As
Publication number | Publication date |
---|---|
CN112101773A (en) | 2020-12-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112101773B (en) | Multi-agent system task scheduling method and system for process industry | |
Zhang et al. | Game theory based real-time multi-objective flexible job shop scheduling considering environmental impact | |
Wang et al. | Application of reinforcement learning for agent-based production scheduling | |
CN103390195B (en) | A kind of machine shop task scheduling energy saving optimizing system based on intensified learning | |
CN104166903B (en) | Mission planning method and system based on process division | |
CN114237869B (en) | Ray double-layer scheduling method and device based on reinforcement learning and electronic equipment | |
CN115293623A (en) | Training method and device for production scheduling model, electronic equipment and medium | |
Liu et al. | Scheduling optimization of task allocation in integrated manufacturing system based on task decomposition | |
CN116700176A (en) | Distributed blocking flow shop scheduling optimization system based on reinforcement learning | |
CN118152084A (en) | Spark cluster multi-job scheduling method in cloud computing environment | |
CN112488543A (en) | Intelligent work site shift arrangement method and system based on machine learning | |
CN113743761A (en) | Intern shift-by-shift scheduling method and system based on random neighborhood search algorithm | |
CN117647960A (en) | Workshop scheduling method, device and system based on deep reinforcement learning | |
CN113050644A (en) | AGV (automatic guided vehicle) scheduling method based on iterative greedy evolution | |
Touat et al. | An integrated guided local search considering human resource constraints for the single-machine scheduling problem with preventive maintenance | |
CN104636610A (en) | Manufacturing system tasking information correction method applied to dynamic environment | |
CN110716522B (en) | Manufacturing enterprise workshop scheduling optimization method based on arbitrary time A-heuristic search | |
CN114819273A (en) | Workshop scheduling method based on combination of multi-Agent global optimization and local optimization | |
CN113010290A (en) | Task management method, device, equipment and storage medium | |
Xing et al. | Improved cuckoo optimization algorithm for human-machine collaborative disassembly line balancing problem | |
Huo et al. | Multi-objective FJSP Based on Multi-agent Reinforcement Learning Algorithm | |
Li et al. | Dynamic integration mechanism for job-shop scheduling model base using Multi-agent | |
Zhang et al. | Dynamic Scheduling Method of Multi-objective Job Shop Based on Reinforcement Learning | |
Cai et al. | The Study of Monte Carlo Algorithm Based on Topology and Dynamic Programming in Production Scheduling Scenario | |
Chen et al. | Probing an LSTM-PPO-Based reinforcement learning algorithm to solve dynamic job shop scheduling problem |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address |
Address after: 250353 University Road, Changqing District, Ji'nan, Shandong Province, No. 3501 Patentee after: Qilu University of Technology (Shandong Academy of Sciences) Country or region after: China Address before: 250353 University Road, Changqing District, Ji'nan, Shandong Province, No. 3501 Patentee before: Qilu University of Technology Country or region before: China |