CN111126905A - Casting enterprise raw material inventory management control method based on Markov decision theory - Google Patents

Casting enterprise raw material inventory management control method based on Markov decision theory Download PDF

Info

Publication number
CN111126905A
CN111126905A CN201911296380.7A CN201911296380A CN111126905A CN 111126905 A CN111126905 A CN 111126905A CN 201911296380 A CN201911296380 A CN 201911296380A CN 111126905 A CN111126905 A CN 111126905A
Authority
CN
China
Prior art keywords
model
decision
production
time
markov
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911296380.7A
Other languages
Chinese (zh)
Other versions
CN111126905B (en
Inventor
唐红涛
王广森
陈世义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuhan University of Technology WUT
Original Assignee
Wuhan University of Technology WUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuhan University of Technology WUT filed Critical Wuhan University of Technology WUT
Priority to CN201911296380.7A priority Critical patent/CN111126905B/en
Publication of CN111126905A publication Critical patent/CN111126905A/en
Application granted granted Critical
Publication of CN111126905B publication Critical patent/CN111126905B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/08Logistics, e.g. warehousing, loading or distribution; Inventory or stock management
    • G06Q10/087Inventory or stock management, e.g. order filling, procurement or balancing against orders
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P90/00Enabling technologies with a potential contribution to greenhouse gas [GHG] emissions mitigation
    • Y02P90/30Computing systems specially adapted for manufacturing

Landscapes

  • Business, Economics & Management (AREA)
  • Economics (AREA)
  • Engineering & Computer Science (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Finance (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Human Resources & Organizations (AREA)
  • Accounting & Taxation (AREA)
  • Operations Research (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)
  • General Factory Administration (AREA)

Abstract

The method comprises the steps of establishing a casting enterprise raw material inventory control model, namely an SCO-IP model, under a dynamic production environment by utilizing a Markov Decision Process theory, carrying out abstract modeling on the model, and finally quantitatively describing the basic operation Process of a researched object; specifically, (1) describing the operation flow of orders, inventory and production in a casting enterprise under a dynamic production environment, and making reasonable assumption on the environment in which a model needs to be established; (2) on the basis of the current situation and reasonable hypothesis, key parameters of a Markov precision Process theoretical description model are adopted, complete Markov multiple reorganization (3) is established for analyzing Decision rules generated in the Markov precision Process theoretical Process, and a complete target cost function is constructed.

Description

Casting enterprise raw material inventory management control method based on Markov decision theory
Technical Field
The invention relates to the field of raw material inventory management methods, in particular to a raw material inventory management control method for a casting enterprise based on a Markov decision theory.
Background
In many foundries, customer order, foundry production, and inventory management are central to the supply chain management of each foundry. Although the three important supply chain management links are greatly developed in the casting enterprises, the problems to be solved due to the characteristics of the casting enterprises still exist in the process of jointly managing orders, production and inventory:
① the impact of poor coupling of production to inventory management on production;
② delay in response of inventory to production plans placed on random orders;
③ the randomness of the order and the small lot nature of the individual pieces have a large impact on the cost of production.
In the existing casting enterprises, raw material inventory management and production management are split, the management processes are mutually independent, the connection between the raw material inventory management and the production management is not tight enough, information transmission is delayed, so that the raw material inventory cannot respond to the dynamic production process in time, and a certain production interruption risk is easily caused; meanwhile, the inventory management of raw materials is considered independently, so that unnecessary raw material replenishment is easily generated, a large amount of material overstock is caused, higher inventory cost is formed, and economic benefit is reduced.
Disclosure of Invention
The invention provides a raw material inventory management control method for a casting enterprise based on a Markov decision theory, which aims to solve the problem of unnecessary waste in the existing inventory management process and obtain a better inventory control method.
In order to achieve the purpose, the technical scheme adopted by the invention is as follows:
the method comprises the steps of establishing a casting enterprise raw material inventory control model, namely an SCO-IP model, under a dynamic production environment by utilizing a Markov Decision Process theory, carrying out abstract modeling on the model, and finally quantitatively describing the basic operation Process of a researched object; in particular, the method comprises the following steps of,
(1) describing the operation flows of orders, inventory and production in a casting enterprise under a dynamic production environment, and making reasonable assumptions on the environment in which a model needs to be established;
(2) on the basis of the current situation and reasonable hypothesis, key parameters of the model are described by a Markov precision Process theory, and a complete Markov multiple group is established;
(3) analyzing a Decision rule generated in the Markov precision Process theoretical Process, and constructing a complete target cost function;
(4) and analyzing the characteristics of the model to find an algorithm for solving the optimal inventory control strategy of the model.
Further, making reasonable assumptions about the environment of the model includes:
(1) production materials having the following characteristics were not considered in this model:
① no longer considers the depreciation property of the materials, and a part of the materials can be reused after the materials are used;
②, the material produced in the production process belongs to the necessary loss of production;
③ the utilization value of the newspaper and waste is not considered for the time being;
(2) in order to quantify the storage cost of the materials, the storage point is regarded as a warehouse with lease expenses;
(3) the warehouse capacity has an upper inventory limit;
(4) the influence of the purchased material quantity on the material supplement speed is not considered;
(5) the penalty cost brought by a certain degree of delay delivery is allowed to bear;
(6) ignoring production preparation time after confirming acceptance of the order;
(7) the production is regarded as a single-line production mode, namely a plurality of orders are not produced at the same time;
(8) the scheduled production tasks can be completed on time, and task delay caused by unexpected factors can not occur;
(9) the stock can ensure the normal production;
(10) taking the order quantity of the raw materials and the corresponding stock level as discrete variables;
(11) only consider random incoming or outgoing orders each time the system is reviewed, and we consider the probability distribution when one order comes and the other orders do not come; when a random order arrives, the production planner can determine the relevant information for the order.
Further, the decision rule includes:
(1) order admission rules:
order admission rules aim to address how random customer orders are handled, imposing the following constraints on the model:
OPk/Pmax+x-τ≤0;
(2) production scheduling rules:
since the production scheduling requirement cannot be met only by depending on the stock level state in the model, a production raw material stock level reference matrix is introduced
Figure BDA0002320665540000031
Which takes into account the orders already placed and the ordering situation,
Figure BDA0002320665540000032
Figure BDA0002320665540000033
Figure BDA0002320665540000034
Figure BDA0002320665540000035
tmin-t≤τ
after processing the possible incoming orders and scheduling the accepted orders, transition to the system state at the next review time.
Further, the objective cost function is:
Figure BDA0002320665540000036
Figure BDA0002320665540000037
further, the Markov Decision Process theory refers to finite stage deterministic Markov Decision Process theory.
Furthermore, the algorithm is an improved reverse induction algorithm based on a dual-processor mechanism, which is obtained by taking a reverse induction method as a main body and combining the characteristics of the model, so that the model is solved more efficiently.
Further, the markov multiple reorganization comprises: decision time and period, state, action, transition probability and reward;
(1) decision time and period:
in the Markov decision process, because the decision time point set T can have various characteristics, the models can be classified according to the characteristics of the decision time point set T:
a) when the decision time point set T is an infinite point set which can be listed, that is, { T ∈ R | T ═ 1,2,3,. and n,. the model is regarded as a discrete decision time model under an infinite planning stage;
b) when the decision time point set T is a tabulatable finite point set, i.e., [1, 2, 3.., n }, we consider the model as a discrete decision time model in a finite planning stage;
c) when the decision time set T is a continuous finite set, namely T belongs to [0, n ], the model is regarded as a continuous decision time model under a finite stage;
d) when the decision time set T is a continuous infinite set, namely T belongs to [0, infinity ]), regarding the model as a continuous decision time model in an infinite stage;
the model is a discrete decision time model under a finite stage;
(2) state and action set:
at the beginning of each decision time, the system will present the corresponding state; s represents a set of possible system states; when a decision maker observes that the state of the system is S at a certain decision moment, S belongs to S, and according to the state, the feasible action set AsIn the method, a reasonable action a is selected, a belongs to As
(3) Transition probability and reward:
at decision time t, after taking action against state s, two effects are produced on the system:
the decision maker receives an immediate reward (cost) r (s, a),
the current state will be distributed with probability pt(. s, a) transitioning to the state for the next decision time;
the instant remuneration (cost) r (S, a) is defined in S ∈ S, a ∈ AsA real-valued function above, which represents the value of the reward (cost) generated at period t after the decision is made at decision time t; when the value is positive, it represents a profit or a reward, otherwise it represents a cost; in the Markov decision process, the production process of the instant reward (cost) is not concerned, and only the value or the expected value of r (s, a) is known after an action is selected; and when continuing to the next examination time, the instant remuneration r (s, a) includes: one-time consideration (cost); cumulative rewards (costs); random reward (cost); consideration (cost) related to the status at the next time;
the expected reward value for action a may be expressed as:
Figure BDA0002320665540000051
in the above equation, p (s' | s, a) represents the transition probability of the system state s transitioning to the next decision instant, usually for the transition probability,
Figure BDA0002320665540000052
to this end, a complete markov decision process can be written:
{T,S,A(s),p(·|s,a),r(s,a)}。
the invention has the beneficial effects that:
the problem that the stock management and the production plan management of the raw materials of the casting enterprise are split is analyzed, on the basis of the problem, a raw material stock control model of the casting enterprise under a dynamic production environment is established by using a Markov decision process theory, and an optimal control strategy of the model stock is solved, so that the coupling between a raw material stock management system and a production system in the casting enterprise is increased, the production interruption risk and the operation cost are reduced, and an effective control strategy is provided for the stock management under the dynamic production environment.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings described below are only some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
FIG. 1 is an overall flow chart of the present invention;
FIG. 2 is a production state transition diagram;
fig. 3 is a stock level state transition diagram.
Detailed Description
In order to make the objects, features and advantages of the present invention more obvious and understandable, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the embodiments described below are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
As shown in fig. 1, the method for managing and controlling the raw material inventory of the casting enterprise based on the Markov Decision theory includes establishing a raw material inventory control model of the casting enterprise, namely an SCO-IP model, under a dynamic production environment by using the Markov Decision Process theory, performing abstract modeling on the model, and finally quantitatively describing the basic operation Process of a researched object; in particular, the method comprises the following steps of,
(1) describing the operation flows of orders, inventory and production in a casting enterprise under a dynamic production environment, and making reasonable assumptions on the environment in which a model needs to be established;
(2) on the basis of the current situation and reasonable hypothesis, key parameters of the model are described by a Markov precision Process theory, and a complete Markov multiple group is established;
(3) analyzing a Decision rule generated in the Markov precision Process theoretical Process, and constructing a complete target cost function;
(4) and analyzing the characteristics of the model to find an algorithm for solving the optimal inventory control strategy of the model.
When a decision maker implements a strategy pi, the decision maker obtains a series of remuneration (cost) according to a certain probability at each decision moment, and the cost is summed up to obtain a specific effect function of the model under the strategy pi. According to the optimization criterion, the total effect function under the finite phase under the strategy pi epsilon is defined as:
Figure BDA0002320665540000061
this expression indicates that the state is s when the decision time is 00And with the strategy pi, the desired total cost, r(s), of the system obtained from the start up until time NN) Representing the cost incurred at the last decision instant in the decision process.
However, in this model, how to initially select a suitable strategy in the system, so that the cost incurred by the system is the lowest, is a matter of concern to the decision maker, i.e. to have the following equation as an optimum function,
Figure BDA0002320665540000062
for any ε ≧ 0, if there is a policy π*So that the following equation holds, then*The method is an N-stage epsilon optimal strategy, and when epsilon is 0, the strategy is called an optimal strategy.
Figure BDA0002320665540000063
Although the decision maker always wants to know how the system should take the optimal action at each possible initial state, in actual production, only the optimal strategy at a certain possible initial state needs to be considered. This is the finite-stage deterministic Markov precision Process theory and describes the SCO-IP model on the basis of the theory.
In the entire SCO-IP model, a decision maker will review the system status after a certain time, and after each review of the system status, take relevant actions, such as ordering raw materials, making a production plan for an accepted order, etc. The time for examining each state of the system is set as the examination time of the SCO-IP model and is denoted by T, wherein T is 0,1,2,3 …, T-1, and T denotes the last time of the finite stage. At each inspection time t, the decision-maker will observe a system state consisting of a plurality of parts and consisting of a set StUniformly expressing:
Figure BDA0002320665540000071
in the expression (3-1), the expression,
Figure BDA0002320665540000072
material inventory level status I including various production raw materials of interest at inspection time ti,tAs shown in the expression (3-2), the expression well reflects the management of the model to be established on the multi-material inventory and is more practical, but the state space of the SCO-IP model becomes more complex, the difficulty of subsequent model solution is increased, and the model is more practical.
Figure BDA0002320665540000073
In expression (3-1), the SCO-IP model examines the second state component, the plant production state
Figure BDA0002320665540000074
This is a status component of a time advance nature, i.e., at the current time of review, the decision-maker can observe the status of plant productivity for some future time, which represents the time from the time of review t to t + τ of reviewDuring the time, the remaining capacity status matrix of the plant (which we assume the system has a limited scheduling duration, with a duration of τ unit scheduling intervals, i.e. the decision maker can only observe the scheduling for the next τ scheduling intervals at most) at different times of the inspection is shown as a row vector (3-3), where each element represents the remaining capacity observable in the future at the current time. When a new customer order is placed on production,
Figure BDA0002320665540000075
changes will follow.
[ptpt+1pt+2… pt+τ-1pt+τ](3-3)
According to the Markov precision Process theory, after the system state is examined, the observed system state is analyzed and judged, and corresponding Decision is made. In the SCO-IP model, we set a decision set A (S), as shown by expressions (3-4), each of which
Figure BDA0002320665540000076
Should be considered as two parts, one part being a row vector containing tau elements
Figure BDA0002320665540000081
And a row vector containing n elements
Figure BDA0002320665540000082
Figure BDA0002320665540000083
A(st)={st∈St|A(st)} (3-5)
Line vector
Figure BDA0002320665540000084
As a response to the order scheduling in the decision matrix, it is based on the workshop production state matrix at the time of the examination t
Figure BDA0002320665540000085
And a material inventory level reference matrix
Figure BDA0002320665540000086
Scheduling plans for possible incoming orders. Where a "1" indicates that within the review interval, there are orders scheduled therein, and a 0 indicates that within the review interval, there are no orders scheduled therein. If the system has not made any scheduling plans for the order, for example at an audit time of 0, the matrix of τ observed by the decision maker can be expressed as,
Figure BDA0002320665540000087
it should be noted that as the examination time advances, the elements of the matrix will shift left, and each time an examination period is crossed, the elements will shift left by one bit, the elements in the previous period will overflow, and the last bit will be zero-filled.
Line vector
Figure BDA0002320665540000088
The response to the stock replenishment of the raw materials in the decision matrix is based on the stock level of each production raw material at the time of the review t
Figure BDA0002320665540000089
And all possible orders, the order plan made. It is the essence of the discovery that the optimal inventory control strategy is made according to the system status, so that the manufacturing and production can be performed normally while the inventory cost is minimized. It is important to reflect the performance differences exhibited between different control strategies and to derive an optimal raw material inventory control strategy.
In Markov precision Process theory, the expected total cost is formed in the whole finite stage by each instant cost r (s, a) under different strategies
Figure BDA00023206655400000810
To show this difference in performance, S0Representing the possible state space in the early stages of the system, we need to find the space so that
Figure BDA00023206655400000811
Minimum inventory control strategy pi*
Markov Decision Process theory holds that it is not important how r (s, a) is obtained during the examination period, but only its value or expected value is obtained, thus giving a component of r (s, a) that contains 4 aspects of cost (reward): (1) one-time revenue (cost) to the next review time; (2) cumulative revenue (cost) for the next review time; (3) random revenue (cost) of state transition to the next review time; (4) depending on the revenue of the next review time status.
In the SCO-IP model, the main goal is to choose the optimal inventory control strategy. Therefore, it is considered that at each decision time, action is taken for a certain state, and the generated instant cost is mainly reflected in the cost associated with raw material inventory management, and the cost generated by other parts is not considered here. After the model and the operating environment are analyzed, a corresponding cost function expression is given:
one-time cost to the next review time:
Figure BDA0002320665540000091
in the formula (3-6), the first item represents the purchase cost of the material i, Qi,tIndicates the order quantity of the ith raw material at the inspection time t, BiRepresenting the purchase unit price of the ith raw material; the second term represents the fixed cost, g, incurred during the ordering processiIndicating a fixed cost incurred in ordering the ith material, sgn indicates a sign function,
Figure BDA0002320665540000092
the function indicates that the fixed cost is only generated when an order event occurs. The third item represents the back-off cost incurred after the order is placed. Because the material stock level is insufficient, the scheduling plan is arranged at the back, and finally the delivery can not be completed in the specified time, and the scheduling plan is set as the delayed delivery penalty cost related to the delayed time length.
Figure BDA0002320665540000093
Representing a cost function related to the delay duration epsilon; the fourth term represents the lost revenue generated by rejecting the order. The expression forms of the third and fourth items will be described in detail later.
In the SCO-IP model, to avoid taking into account the randomness of the risk of order production interruptions, we enforce production preconditions:
when the various production raw materials in inventory can meet the quantity of material required in the order, a production plan is accepted and arranged for the order. Due to the fact that production cannot be performed in a stock out state, stock out costs are not considered. On the contrary, the production loss caused by the fact that the order cannot be arranged in the set production scheduling time due to the fact that the material stock level is insufficient, and the order is rejected finally is converted into stock penalty cost
Figure BDA0002320665540000094
Cumulative cost for the next stage:
Figure BDA0002320665540000095
in the expression (3-7), hiRepresenting the cost of holding material i per unit time. According to the foregoing, since
Figure BDA0002320665540000096
Taking into account the production plan already scheduled, i.e. the amount consumed, and therefore
Figure BDA0002320665540000097
The end-of-cycle stock level is shown, while equations (3-7) represent the cost of goods generated at this stage.
Random cost to shift to the next time: in inventory costs, such costs are set to 0,
Figure BDA0002320665540000101
therefore, the instantaneous cost function r (s, a) of each stage, as shown in equations (3-9) and (3-10),
Figure BDA0002320665540000102
Figure BDA0002320665540000103
in the formula (3-9), ri(s, a) indicates inventory management costs corresponding to the ith production raw material, wherein s' indicates that a decision maker takes action a in a state s at the current inspection time and then transits to a state at the next inspection time with a certain probability. And equations (3-10) represent the inventory management cost for all materials at this stage.
After the entire SCO-IP model is set forth, the deferred delivery costs and the rejection penalty costs involved in the objective cost equation will be explained. Among them, the deferred delivery cost in expression (3-6) is divided into two categories in the SCO-IP model:
when the accepted new customer order is lack of raw materials for production, production cannot be scheduled in the fastest time, so that scheduled production is delayed, and delivery is delayed;
due to the existing production scheduling plan, the new customer order accepted is forced to schedule production at a later time, resulting in production delays and delayed deliveries.
Although the two deferred delivery costs are caused by different system environments, they both represent penalty costs of deferred delivery in the end, as shown in the expression (3-11),
Figure BDA0002320665540000104
it is common practice to associate a deferred delivery cost with a deferred time, as dictated by the product contract of some businesses and customers. Thus, in equations (3-11), the penalty cost is deferred
Figure BDA0002320665540000105
Is considered as a function of the lead time epsilon. Since the specific form of this function is not the focus of the study in the SCO-IP model, the delivery cost will be postponed as shown by the expression (3-12)
Figure BDA0002320665540000106
Expressed as a linear function of the delay time epsilon. In the expression (3-12), δ represents the penalty cost to be borne per unit of the extension time. Although somewhat crude, this approach is not lost as a simple and effective way to express this deferred delivery cost.
f(ε)=ε*δ (3-12)
In the objective cost function formula (3-9), in addition to the deferred delivery penalty cost, the loss of business due to rejection of the order is another component of the cost function of the control system that needs to be controlled.
In addition to evaluating the plant production capacity consumed by the order during the admission review of the incoming order, the quantities of various production raw materials required by the order also need to be evaluated. When the evaluations pass, the order is accepted, otherwise the system rejects the incoming order. Obviously, when the system rejects an order, the revenue from that order is lost as well. The income corresponding to different orders is different, and as shown in the expression (3-13), the cost of the refusal order is different
Figure BDA0002320665540000111
Differing from order to order, e.g.,c1,tPresentation and customer order O1,tAn associated fixed penalty cost, and refusal of O2,tThe order will generate c2,tFixed penalty cost of.
Figure BDA0002320665540000112
After defining and accounting for customer order information, how a decision maker should make decisions about specific system states with the order information known is an important ring in the SCO-IP model, which dominates the development of the entire production process and the control of the inventory system. Therefore, we introduce two important decision rules as follows:
(1) order admission rules
Order admission rules aim to solve "how to handle random customer orders", for which the model addresses the state s at some inspection time ttThe following constraints apply:
OPk/Pmax+x-τ≤0 (3-14)
in inequalities (3-15), x represents a row vector in the equation (3-4)
Figure BDA0002320665540000113
The number of the medium element is 1. This inequality embodies yet another hard constraint of the model that production scheduling inhibition of orders is scheduled outside of the planned time period τ (relative to the current review time t).
(2) Rule of production scheduling
After solving how to handle incoming orders, how to arrange orders becomes another problem to be solved. When a customer order comes, the earliest production scheduling date t needs to be determined by combining the material inventory level of the scheduling date and the production capacity of a workshopminBut only on the inventory level status given in the SCO-IP model
Figure BDA0002320665540000114
Can not meet the production demand, so a raw material stock water is introducedFlat reference matrix
Figure BDA0002320665540000115
As shown in expressions (3-16). The state matrix is used for scheduling reference only,
Figure BDA0002320665540000116
the order and the order situation are considered, and the stock level of each production raw material is the sum of the remaining stock quantity and the stock quantity from the moment to the moment at each inspection moment due to the purchase lead time of the production raw material.
Figure BDA0002320665540000121
Figure BDA0002320665540000122
Figure BDA0002320665540000123
Figure BDA0002320665540000124
tmin-t≤τ (3-19)
When there is no t satisfying the above inequalityminIf so, it indicates that the system is unable to properly place the order. Thus, the system will reject the order and thereby incur the loss of service caused by the rejected order
Figure BDA0002320665540000125
Figure BDA0002320665540000126
The customer is charged to the manufacturer, which includes the cost and profit of producing the order. In the expression (3-17), θtIndicating a production order OtThe number of cycles required is the number of production capacity remaining on the day in the production scheduleAt tminIn the presence of thetatMay be an integer or a decimal number, and the value is represented by tminAnd (4) determining. The expression also embodies the fact that the delay delivery duration needs to be limited to the minimum time so as to minimize the cost of inventory control; in the inequality (3-18),
Figure BDA0002320665540000127
is formed by that the examination time is t in the expression (3-3)minUntil the examination time is tminA matrix of rows of residual capacity composed of elements of + θ t-1, the inequality being expressed at tminWhen the production is carried out at the inspection time, the residual production capacity of the immediately adjacent inspection period needs to meet the production capacity required by the production of the new order. According to the characteristic of continuity of the production process of the single order in the production of the foundry, the production process of the foundry must be arranged in the adjacent inspection period when the foundry is subjected to production and arrangement.
In the inequalities (3-19),
Figure BDA0002320665540000128
is formed by that the examination time is t in the expression (3-16)minThe inequality representing the row matrix formed by the column elements of (a) at the time of examination tminWhen the production is scheduled, the stock level of various production raw materials at the moment can meet the quantity of the production raw materials required by a newly-arrived order; while inequality (3-20) represents the examination time tminShould be within the planned duration tau.
After processing the possible incoming orders and scheduling the accepted orders, we can write the system state for transitioning to the next review time, as shown in expression (3-21). Since the only random factor in the overall SCO-IP model is the randomness of the customer orders, in this control system, the probability distribution of the system state transitioning to the next audit point is the same as the probability distribution p (· | s, a) that the customer order came to. We can express the system state at the next review moment after the transition as follows,
Figure BDA0002320665540000131
the equation of the state transition is shown in the formula (3-22), and the state transition mode is shown in fig. 2 and 3.
Figure BDA0002320665540000132
Specially adapted for members other than the state space
Figure BDA0002320665540000133
Its states between adjacent inspection times also have a transition relationship, as shown in expression (3-23),
Figure BDA0002320665540000134
in the expression (3-22) above,
Figure BDA0002320665540000135
represents the production capacity consumption matrix obtained after planning the order, the matrix is a row matrix of 1 x tau, represents the workshop production capacity consumption in the future plan, and at the same time, we will order OkProduction raw material [ OI ] referred to in (1)1,kOI2,k… OIN,k]As a whole, use
Figure BDA0002320665540000136
Is shown by
Figure BDA0002320665540000137
A row matrix is shown which is composed of orders for each raw material to be produced. In the expression (3-23)
Figure BDA0002320665540000138
Representing the material consumption in different examination periods in the planning time from the current examination time t;
Figure BDA0002320665540000139
then it is indicated in the same meterAnd in the time division, the arrival condition of each production raw material at different inspection moments forms an N x tau multidimensional matrix by the arrival quantity of each production raw material. It is noted that when the row vector of capacity is shifted to the next examination time, the first element will overflow, the whole element will shift to the left, the last bit of the matrix will be complemented with the capacity in full state, wherein
Figure BDA00023206655400001310
The same operation will be performed.
In order to solve the SCO-IP model, firstly, the system state space of the SCO-IP model is improved, and then the original reverse induction algorithm is optimized and improved based on the improved state space and by combining the characteristics and the mechanism of the SCO-IP model, so that the improved reverse induction algorithm based on the dual-processor mechanism is obtained.
(1) System state space optimization
According to the setting of the SCO-IP model operation environment, the maximum stock level of the raw materials for production is limited. Therefore, in the process of rationality analysis of the state space of the control system, we apply the following constraints to the initial model state space,
I1,t+I2,t+I3,t≤Imax(3-23)
Q1,t+Q2,t+Q3,t≤Imax(3-24)
in inequalities (3-26) and (3-27), the sum of the stock levels of all the production raw materials cannot exceed the set maximum stock capacity at each system inspection time t, which is a basic condition for keeping the control system operating properly. Likewise, for each system decision time t, the sum of the replenishment quantities of all production raw materials likewise cannot exceed the maximum stock capacity.
The following constraints are further imposed on the state space of the system,
Imax-(I1+I2)≥I3min(3-25)
Imax-(I1+I3)≥I2min(3-26)
Imax-(I3+I2)≥I1min(3-27)
inequalities (3-28) - (3-30) all show that at a certain stock level, the stock level fluctuation range of a certain raw material must be larger than Ii,minIn which Ii,minIndicating that a single order consumes the lowest consumption of the ith production raw material in the order pool. I isiIndicating the stock level of the i-th production raw material. When the inventory status component in the system status space fails to satisfy the inequality, then the SCO-IP model will not accept any more orders. Because the stock level of a certain production raw material has a floating range smaller than Ii,minThen this inventory level status component is indicated as failing to satisfy the production of any order, which is restocked in real time to maximize inventory capacity. Obviously, the phenomenon that such a production system cannot accept orders is unreasonable, and the system status in this case makes the production process in the SCO-IP model unsustainable.
In conclusion, through the restriction on the inventory system and the production system, the state vectors which do not conform to the actual situation and the basic operation environment in the original system state space can be screened out, so that the system state space is optimized, and the subsequent solution of the model optimal strategy is facilitated.
(2) Order admission processing mechanism
As the diversification of customer demand has increased, it has been difficult for traditional inventory-oriented profitability models to keep up with customer demand. Particularly in the traditional manufacturing industry, inventory-oriented production (MTS) tends to trap products into a difficult position of lost sales. In order to adapt to diversified customer demands, the order-oriented production mode can enable enterprises to flexibly respond to diversified market demands. The enterprise facing order production is adopted to mainly organize production according to the coming customer orders. This allows for a tight connection between the production scheduling plan and the customer order. According to actual production, a business usually receives all possible orders under sufficient production conditions to maximize economic efficiency. Based on SCO-IP model, order admission rules in actual production are added and combined with a standard reverse induction algorithm to realize a corresponding order admission mechanism.
According to the setting of the SCO-IP model operating environment, only one customer order is possible to arrive at each examination time t, and when the order arrives, a decision maker can know the corresponding production information of the order. And the random customer orders for the SCO-IP model are a pool of customer orders formed based on results of historical data analysis of foundry orders. According to the basic characteristics of the model, the algorithm steps of the order admission mechanism are given as follows:
1) determining an order pool, Ord, based on historical statisticsallThe order pool contains N possible orders; and according to the definition of order information, have
Figure BDA0002320665540000151
n≤N,Ok∈Ordall. The probability distribution of each order coming in the order pool obeys a certain probability distribution
Figure BDA0002320665540000152
2) Checking the residual production capacity state P of the system;
3) select order OnIf the remaining production capacity P of the current system is more than or equal to PnAnd satisfies the following inequality,
Figure BDA0002320665540000153
Figure BDA0002320665540000154
then the customer order O is acceptedn. Otherwise, the customer order O is rejectednAnd simultaneously obtain the rejection penalty,
Figure BDA0002320665540000155
4) if N is equal to N, the algorithm is ended; otherwise n is n +1 and step 3 is repeated.
(3) Production scheduling mechanism
According to the actual production situation, the SCO-IP model adopts the production mode of MTO, and the production scheduling plan is based on the information of the customer order. As described above for the formation of the order admission mechanism and the implementation of the algorithm, the production scheduling mechanism is also based on the processing manner of the actual production situation. Since in the SCO-IP model, there may be one order at a time, we simplify the production scheduling mode to a first come first scheduling mode, i.e., an order priority mechanism. The mode is also a relatively common production scheduling mode in production. Therefore, based on the production mode, the implementation steps of the corresponding production scheduling algorithm are given:
1) obtaining the result of the order admission mechanism algorithm at the same examination time t;
2) examining the remaining production capacity state P of the system and the stock level state I of each raw materialt,i
3) Select order OkIf the order is accepted, the following formula is calculated, and t satisfying the corresponding constraint condition is foundminOtherwise, keeping the original production scheduling plan;
Figure BDA0002320665540000161
Figure BDA0002320665540000162
4) if there is t satisfying the conditionminThen the order is placed, i.e. from the calculated tminStarting to arrange production backwards continuously until order OkUntil the production is finished, updating a production scheduling plan; otherwise, keeping the original production scheduling plan;
5) if N is equal to N, the algorithm is ended; otherwise, n is n +1, and returns to step 3.
(4) Dual-processor reverse induction algorithm based on optimization space
After the state space of the SCO-IP model is analyzed reasonably and the order admission processing mechanism and the production scheduling mechanism are realized, the order admission processing mechanism and the production scheduling mechanism need to be combined with a standard reverse induction algorithm so as to find an inventory control optimal strategy of the SCO-IP model. Therefore, an improved reverse induction algorithm is provided, which is based on an optimized system state space and incorporates an order admission mechanism and a production scheduling mechanism. The steps of the algorithm are as follows,
1) establishing a production system, an inventory system, a customer order system and a decision system, and initializing relevant data;
2) establishing an initial system state space S1Initial action space A1
3) Determining a finite stage length T;
4) optimizing the initial system state space and the initial action space to form S'1And A'1
5) At an inspection time T ═ T, in an optimized system state space S'1Each possible system state;
6) applying an order admission processing mechanism;
7) applying a production scheduling mechanism;
8) record the corresponding tmin
9) Implementation of optimized action space A'1
10) Calculating the optimal expected inventory cost of each system state vector when a random customer order is placed, and recording the corresponding optimal action vector and the optimal expected inventory cost of each possible system state;
11) at an inspection time t ═ t-1, in an optimized system state space S'1In each of the possible system states of the system,
12) applying order admission processing mechanism and production scheduling mechanism, and recording corresponding tminThen, performing a motion space;
13) using a reverse induction algorithm, calculating
Figure BDA0002320665540000171
And record the corresponding mostAction of excellence A*
Figure BDA0002320665540000172
Figure BDA0002320665540000173
14) If t is 1, the algorithm ends; otherwise, returning to the step (11).
Key parameters of the model
Figure BDA0002320665540000174
Figure BDA0002320665540000181

Claims (7)

1. The casting enterprise raw material inventory management control method based on the Markov decision theory is characterized by comprising the following steps of: the method comprises the steps of establishing a casting enterprise raw material inventory control model, namely an SCO-IP model, under a dynamic production environment by using a Markov Decision Process theory, carrying out abstract modeling on the model, and finally quantitatively describing the basic operation Process of a researched object; in particular, the method comprises the following steps of,
(1) describing the operation flows of orders, inventory and production in a casting enterprise under a dynamic production environment, and making reasonable assumptions on the environment in which a model needs to be established;
(2) on the basis of the current situation and reasonable hypothesis, key parameters of the model are described by a Markov precision Process theory, and a complete Markov multiple group is established;
(3) analyzing a Decision rule generated in the Markov precision Process theoretical Process, and constructing a complete target cost function;
(4) and analyzing the characteristics of the model to find an algorithm for solving the optimal inventory control strategy of the model.
2. The Markov decision theory-based casting enterprise raw material inventory management control method according to claim 1, wherein: making reasonable assumptions about the environment of the model includes:
(1) production materials having the following characteristics were not considered in this model:
① no longer considers the depreciation property of the materials, and a part of the materials can be reused after the materials are used;
②, the material produced in the production process belongs to the necessary loss of production;
③ the utilization value of the newspaper and waste is not considered for the time being;
(2) in order to quantify the storage cost of the materials, the storage point is regarded as a warehouse with lease expenses;
(3) the warehouse capacity has an upper inventory limit;
(4) the influence of the purchased material quantity on the material supplement speed is not considered;
(5) the penalty cost brought by a certain degree of delay delivery is allowed to bear;
(6) ignoring production preparation time after confirming acceptance of the order;
(7) the production is regarded as a single-line production mode, namely a plurality of orders are not produced at the same time;
(8) the scheduled production tasks can be completed on time, and task delay caused by unexpected factors can not occur;
(9) the stock can ensure the normal production;
(10) taking the order quantity of the raw materials and the corresponding stock level as discrete variables;
(11) only consider random incoming or outgoing orders each time the system is reviewed, and we consider the probability distribution when one order comes and the other orders do not come; when a random order arrives, the production planner can determine the relevant information for the order.
3. The Markov decision theory-based casting enterprise raw material inventory management control method according to claim 1, wherein: the decision rule comprises:
(1) order admission rules:
order admission rules aim to address how random customer orders are handled, imposing the following constraints on the model:
OPk/Pmax+x-τ≤0;
(2) production scheduling rules:
since the production scheduling requirement cannot be met only by depending on the stock level state in the model, a production raw material stock level reference matrix is introduced
Figure FDA0002320665530000021
Which takes into account the orders already placed and the ordering situation,
Figure FDA0002320665530000022
Figure FDA0002320665530000023
Figure FDA0002320665530000024
Figure FDA0002320665530000025
tmin-t≤τ
after processing the possible incoming orders and scheduling the accepted orders, transition to the system state at the next review time.
4. The Markov decision theory-based casting enterprise raw material inventory management control method according to claim 1, wherein: the objective cost function is:
Figure FDA0002320665530000026
Figure FDA0002320665530000027
5. the Markov decision theory-based casting enterprise raw material inventory management control method according to claim 1, wherein: the Markov precision Process theory refers to finite stage deterministic Markov precision Process theory.
6. The Markov decision theory-based casting enterprise raw material inventory management control method according to claim 1, wherein: the algorithm is an improved reverse induction algorithm based on a dual-processor mechanism by taking a reverse induction method as a main body and combining the characteristics of the model, and the model is solved more efficiently.
7. The Markov decision theory-based casting enterprise raw material inventory management control method according to claim 1, wherein: the Markov multi-reorganization includes: decision time and period, state, action, transition probability and reward;
(1) decision time and period:
in the Markov decision process, because the decision time point set T can have various characteristics, the models can be classified according to the characteristics of the decision time point set T:
a) when the decision time point set T is an infinite point set which can be listed, that is, { T ∈ R | T ═ 1,2,3,. and n,. the model is regarded as a discrete decision time model under an infinite planning stage;
b) when the decision time point set T is a tabulatable finite point set, i.e. T ═ 1,2, 3.., n }, we consider the model as a discrete decision time model under a finite planning stage;
c) when the decision time set T is a continuous finite set, namely T belongs to [0, n ], the model is regarded as a continuous decision time model under a finite stage;
d) when the decision time set T is a continuous infinite set, namely T belongs to [0, infinity ]), regarding the model as a continuous decision time model in an infinite stage;
the model is a discrete decision time model under a finite stage;
(2) state and action set:
at the beginning of each decision time, the system will present the corresponding state; s represents a set of possible system states; when a decision maker observes that the state of the system is S at a certain decision moment, S belongs to S, and according to the state, the feasible action set AsIn the method, a reasonable action a is selected, a belongs to As
(3) Transition probability and reward:
at decision time t, after taking action against state s, two effects are produced on the system:
the decision maker receives an immediate reward (cost) r (s, a),
the current state will be distributed with probability pt(. s, a) transitioning to the state for the next decision time;
the instant remuneration (cost) r (S, a) is defined in S ∈ S, a ∈ AsA real-valued function above, which represents the value of the reward (cost) generated at period t after the decision is made at decision time t; when the value is positive, it represents a profit or a reward, otherwise it represents a cost; in the Markov decision process, the production process of the instant reward (cost) is not concerned, and only the value or the expected value of r (s, a) is known after an action is selected; and when continuing to the next examination time, the instant remuneration r (s, a) includes: one-time consideration (cost); cumulative rewards (costs); random reward (cost); consideration (cost) related to the status at the next time;
the expected reward value for action a may be expressed as:
Figure FDA0002320665530000041
in the above equation, p (s' | s, a) represents the transition probability of the system state s transitioning to the next decision instant, usually for the transition probability,
Figure FDA0002320665530000042
to this end, a complete markov decision process can be written:
{T,S,A(s),p(·|s,a),r(s,a)}。
CN201911296380.7A 2019-12-16 2019-12-16 Casting enterprise raw material inventory management control method based on Markov decision theory Active CN111126905B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911296380.7A CN111126905B (en) 2019-12-16 2019-12-16 Casting enterprise raw material inventory management control method based on Markov decision theory

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911296380.7A CN111126905B (en) 2019-12-16 2019-12-16 Casting enterprise raw material inventory management control method based on Markov decision theory

Publications (2)

Publication Number Publication Date
CN111126905A true CN111126905A (en) 2020-05-08
CN111126905B CN111126905B (en) 2023-08-01

Family

ID=70499300

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911296380.7A Active CN111126905B (en) 2019-12-16 2019-12-16 Casting enterprise raw material inventory management control method based on Markov decision theory

Country Status (1)

Country Link
CN (1) CN111126905B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113077188A (en) * 2021-04-28 2021-07-06 中国人民解放军国防科技大学 MTO enterprise order accepting method based on average reward reinforcement learning
CN113112051A (en) * 2021-03-11 2021-07-13 同济大学 Production maintenance joint optimization method for serial production system based on reinforcement learning
CN113592240A (en) * 2021-07-02 2021-11-02 中国人民解放军国防科技大学 Order processing method and system for MTO enterprise
CN113723877A (en) * 2021-08-18 2021-11-30 中国科学技术大学 Inventory decision method and system based on second-order function decomposition method
CN113807792A (en) * 2021-09-28 2021-12-17 重庆允成互联网科技有限公司 Collaborative processing method, system, equipment and storage medium based on discrete scene

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110077991A1 (en) * 2009-09-30 2011-03-31 Xerox Corporation Methods for supply chain management
CN106991500A (en) * 2017-04-10 2017-07-28 哈尔滨理工大学 Inventory allocation method based on multi-Agent network for distributed sales model
CN109426920A (en) * 2018-01-19 2019-03-05 武汉十傅科技有限公司 A kind of enterprise's production planning optimization method considering prediction order and practical order
CN109543881A (en) * 2018-10-23 2019-03-29 上海大学 Equipment inventory management decision optimization method, system, electronic equipment and storage medium
CN110517002A (en) * 2019-08-29 2019-11-29 烟台大学 Production control method based on intensified learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110077991A1 (en) * 2009-09-30 2011-03-31 Xerox Corporation Methods for supply chain management
CN106991500A (en) * 2017-04-10 2017-07-28 哈尔滨理工大学 Inventory allocation method based on multi-Agent network for distributed sales model
CN109426920A (en) * 2018-01-19 2019-03-05 武汉十傅科技有限公司 A kind of enterprise's production planning optimization method considering prediction order and practical order
CN109543881A (en) * 2018-10-23 2019-03-29 上海大学 Equipment inventory management decision optimization method, system, electronic equipment and storage medium
CN110517002A (en) * 2019-08-29 2019-11-29 烟台大学 Production control method based on intensified learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
汪方博: "基于Markov理论下的库存管理决策研究", 《中国优秀硕士学位论文全文数据库》 *
陈弘: "马氏排队库存系统最优控制策略研究", 《中国博士学位论文全文数据库》 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113112051A (en) * 2021-03-11 2021-07-13 同济大学 Production maintenance joint optimization method for serial production system based on reinforcement learning
CN113112051B (en) * 2021-03-11 2022-10-25 同济大学 Production maintenance joint optimization method for serial production system based on reinforcement learning
CN113077188A (en) * 2021-04-28 2021-07-06 中国人民解放军国防科技大学 MTO enterprise order accepting method based on average reward reinforcement learning
CN113592240A (en) * 2021-07-02 2021-11-02 中国人民解放军国防科技大学 Order processing method and system for MTO enterprise
CN113592240B (en) * 2021-07-02 2023-10-13 中国人民解放军国防科技大学 MTO enterprise order processing method and system
CN113723877A (en) * 2021-08-18 2021-11-30 中国科学技术大学 Inventory decision method and system based on second-order function decomposition method
CN113723877B (en) * 2021-08-18 2023-11-17 中国科学技术大学 Inventory decision method and system based on second-order function decomposition method
CN113807792A (en) * 2021-09-28 2021-12-17 重庆允成互联网科技有限公司 Collaborative processing method, system, equipment and storage medium based on discrete scene

Also Published As

Publication number Publication date
CN111126905B (en) 2023-08-01

Similar Documents

Publication Publication Date Title
CN111126905A (en) Casting enterprise raw material inventory management control method based on Markov decision theory
Chan et al. An innovative supply chain performance measurement system incorporating research and development (R&D) and marketing policy
CN109741083B (en) Material demand weighted prediction method based on enterprise MRP
Avci Lateral transshipment and expedited shipping in disruption recovery: A mean-CVaR approach
Bhatnagar et al. The joint transshipment and production control policies for multi-location production/inventory systems
Zhao et al. System dynamics simulation‐based model for coordination of a three‐level spare parts supply chain
Upasani et al. Incorporating manufacturing lead times in joint production-marketing models: a review and some future directions
Singh et al. A Soft Computing based Inventory model with deterioration and price dependent demand
Aláč Decision making and its importance in production planning within the woodprocessing company, respectively in the whole supply chain
Zhang et al. Operational decisions and game analysis in the agricultural supply chain: invest or not?
CN116415780A (en) Intelligent ordering method and system
Naseri et al. Pricing and inventory control decisions in the stochastic hybrid production systems with multiple recovery options
Radhakrishnan et al. Genetic algorithm based inventory optimization analysis in supply chain management
Miranzadeh et al. Simulation of a single product supply chain model with ARENA
Che Pricing strategy and reserved capacity plan based on product life cycle and production function on LCD TV manufacturer
Skvortsova et al. Economic assessment of the effectiveness of the introduction of industry 4.0 technologies in the activities of industrial enterprises
CN102542432A (en) Inventory management system and method
Lin et al. A strategic quick response approach for dynamic supply chain management of perishable goods
Sharam et al. Individual And Collective Pricing Strategies For A Multi-Layer, Multi-Channel Supply Chain With Suggested Selling Price By Manufacturer
Pai et al. Impact of delivery delay on the manufacturing firm inventories: a system dynamics approach
Rao Inventory Management Algorithm and System Implementation Based on Cost Control
Setak et al. Developing a model for pricing and control the inventory of perishable products with exponential demand
Guirong et al. Study on Auto Enterprise Inventory Management
Li Optimization Design of Short Life Cycle Product Logistics Supply Chain Scheme Based on Support Vector Machine
Kościelniak et al. Vendor Managed Inventory—implementation of VMI concept from the dynamic management perspective

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant