CN114996781A

CN114996781A - Two-dimensional irregular part layout method and system based on actors-critics

Info

Publication number: CN114996781A
Application number: CN202210592548.4A
Authority: CN
Inventors: 史明亮; 孙伟平; 饶运清; 方杰
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2022-05-27
Filing date: 2022-05-27
Publication date: 2022-09-02
Anticipated expiration: 2042-05-27
Also published as: CN114996781B

Abstract

The invention discloses a two-dimensional irregular part layout method and system based on actors and critics, and belongs to the field of intelligent plate cutting and blanking manufacturing. The method comprises the following steps: summarizing constraint conditions of the two-dimensional part layout problem, and establishing a part sequence optimization and positioning model in the layout process by taking the utilization rate of a mother board as a target; an actor-critic algorithm is proposed to solve, using a pointer network as a base network, in combination with a heuristic algorithm. The two-dimensional part layout method can effectively improve the utilization rate of the motherboard in a limited time, reduce the production cost and provide an idea for solving the part layout problem by using a deep learning method later.

Description

Two-dimensional irregular part layout method and system based on actors-critics

Technical Field

The invention belongs to the field of intelligent plate cutting and blanking manufacturing, and particularly relates to a two-dimensional irregular part layout method and system based on actors-critics.

Background

The problem of part layout widely exists in the industries of processing and manufacturing, clothes and printing and the like, such as plate processing, leather manufacturing, glass cutting, clothing cutting and the like. Therefore, the layout of the parts is finished in a short time, the utilization rate of raw materials is improved, the labor and material costs can be effectively reduced, and remarkable economic benefits are brought.

At present, mainstream algorithms for solving the problem of part layout are divided into a traditional heuristic algorithm and an intelligent optimization algorithm. The heuristic algorithm mainly solves the positioning problem of parts in a known sequence on raw materials, and common heuristic algorithms comprise a BL algorithm, a binary tree layout algorithm, a lowest horizontal line algorithm and the like. The intelligent optimization algorithm mainly focuses on the sequence problem of part arrangement, and common intelligent optimization algorithms include a genetic algorithm, a simulated annealing algorithm, an ant colony algorithm and the like. The heuristic algorithm is convenient and fast, can be suitable for different application scenes, and has low utilization rate of raw materials. In contrast, the result obtained by the intelligent optimization algorithm is generally close to the optimal solution, but the algorithm itself needs to be designed artificially and specifically for different problems, the problems are slightly changed, corresponding modification needs to be made, the universality is poor, and the time consumed for obtaining the result by the algorithm is long.

Great glowing provides a two-dimensional irregular polygon layout method based on deep reinforcement learning, and the method mainly adopts the following ideas: firstly, extracting shape features of two-dimensional irregular parts to obtain feature vectors, sending the feature vectors into an actor network formed by an encoder-decoder, calculating reward values by using a comment family network formed by the encoder-decoder structure, and backtracking and updating to obtain a final result.

However, this method has the following disadvantages: 1) training a reinforcement learning network for two-dimensional irregular part layout from zero, and not utilizing prior experience obtained by a rectangular part layout algorithm, so that the training convergence time is long and the utilization rate of a plate is reduced; 2) the critical polygon algorithm is used for collision detection, so that the time consumption is large; 3) the method is limited by two-dimensional irregular part layout, the positioning algorithm only adopts the simplest left-lower filling positioning algorithm, the positioning algorithm is limited and single, and the positioning effect is poor; 4) the shape feature extraction is carried out on the two-dimensional irregular part, so that compression loss is caused, and the utilization rate of a stock layout result is reduced.

Disclosure of Invention

Aiming at the defects and improvement requirements of the prior art, the invention provides a two-dimensional irregular part layout method and a two-dimensional irregular part layout system based on actors and critics, and aims to effectively optimize the utilization rate of a motherboard of a two-dimensional irregular part layout algorithm in a limited time by using the advantages of deep learning and reduce the production cost.

To achieve the above object, according to a first aspect of the present invention, there is provided an actor-critic-based two-dimensional irregular part layout method, comprising:

a training stage:

s1, constructing a first rectangular part data set for training a critic network by adopting a rectangular cutting mode; generating two-dimensional irregular parts and an enveloping mode by adopting polygons, and constructing a second rectangular part data set for training an actor network, wherein the rectangular part data set comprises a plurality of rectangular part sets, and the critic network and the actor network are both formed by a pointer network and a positioning algorithm, the pointer network is used for determining the layout sequence of each part, and the positioning algorithm is used for determining the layout position of each part;

s2, determining the reward of the current stock layout scheme corresponding to each rectangular part set in the first rectangular part data set, wherein the reward is the difference value between the area of the current stock layout scheme and the minimum stock layout area of the current rectangular part set;

s3, obtaining a comment family network reward gradient by adopting a random strategy gradient according to rewards of all current stock layout schemes corresponding to the first rectangular part data set, and updating parameters of a pointer network in the comment family network;

s4, repeating S2-S3 until the average area of each current stock layout is converged and stops, and obtaining a trained critic network;

s5, determining the reward of the current layout scheme of each rectangular part set in the second rectangular part data set, wherein the reward is the difference value of the area of the current layout scheme obtained by the actor network and the area of the standard layout scheme obtained by the trained critic network;

s6, obtaining actor network reward gradient by adopting a random strategy gradient according to rewards of all current layout schemes corresponding to the second rectangular part data set, and updating parameters of a pointer network in the actor network; adopting the rewarded mean square error of the current layout scheme as the network loss of the critics, and updating parameters in the critics network;

s7, repeating S5-S6 until the average area of each current stock layout is converged and stopped, and obtaining a trained actor network;

an application stage:

and inputting all the two-dimensional irregular parts to be subjected to stock layout into the trained actor network to obtain a stock layout scheme.

Preferably, the rectangular cutting mode is as follows:

firstly, setting a complete rectangular part with the size of (W, H) and the number of parts to be generated as n, and initializing a rectangular part set I { (W, H) };

then, randomly taking out a rectangular part from the I, randomly taking out one edge from the edge set of the rectangular part, and randomly selecting a cutting point from the edge; the rectangular part is divided into two rectangles according to the cutting points, added to a rectangular part set I and deleted from an original rectangular part in the set I; the above operation is performed n times to obtain a final rectangular part set I ═ W ₁ ,H ₁ ),(W ₂ ,H ₂ ),…,((W _n ,H _n ))]。

Has the advantages that: the automatically generated rectangular data set has large data volume, and is beneficial to training and avoiding overfitting. And all rectangular parts are cut from a complete rectangle, so that the minimum layout area of the data of the rectangular parts is the area WxH of the complete rectangle, and the problem of label loss of a common data set is solved.

Preferably, the pointer network takes the following structure:

for a two-dimensional part layout problem, a set of data (P, C) is given ^P ) P is a set of parts to be laid out on a single sheet, C ^P ＝{C ₁ ,…C _n For the optimal sequence of the part layout obtained, C _i For the ith selected part; sequence-to-sequence models estimate conditional probabilities according to the chain rule using a recurrent neural network as the basis networkp(C ^p |P；θ)：

Pointer network modeling p (C) using the following approach _i |C ₁ ,…,C _i-1 ,P)：

p(C _i |C ₁ ,…,C _i-1 ,P)＝softmax(u ⁱ )

The parameters of the model are updated by maximizing the conditional probability of the training set:

where θ is all the learnable parameters of the model,

is the jth value of the ith one-dimensional vector in a two-dimensional vector, e _j Is the output of the j-th encoder, d _i Is the output of the ith decoder, u ⁱ For the ith one-dimensional vector in a two-dimensional vector, the softmax function is to divide u with length n ⁱ Normalization to a probability distribution of the input dictionary, v, W ₁ And W ₂ Is a learnable parameter for the output model.

Has the advantages that: by adopting the pointer network, the combined optimization problem of the stock layout problem can be trained by using deep learning, so that the possibility of improving the generalization capability and accuracy of the algorithm is realized.

Preferably, the positioning algorithm is a heuristic algorithm, which is specifically as follows:

(1) determining the placing position of a first part to be subjected to layout by adopting a bottom-left algorithm according to the part layout sequence obtained by the pointer network;

(2) obtaining two rectangular available spaces for the rest plates according to the criterion of maximum area, putting the available spaces into an empty set ES, and initializing a part placing angle set O, wherein the angle set comprises a transverse placing angle and a vertical placing angle;

(3) aiming at each rectangle available space s belonging to ES and each angle O belonging to O, calculating the total area occupied by all parts after the next part is placed into the available space s at the angle O, wherein the total area is the area of the smallest rectangle capable of containing all arranged parts; selecting the angle o with the smallest total area _i And available space s _i And placing the part by adopting a bottom-left algorithm.

Has the beneficial effects that: the heuristic algorithm is adaptive to the pointer network, is simple and effective, and can further reduce the consumed time of the whole scheme while ensuring the utilization rate of the motherboard.

Preferably, in step S3, the random strategy gradient calculation formula is as follows:

wherein M is the number of rectangular part sets in the first rectangular part data set, t _i The ith sample, w, of the model input _i Order of two-dimensional part layout output for model, A (w) _i |t _i ) At the input of t _i Output ordering of w _i Area of the whole used plate material, b (t) _i ) Is the minimum layout area of the ith rectangular part set,

to obtain a partial derivative of theta, p _θ (w _i |t _i ) For a given input t _i Output w in the case of _i The probability of (c).

Has the advantages that: the random strategy gradient algorithm is suitable for the problem of part layout that the layout sequence is sequential and the related combination optimization problem exists among the parts of the layout, and is beneficial to the rapid convergence of training.

Preferably, in step S6, the random strategy gradient calculation formula is as follows:

wherein N is the number of rectangular part sets in the second rectangular part data set, t _i The ith sample, w, of the model input _i Order of two-dimensional part layout output for model, A (w) _i |t _i ) At the input of t _i Output ordering of w _i The area of the whole used plate material, b (t) _i ) Obtaining the layout area of the ith rectangular part set for the trained critic network,

To achieve the above object, according to a second aspect of the present invention, there is provided an actor-critic-based two-dimensional irregular parts layout system comprising: a computer-readable storage medium and a processor;

the computer-readable storage medium is used for storing executable instructions;

the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the actor-critic-based two-dimensional irregular part layout method of the first aspect.

Generally, by the above technical solution conceived by the present invention, the following beneficial effects can be obtained:

aiming at the problems that the prior experience obtained by a rectangular part layout algorithm cannot be utilized to cause long convergence time of training and low utilization rate of plates when a reinforced learning network of two-dimensional irregular part layout is trained from scratch by the conventional layout method, the invention adopts a transfer learning technology, takes the trained rectangular part layout algorithm as a critic network, so that the prior experience of the rectangular part layout is used for reference, the convergence speed of the training is greatly improved, and the utilization rate of the layout is effectively improved. In addition, the invention uses the rectangular envelope technology, because the parts after rectangular envelope do not need to use the critical polygon algorithm for collision detection any more, the training and running speed is greatly accelerated, and the waste area is less for irregular parts which are more regular and close to the actual production. The parts are positioned by using different positioning algorithms in a diversified manner, the selection space is wide, and the universality is strong.

Drawings

FIG. 1 is a flow chart of a two-dimensional irregular part layout method based on actors-critics provided by the present invention;

FIG. 2 is a schematic diagram of an automatic generation process of a rectangular part data set according to a preferred embodiment of the present invention;

FIG. 3 is a schematic diagram of a pointer network architecture provided by the preferred embodiment of the present invention;

FIG. 4 is a schematic diagram of the rectangular available space provided by the preferred embodiment of the present invention;

fig. 5 is a schematic diagram of the final layout results provided by the preferred embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and do not limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.

The invention discloses a two-dimensional irregular part layout problem, which is to give a large enough plate and arrange n parts on the plate according to a certain sequence and orientation so as to minimize the area used by the plate.

Based on the given conditions, as shown in fig. 1, the invention provides a two-dimensional irregular part layout method based on actors-critics, which comprises the following steps:

a training stage:

step S1: and constructing a training set.

S11, generating a rectangular part set in a rectangular cutting mode, repeating the rectangular cutting for multiple times to obtain a rectangular part data set (the number of the rectangular part sets in the data set is M), and using the rectangular part data set as a training set of the critic network.

The rectangular cutting process is shown in fig. 2:

first, a complete rectangular part is set, the size is (W, H), the number of parts to be generated is n, and the rectangular part set I is initialized.

Then, randomly taking out a rectangular part from the I, randomly taking out one side from the side set of the rectangular part, and randomly selecting a cutting point from the selected side; the rectangular part is divided into two rectangles according to the cutting point, added to the rectangular part set I, and the original rectangular part in the set I is deleted. The above operation is performed n times to obtain a final rectangular part set I ═ W ₁ ,H ₁ ),(W ₂ ,H ₂ ),…,((W _n ,H _n ))]。

Since all rectangular parts are cut from a complete rectangle, the minimum layout area of the rectangular part data is the area W × H of the complete rectangle.

And S12, generating a two-dimensional irregular part set by using a polygon generation algorithm, repeating the steps for multiple times, and generating a two-dimensional irregular part data set.

And S13, enveloping the two-dimensional irregular parts into rectangles by adopting an enveloping method for each part in each two-dimensional irregular part set to obtain an enveloped part data set which is used as a training set of the actor network.

Step S2: for each rectangular part set, determining the reward of the current layout scheme of the rectangular part set. The method comprises the following specific steps:

and S21, determining the stock layout sequence of each part by adopting a pointer network, wherein the pointer network is structured as shown in figure 3. And determining the stock layout positions of the parts by adopting a heuristic algorithm to obtain a current stock layout scheme, and calculating the area of the current stock layout scheme.

As shown in fig. 3, the pointer network takes the following structure:

for a two-dimensional part layout problem, a set of data (P, C) is given ^P ) P is a set of parts to be laid out on a single sheet, C ^P ＝{C ₁ ,…C _n The obtained optimal sequence of part layout. The sequence-to-sequence model estimates the conditional probability p (C) according to the chain rule using the recurrent neural network as the basis network ^p |P；θ)。

Pointer network modeling p (C) using the following method _i |C ₁ ,…,C _i-1 ,P)。

p(C _i |C ₁ ,…,C _i-1 ,P)＝softmax(u ⁱ )

The parameters of the model are updated by maximizing the conditional probability of the training set.

Wherein the softmax function is to convert u ⁱ (length n) normalized to a probability distribution, v, W, of the input dictionary ₁ And W ₂ Are learnable parameters of the output model.

The heuristic algorithm flow is as follows:

(1) and determining the placing position of the first part to be subjected to layout by adopting a bottom-left algorithm according to the part layout sequence obtained by the pointer network.

(2) And (3) obtaining two rectangular available spaces of the rest plates according to the criterion of the maximum area, wherein the schematic diagram of the rectangular available spaces is shown in fig. 4, putting the rectangular available spaces into an empty set ES, and initializing a part placing angle set O, including a transverse placing angle and a vertical placing angle.

(3) For each rectangular available space s e ES and each angle O e O, the total area occupied by all parts after placing the next part in available space s at the angle O is calculated. Wherein the total area is the area of the smallest rectangle that can contain all the parts arranged. Selecting the angle o with the smallest total area _i And available space s _i And putting the parts in the BL principle.

And S22, taking the difference between the area of the current stock layout scheme and the minimum stock layout area of the current rectangular part set as the reward of the current stock layout scheme.

And taking the difference between the area of the current layout scheme and the minimum layout area of the current rectangular part set as the reward of the current layout scheme, thereby increasing the probability of selecting the output scheme capable of obtaining the minimum layout area.

Step S3: and determining to adopt a random strategy gradient to obtain an incentive gradient according to the incentives of the M current stock layout schemes, and backtracking and updating parameters of the pointer network. The method for stochastic strategy gradient comprises the following calculation steps:

where M is the number of training samples in a batch, t _i For the i-th sample of the model input, w _i Is the order of the two-dimensional part layout output by the model, A (w) _i |t _i ) Is when the input is t _i Ordering of outputs as w _i The area of the whole used plate material, b (t) _i ) Is the minimum lay out area of the ith set of rectangular parts,

Step S4: and circulating Step 2-Step 3, stopping circulation if the average area of the M current stock layout schemes is converged, and taking the current pointer network and the heuristic algorithm part as the critic network.

In this embodiment, the judgment basis of the convergence of the average area of the M current stock layout schemes is as follows: and in 5 continuous circulation turns, the average area of the M current stock layout schemes is larger than the minimum average area of the existing current stock layout schemes.

Step S5: for each part set in the enveloped part data set (the number of the part sets in the data set is N), determining the reward of the current stock layout scheme of the part set. The method comprises the following specific steps:

and S51, determining the stock layout sequence and position of each part by adopting a critic network, taking the stock layout sequence and position as a reference stock layout, and calculating the area of the reference stock layout.

S52, determining the layout sequence of each part by adopting a pointer network in the actor network, determining the layout position of each part by adopting a heuristic algorithm in the actor network to obtain a current layout scheme, and calculating the area of the current layout scheme.

And S53, taking the difference value between the area of the current layout scheme obtained by the actor network and the area of the reference layout scheme as the reward of the current layout scheme.

Step S6: and obtaining an actor network reward gradient by adopting a random strategy gradient according to the rewards of the N current stock layout schemes, and updating parameters in the actor network. And adopting the mean square error of the reward of the current stock layout scheme as the network loss of the critic, and updating the parameters in the critic network.

Step S7: and (5) circulating Step 5-Step 6, and stopping circulating if the average area of the N current layout schemes is converged, wherein the actor network is used as a final model for solving the irregular part layout problem.

If the number of the two-dimensional irregular parts to be subjected to layout in the application stage is far more than that of the two-dimensional irregular parts in a single irregular part set during training, the number of the two-dimensional irregular parts in the irregular part set needs to be set to be larger than or equal to that of the two-dimensional irregular parts to be subjected to layout, a new data set is regenerated, and the training is carried out again.

An application stage:

and inputting all the two-dimensional irregular parts to be subjected to layout to the trained actor network to obtain a layout scheme.

Examples

Rectangular part data set 500000 groups were generated, each group containing 15 rectangular parts. Generating 500000 groups of irregular part data sets, wherein each group comprises 15 polygonal parts with different forms, the number of vertexes of the polygonal parts is [3,7], and enveloping all the parts to obtain 500000 groups of enveloped part data sets. The optimization target is the area of the plate consumed after stock layout, the model is converged after 56 rounds of training, and the training is completed. The test includes that 15 polygonal samples are input into the trained model, and the obtained two-dimensional irregular part layout scheme is shown in fig. 5, wherein the area in the dashed frame is the area of the layout scheme consumed plate.

It will be understood by those skilled in the art that the foregoing is only a preferred embodiment of the present invention, and is not intended to limit the invention, and that any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. A two-dimensional irregular part layout method based on actors-critics is characterized by comprising the following steps:

a training stage:

an application stage:

2. The method according to claim 1, characterized in that the rectangular cutting is performed in particular as follows:

then, randomly taking out a rectangular part from the I, randomly taking out one edge from the edge set of the rectangular part, and randomly selecting a cutting point from the edge; the rectangular part is divided into two rectangles according to the cutting point, added to the rectangular part set I, and the original rectangular part in the set I is deleted; the above operation is performed n times to obtain a final rectangular part set I ═ W ₁ ,H ₁ ),(W ₂ ,H ₂ ),…,((W _n ,H _n ))]。

3. The method of claim 1, wherein the pointer network takes the following structure:

for a two-dimensional part layout problem, a set of data (P, C) is given ^P ) P is a set of parts to be laid out on a single sheet, C ^P ＝{C ₁ ,…C _n For the optimal sequence of the part layout obtained, C _i For the ith selected part; the sequence-to-sequence model estimates the conditional probability p (C) according to the chain rule using the recurrent neural network as the basis network ^p |P；θ)：

The pointer network models p (C) as follows _i |C ₁ ,…,C _i-1 ,P)：

p(C _i |C ₁ ,…,C _i-1 ,P)＝softmax(u ⁱ )

where theta is all learnable parameters of the model,

is the jth value of the ith one-dimensional vector in a two-dimensional vector, e _j For the output of the j-th encoder, d _i Is the output of the ith decoder, u ⁱ Is the ith one-dimensional vector in a two-dimensional vector, softmax functionSeveral bundles of u of length n ⁱ Normalization to a probability distribution of the input dictionary, v, W ₁ And W ₂ Is a learnable parameter of the output model.

4. The method of claim 1, wherein the positioning algorithm is a heuristic algorithm, as follows:

5. The method of claim 1, wherein in step S3, the random strategy gradient is calculated as follows:

wherein M is the number of rectangular part sets in the first rectangular part data set, t _i The ith sample, w, of the model input _i Order of two-dimensional part layout for model output, A (w) _i |t _i ) At the input of t _i Output ordering of w _i The area of the whole used plate material, b (t) _i ) Is the minimum layout area of the ith rectangular part set,

6. The method of claim 1, wherein in step S6, the random strategy gradient is calculated as follows:

wherein N is the number of rectangular part sets in the second rectangular part data set, t _i The ith sample, w, of the model input _i Order of two-dimensional part layout output for model, A (w) _i |t _i ) Is at input t _i Output ordering of w _i The area of the whole used plate material, b (t) _i ) Obtaining the layout area of the ith rectangular part set for the trained critic network,

7. A two-dimensional irregular parts layout system based on actors-critics, comprising: a computer-readable storage medium and a processor;

the processor is configured to read executable instructions stored in the computer-readable storage medium and execute the actor-critic-based two-dimensional irregular parts layout method of any one of claims 1 to 6.