CN116629128B - Method for controlling arc additive forming based on deep reinforcement learning - Google Patents

Method for controlling arc additive forming based on deep reinforcement learning Download PDF

Info

Publication number
CN116629128B
CN116629128B CN202310620763.5A CN202310620763A CN116629128B CN 116629128 B CN116629128 B CN 116629128B CN 202310620763 A CN202310620763 A CN 202310620763A CN 116629128 B CN116629128 B CN 116629128B
Authority
CN
China
Prior art keywords
network
temperature field
picture
temperature
parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310620763.5A
Other languages
Chinese (zh)
Other versions
CN116629128A (en
Inventor
邓路兵
董博伦
蔡笑宇
林三宝
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN202310620763.5A priority Critical patent/CN116629128B/en
Publication of CN116629128A publication Critical patent/CN116629128A/en
Application granted granted Critical
Publication of CN116629128B publication Critical patent/CN116629128B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2111/00Details relating to CAD techniques
    • G06F2111/10Numerical modelling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2113/00Details relating to the application field
    • G06F2113/10Additive manufacturing, e.g. 3D printing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2119/00Details relating to the type or aim of the analysis or the optimisation
    • G06F2119/08Thermal analysis or thermal optimisation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02PCLIMATE CHANGE MITIGATION TECHNOLOGIES IN THE PRODUCTION OR PROCESSING OF GOODS
    • Y02P10/00Technologies related to metal processing
    • Y02P10/25Process efficiency

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Geometry (AREA)
  • Medical Informatics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Feedback Control In General (AREA)

Abstract

The invention relates to a method for controlling arc additive forming based on deep reinforcement learning, and belongs to the technical field of arc additive manufacturing. The method solves the problems that the process parameters of the complex components are difficult to determine and the molding is difficult to regulate and control. The method comprises the following steps: s1: performing numerical simulation on the arc material adding process; s2: acquiring numerical simulation temperature field information of an arc material adding process and processing the numerical simulation temperature field information; s3: constructing an arc additive manufacturing reinforcement learning environment and an intelligent body; s4: setting up a value network and a decision network; s5: training the network based on the environment built by the S3 by utilizing the temperature field information acquired by the S2; s6: and (3) automatically adjusting lamination parameters in the arc material-increasing process by utilizing the trained neural network in the step (S5), and keeping the fusion width and the fusion depth of the lamination layer stable. The method has good generalization capability, is suitable for components with complex shapes, and can reduce the time cost and the material cost for exploring technological parameters by applying the parameters executed by the intelligent agent to the actual arc material-increasing process after correcting the numerical simulation model.

Description

Method for controlling arc additive forming based on deep reinforcement learning
Technical Field
The invention relates to an additive forming method, and belongs to the technical field of arc additive manufacturing.
Background
With the advent of AlphaGo, deep reinforcement learning has been vigorously developed. Reinforcement learning is a branch of machine learning, is suitable for scenes in which tasks need to be completed through trial and error, can make independent decisions based on environments, continuously adjusts strategies according to environment feedback, achieves the capability of quickly adapting to environment changes, and can well solve the problems in dynamic scenes. The cost function and the strategy function model in the deep reinforcement learning are established based on the neural network, so that the problem of a high-dimensional state space can be well processed, the search space can be optimized based on the reward function, and the intelligent agent can make a better decision by maximizing the expected reward. With the development of technology, the cross fusion of the arc additive forming technology and reinforcement learning is further deepened, and the method provides possibility for solving the problems existing in arc additive manufacturing.
Arc additive manufacturing is an emerging metal additive manufacturing technology that utilizes an arc as a heat source by layering the wires layer by layer into a three-dimensional member after melting. At present, arc additive manufacturing process parameters are mainly determined by performing experiments to continuously try and error. Since arc additive manufacturing is a multi-physical field coupling process, it is difficult to determine its optimal parameters through practical experiments for complex components, and due to constantly changing heat dissipation conditions and severe heat accumulation effects.
Therefore, a method for controlling arc additive forming based on deep reinforcement learning is needed to solve the above-mentioned problems.
Disclosure of Invention
The present invention solves the problems of difficult determination of process parameters and difficult regulation of molding of complex components, and provides a method of controlling arc additive molding based on deep reinforcement learning, a brief summary of which is provided below in order to provide a basic understanding of certain aspects of the invention. It should be understood that this summary is not an exhaustive overview of the invention. It is not intended to identify key or critical elements of the invention or to delineate the scope of the invention.
The technical scheme of the invention is as follows:
a method for controlling arc additive forming based on deep reinforcement learning, comprising the following steps:
s1: performing numerical simulation on the arc material adding process;
s2: acquiring numerical simulation temperature field information of an arc material adding process and processing the numerical simulation temperature field information;
s3: constructing an arc additive manufacturing reinforcement learning environment and an intelligent body;
s4: setting up a value network and a decision network;
s5: training the network based on the environment built by the S3 by utilizing the temperature field information acquired by the S2;
s6: and (3) automatically adjusting lamination parameters in the arc material-increasing process by utilizing the trained neural network in the step (S5), and keeping the fusion width and the fusion depth of the lamination layer stable.
Preferably: in S1, discretizing a complex component, decomposing the complex component into a multi-layer single-channel component, a multi-layer multi-channel component and a cross structure, respectively carrying out numerical simulation on the complex component, controlling the grid quantity of each model, ensuring the shape diversity of the multi-layer single-channel component, the multi-layer multi-channel component and the cross structure, adopting a gradual calculation method in the simulation calculation process of the arc material-increasing process, dispersing each welding seam into a plurality of calculation working conditions, independently determining the optimal technological parameters of each working condition by an intelligent agent, and correcting the model.
Preferably: the number of grids is controlled within 50000.
Preferably: s2, opening a temperature field model diagram of a calculation result, recording temperature intervals corresponding to various colors in the temperature field cloud diagram, and calculating a median value of each temperature interval; acquiring temperature field cloud picture screenshot of each interlayer and the substrate, sequentially stacking all acquired temperature field pictures, and filling the pictures below the substrate picture and above the current laminated layer picture with monochromatic pictures for representing an environmental temperature state; converting a color temperature field picture into a gray picture under the condition that information contained in the temperature field picture is not reduced, performing feature processing on the gray picture to accelerate the training process of a neural network because temperature information represented by each pixel value in the converted gray picture is not clear, defining one pixel value in the gray picture as ambient temperature, defining the other pixel value in the gray picture as material melting point temperature, normalizing the values of each temperature field obtained by the previous calculation to other pixel values corresponding to two pixel value ranges, and replacing the pixels corresponding to each temperature in the gray picture with the calculated pixel values to form a new gray picture; and sequentially stacking all the processed gray-scale pictures into a picture sequence containing time dimension and space dimension temperature field information, and taking the picture sequence as the input of the neural network.
Preferably: s2, filling the lower surface of the substrate picture and the upper surface of the current lamination layer picture with 5 white pictures; and defining a pixel value 0 in the gray picture as an ambient temperature of 25 ℃, defining a pixel value 255 in the gray picture as a material melting point temperature, and normalizing the values of all the temperature fields obtained through the previous calculation to corresponding image values in a range of 0-255.
Preferably: s3, defining the temperature state diagram obtained in the S2 as a real-time state of the environment, wherein the action space of the intelligent body is used for adjusting the deposition current, the lamination voltage and the lamination speed, and determining the range of lamination parameters; the intelligent agent selects deposition current, lamination voltage and lamination speed according to the state of the environment received in real time, and then submits the deposition current, lamination voltage and lamination speed to a numerical simulation solver for the next material adding process, so that the regulation and control of the melting width and the melting depth are realized; when the layering parameters are selected, an attenuation epsilon-greedy strategy is used, so that an intelligent body randomly adopts the layering parameters to obtain various temperature field states for learning when the intelligent body just starts to learn, the optimal parameters which should be adopted currently are predicted better according to the current temperature field states so as to keep the melting width and the melting depth stable, and when the intelligent body trains to the later stage, the layering parameters output by a decision network are adopted; after the current numerical simulation calculation is completed, the robot enters the next environmental state and obtains corresponding rewards.
Preferably: in S3, the rewarding function of the corresponding rewarding of the robot entering the next environmental state is defined as follows:
wherein: d is the optimal penetration, D is the real-time penetration of the actual additive process, W is the optimal penetration, W is the real-time penetration of the actual additive process, and the rewarding value is limited to the range of < -10, 10 >.
Preferably: in S4 and S5, extracting time dimension and space dimension characteristics of the temperature state diagram obtained in S2 by adopting a 3D convolution layer; the value network and the decision network share weight, and only different neurons are adopted at the output layer for output; and optimizing the built model by adopting a near-end strategy optimization algorithm and a multithreading synchronous updating mode.
Preferably: in the multithreading synchronous updating mode, 12 numerical simulation solving environments are built at the same time, the additive component models of each numerical simulation solving environment are set to be different, an intelligent body interacts with the 12 environments through the same value network and strategy network, and in the interaction process, the intelligent body can record the temperature field state before each interaction, the lamination parameters executed during the interaction, the temperature field state after the interaction and the obtained rewards; after the intelligent agent interacts with all environments for 60 times, the intelligent agent updates the value network and the decision network by using the recorded 720 times of interaction information as one batch of data, 72 pieces of recording information are input when each time is updated, each batch of data trains the neural network for three rounds, each batch of data interacts with the environment by using a new network after training, and the next round of learning process is entered after the interaction process is completed.
Preferably: and S6, fixing the parameters of the neural network after the training of the neural network is completed, and adopting a greedy strategy when an intelligent agent interacts with the environment, and directly adopting the layering parameters with the maximum output probability of the current strategy network.
The invention has the following beneficial effects:
according to the invention, the temperature field state of the material adding process is obtained in real time through numerical simulation, and based on the built reinforcement learning environment, the intelligent body continuously interacts with temperature field information provided by the environment, so that the weights of a value network and a decision network are continuously optimized; after training, the intelligent agent adjusts the lamination parameters in real time through the optimal lamination parameters provided by the decision network so as to keep the penetration and the fusion width of the lamination process stable;
the invention has good generalization capability, can be suitable for various materials and components with complex shapes, can apply the parameters executed by the intelligent body to the actual arc material-increasing process, and can reduce the time cost and the material cost for exploring the technological parameters.
Drawings
FIG. 1 is an agent and environment interactive learning process;
FIG. 2 is a temperature field state diagram process flow diagram;
fig. 3 is a block diagram of a value network and decision network.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the present invention is described below by means of specific embodiments shown in the accompanying drawings. It should be understood that the description is only illustrative and is not intended to limit the scope of the invention. In addition, in the following description, descriptions of well-known structures and techniques are omitted so as not to unnecessarily obscure the present invention.
The first embodiment is as follows: 1-3, a method for controlling arc additive forming based on deep reinforcement learning according to the present embodiment includes the following steps:
s1: performing numerical simulation on the arc material adding process;
s2: acquiring numerical simulation temperature field information of an arc material adding process and processing the numerical simulation temperature field information;
s3: constructing an arc additive manufacturing reinforcement learning environment and an intelligent body;
s4: setting up a value network and a decision network;
s5: training the network based on the environment built by the S3 by utilizing the temperature field information acquired by the S2;
s6: the trained neural network of the S5 is utilized to automatically adjust the lamination parameters in the arc material-increasing process, and the fusion width and the fusion depth of the lamination layer are kept stable;
and a plurality of digital simulation solvers are deeply interacted with the built reinforcement learning environment, and the numerical simulation temperature field results are processed and then are input into a value network and a strategy network as environment states, so that the value network and the strategy network parameters are continuously optimized, the neural network is enabled to evaluate the temperature field states of the current material adding process more accurately, and the layering parameters output by the strategy network are more approximate to theoretical optimal parameters, thereby realizing real-time adjustment of arc material adding manufacturing process parameters and forming control.
The second embodiment is as follows: in S1, in order to improve the calculation efficiency of numerical simulation, discretizing a complex component during numerical simulation, decomposing the complex component into a multi-layer single-channel component and/or a multi-layer multi-channel component and/or a cross structure, respectively performing numerical simulation on the complex component, controlling the grid number of each model within 50000 so as to improve the calculation speed, and simultaneously, in order to ensure the generalization capability of the method, ensuring the diversity of the shapes of the multi-layer single-channel component, the multi-layer multi-channel component and the cross structure, adopting a step-by-step calculation method in the simulation calculation process of the arc additive process, dispersing each welding seam into a plurality of calculation working conditions, independently determining the optimal technological parameter of each working condition by an intelligent body, and enabling the simulation result to be consistent with the actual experimental result by the numerical simulation model through calibrating the heat source parameter and the heat dissipation condition.
And a third specific embodiment: 1-3, in S2, a temperature field model diagram of a calculation result is opened, temperature intervals corresponding to various colors in the temperature field cloud diagram are recorded, and a median value of each temperature interval is calculated; acquiring temperature field cloud picture screenshot of each interlayer and substrate, stacking all acquired temperature field pictures in sequence for better reaction of the temperature field state of the material adding process, and filling the lower surface of the substrate picture and the upper surface of the current laminated layer picture with monochromatic pictures for representing the environmental temperature state; in order to reduce the data amount input by the neural network, converting a color temperature field picture into a gray picture under the condition of not reducing the information contained in the temperature field picture, performing feature processing on the gray picture to accelerate the training process of the neural network because the temperature information represented by each pixel value in the converted gray picture is not clear, defining one pixel value in the gray picture as the ambient temperature, defining the other pixel value in the gray picture as the material melting point temperature, normalizing the value of each temperature field obtained by the previous calculation to other corresponding pixel values in the range of two pixel values, and replacing the pixel corresponding to each temperature in the gray picture with the pixel value obtained by the calculation to form a new gray picture; and sequentially stacking all the processed gray-scale pictures into a picture sequence containing time dimension and space dimension temperature field information, and taking the picture sequence as the input of the neural network.
The specific embodiment IV is as follows: 1-3, in S2, the lower surface of the substrate picture and the upper surface of the current lamination layer picture are filled with 5 white pictures; and defining a pixel value 0 in the gray picture as an ambient temperature of 25 ℃, defining a pixel value 255 in the gray picture as a material melting point temperature, and normalizing the values of all the temperature fields obtained through the previous calculation to corresponding image values in a range of 0-255.
Fifth embodiment: 1-3, in S3, the temperature state diagram obtained in S2 is defined as a real-time state of the environment, the action space of the intelligent body is used for adjusting deposition current, lamination voltage and lamination speed, and the range of lamination parameters is determined through practical experiments, so that the intelligent body can select proper lamination parameter combinations in the action space range in the face of different temperature field states, and the intelligent body is ensured to have complete capability of controlling penetration and stable fusion width; the intelligent agent selects deposition current, lamination voltage and lamination speed according to the state of the environment received in real time, and then submits the deposition current, lamination voltage and lamination speed to a numerical simulation solver for the next material adding process, so that the regulation and control of the melting width and the melting depth are realized; when the stacking parameters are selected, an attenuation epsilon-greedy strategy is used, so that an agent tends to randomly take the stacking parameters when just beginning to learn so as to obtain various temperature field states for learning, the value network and the weight parameters of the decision network are better optimized, the value network can better predict the value of the current temperature field state, the decision network can better predict the optimal parameters which should be taken currently according to the current temperature field state so as to keep the melting width and the melting depth stable, and when the agent trains to the later stage, the stacking parameters which are output by the decision network tend to be taken; after the current numerical simulation calculation is completed, the robot enters the next environmental state and obtains corresponding rewards.
Specific embodiment six: 1-3, in a method for controlling arc additive forming based on deep reinforcement learning according to the present embodiment, in S3, a reward function of a robot entering a next environmental state and corresponding to a reward is defined as follows:
wherein: d is the optimal penetration, D is the real-time penetration of the actual additive process, W is the optimal penetration, W is the real-time penetration of the actual additive process, and in order to prevent the problems of unstable neural network training or wrong decision and the like possibly caused by overlarge rewards, rewards calculated by the following formula are cut, and the rewards are limited to the range of [ -10, 10 ].
Seventh embodiment: 1-3 are combined to explain the embodiment, in the method for controlling arc additive forming based on deep reinforcement learning of the embodiment, in S4 and S5, a 3D convolution layer is adopted to extract the time dimension and space dimension characteristics of the temperature state diagram obtained in S2; in order to reduce the parameters of the neural network and improve the stability of the algorithm, the value network and the decision network share weight, and only different neurons are adopted at an output layer for output; optimizing the built model by adopting a near-end strategy optimization algorithm and a multithreading synchronous updating mode; by adopting the near-end strategy optimization algorithm, the transient updating of the weight value due to the collected bad data of the neural network can be effectively avoided, and the learned strategy can be ensured to be updated steadily, so that the efficient and stable training is realized.
Eighth embodiment: 1-3 are combined to explain the embodiment, in the multithreading synchronous updating mode, 12 numerical simulation solving environments are built at the same time, the material adding component model of each numerical simulation solving environment is set to be different, an intelligent agent interacts with the 12 environments through the same value network and strategy network, and in the interaction process, the intelligent agent can record the temperature field state before each interaction, the lamination parameters executed during the interaction, the temperature field state after the interaction and the obtained rewards; after the intelligent agent interacts with all environments for 60 times, the intelligent agent updates the value network and the decision network by using the recorded 720 times of interaction information as one batch of data, 72 pieces of recording information are input when each time is updated, each batch of data trains the neural network for three rounds, each batch of data interacts with the environment by using a new network after training, and the next round of learning process is entered after the interaction process is completed.
Detailed description nine: 1-3, in S6, when the neural network parameters are fixed after the neural network training is completed, an intelligent agent directly adopts a greedy strategy when interacting with the environment, and the layering parameters with the maximum probability of outputting the current strategy network are directly adopted, so that the intelligent agent can better control the layering layer melting width and the melting depth to be stable.
It should be noted that, in the above embodiments, as long as the technical solutions that are not contradictory can be arranged and combined, those skilled in the art can exhaust all the possibilities according to the mathematical knowledge of the arrangement and combination, so the present invention does not describe the technical solutions after the arrangement and combination one by one, but should be understood that the technical solutions after the arrangement and combination have been disclosed by the present invention.
The above description is only of the preferred embodiments of the present invention and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (5)

1. The method for controlling arc additive forming based on deep reinforcement learning is characterized by comprising the following steps of: the method comprises the following steps:
s1: performing numerical simulation on the arc material adding process;
in S1, discretizing a complex component, decomposing the complex component into a multi-layer single-channel component, a multi-layer multi-channel component and a cross structure, respectively performing numerical simulation on the complex component, controlling the grid quantity of each model, ensuring the shape diversity of the multi-layer single-channel component, the multi-layer multi-channel component and the cross structure, adopting a gradual calculation method in the simulation calculation process of an arc material-increasing process, dispersing each welding seam into a plurality of calculation working conditions, independently determining the optimal technological parameters of each working condition by an intelligent agent, and correcting the model;
s2: acquiring numerical simulation temperature field information of an arc material adding process and processing the numerical simulation temperature field information;
s2, opening a temperature field model diagram of a calculation result, recording temperature intervals corresponding to various colors in the temperature field cloud diagram, and calculating a median value of each temperature interval; acquiring temperature field cloud picture screenshot of each interlayer and the substrate, sequentially stacking all acquired temperature field pictures, and filling the pictures below the substrate picture and above the current laminated layer picture with monochromatic pictures for representing an environmental temperature state; converting a color temperature field picture into a gray picture under the condition that information contained in the temperature field picture is not reduced, performing feature processing on the gray picture to accelerate the training process of a neural network because temperature information represented by each pixel value in the converted gray picture is not clear, defining one pixel value in the gray picture as ambient temperature, defining the other pixel value in the gray picture as material melting point temperature, normalizing the values of each temperature field obtained by the previous calculation to other pixel values corresponding to two pixel value ranges, and replacing the pixels corresponding to each temperature in the gray picture with the calculated pixel values to form a new gray picture; sequentially stacking all the processed gray-scale pictures into a picture sequence containing time dimension and space dimension temperature field information, and taking the picture sequence as the input of a neural network;
s3: constructing an arc additive manufacturing reinforcement learning environment and an intelligent body;
s3, defining the temperature state diagram obtained in the S2 as a real-time state of the environment, wherein the action space of the intelligent body is used for adjusting the deposition current, the lamination voltage and the lamination speed, and determining the range of lamination parameters; the intelligent agent selects deposition current, lamination voltage and lamination speed according to the state of the environment received in real time, and then submits the deposition current, lamination voltage and lamination speed to a numerical simulation solver for the next material adding process, so that the regulation and control of the melting width and the melting depth are realized; when the layering parameters are selected, an attenuation epsilon-greedy strategy is used, so that an intelligent body randomly adopts the layering parameters to obtain various temperature field states for learning when the intelligent body just starts to learn, the optimal parameters which should be adopted currently are predicted better according to the current temperature field states so as to keep the melting width and the melting depth stable, and when the intelligent body trains to the later stage, the layering parameters output by a decision network are adopted; after the current numerical simulation calculation is completed, the robot enters the next environmental state and obtains corresponding rewards;
s4: setting up a value network and a decision network;
s5: training the network based on the environment built by the S3 by utilizing the temperature field information acquired by the S2;
s6: and (3) automatically adjusting lamination parameters in the arc material-increasing process by utilizing the trained neural network in the step (S5), and keeping the fusion width and the fusion depth of the lamination layer stable.
2. The method for controlling arc additive forming based on deep reinforcement learning of claim 1, wherein: s2, filling the lower surface of the substrate picture and the upper surface of the current lamination layer picture with 5 white pictures; and defining a pixel value 0 in the gray picture as an ambient temperature of 25 ℃, defining a pixel value 255 in the gray picture as a material melting point temperature, and normalizing the values of all the temperature fields obtained through the previous calculation to corresponding image values in a range of 0-255.
3. The method for controlling arc additive forming based on deep reinforcement learning of claim 1, wherein: in S4 and S5, extracting time dimension and space dimension characteristics of the temperature state diagram obtained in S2 by adopting a 3D convolution layer; the value network and the decision network share weight, and only different neurons are adopted at the output layer for output; and optimizing the built model by adopting a near-end strategy optimization algorithm and a multithreading synchronous updating mode.
4. A method of controlling arc additive forming based on deep reinforcement learning according to claim 3, wherein: in the multithreading synchronous updating mode, 12 numerical simulation solving environments are built at the same time, the additive component models of each numerical simulation solving environment are set to be different, an intelligent body interacts with the 12 environments through the same value network and strategy network, and in the interaction process, the intelligent body can record the temperature field state before each interaction, the lamination parameters executed during the interaction, the temperature field state after the interaction and the obtained rewards; after the intelligent agent interacts with all environments for 60 times, the intelligent agent updates the value network and the decision network by using the recorded 720 times of interaction information as one batch of data, 72 pieces of recording information are input when each time is updated, each batch of data trains the neural network for three rounds, each batch of data interacts with the environment by using a new network after training, and the next round of learning process is entered after the interaction process is completed.
5. The method for controlling arc additive forming based on deep reinforcement learning of claim 1 or 4, wherein: and S6, fixing the parameters of the neural network after the training of the neural network is completed, and adopting a greedy strategy when an intelligent agent interacts with the environment, and directly adopting the layering parameters with the maximum output probability of the current strategy network.
CN202310620763.5A 2023-05-30 2023-05-30 Method for controlling arc additive forming based on deep reinforcement learning Active CN116629128B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310620763.5A CN116629128B (en) 2023-05-30 2023-05-30 Method for controlling arc additive forming based on deep reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310620763.5A CN116629128B (en) 2023-05-30 2023-05-30 Method for controlling arc additive forming based on deep reinforcement learning

Publications (2)

Publication Number Publication Date
CN116629128A CN116629128A (en) 2023-08-22
CN116629128B true CN116629128B (en) 2024-03-29

Family

ID=87596967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310620763.5A Active CN116629128B (en) 2023-05-30 2023-05-30 Method for controlling arc additive forming based on deep reinforcement learning

Country Status (1)

Country Link
CN (1) CN116629128B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149335A (en) * 2020-10-28 2020-12-29 浙江大学 Multilayer arc additive manufacturing process thermal history prediction method based on machine learning
CN112632724A (en) * 2020-12-23 2021-04-09 广东省科学院中乌焊接研究所 Test design and structured data acquisition method for metal additive manufacturing process system
CN113392935A (en) * 2021-07-09 2021-09-14 浙江工业大学 Multi-agent deep reinforcement learning strategy optimization method based on attention mechanism
CN113569352A (en) * 2021-07-13 2021-10-29 华中科技大学 Additive manufacturing size prediction and process optimization method and system based on machine learning
CN115333143A (en) * 2022-07-08 2022-11-11 国网黑龙江省电力有限公司大庆供电公司 Deep learning multi-agent micro-grid cooperative control method based on double neural networks
CN115563736A (en) * 2022-10-28 2023-01-03 江南大学 Turbine blade arc additive real-time temperature field prediction method
WO2023059627A1 (en) * 2021-10-05 2023-04-13 Foshey Michael J Learning closed-loop control policies for manufacturing

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020216458A1 (en) * 2019-04-26 2020-10-29 Siemens Industry Software Nv Machine learning approach for fatigue life prediction of additive manufactured components

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149335A (en) * 2020-10-28 2020-12-29 浙江大学 Multilayer arc additive manufacturing process thermal history prediction method based on machine learning
CN112632724A (en) * 2020-12-23 2021-04-09 广东省科学院中乌焊接研究所 Test design and structured data acquisition method for metal additive manufacturing process system
CN113392935A (en) * 2021-07-09 2021-09-14 浙江工业大学 Multi-agent deep reinforcement learning strategy optimization method based on attention mechanism
CN113569352A (en) * 2021-07-13 2021-10-29 华中科技大学 Additive manufacturing size prediction and process optimization method and system based on machine learning
WO2023059627A1 (en) * 2021-10-05 2023-04-13 Foshey Michael J Learning closed-loop control policies for manufacturing
CN115333143A (en) * 2022-07-08 2022-11-11 国网黑龙江省电力有限公司大庆供电公司 Deep learning multi-agent micro-grid cooperative control method based on double neural networks
CN115563736A (en) * 2022-10-28 2023-01-03 江南大学 Turbine blade arc additive real-time temperature field prediction method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Numerical simulation of multi-layer rotating arc narrow gap MAG welding for medium steel plate;Li W, Yu R, Huang D, et al.;Journal of Manufacturing Processes;20190901;第45卷;460-471 *
Tandem-GMAW电弧增材制造基层成型宽度研究及热过程分析;石俊彪;赵昀;陈树君;迟杏;;天津大学学报(自然科学与工程技术版);20200629(第09期);34-40 *
基于ANSYS的TIG电弧增材制造温度场数值模拟分析;刘东帅;吕彦明;周文军;杨华;王康;;激光与光电子学进展;20191225(第24期);181-187 *
基于MIG的电弧增材制造不同层间停留时间温度场仿真模拟;张天雷;徐刚;沈艳涛;明灿;何林基;马春伟;;轻工机械;20200530(第03期);42-47 *
基于深度强化学习的群体对抗策略研究;刘强;姜峰;;智能计算机与应用;20200501(第05期);301-307 *

Also Published As

Publication number Publication date
CN116629128A (en) 2023-08-22

Similar Documents

Publication Publication Date Title
Sartoretti et al. Distributed reinforcement learning for multi-robot decentralized collective construction
WO2021208771A1 (en) Reinforced learning method and device
CN110515303B (en) DDQN-based self-adaptive dynamic path planning method
CN109388484B (en) Multi-resource cloud job scheduling method based on Deep Q-network algorithm
CN109491320B (en) Cutter path generation and optimization method based on reinforcement learning
CN109936865B (en) Mobile sink path planning method based on deep reinforcement learning algorithm
CN111695690A (en) Multi-agent confrontation decision-making method based on cooperative reinforcement learning and transfer learning
CN110853110A (en) Automatic picture toning method based on generation countermeasure network
CN111182564B (en) Wireless link quality prediction method based on LSTM neural network
CN110427006A (en) A kind of multi-agent cooperative control system and method for process industry
CN106096729A (en) A kind of towards the depth-size strategy learning method of complex task in extensive environment
CN108075975B (en) Method and system for determining route transmission path in Internet of things environment
CN113299084B (en) Regional signal lamp cooperative control method based on multi-view coding migration reinforcement learning
CN105068421A (en) Two-degree-of-freedom cooperative control method for multiple mobile robots
CN111353910A (en) Distributed intelligent power grid economic dispatching method based on finite time consistency under directed topology
CN111597750A (en) Hybrid electric vehicle energy management method based on BP neural network
CN116629128B (en) Method for controlling arc additive forming based on deep reinforcement learning
CN113534678B (en) Migration method from simulation of operation question-answering task to physical system
US11570063B2 (en) Quality of experience optimization system and method
CN115257697B (en) Hybrid vehicle energy management and cooperative control method, system and application
CN116128028A (en) Efficient deep reinforcement learning algorithm for continuous decision space combination optimization
CN100578538C (en) Virtual surroundings population objects behaviors evolvement method based on gradation picture organization and transformation
CN113052970B (en) Design method, device and system for light intensity and color of lamplight and storage medium
TWI748794B (en) Beam selection method based on neural network and management server
CN111182556B (en) Wireless network planning design method based on intelligent agent

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant