CN113947194A - Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply - Google Patents

Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply Download PDF

Info

Publication number
CN113947194A
CN113947194A CN202111211867.8A CN202111211867A CN113947194A CN 113947194 A CN113947194 A CN 113947194A CN 202111211867 A CN202111211867 A CN 202111211867A CN 113947194 A CN113947194 A CN 113947194A
Authority
CN
China
Prior art keywords
oxygen supply
plateau
reinforcement learning
learning model
action
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111211867.8A
Other languages
Chinese (zh)
Inventor
张羽
杨慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202111211867.8A priority Critical patent/CN113947194A/en
Publication of CN113947194A publication Critical patent/CN113947194A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • CCHEMISTRY; METALLURGY
    • C01INORGANIC CHEMISTRY
    • C01BNON-METALLIC ELEMENTS; COMPOUNDS THEREOF; METALLOIDS OR COMPOUNDS THEREOF NOT COVERED BY SUBCLASS C01C
    • C01B13/00Oxygen; Ozone; Oxides or hydroxides in general
    • C01B13/02Preparation of oxygen
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Inorganic Chemistry (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply, which comprises the following steps: inputting an environment state, preprocessing data, using a neural network decision output action, receiving an environment feedback reward, and updating parameters of the neural network. Various factors in extreme environments such as plateau and the like can be comprehensively considered, the accuracy of the model is improved, meanwhile, the calculated amount of the model is reduced as much as possible under the condition of making a correct decision, the oxygen supply task is efficiently completed, and the model can be used as a model basis for intelligently controlling the oxygen supply amount of an oxygen supply system.

Description

Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply
Technical Field
The invention relates to the technical field of machine learning, in particular to a light weight reinforcement learning model construction method for plateau scene intelligent oxygen supply.
Background
The altitude reaction is an uncomfortable symptom generated after a person enters a plateau with the altitude of more than 3000 meters and is exposed to a low-pressure and low-oxygen environment, and is a unique common disease in the plateau area. The harm of the altitude stress to the human body is great, and the reduction of the altitude stress has great significance to the psychological and physiological influences, so that the portable and intelligent oxygen supply system is urgently needed to be provided for people doing altitude operations, the intelligent reinforcement learning technology in machine learning can be adopted, and the intelligent body can gradually adapt to the environment in training, so that the best overall benefit is obtained.
The invention patent 201210307733.0 relates to a portable oxygen generator suitable for plateau areas, which is added with a series of intelligent judgment devices based on the traditional oxygen generator to adjust the output oxygen flow and form pulse oxygen supply, but the device has larger volume, is not suitable for being carried by a single person and can only be used in non-mobile scenes.
Disclosure of Invention
The invention provides a light weight reinforcement learning model construction method for plateau scene intelligent oxygen supply. The model uses the reinforcement learning technology in the field of artificial intelligence, can comprehensively consider various factors in the extreme environment of the plateau, improves the accuracy of the model, reduces the calculated amount of the model as much as possible under the condition of making a correct decision, efficiently completes the oxygen supply task, and can be used as a model for intelligently controlling the oxygen supply amount of an oxygen supply system.
The invention provides a method for constructing a lightweight reinforcement learning model for plateau scene intelligent oxygen supply, which comprises the following steps:
s1: receiving plateau environmental state information of current oxygen supply;
s2: preprocessing the plateau environmental state information to obtain an environmental state matrix;
s3: obtaining a profit estimation value set of each action by utilizing a neural network according to the environment state matrix;
s4: acquiring the optimal action in the income estimation value set of each action;
s5: judging whether the optimal action is a preset optimal action or not, if so, ending the current oxygen supply and sending the optimal action to an external task controller; otherwise, return to step S1.
Optionally, in step S1, the current oxygen supply environmental status information includes: environmental data and task data.
Optionally, the environmental data includes: altitude and temperature and humidity; and/or the task data comprises blood oxygen saturation, heart rate parameters and respiratory parameters.
Alternatively, the step S2 includes: sampling the environment state information; and carrying out noise reduction operation on the sampled environmental state information to obtain the environmental state matrix.
Optionally, in step S3, the neural network includes an input layer, a fully-connected layer and an output layer, the input layer is used for inputting the environment state matrix, the output layer outputs the profit estimation value sets of the actions, and the fully-connected layer simultaneously connects the input layer and the output layer.
The invention has the following beneficial effects:
the reinforcement learning model in the artificial intelligence field can be well adapted to the plateau environment, better actions are learned from special environment states of the plateau, such as altitude, temperature, oxygen content and the like, so that the total yield is highest, the accuracy of a decision result is guaranteed, and a light neural network is adopted, namely, fewer network parameters and the number of layers are adopted, so that the calculated amount is reduced, and the reinforcement learning model can better work in embedded equipment.
The model core adopts a Q learning algorithm, the Q learning algorithm needs a Q value function to evaluate the value of taking certain action in a certain state of the environment due to the complicated state of the plateau scene, and the Q value function of the original Q learning algorithm adopts a table to store all the states. Because the environmental states in the plateau scene have a lot of possibilities, the adoption of table storage occupies a large storage space, the cost is high, the query efficiency is low, and in order to solve the problem, the neural network is adopted to approximate the distribution of the traditional Q value.
The input layer of the neural network is a full-link layer of 3 neurons, and receives a one-dimensional matrix formed by altitude, temperature and blood oxygen saturation data of the current state; the network output layer is a full connection layer corresponding to all the action quantities in the action set A, the output is a gain estimation value matrix of each action, and the network can be provided with 2 hidden layers which are full connection layers of 10 and 8 neurons respectively.
The invention has the beneficial effects that: the reinforcement learning model in the artificial intelligence field can better adapt to the plateau environment, better actions are learned from special environment states of the plateau such as altitude, temperature, oxygen content and the like, so that the total yield is highest, the accuracy of a decision result is guaranteed, and a light neural network is adopted, namely, fewer network parameters and the number of layers are adopted, so that the calculation amount of the model is reduced, and the model can better work in embedded equipment.
The model work flow of the invention is as follows:
the model receives as input data returned by the environmental state sensors and the task state sensors, the environmental state information such as the temperature T of the environment at time TtAltitude of the environment AltThe oxygen supply task state information in the plateau environment is the blood oxygen saturation X of the usert
The information is used as the input of a data preprocessing module, the data preprocessing module mainly carries out data sampling and noise reduction, the sensor data can be sampled and noise reduced at fixed intervals of seconds, and the final data after noise reduction is the output of the data preprocessing module, namely a matrix S of an environment statet
Decision maker receiving matrix StAnd then provided to a neural network, which receives an input St3 neurons of the input layer respectively receive altitude, temperature and blood oxygen saturation data of the current state, then the data pass through 2 hidden layers and finally reach the output layer, and each neuron of the output layer outputs a revenue estimation value set Q (S) of each action at the t momenttA). The simultaneous neural network will StThe blood oxygen saturation data in (1) is used as the previous action environment for giving a reward to optimize parameters of the neural network;
set Q (S) of decision maker output from neural networktAnd, one of the Q (S) having the largest expected profit value is found out of A)t,a′t) Corresponding optimum action a'tThe output as the model is sent to an external task controller for execution; after the action is taken, if the task is not completed, the step 1 is returned to continue execution, and if the task is completed, the model work is ended.
Drawings
FIG. 1 is a flow chart of a lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply provided by the invention;
fig. 2 is a schematic structural diagram of the lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply provided by the invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
Example 1
The invention provides a method for constructing a lightweight reinforcement learning model for plateau scene intelligent oxygen supply, which is shown in reference to fig. 1 and 2 and comprises the following steps:
s1: receiving plateau environmental state information of current oxygen supply;
s2: preprocessing the plateau environmental state information to obtain an environmental state matrix;
s3: obtaining a profit estimation value set of each action by utilizing a neural network according to the environment state matrix;
s4: acquiring the optimal action in the income estimation value set of each action;
s5: judging whether the optimal action is a preset optimal action or not, if so, ending the current oxygen supply and sending the optimal action to an external task controller; otherwise, return to step S1.
Optionally, in step S1, the current oxygen supply environmental status information includes: environmental data and task data.
Optionally, the environmental data includes: altitude and temperature and humidity; and/or the task data comprises blood oxygen saturation, heart rate parameters and respiratory parameters.
Alternatively, the step S2 includes: sampling the environment state information; and carrying out noise reduction operation on the sampled environmental state information to obtain the environmental state matrix.
Optionally, in step S3, the neural network includes an input layer, a fully-connected layer and an output layer, the input layer is used for inputting the environment state matrix, the output layer outputs the profit estimation value sets of the actions, and the fully-connected layer simultaneously connects the input layer and the output layer.
The invention has the following beneficial effects:
the reinforcement learning model in the artificial intelligence field can be well adapted to the plateau environment, better actions are learned from special environment states of the plateau, such as altitude, temperature, oxygen content and the like, so that the total yield is highest, the accuracy of a decision result is guaranteed, and a light neural network is adopted, namely, fewer network parameters and the number of layers are adopted, so that the calculated amount is reduced, and the reinforcement learning model can better work in embedded equipment.
Example 2
The invention provides a light weight reinforcement learning model construction method for plateau scene intelligent oxygen supply. The method uses the reinforcement learning technology in the field of artificial intelligence, can comprehensively consider various factors in the extreme environment of the plateau, improves the accuracy of the model, reduces the calculated amount as much as possible under the condition of making a correct decision, and efficiently completes the task of oxygen supply.
The core adopts a Q learning algorithm, because the state of a plateau scene is complex, the Q learning algorithm needs a Q value function to evaluate the value of taking a certain action in a certain state of the environment, and the Q value function of the original Q learning algorithm adopts a table to store all the states. Because the environmental states in the plateau scene have a lot of possibilities, the adoption of table storage occupies a large storage space, the cost is high, the query efficiency is low, and in order to solve the problem, the neural network is adopted to approximate the distribution of the traditional Q value.
The invention has the beneficial effects that: the invention can better adapt to the plateau environment, learns better actions from special environment states of the plateau such as altitude, temperature, oxygen content and the like to ensure the highest total benefit, thereby ensuring the accuracy of decision results, adopts a light-weight neural network, namely less network parameters and layers to reduce the calculation amount of a model, and can better work in embedded equipment.
The working process of the invention is as follows:
receiving as input data returned by the environmental status sensor and the task status sensor, the environmental status information such as the temperature T of the environment at time TtEnvironment ofAltitude of AltThe oxygen supply task state information in the plateau environment is the blood oxygen saturation X of the usert
The information is used as the input of a data preprocessing module, the data preprocessing module mainly carries out data sampling and noise reduction, the sensor data can be sampled and noise reduced at fixed intervals of seconds, and the final data after noise reduction is the output of the data preprocessing module, namely a matrix S of an environment statet
Decision maker receiving matrix StThen provided to a neural network, on the one hand StFinally, the income estimation value set Q (S) of each action taken at the time t is output as the input of the networktA), on the other hand StOptimizing parameters of the neural network as a reward;
set Q (S) of decision maker output from neural networktAnd, one of the Q (S) having the largest expected profit value is found out of A)t,a′t) Corresponding optimum action a'tThe output as the model is sent to an external task controller for execution; after the action is taken, if the task is not completed, returning to the initial workflow to continue executing, and if the task is completed, ending the model work.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims (5)

1. A light weight reinforcement learning model construction method for plateau scene intelligent oxygen supply is characterized by comprising the following steps:
s1: receiving plateau environmental state information of current oxygen supply;
s2: preprocessing the plateau environmental state information to obtain an environmental state matrix;
s3: obtaining a profit estimation value set of each action by utilizing a neural network according to the environment state matrix;
s4: acquiring the optimal action in the income estimation value set of each action;
s5: judging whether the optimal action is a preset optimal action or not, if so, ending the current oxygen supply and outputting a lightweight reinforcement learning model; otherwise, return to step S1.
2. The method for constructing a light weight reinforcement learning model for plateau scene intelligent oxygen supply according to claim 1, wherein in step S1, the current oxygen supply environment state information includes: environmental data and task data.
3. The plateau scene intelligent oxygen supply-oriented lightweight reinforcement learning model construction method according to claim 2, wherein the environmental data includes: altitude and temperature and humidity; and/or
The task data includes blood oxygen saturation, heart rate parameters, and respiratory parameters.
4. The plateau scene intelligent oxygen supply-oriented lightweight reinforcement learning model construction method of claim 1, wherein the step S2 includes:
sampling the environment state information;
and carrying out noise reduction operation on the sampled environmental state information to obtain the environmental state matrix.
5. The method for constructing a light weight reinforcement learning model for plateau scene intelligent oxygen supply according to claim 1, wherein in step S3, the neural network includes an input layer, a full connection layer and an output layer, the input layer is used for inputting the environment state matrix, the output layer outputs the profit estimation value set of each action, and the full connection layer connects the input layer and the output layer at the same time.
CN202111211867.8A 2021-10-18 2021-10-18 Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply Pending CN113947194A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111211867.8A CN113947194A (en) 2021-10-18 2021-10-18 Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111211867.8A CN113947194A (en) 2021-10-18 2021-10-18 Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply

Publications (1)

Publication Number Publication Date
CN113947194A true CN113947194A (en) 2022-01-18

Family

ID=79331374

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111211867.8A Pending CN113947194A (en) 2021-10-18 2021-10-18 Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply

Country Status (1)

Country Link
CN (1) CN113947194A (en)

Similar Documents

Publication Publication Date Title
WO2018153359A1 (en) Emotion state prediction method and robot
CN108717852B (en) Intelligent robot semantic interaction system and method based on white light communication and brain-like cognition
CN111475546A (en) Financial time sequence prediction method for generating confrontation network based on double-stage attention mechanism
CN107817891A (en) Screen control method, device, equipment and storage medium
EP2788969B1 (en) Apparatus, system, and method for therapy based speech enhancement and brain reconfiguration
CN109063714A (en) The construction method of Parkinson's disease bradykinesia video detection model based on deep neural network
Burrello et al. Bioformers: Embedding transformers for ultra-low power semg-based gesture recognition
CN111870242A (en) Intelligent gesture action generation method based on electromyographic signals
CN116564561A (en) Intelligent voice nursing system and nursing method based on physiological and emotion characteristics
CN106472332A (en) Pet feeding method and system based on dynamic intelligent algorithm
CN113487165A (en) Intelligent factory production operation scheduling method and system based on deep reinforcement learning
CN112396001A (en) Rope skipping number statistical method based on human body posture estimation and TPA (tissue placement model) attention mechanism
CN111754962A (en) Folk song intelligent auxiliary composition system and method based on up-down sampling
CN117349622A (en) Wind power plant wind speed prediction method based on hybrid deep learning mechanism
CN106651001A (en) Needle mushroom yield prediction method based on improved neural network and implementation system
CN113947194A (en) Lightweight reinforcement learning model construction method for plateau scene intelligent oxygen supply
CN114781633A (en) Processor fusing artificial neural network and pulse neural network
CN118136222B (en) Digital intelligent oxygen generation system oriented to medical environment
CN111300413A (en) Multi-degree-of-freedom myoelectric artificial hand control system and using method thereof
CN110909621A (en) Body-building guidance system based on vision
CN112906673A (en) Lower limb movement intention prediction method based on attention mechanism
CN118230908A (en) Traditional Chinese medicine large model and preference alignment method based on reinforcement learning
CN110363074A (en) One kind identifying exchange method for complicated abstract class of things peopleization
CN116300500A (en) Intelligent household equipment control method and related equipment
CN115602139A (en) Automatic music generation method and device based on two-stage generation model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination