CN114884596B

CN114884596B - Energy perception-based backscattering code rate self-adaption method, device and system

Info

Publication number: CN114884596B
Application number: CN202210434515.7A
Authority: CN
Inventors: 吴心仪; 张瑞杰; 周培龙; 周翔宇; 何承天; 董慧鑫; 王巍
Original assignee: Huazhong University of Science and Technology
Current assignee: Huazhong University of Science and Technology
Priority date: 2022-04-24
Filing date: 2022-04-24
Publication date: 2023-04-07
Anticipated expiration: 2042-04-24
Also published as: CN114884596A

Abstract

The invention relates to a backscattering code rate self-adaption method, a device and a system based on energy perception, wherein the method comprises the following steps: initializing a code rate strategy, executing the strategy, acquiring current channel state information, inputting the channel state information into a reinforcement learning model, receiving the input of the channel state information by the reinforcement learning model, outputting a code rate decision result, monitoring the residual energy of a node, comparing the energy detection result with a set threshold, and selecting the code rate decision result given by the reinforcement learning model as a final code rate execution strategy when the energy detection result is higher than or equal to the set threshold; and when the energy detection result is lower than the set threshold, ignoring a code rate decision result given by the reinforcement learning model, and selecting the optional lowest code rate as a final code rate execution strategy. The invention combines the traditional code rate self-adaptive strategy with the energy constraint rule, can effectively combine the characteristics of the backscattering technology and adjust the code rate according to the environmental state, and achieves better transmission effect.

Description

Energy perception-based backscattering code rate self-adaption method, device and system

Technical Field

The invention belongs to the technical field of wireless communication, and particularly relates to a backscattering code rate self-adaption method, device and system based on energy perception.

Background

The backscattering technique is a technique for realizing communication by reflecting electromagnetic waves in an environment, and is capable of realizing a data transmission task with extremely low power consumption, and recently receives much attention. With the development of technology, the bandwidth of data to be transmitted by a backscatter node is rapidly increasing, from the original light and temperature data to the current microphone audio data, even video stream data. However, the communication quality of the backscattering technique is very susceptible to the quality of the dynamic wireless channel, which is especially obvious when the backscattering node performs continuous and large-capacity data (such as video streaming) transmission tasks, and when the channel bandwidth is not matched with the data rate, problems such as unstable transmission quality and data loss occur.

In recent years, some methods based on rate adaptation of the physical layer of the backscatter network have appeared, but these methods have a common disadvantage that they cannot solve the problem of source-end data accumulation due to poor channel quality. Since these methods do not change the amount of data that needs to be transmitted, this is unacceptable for backscatter nodes where local storage space is limited.

For the video acquisition task in the back scattering, the applicant thinks of a code rate self-adaptive method, which adaptively adjusts the data rate according to the channel state so as to improve the transmission quality. However, the conventional bitrate adaptive technology is mainly applied to internet multimedia transmission to meet application requirements of low-delay and high-quality viewing experience, such as live broadcasting, video conferencing and the like. Therefore, the applicant further thinks of making rate decisions using reinforcement learning to adapt to varying network states, however, these techniques are not suitable for application on backscatter nodes for the following reasons:

1. the energy consumption is high, the energy efficiency is poor, and the realization on a passive back scattering node is difficult;

2. the scheme has poor stability. The traditional technology determines the future code rate based on the network bandwidth and the cache, however, when the residual capacity of the back scattering node cannot support the data transmission task of the code rate, serious quality degradation is caused.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provides a backscattering code rate self-adaption method, a backscattering code rate self-adaption device and an energy perception based system.

The technical scheme of the invention is realized as follows: the invention discloses a backscattering code rate self-adaption method based on energy perception, which comprises the following steps:

s1) initializing a code rate strategy;

s2) executing the strategy;

s3) acquiring current channel state information, and inputting the channel state information into a reinforcement learning model;

s4) receiving channel state information input by the reinforcement learning model, and outputting a code rate decision result;

s5) monitoring the residual energy of the nodes, comparing the energy detection result with a set threshold value, judging whether the code rate decision result output by the current reinforcement learning model is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning model is suitable for execution, and selecting the code rate decision result given by the reinforcement learning model as the final code rate to return to the execution strategy in the step S2); and when the energy detection result is lower than a set threshold value, the code rate decision result output by the current reinforcement learning model is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and the set code rate is selected as the final code rate and is returned to the step S2) to execute the strategy.

Further, the channel state information includes signal-to-noise ratio and bit error rate information.

Further, in step S5), when the energy detection result is lower than the set threshold, it is determined that the code rate decision result output by the current reinforcement learning model is not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and the selectable lowest code rate is selected as the final code rate and returned to the step S2) to execute the strategy.

Further, the reinforcement learning model guides next code rate selection according to the feedback result of the energy constraint rule, and the method specifically comprises the following steps:

and the reinforcement learning model receives the feedback result and the state vector of the energy constraint rule and outputs a code rate decision result.

And S3) inputting the channel state information into the trained reinforcement learning model, selecting the reinforcement learning model to be deployed on a computer outside the system for online training when the reinforcement learning model is trained, exchanging data with the node of the system through a serial port, and after the reinforcement learning model is trained in a training stage, performing lightweight processing on a fully-connected code rate decision network of the reinforcement learning model and then deploying the fully-connected code rate decision network on the node.

The invention discloses a backscattering code rate self-adaptive device based on energy perception, which comprises a memory, a storage unit and a control unit, wherein the memory is used for storing a program;

and a processor for implementing the steps of the backscatter code rate adaptation method as described above when executing the program.

The invention discloses a backscattering code rate self-adaptive device based on energy perception, which comprises a channel state information acquisition module, a reinforcement learning module, an energy constraint rule module and a strategy execution module,

the channel state information acquisition module is used for acquiring current channel state information and inputting the channel state information into the reinforcement learning module;

the reinforcement learning module is used for receiving the input of channel state information and outputting a code rate decision result;

the energy constraint rule module is used for monitoring the residual energy of the nodes, comparing an energy detection result with a set threshold value, judging whether a code rate decision result output by the current reinforcement learning module is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning module is suitable for execution, and selecting the code rate decision result given by the reinforcement learning module as a final code rate to be provided to the strategy execution module; when the energy detection result is lower than a set threshold value, the code rate decision result output by the current reinforcement learning module is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning module is ignored, and an optional lowest code rate is selected as a final code rate and is provided to the strategy execution module;

and the strategy execution module is used for receiving the final code rate execution strategy provided by the energy constraint rule module.

Further, the energy constraint rule module feeds back a result of whether the current code rate strategy is really executed to the reinforcement learning module, and the reinforcement learning module guides the next code rate selection according to the feedback result of the energy constraint rule module.

The invention discloses a backscattering code rate self-adaptive system based on energy perception, which comprises a reader, backscattering nodes, a sensor and a reinforcement learning model, wherein the sensor is used for acquiring original video information and sending the original video information to the reader through the backscattering nodes, and the reader is used for receiving decoded video data and calculating channel state information and feeding the channel state information back to the backscattering nodes; the backscattering node is used for inputting channel state information into a reinforcement learning model, and the reinforcement learning model is used for receiving the input of the channel state information and outputting a code rate decision result; the backscattering node is used for monitoring the residual energy of the node in real time, comparing an energy detection result with a set threshold value, judging whether a code rate decision result output by the current reinforcement learning model is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning model is suitable for execution, and performing data compression and sending tasks according to a code rate decision result given by the reinforcement learning model as a final code rate; and when the energy detection result is lower than a set threshold value, the code rate decision result output by the current reinforcement learning model is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and the optional lowest code rate is selected as the final code rate to execute data compression and sending tasks.

Further, the channel state information includes bit error rate and signal-to-noise ratio information.

And further, carrying out lightweight processing on the fully-connected code rate decision network of the trained reinforcement learning model, and then deploying the fully-connected code rate decision network on the backscattering nodes.

The invention has at least the following beneficial effects: the invention discloses a backscattering code rate self-adaption method, device and system based on energy perception, and aims to optimize data transmission quality of backscattering nodes.

The present invention attempts to use a depth-enhanced model in the context of backscatter nodes. In the aspect of energy consumption, the reinforcement learning strategy is constrained in an active energy perception mode; in the aspect of resources, the method extracts part of the network in the reinforcement learning model for deployment, and reduces the required resources.

Because the invention actively senses the residual electric quantity, the execution strategy is restricted in the electric quantity budget, the strategy execution is ensured to be smooth, and the transmission quality is prevented from being greatly reduced (delay, blockage and the like) caused by the sudden interruption of the transmission due to the electric quantity.

The invention realizes the combination of the traditional code rate self-adaptive strategy and the energy constraint rule, improves the stability of executing the code rate self-adaptive strategy in the back scattering node and avoids the great reduction of the quality in the transmission process.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a software framework diagram of a backscattering code rate adaptive method based on energy sensing according to an embodiment of the present invention;

fig. 2 is a flowchart of an energy-aware-based backscattering code rate adaptive method according to an embodiment of the present invention;

fig. 3 is a schematic diagram of an energy-aware-based backscattering code rate adaptive system according to an embodiment of the present invention;

fig. 4 is a schematic diagram of an energy-aware-based backscattering code rate adaptive system in a training phase according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

FIG. 1 is a software framework of the present invention illustrating the integration of reinforcement learning algorithms and energy-based conservative algorithms in a coupled closed-loop system, wherein:

firstly, collecting information such as signal-to-noise ratio, bit error rate and the like from the environment to represent the environment state at the current moment, namely channel state information;

then, the environmental state information in a period of time is arranged into a state vector to be used as the input of a reinforcement learning algorithm;

calculating the return of the last decision by combining the feedback result of the energy constraint rule and the state vector, and mapping the state vector to an alternative code rate set by utilizing a full-connection network;

meanwhile, the energy constraint rule receives the energy detection result and judges whether the current code rate selection is suitable for execution. And when the energy detection result is lower than the threshold value, the decision given by the reinforcement learning algorithm is ignored, and the optional lowest code rate is selected and fed back to the reinforcement learning algorithm.

The invention is applied to a backscattering node, the working mode of the backscattering node is in a duty-cycle form, namely the backscattering node is used for collecting energy after sleeping for a period of time, and the backscattering node automatically starts to work for a period of time after collecting an energy threshold which is set in advance.

Therefore, the remaining power of the backscatter node, i.e., the voltage of the energy storage capacitor on the backscatter node, needs to be monitored (when the voltage of the capacitor drops to a certain value, the node enters a sleep mode). The energy storage capacitor is connected with the energy collection module and used for storing energy and releasing the energy to supply power to the node.

The energy detection result is a constraint factor for the output code rate decision of the reinforcement learning model. The reinforcement learning model only selects a most suitable strategy from the perspective of the channel state; however, due to the special duty-cycle operation mode of the backscatter node, the code rate strategy cannot be successfully executed in the environment, and the energy detection is performed to ensure that the strategy can be executed before the backscatter node enters the sleep mode next time.

The invention not only takes the energy constraint rule as a protection mechanism to avoid the backscattering node from executing too aggressive code rate strategy and exceeding the energy consumption budget of the system; in addition, the invention also provides feedback information to guide the reinforcement learning algorithm to make correct behavior. Instability due to power consumption limitations can be greatly reduced.

Example one

Referring to fig. 2, an embodiment of the present invention provides an energy-aware-based backscattering code rate adaptive method, including the following steps:

s1) initializing a code rate strategy;

s2) executing the strategy;

s3) acquiring current channel state information, and inputting the channel state information into a reinforcement learning model (namely a reinforcement learning algorithm);

s5) monitoring the residual energy of the nodes, comparing the energy detection result with a set threshold, judging whether a code rate decision result output by the current reinforcement learning model is suitable for execution, when the energy detection result is higher than or equal to the set threshold, representing that the residual energy of the current backscattering node can complete the execution of the code rate strategy before the next dormancy, namely, considering that the code rate decision result output by the current reinforcement learning model is suitable for execution, and selecting the code rate decision result given by the reinforcement learning model as a final code rate to return to the step S2) to execute the strategy; when the energy detection result is lower than the set threshold, the strategy is executed before the current back scattering node is subjected to the next dormancy at a high probability, that is, the code rate decision result output by the current reinforcement learning model is not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and the set code rate (the set code rate is the selectable lowest code rate which is the safest because a certain code rate possibly accords with the energy constraint rule but is not suitable for transmission in the current channel state) is selected as the final code rate and returned to the step S2) to execute the strategy.

When the reinforcement learning model outputs a code rate strategy according to the environment state, if the residual electric quantity in the current duty period is detected to be smaller than the threshold value corresponding to the strategy, the execution of the strategy can not be supported, the strategy can not be selected to be executed, and a lowest code rate strategy is executed to preferentially ensure energy constraint and give a negative feedback to the model.

The threshold value set in step S5) is data measured experimentally in advance. Since the alternative rate strategy is not continuous but limited, the power consumption of the strategy needs to be measured in advance through experiments. According to experience, decision making with a large code rate generally consumes more energy. For example, in this embodiment, the code rate may be fixed, then the voltage values decreased by the policy within a plurality of sets of duty are measured, and the worst case, that is, the decreased voltage is more, is found in the data, and the average of the data similar to this case is used as the threshold of the code rate policy in the energy constraint rule.

And the set threshold value can be adjusted according to needs, experiments show that the strategy of a reinforcement learning model is ignored in each energy constraint rule, the lowest code rate is selected, and the threshold value is set too high, and a proper value needs to be selected through retesting.

Further, information such as signal-to-noise ratio, bit error rate, PSNR of a picture, packet loss rate, and the like is acquired to represent channel state information at the current time.

Further, a feedback result of the energy rule, that is, a result of whether the current bitrate strategy is really executed (positive feedback-positive/negative feedback-negative) is fed back to the reinforcement learning model to guide the reinforcement learning model to make a correct behavior, for example, when a bitrate decision given by the reinforcement learning model is not executed and a message feedback is obtained, it means that the current energy collection effect is not ideal, the remaining energy is not sufficient, and the reinforcement learning is guided to select a bitrate strategy with conservative rate next time, and the lower bitrate strategy is biased.

Further, the feedback of the previous decision is calculated by combining the feedback result of the energy constraint rule and the state vector, and the state vector is mapped to the alternative code rate set by using a full-connection network, which specifically comprises: the feedback result of the energy rule and the state vector are input into a neural network model (existing), and decision return is output, wherein the neural network is a part of a reinforcement learning model. The return of the last decision is the return value generated by inputting the feedback result obtained after the last decision is executed and the current time state vector (representing the change of the last decision to the environment) into the neural network. The feedback result of the energy constraint rule refers to whether the rate decision is actually performed or replaced by the optional lowest rate.

The full-connection network is the most important part of the reinforcement learning model and is called agent reinforcement learning agent. The environment state, namely the channel state vector, is used as input, the output is a vector with the length of the number of the alternative code rates, each number of the vector represents the probability of the executed corresponding code rate strategy, and the maximum probability is the code rate decision given by the reinforcement learning model.

Further, in step S3), the channel state information is input into the trained reinforcement learning model, when the reinforcement learning model is trained, the reinforcement learning model is selected to be deployed on a computer outside the system for online training, data is exchanged with the nodes of the system through a serial port, and after the reinforcement learning model is trained in the training stage, the full-connection bit rate decision network of the reinforcement learning model is deployed on the nodes after being subjected to lightweight processing. The network lightweight refers to that parameters of a network are simplified from floating points to fixed point decimal numbers, and then operations such as pruning and the like are performed, which are the prior art and are not described herein again.

The fully connected network is part of the reinforcement learning model used, namely agent reinforcement learning agents. After the reinforcement learning model is trained in the training phase, the network is deployed on the backscattering nodes, and all the reinforcement learning models are not required to be deployed on the nodes.

Example two

The embodiment of the invention discloses a backscattering code rate self-adaption device based on energy perception, which comprises a memory, a storage unit and a control unit, wherein the memory is used for storing a program;

and a processor, configured to implement the steps of the backscatter code rate adaptation method according to embodiment one when executing the program.

EXAMPLE III

The embodiment of the invention discloses a backscattering code rate self-adaptive device based on energy perception, which comprises a channel state information acquisition module, a reinforcement learning module, an energy constraint rule module and a strategy execution module,

the energy constraint rule module is used for monitoring the residual energy of the node, namely the voltage at two ends of an energy storage capacitor on the backscattering node, comparing an energy detection result with a set threshold value, judging whether a code rate decision result output by the current reinforcement learning module is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning module is suitable for execution, and selecting a code rate decision result given by the reinforcement learning module as a final code rate to be provided to the strategy execution module; when the energy detection result is lower than a set threshold value, the code rate decision result output by the current reinforcement learning module is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning module is ignored, and an optional lowest code rate is selected as a final code rate and is provided to the strategy execution module;

the strategy execution module is used for receiving the final code rate execution strategy provided by the energy constraint rule module, namely executing data compression and sending tasks.

Further, the energy constraint rule module is used for feeding back a result of whether the current code rate strategy is really executed to the reinforcement learning module, and guiding the reinforcement learning module to make a correct behavior.

Furthermore, the reinforcement learning model guides the next code rate selection according to the feedback result of the energy constraint rule module.

Further, the reinforcement learning module is configured to calculate a return of a previous decision by combining a feedback result of the energy constraint rule module and the state vector, and map the state vector to an alternative code rate set by using a full connection network.

Example four

Referring to fig. 3, the embodiment of the invention discloses a backscattering code rate adaptive system based on energy sensing, which comprises a reader, backscattering nodes, a sensor and a reinforcement learning model, wherein the sensor is used for collecting original video information and sending the original video information to the reader through the backscattering nodes, and the reader is used for receiving decoded video data and calculating channel state information and feeding the channel state information back to the backscattering nodes;

the backscattering node is used for inputting the channel state information into a reinforcement learning model, and the reinforcement learning model is used for receiving the input of the channel state information and outputting a code rate decision result. The backscattering node undertakes the tasks of data compression sending and code rate decision.

Furthermore, the backscattering node is used for monitoring the residual energy of the node in real time, comparing an energy detection result with a set threshold value, judging whether a code rate decision result output by the current reinforcement learning model is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning model is suitable for execution, and performing data compression and sending tasks according to a code rate decision result given by the reinforcement learning model as a final code rate; and when the energy detection result is lower than a set threshold value, the code rate decision result output by the current reinforcement learning model is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and the optional lowest code rate is selected as the final code rate to execute data compression and sending tasks.

Further, feeding the final code rate back to the reinforcement learning model to guide the reinforcement learning model to make correct behaviors;

and calculating the return of the last decision by combining the feedback result of the energy constraint rule and the state vector, and mapping the state vector to the alternative code rate set by utilizing a full-connection network.

Further, a reinforcement learning model is deployed on the backscatter nodes.

Preferably, after the reinforcement learning model is trained in the training stage, the fully-connected bitrate decision network of the reinforcement learning model is subjected to lightweight processing and then deployed on the backscattering node.

Due to the limitation of the power consumption of the backscattering node, in the training stage, the reinforcement learning model is selected to be deployed on a computer outside the system for on-line training, and data are exchanged with the node through a serial port, as shown in fig. 4. In the testing stage, the invention performs lightweight processing on a fully-connected code rate decision network and deploys the fully-connected code rate decision network on nodes to realize corresponding functions, as shown in fig. 3.

The invention combines the traditional code rate self-adaptive strategy and the energy constraint rule, designs the energy perception scheme and the training scheme, can effectively combine the characteristics of the backscattering technology and adjust the code rate according to the environmental state, and achieves better transmission effect. The invention discloses a reinforcement learning training scheme, namely on-line training, suitable for low-power consumption backscattering nodes, and light-weight network deployment can be completed on the backscattering nodes with limited power consumption and resources.

The invention discloses a code rate self-adaptive method aiming at a video acquisition task in back scattering, which is used for self-adaptively adjusting the data rate according to the channel state so as to improve the transmission quality.

The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like that fall within the spirit and principle of the present invention are intended to be included therein.

Claims

1. A backscattering code rate self-adaptive method based on energy perception is characterized in that: the method comprises the following steps:

s1) initializing a code rate strategy;

s2) executing the strategy;

s5) monitoring the residual energy of the back scattering node, comparing an energy detection result with a set threshold value, judging whether a code rate decision result output by the current reinforcement learning model is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning model is suitable for execution, and selecting the code rate decision result given by the reinforcement learning model as a final code rate to return to the execution strategy in the step S2); and when the energy detection result is lower than the set threshold, the code rate decision result output by the current reinforcement learning model is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and the set code rate is selected as the final code rate and is returned to the step S2) to execute the strategy.

2. The energy-aware-based backscattering code rate adaptation method of claim 1, wherein: when the energy detection result is lower than the set threshold in the step S5), the code rate decision result output by the current reinforcement learning model is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and the optional lowest code rate is selected as the final code rate and is returned to the step S2) to execute the strategy;

the channel state information includes signal-to-noise ratio and bit error rate information.

3. The energy-aware-based backscattering code rate adaptation method of claim 1, wherein: the reinforcement learning model guides next code rate selection according to a feedback result of the energy constraint rule, namely a result of whether the current code rate strategy is really executed, and specifically comprises the following steps:

and the reinforcement learning model receives the feedback result of the channel state information and the energy constraint rule, namely the result of whether the current code rate strategy is really executed or not, and outputs a code rate decision result.

4. The energy-aware-based backscattering code rate adaptation method of claim 1, wherein: and S3) inputting the channel state information into the trained reinforcement learning model, selecting the reinforcement learning model to be deployed on a computer outside the system for online training when the reinforcement learning model is trained, exchanging data with a backscattering node of the system through a serial port, and after the reinforcement learning model is trained in a training stage, performing lightweight processing on a fully-connected code rate decision network of the reinforcement learning model and then deploying the fully-connected code rate decision network on the backscattering node.

5. A backscattering code rate self-adaptive device based on energy perception is characterized in that: comprises a memory for storing a program;

and a processor for implementing the steps of the backscatter code rate adaptation method of any one of claims 1 to 4 when executing the program.

6. A backscattering code rate self-adaptive device based on energy perception is characterized in that: comprises a channel state information acquisition module, a reinforcement learning module, an energy constraint rule module and a strategy execution module,

the energy constraint rule module is used for monitoring the residual energy of the back scattering node, comparing an energy detection result with a set threshold value, judging whether a code rate decision result output by the current reinforcement learning module is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning module is suitable for execution, and selecting the code rate decision result given by the reinforcement learning module as a final code rate to be provided for the strategy execution module; when the energy detection result is lower than a set threshold value, the code rate decision result output by the current reinforcement learning module is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning module is ignored, and an optional lowest code rate is selected as a final code rate and is provided to the strategy execution module;

7. The apparatus for adaptive backscattering bitrate based on energy perception according to claim 6, wherein: and the energy constraint rule module feeds back a result of whether the current code rate strategy is really executed to the reinforcement learning module, and the reinforcement learning module guides the next code rate selection according to the feedback result of the energy constraint rule module.

8. A backscattering code rate self-adaptive system based on energy perception is characterized in that: the device comprises a reader, a backscattering node, a sensor and a reinforcement learning model, wherein the sensor is used for collecting original video information and sending the original video information to the reader through the backscattering node, and the reader is used for receiving decoded video data and calculating channel state information and feeding the channel state information back to the backscattering node; the backscattering node is used for inputting channel state information into a reinforcement learning model, and the reinforcement learning model is used for receiving the input of the channel state information and outputting a code rate decision result; the backscattering node is used for monitoring the residual energy of the node in real time, comparing an energy detection result with a set threshold value, judging whether a code rate decision result output by the current reinforcement learning model is suitable for execution or not, when the energy detection result is higher than or equal to the set threshold value, considering that the code rate decision result output by the current reinforcement learning model is suitable for execution, and performing data compression and sending tasks according to a code rate decision result given by the reinforcement learning model as a final code rate; and when the energy detection result is lower than a set threshold value, the code rate decision result output by the current reinforcement learning model is considered to be not suitable for execution, the code rate decision result given by the reinforcement learning model is ignored, and an optional lowest code rate is selected as a final code rate to execute data compression and transmission tasks.

9. The adaptive system for backscattering bitrate based on energy perception according to claim 8, wherein: the channel state information comprises bit error rate and signal-to-noise ratio information;

the reinforcement learning model guides next code rate selection according to a feedback result of the energy constraint rule, namely a result of whether the current code rate strategy is really executed, and the method specifically comprises the following steps:

10. The energy-aware-based backscattering bitrate adaptive system according to claim 8, wherein: and carrying out lightweight processing on the fully-connected code rate decision network of the trained reinforcement learning model, and then deploying the fully-connected code rate decision network on the backscattering node.