CN113093124A - DQN algorithm-based real-time allocation method for radar interference resources - Google Patents

DQN algorithm-based real-time allocation method for radar interference resources Download PDF

Info

Publication number
CN113093124A
CN113093124A CN202110370353.0A CN202110370353A CN113093124A CN 113093124 A CN113093124 A CN 113093124A CN 202110370353 A CN202110370353 A CN 202110370353A CN 113093124 A CN113093124 A CN 113093124A
Authority
CN
China
Prior art keywords
unmanned aerial
aerial vehicle
radar
jam
interference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110370353.0A
Other languages
Chinese (zh)
Other versions
CN113093124B (en
Inventor
蒋伊琳
黄星源
尚熙
陈涛
赵忠凯
郭立民
刘鲁涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Engineering University
Original Assignee
Harbin Engineering University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Engineering University filed Critical Harbin Engineering University
Priority to CN202110370353.0A priority Critical patent/CN113093124B/en
Publication of CN113093124A publication Critical patent/CN113093124A/en
Application granted granted Critical
Publication of CN113093124B publication Critical patent/CN113093124B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/02Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S13/00
    • G01S7/38Jamming means, e.g. producing false echoes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention belongs to the technical field of radar interference, and particularly relates to a method for real-time allocation of radar interference resources based on a DQN algorithm. The invention introduces the DQN algorithm into the interference pattern resource allocation of the unmanned aerial vehicle, overcomes the defects of the prior art on dynamic and real-time allocation, realizes the real-time allocation of the interference pattern resource of the unmanned aerial vehicle from the start of a task to the completion of the task, and can be used for processing the condition that the radar has a plurality of working mode conversions.

Description

DQN algorithm-based real-time allocation method for radar interference resources
Technical Field
The invention belongs to the technical field of radar interference, and particularly relates to a method for real-time allocation of radar interference resources based on a DQN algorithm.
Background
At present, more and more radars are automatically changed according to the surrounding environment, so that the requirement on an interference resource allocation strategy is higher and higher, the unmanned aerial vehicle is required to be capable of adaptively changing the strategy of the unmanned aerial vehicle in real time according to the obtained parameters of the radars, and the current threatened radars can be effectively interfered in real time and quickly in the whole flight process. Therefore, the method has important significance for researching interference pattern resources of the unmanned aerial vehicle distributed in real time along with the flight range in the flight process.
The allocation of interference pattern resources results in a large amount of data accumulation and calculation, which puts higher demands on the ability of the drones to quickly allocate the interference pattern resources they carry. The existing algorithms which are applicable to the problem comprise a traditional dynamic planning algorithm and an intelligent population search algorithm, the two algorithms are not dynamic but static for the distribution of interference pattern resources carried by the unmanned aerial vehicle, the distribution mode of the interference pattern resources cannot be changed in real time along with the flying distance of the unmanned aerial vehicle, particularly the situation that the radar is in multiple working modes, and the multifunctional radar is classified into 3 working modes for searching, tracking and guiding. In order to make up for the deficiency of the allocation mode under the algorithm, the invention provides that the DQN algorithm is introduced into the research of the interference pattern resource allocation of the unmanned aerial vehicle, so that the defects of the two algorithms on dynamic and real-time allocation can be made up, and the situation that the radar has multiple working mode conversions can be processed.
Disclosure of Invention
The invention aims to provide a method for real-time allocation of radar interference resources based on a DQN algorithm.
The purpose of the invention is realized by the following technical scheme: the method comprises the following steps:
step 1: obtaining an interference resource pool J ═ J1,j2,......,jxThe radar resource pool P ═ P1,P2,......,PmThe unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,...,jamm}; obtaining the required success rate SR of the task executed by the unmanned aerial vehicle groupmax
Wherein x represents the number of interference patterns; the number of unmanned aerial vehicles in the unmanned aerial vehicle group is the same as the number of radars in the environment, and is m;
step 2: setting the distance L from the starting point to the task point, the number of iteration steps num and the maximum capacity D of an empirical playback pool of the unmanned aerial vehiclemax(ii) a Initializing state S of m-section radar with t equal to 11={s11,s21,...,sm1}; initializing an experience playback pool
Figure BDA0003009036530000011
Wherein the content of the first and second substances,
Figure BDA0003009036530000012
representing unmanned aerial vehicle jamuInterfering radar PiThe state at the t step;
Figure BDA0003009036530000013
representing unmanned aerial vehicle jamuThe accumulated flight distance at the t-th step,
Figure BDA0003009036530000014
fucli(t) denotes a radar PiThe state at the t step; u ═ 1,2, ·, m, i ═ 1,2, ·, m;
and step 3: interference action A executed by selecting unmanned aerial vehicle cluster by greedy strategyt={a1t,a2t,...,amt};
Wherein, aut={Pi,jkDenotes unmanned plane jamuFor radar P at t stepiPerforming a jamming action jk,jk∈J;
And 4, step 4: performing a disturbing action AtThen, according to the reward value RtObtaining the state S of m radarst+1
Figure BDA0003009036530000021
Figure BDA0003009036530000022
Wherein if radar PiKeeping the working mode unchanged, then sitThe change is not changed; if radar PiFrom search mode to tracking mode or from tracking mode to guidance mode, sitIncreasing; if radar PiFrom the guidance mode to the tracking mode, or from the guidance mode to the search mode, or from the tracking mode to the search mode, sitDecrease;
and 5: will (S)t,At,Rt,St+1) Storing the experience playback pool D; if the experience playback pool D does not reach the maximum capacity DmaxIf so, changing t to t +1, and returning to the step 3; otherwise, executing step 6;
step 6: initialization G1=0,G 20; randomly sampling a batch of samples from an experience pool D, and converting the state sitAnd action aitPerforming combined input into the neural network for training, and utilizing DQN algorithm to correspond to the state s of the neural network at each stepitCorrecting the output action to make the output of the neural network to act aitApproaching;
and 7: predicting the action taken by the unmanned aerial vehicle cluster from 1-num step according to the trained neural network, and recording whether the unmanned aerial vehicle cluster successfully reaches a task point after num step;
and 8: repeatedly executing the step 7, and calculating the success rate sr of the task executed by the unmanned aerial vehicle group; if the success rate SR of the unmanned aerial vehicle group to execute the task is larger than the SRmaxEnding the training and executing the step 9; otherwise, returning to the step 2;
sr=(G2/G1)
wherein G is1The total times of executing the step 7, namely the total flying times of the unmanned aerial vehicle group; g2The number of times of completing the flight mission for the unmanned aerial vehicle group;
and step 9: the neural network meeting the success rate of the task requirement is used for executing the real-time allocation of radar interference resources of the unmanned aerial vehicle cluster, and the state S of m radars at a certain momenttInput to meet the required success rate of the taskObtaining the interference action A taken by the unmanned aerial vehicle group in the neural networktNamely, the real-time allocation result of the radar interference resources of the unmanned aerial vehicle cluster.
The invention has the beneficial effects that:
the invention introduces the DQN algorithm into the interference pattern resource allocation of the unmanned aerial vehicle, overcomes the defects of the prior art on dynamic and real-time allocation, realizes the real-time allocation of the interference pattern resource of the unmanned aerial vehicle from the start of a task to the completion of the task, and can be used for processing the condition that the radar has a plurality of working mode conversions.
Drawings
Fig. 1 is a DQN learning diagram.
Fig. 2 is a flow chart of DQN algorithm training in conjunction with radar interference strategy assignment.
Fig. 3 is a conversion diagram of the operation mode of the multifunctional radar.
Fig. 4 shows the relationship between the radar and the radial position of the drone.
Figure 5 is a tensorbard code visualization diagram.
Fig. 6 is an interference resource allocation diagram when t is 20 steps.
Fig. 7 is an interference resource allocation diagram when t is 40 steps.
Fig. 8 is an interference resource allocation diagram when t is 60 steps.
Fig. 9 is an interference resource allocation diagram when t is 80 steps.
FIG. 10 is a graph of error as a function of iteration number.
FIG. 11 is a graph of flight success rate as a function of iteration number.
Detailed Description
The invention is further described below with reference to the accompanying drawings.
The invention aims to provide a DQN-based method suitable for dynamically allocating interference pattern resources carried by an unmanned aerial vehicle, and particularly relates to a method for realizing the real-time allocation of the interference pattern resources from the start to the completion of a task of the unmanned aerial vehicle by processing the condition that a radar has multiple working modes.
The invention uses DQN algorithm as a solving tool, a network structure is shown in figure 1, the network structure is introduced into the distribution of a plurality of multifunctional radar interference resources, a dynamic distribution strategy of the interference resources in the environment changing along with the unmanned aerial vehicle group range is researched by adopting a one-to-one interference mode on the basis of a complex electronic countermeasure environment, the flow of the whole scheme is shown in figure 2, and the method comprises the following steps:
step 1: bringing the electronic countermeasure information into an interference resource pool J and a radar resource pool P ═ P1,P2,......,PmThe unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,...,jamm}; wherein J ═ { J ═ J1,j2,......,jxX represents the number of interference patterns.
Ground radar resource pool P ═ { P ═ P1,P2,......,PmAnd m represents the number of radars in the environment. Pi={fucl,sys,pp,gr,qs},PiThe method comprises the steps of (1) representing an ith multifunctional radar, and fucl representing different working mode parameter sets of the multifunctional radar; qs represents a measure of the radar against interference; sys represents radar constitution, representing different radar types; pp denotes the peak power (KW) of the radar and gr denotes the radar antenna gain (dB).
Relevant parameters in fucl: fcl ═ pwj,bwj,prfj,rfjJ is 0-2, which represents three different working modes of the first multifunctional radar, wherein pwj,bwj,prfj,rfjThe radar signal pulse width, the receiver bandwidth, the pulse repetition frequency and the carrier frequency of the radar under different modes are respectively.
Unmanned aerial vehicle group jam ═ jam of interference resource to be distributed1,jam2,...,jammAnd m represents the number of drones. Wherein the ith unmanned plane is jami={pjam,gj,bwjam,J},pjamFor unmanned aerial vehicle power (W), gj for unmanned aerial vehicle antenna gain (dB), bwjamIs the drone bandwidth (MHz).
Step 2: setting of relevant parameters in DQN networks, Dnum(empirical playback set size), γ (reward discount factor), r (learning rate), ε (ε -greedy), C (number of network weight reset steps).
And step 3: training of the DQN network is started.The distance from a starting point to a task point of the unmanned aerial vehicle is L, the unmanned aerial vehicle is divided into num steps, t represents that the unmanned aerial vehicle flies t steps, the initialized t is 1, and the state S of m radars is detected1={s11,s21,...,sm1}; initializing an experience playback pool
Figure BDA0003009036530000041
Having a capacity of Dmax(ii) a Initializing a randomly generated weight θ1
Wherein the content of the first and second substances,
Figure BDA0003009036530000042
express the u unmanned plane jamuThe state of the interference ith radar in the step t; jamuRepresents the accumulated flight distance of the unmanned plane during the step t, and
Figure BDA0003009036530000043
Figure BDA0003009036530000044
and 4, step 4: interference action A executed by selecting unmanned aerial vehicle cluster by greedy strategyt={a1t,a2t,...,amt};
Wherein, ait={Pi,jkDenotes unmanned plane jamiFor radar P in t stepsiPerforming a jamming action jk,jk∈J;
And 5: performing a disturbing action AtThen, the state S of m radar parts is obtainedt+1To obtain a reward value RtAs shown in the following formula (2)
Figure BDA0003009036530000045
Figure BDA0003009036530000046
Wherein r ist(i) And (4) indicating that the unmanned aerial vehicle interferes with the reward value obtained by the ith radar when flying to the step t. RtRepresenting the total reward value obtained by the interfering m radars at step t.
When the ith radar goes from the search mode to the tracking mode and then to the guidance mode, sitSequentially increasing, otherwise, decreasing; if the radar keeps the working mode unchanged, sitThe radar operation transitions are shown in figure 3, without change. The transition of the drone to the radar mode of operation is obtained by a change in the parameters of fucol in step 1.
Step 6: will (S)t,At,Rt,St+1) And storing the samples into an experience pool D, and if the number of the samples stored in the experience pool D is not enough, entering a step 4, and making t equal to t +1 until D is full. And on the contrary, randomly sampling a batch of samples from the experience pool D every C step in the training process to adjust the internal parameters of the training network.
And 7: and if the experience pool is full, training is started from steps of 1-num in sequence. Training the network is to state sitAnd action aitCombining and carrying out learning in a neural network, utilizing the advantages of the neural network, and utilizing the characteristics in the DQN algorithm to carry out learning on the current state s at each stepitAction taken aitCorrecting the reward gradually to the optimal action aitTo get close. If the whole unmanned plane with the steps of 1-num flies successfully, G2Adding 1; whether failed or successful, G1And adding 1.
And 8: if the total flying times of the unmanned aerial vehicle is G at the moment1The number of times that the unmanned plane successfully completes the flight mission is G2The success rate of obtaining the unmanned aerial vehicle to execute the task is shown as the formula (4), and when SR is larger than the task requirement SRmaxAnd ending the training and entering the step 9, otherwise, continuing to execute the step 3.
sr=(G2/G1) (4)
And step 9: at this point, the training of the interference pattern resource allocation by using the DQN algorithm is finished, and the internal neural network parameters are trained. Now we input the corresponding state StCan pass through DQN networkAnd obtaining a corresponding optimal interference pattern resource allocation result according to the training result.
Example 1:
the invention provides a DQN-based method suitable for dynamically allocating interference pattern resources carried by an unmanned aerial vehicle, and particularly relates to a method for realizing real-time allocation of the interference pattern resources from the start to the completion of a task of the unmanned aerial vehicle by processing the condition that a radar has multiple working modes. In order to verify the effectiveness of the method, the method is used, and as shown in fig. 2, a DQN algorithm is performed to allocate interference resources of the unmanned aerial vehicle, which change along with the flight path, in real time.
The method comprises the following steps: obtaining an interference resource pool J and a radar resource pool P ═ { P ═ P1,P2,P3,P4Resource pool for confrontation environment E ═ E1,E2The unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,jam3,jam4};
Wherein J ═ { J ═ J1,j2,j3,j4,j5,j6,j7},j1Representing noise frequency modulation suppressed interference, j2Representing noise frequency modulation suppressed interference, j3Representing smart noise convolution disturbances, j4Suppression of disturbances, j, representing dense decoys5Representing distance-trailing spoofing interference, j6Representing speed-pulling spoofing disturbances, j7Representing a combined range-velocity tow spoofing disturbance.
Ground radar resource pool P ═ { P ═ P1,P2,P3,P4And for the established radar, the radar has two basic anti-jamming capabilities of pulse compression and pulse accumulation. Now. We denote the range radar by 0, the pulse Doppler radar by 1 and the MTI moving object display radar by 2.
Wherein P is1{ fuco, 0,320,32, qs }, where qs increases the pulse front tracking immunity measure; when the radar is in the search state, fucl ═ {32,24,0.3,8.7}, and when the radar is in the tracking state, fucl ═ {15,40,1.2,8.7 }.
Wherein P is2{ fucol, 1,250,33, qs }, where qs adds clutter cancellation, pulse leading edge tracking anti-jamming measures; thunderWhen in the search state, fucl ═ 20,24,0.5,10.3, and when in the tracking state, fucl ═ 5,60,1.5, 11.1.
Wherein P is3{ fucl,2,180,34, qs }, wherein qs adds clutter cancellation, speed discrimination anti-jamming measures; when the radar is in the search state, fucl ═ {15,32,0.8,9.5}, and when the radar is in the tracking state, fucl ═ {8,50,1.8,9.5 }.
Wherein P is4The method comprises the following steps of (1, 220,33, qs), wherein qs adds a clutter cancellation and speed discrimination anti-interference measure; when the radar is in the search state, fucl ═ {15,32,0.8,11.8}, and when the radar is in the tracking state, fucl ═ {4,60,2.4,11.8 }.
Unmanned aerial vehicle group jam ═ jam of interference resource to be distributed1,jam2,jam3,jam4And m represents the number of drones. Wherein the ith unmanned plane is jami={pjam,gj,bwjam,J},pjamFor unmanned aerial vehicle power (W), gj for unmanned aerial vehicle antenna gain (dB), bwjamIs the drone bandwidth (MHz).
jam1={10,9,200,J}、jam2={10,9,200,J}、jam3={10,9,200,J}、jam410,9,200, J belonging to J1~j7
rdmAnd jdmRespectively representing the position coordinates, in units (KM), of the mth radar and drone. The specific coordinate settings are as follows:
rd1=[-30,200],rd2=[30,120],rd3=[-20,40],rd4=[20,0];jd1=[0,10],jd2=[0,10],jd3=[0,10],jd4=[0,10]。
therefore, the position information of the unmanned aerial vehicle and the radar can be described in the two-dimensional coordinates, the distance between the unmanned aerial vehicle and the radar can be calculated through the coordinates, and the radial distance transformation relation between the unmanned aerial vehicle and each radar is shown in fig. 4.
Step 2: relevant parameters in the DQN network are set, D (empirical playback set size) is 2000, γ (reward discount factor) is 0.9, r (learning rate) is 0.001, e (e-greedy) is 0.9, and C (reset network weight step number) is 200.
And step 3: training of the DQN network is started. The distance from the starting point to the task point of the whole unmanned aerial vehicle is 300KM, the whole unmanned aerial vehicle is divided into 100 steps, t represents that the unmanned aerial vehicle flies t steps, the initialized t is 1, and the states S of m radars are obtained1={s11,s21,s31,s41}; initializing an experience playback pool
Figure BDA0003009036530000071
Its capacity is 2000; initializing a randomly generated weight θ1
Wherein the content of the first and second substances,
Figure BDA0003009036530000072
express the u unmanned plane jamuThe state of the interference ith radar in the step t; jamuRepresents the accumulated flight distance of the unmanned plane during the step t, and
Figure BDA0003009036530000073
Figure BDA0003009036530000074
and 4, step 4: interference action performed by a greedy strategy selection drone swarmt={a1t,a2t,a3t,a4t}; wherein, ait={Pi,jkDenotes unmanned plane jamiFor radar P in t stepsiPerforming a jamming action jk,jk∈J;
And 5: performing a disturbing action AtThen, the state S of m radar parts is obtainedt+1To obtain a reward value RtAs shown in the following formula (6)
Figure BDA0003009036530000075
Figure BDA0003009036530000076
Wherein r ist(i) And (4) indicating that the unmanned aerial vehicle interferes with the reward value obtained by the ith radar when flying to the step t. RtRepresenting the total reward value obtained by the interfering 4 radars at step t.
When the ith radar goes from the search mode to the tracking mode and then to the guidance mode, sitSequentially increasing, otherwise, decreasing; if the radar keeps the working mode unchanged, sitThe radar operation transitions are shown in figure 3, without change. The transition of the drone to the radar mode of operation is obtained by a change in the parameters of fucol in step 1.
Step 6: will (S)t,At,Rt,St+1) And storing the samples into an experience pool D, and if the number of the samples stored in the experience pool D is not enough, entering a step 4, and making t equal to t +1 until D is full. On the contrary, a batch of samples are randomly sampled from the experience pool D every 200 steps in the training process to adjust the internal parameters of the training network.
And 7: and (4) when the experience pool is full, starting training from 1-100 steps in sequence. Training the network is to state sitAnd action aitCombining and carrying out learning in a neural network, utilizing the advantages of the neural network, and utilizing the characteristics in the DQN algorithm to carry out learning on the current state s at each stepitAction taken aitCorrecting the reward gradually to the optimal action aitTo get close. If the whole unmanned aerial vehicle with 1-100 steps flies successfully, G2Adding 1; whether failed or successful, G1Plus 1, and G1、G2The initial value is zero.
And 8: if the total flying times of the unmanned aerial vehicle is G at the moment1The number of times that the unmanned plane successfully completes the flight mission is G2The success rate of obtaining the unmanned aerial vehicle to execute the task is shown as the formula (4), and when SR is larger than the task requirement SRmaxAnd ending the training and entering the step 9, otherwise, continuing to execute the step 3.
And step 9: at this point, the training of the interference pattern resource allocation by using the DQN algorithm is finished, and the internal neural network parameters are trained. Now we input the corresponding state StCan be trained by DQN networkAnd obtaining a corresponding optimal interference pattern resource allocation result.
The results of dynamic allocation of interference resources through the DQN network are shown in fig. 6-9, where t is the number of flight steps of the drone. After training and learning are carried out through the DQN algorithm, an optimal interference pattern resource distribution result under the current establishment environment and a graph 10 of changes of the DQN error function along with iteration times can be obtained. The Tenscript code frame visualization display is shown in FIG. 5.
2. Analysis of simulation results
The results of interference resource allocation in the simulation environment are shown in FIGS. 6 to 9. In the whole flying process of the cluster, the influence of interference resource distribution along with the dynamic change of the flight distance is considered, and the results show that in the change process of random cluster flying time, different interference patterns of different interference machines are adopted for interference at different moments aiming at different multifunctional radars, so that a flying task is completed. The experiment is carried out for 1600 times of simulation, although the error function after the DQN network training has fluctuation of about 0.2 in 1200-1600, the DQN error function is basically converged between 0.1-0.3, and the interference resource allocation can be basically converged, through the final flight success rate effect diagram, as shown in FIG. 11, we can see that the success rate of the interference effect through the DQN algorithm is finally stabilized at more than 70%, and the overall interference allocation result is good for the whole interference process, thereby realizing the requirement of dynamic allocation of the interference resources, and further verifying the feasibility and effectiveness of the establishment method.
The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention, and various modifications and changes may be made by those skilled in the art. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (1)

1. A DQN algorithm-based real-time allocation method for radar interference resources is characterized by comprising the following steps:
step 1: obtaining an interference resource pool J ═ J1,j2,......,jxThe radar resource pool P ═ P1,P2,......,PmThe unmanned aerial vehicle group jam to be distributed is { jam ═ jam1,jam2,...,jamm}; obtaining the required success rate SR of the task executed by the unmanned aerial vehicle groupmax
Wherein x represents the number of interference patterns; the number of unmanned aerial vehicles in the unmanned aerial vehicle group is the same as the number of radars in the environment, and is m;
step 2: setting the distance L from the starting point to the task point, the number of iteration steps num and the maximum capacity D of an empirical playback pool of the unmanned aerial vehiclemax(ii) a Initializing state S of m-section radar with t equal to 11={s11,s21,...,sm1}; initializing an experience playback pool
Figure FDA0003009036520000011
Wherein the content of the first and second substances,
Figure FDA0003009036520000012
representing unmanned aerial vehicle jamuInterfering radar PiThe state at the t step;
Figure FDA0003009036520000013
representing unmanned aerial vehicle jamuThe accumulated flight distance at the t-th step,
Figure FDA0003009036520000014
fucli(t) denotes a radar PiThe state at the t step; u ═ 1,2, ·, m, i ═ 1,2, ·, m;
and step 3: interference action A executed by selecting unmanned aerial vehicle cluster by greedy strategyt={a1t,a2t,...,amt};
Wherein, aut={Pi,jkDenotes unmanned plane jamuFor radar P at t stepiPerforming a jamming action jk,jk∈J;
And 4, step 4: performing a disturbing action AtThen, according to the reward value RtGet m partState S of radart+1
Figure FDA0003009036520000016
Figure FDA0003009036520000015
Wherein if radar PiKeeping the working mode unchanged, then sitThe change is not changed; if radar PiFrom search mode to tracking mode or from tracking mode to guidance mode, sitIncreasing; if radar PiFrom the guidance mode to the tracking mode, or from the guidance mode to the search mode, or from the tracking mode to the search mode, sitDecrease;
and 5: will (S)t,At,Rt,St+1) Storing the experience playback pool D; if the experience playback pool D does not reach the maximum capacity DmaxIf so, changing t to t +1, and returning to the step 3; otherwise, executing step 6;
step 6: initialization G1=0,G20; randomly sampling a batch of samples from an experience pool D, and converting the state sitAnd action aitPerforming combined input into the neural network for training, and utilizing DQN algorithm to correspond to the state s of the neural network at each stepitCorrecting the output action to make the output of the neural network to act aitApproaching;
and 7: predicting the action taken by the unmanned aerial vehicle cluster from 1-num step according to the trained neural network, and recording whether the unmanned aerial vehicle cluster successfully reaches a task point after num step;
and 8: repeatedly executing the step 7, and calculating the success rate sr of the task executed by the unmanned aerial vehicle group; if the success rate SR of the unmanned aerial vehicle group to execute the task is larger than the SRmaxEnding the training and executing the step 9; otherwise, returning to the step 2;
sr=(G2/G1)
wherein G is1The total times of executing the step 7, namely the total flying times of the unmanned aerial vehicle group; g2The number of times of completing the flight mission for the unmanned aerial vehicle group;
and step 9: the neural network meeting the success rate of the task requirement is used for executing the real-time allocation of radar interference resources of the unmanned aerial vehicle cluster, and the state S of m radars at a certain momenttInputting the data into a neural network meeting the success rate of the task requirements to obtain an interference action A taken by the unmanned aerial vehicle grouptNamely, the real-time allocation result of the radar interference resources of the unmanned aerial vehicle cluster.
CN202110370353.0A 2021-04-07 2021-04-07 DQN algorithm-based real-time allocation method for radar interference resources Active CN113093124B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110370353.0A CN113093124B (en) 2021-04-07 2021-04-07 DQN algorithm-based real-time allocation method for radar interference resources

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110370353.0A CN113093124B (en) 2021-04-07 2021-04-07 DQN algorithm-based real-time allocation method for radar interference resources

Publications (2)

Publication Number Publication Date
CN113093124A true CN113093124A (en) 2021-07-09
CN113093124B CN113093124B (en) 2022-09-02

Family

ID=76674257

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110370353.0A Active CN113093124B (en) 2021-04-07 2021-04-07 DQN algorithm-based real-time allocation method for radar interference resources

Country Status (1)

Country Link
CN (1) CN113093124B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114444398A (en) * 2022-02-08 2022-05-06 扬州宇安电子科技有限公司 Grey wolf algorithm-based networking radar cooperative interference resource allocation method
CN114509732A (en) * 2022-02-21 2022-05-17 四川大学 Deep reinforcement learning anti-interference method of frequency agile radar

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150260828A1 (en) * 2012-10-27 2015-09-17 Valeo Schalter Und Sensoren Gmbh Method for suppressing interference in a received signal of a radar sensor of a motor vehicle and corresponding driver assistance device
US9622133B1 (en) * 2015-10-23 2017-04-11 The Florida International University Board Of Trustees Interference and mobility management in UAV-assisted wireless networks
CN108710110A (en) * 2018-04-11 2018-10-26 哈尔滨工程大学 A kind of cognitive interference method based on Markov process decision
CN108777872A (en) * 2018-05-22 2018-11-09 中国人民解放军陆军工程大学 A kind of anti-interference model of depth Q neural networks and intelligent Anti-interference algorithm
CN109444832A (en) * 2018-10-25 2019-03-08 哈尔滨工程大学 Colony intelligence interfering well cluster method based on more jamming effectiveness values
CN109862610A (en) * 2019-01-08 2019-06-07 华中科技大学 A kind of D2D subscriber resource distribution method based on deeply study DDPG algorithm
CN109884599A (en) * 2019-03-15 2019-06-14 西安电子科技大学 A kind of radar chaff method, apparatus, computer equipment and storage medium
CN110031807A (en) * 2019-04-19 2019-07-19 电子科技大学 A kind of multistage smart noise jamming realization method based on model-free intensified learning
CN110515045A (en) * 2019-08-30 2019-11-29 河海大学 A kind of radar anti-interference method and system based on Q- study
CN111199127A (en) * 2020-01-13 2020-05-26 西安电子科技大学 Radar interference decision method based on deep reinforcement learning
CN111970072A (en) * 2020-07-01 2020-11-20 中国人民解放军陆军工程大学 Deep reinforcement learning-based broadband anti-interference system and anti-interference method
CN112435275A (en) * 2020-12-07 2021-03-02 中国电子科技集团公司第二十研究所 Unmanned aerial vehicle maneuvering target tracking method integrating Kalman filtering and DDQN algorithm
CN112543038A (en) * 2020-11-02 2021-03-23 杭州电子科技大学 Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150260828A1 (en) * 2012-10-27 2015-09-17 Valeo Schalter Und Sensoren Gmbh Method for suppressing interference in a received signal of a radar sensor of a motor vehicle and corresponding driver assistance device
US9622133B1 (en) * 2015-10-23 2017-04-11 The Florida International University Board Of Trustees Interference and mobility management in UAV-assisted wireless networks
CN108710110A (en) * 2018-04-11 2018-10-26 哈尔滨工程大学 A kind of cognitive interference method based on Markov process decision
CN108777872A (en) * 2018-05-22 2018-11-09 中国人民解放军陆军工程大学 A kind of anti-interference model of depth Q neural networks and intelligent Anti-interference algorithm
CN109444832A (en) * 2018-10-25 2019-03-08 哈尔滨工程大学 Colony intelligence interfering well cluster method based on more jamming effectiveness values
CN109862610A (en) * 2019-01-08 2019-06-07 华中科技大学 A kind of D2D subscriber resource distribution method based on deeply study DDPG algorithm
CN109884599A (en) * 2019-03-15 2019-06-14 西安电子科技大学 A kind of radar chaff method, apparatus, computer equipment and storage medium
CN110031807A (en) * 2019-04-19 2019-07-19 电子科技大学 A kind of multistage smart noise jamming realization method based on model-free intensified learning
CN110515045A (en) * 2019-08-30 2019-11-29 河海大学 A kind of radar anti-interference method and system based on Q- study
CN111199127A (en) * 2020-01-13 2020-05-26 西安电子科技大学 Radar interference decision method based on deep reinforcement learning
CN111970072A (en) * 2020-07-01 2020-11-20 中国人民解放军陆军工程大学 Deep reinforcement learning-based broadband anti-interference system and anti-interference method
CN112543038A (en) * 2020-11-02 2021-03-23 杭州电子科技大学 Intelligent anti-interference decision method of frequency hopping system based on HAQL-PSO
CN112435275A (en) * 2020-12-07 2021-03-02 中国电子科技集团公司第二十研究所 Unmanned aerial vehicle maneuvering target tracking method integrating Kalman filtering and DDQN algorithm

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
KOZY, M等: "Applying Deep-Q Networks to Target Tracking to Improve Cognitive Radar", 《2019 IEEE RADAR CONFERENCE (RADARCONF)》 *
VAN HASSELT H等: "Deep reinforcement learning with double Q-Learning", 《NATIONAL CONFERENCE ON ARTIFICIAL INTELLIGENCE》 *
张柏开等: "基于Q-Learning的多功能雷达认知干扰决策方法", 《电讯技术》 *
张柏开等: "对多功能雷达的DQN认知干扰决策方法", 《系统工程与电子技术》 *
杨鸿杰等: "基于强化学习的智能干扰算法研究", 《电子测量技术》 *
王帅康: "基于深度强化学习的无人机自主降落方法研究", 《中国优秀博硕士学位论文全文数据库(硕士)工程科技Ⅱ辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114444398A (en) * 2022-02-08 2022-05-06 扬州宇安电子科技有限公司 Grey wolf algorithm-based networking radar cooperative interference resource allocation method
CN114444398B (en) * 2022-02-08 2022-11-01 扬州宇安电子科技有限公司 Grey wolf algorithm-based networking radar cooperative interference resource allocation method
CN114509732A (en) * 2022-02-21 2022-05-17 四川大学 Deep reinforcement learning anti-interference method of frequency agile radar
CN114509732B (en) * 2022-02-21 2023-05-09 四川大学 Deep reinforcement learning anti-interference method of frequency agile radar

Also Published As

Publication number Publication date
CN113093124B (en) 2022-09-02

Similar Documents

Publication Publication Date Title
CN113093124B (en) DQN algorithm-based real-time allocation method for radar interference resources
CN111090078B (en) Networking radar residence time optimal control method based on radio frequency stealth
CN113341383B (en) Anti-interference intelligent decision method for radar based on DQN algorithm
CN107329136B (en) MIMO radar multi-target self-adaptive tracking method based on variable analysis time
CN106682820A (en) Optimized radar task scheduling method of digital array based on pulse interlacing
CN111812599B (en) Networking radar optimal waveform design method based on low interception performance under game condition
CN111190176B (en) Self-adaptive resource management method of co-location MIMO radar networking system
CN116299408B (en) Multi-radar autonomous cooperative detection system and detection method
CN115567353B (en) Interference multi-beam scheduling and interference power combined optimization method for radar networking system
CN115343680A (en) Radar anti-interference decision method based on deep reinforcement learning and combined frequency hopping and pulse width distribution
CN113311857A (en) Environment sensing and obstacle avoidance system and method based on unmanned aerial vehicle
Zhang et al. Research on decision-making system of cognitive jamming against multifunctional radar
CN109633587B (en) Adaptive adjustment method for networking radar signal bandwidth
Zhang et al. Performance analysis of deep reinforcement learning-based intelligent cooperative jamming method confronting multi-functional networked radar
CN112051552A (en) Multi-station-based main lobe anti-interference method and device
CN113376607B (en) Airborne distributed radar small sample space-time self-adaptive processing method
CN115236607A (en) Radar anti-interference strategy optimization method based on double-layer Q learning
Wicks Cognitive radar: A way forward
CN109212494B (en) Radio frequency stealth interference waveform design method for networking radar system
Goodman Foundations of cognitive radar for next-generation radar systems
CN113114399B (en) Three-dimensional spectrum situation complementing method and device based on generation countermeasure network
Ding et al. Collaborative route optimization and resource management strategy for multi-target tracking in airborne radar system
CN114675262A (en) Hypersonic aircraft searching method based on guide information
Bi et al. Optimization method of passive omnidirectional buoy array in on-call anti-submarine search based on improved NSGA-II
Zhang et al. An Intelligent Strategy Decision Method for Collaborative Jamming Based On Hierarchical Multi-Agent Reinforcement Learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant