CN116599575B - Simulation environment construction method and device for large-scale remote sensing task system - Google Patents

Simulation environment construction method and device for large-scale remote sensing task system Download PDF

Info

Publication number
CN116599575B
CN116599575B CN202310868770.7A CN202310868770A CN116599575B CN 116599575 B CN116599575 B CN 116599575B CN 202310868770 A CN202310868770 A CN 202310868770A CN 116599575 B CN116599575 B CN 116599575B
Authority
CN
China
Prior art keywords
task
satellite
information
data transmission
tensor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310868770.7A
Other languages
Chinese (zh)
Other versions
CN116599575A (en
Inventor
于嘉宁
王世金
王月
王江斌
金勇�
徐颖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Digital Space Beijing Technology Co ltd
Original Assignee
Digital Space Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Digital Space Beijing Technology Co ltd filed Critical Digital Space Beijing Technology Co ltd
Priority to CN202310868770.7A priority Critical patent/CN116599575B/en
Publication of CN116599575A publication Critical patent/CN116599575A/en
Application granted granted Critical
Publication of CN116599575B publication Critical patent/CN116599575B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/14Relay systems
    • H04B7/15Active relay systems
    • H04B7/185Space-based or airborne stations; Stations for satellite systems
    • H04B7/1851Systems using a satellite or space-based relay
    • H04B7/18519Operations control, administration or maintenance
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Medical Informatics (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Astronomy & Astrophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Radio Relay Systems (AREA)

Abstract

The application discloses a simulation environment construction method and device of a large-scale remote sensing task system. The method comprises the following steps: according to the satellite task resource information, calculating a data transmission window list of each satellite to a ground station and a task execution opportunity list of each satellite to each task target in the multi-satellite multi-station system, calculating a task planning period of each satellite and an environment state feature space of the multi-satellite multi-station system, generating an environment state of the multi-satellite multi-station system according to parameters in the environment state feature space in the task planning period of each satellite, selecting tasks to be executed by a decision system according to the task execution opportunity list in the task planning period, selecting corresponding data downloading windows according to the data transmission window list, and updating the environment state. The embodiment of the application can realize that the effective environment state is built on the whole by using a small quantity of state parameters so as to describe a complex multi-star multi-station task planning system.

Description

Simulation environment construction method and device for large-scale remote sensing task system
Technical Field
The application relates to the technical field of satellite control, in particular to construction of satellite environment states, and particularly relates to a simulation environment construction method and device of a large-scale remote sensing task system, electronic equipment and a computer readable storage medium.
Background
The multi-star multi-station satellite system is very complex and the difficulty of building a simulation environment is generally high. The simulation environment of the satellite remote sensing task system constructed based on the reinforcement learning algorithm often increases the difficulty of constructing the corresponding environment state due to high system impurity degree.
Constructing a complex environment state based on a complex task system often results in large calculation amount and slow calculation speed of a related classical algorithm. Meanwhile, the excessively complex environment state increases the difficulty of training the artificial intelligent algorithm model by adopting methods such as reinforcement learning and the like, and reduces the robustness and the application range of the artificial intelligent algorithm model.
Disclosure of Invention
In view of this, the embodiments of the present application provide a method, an apparatus, an electronic device, and a computer readable storage medium for constructing a simulation environment of a large-scale remote sensing task system, which are used for solving at least one technical problem.
The embodiment of the application provides a simulation environment construction method of a large-scale remote sensing task system, which comprises the following steps:
calculating a data transmission window list of each satellite to a ground station in the multi-satellite multi-station system according to the satellite task resource information, and a task execution opportunity list of each satellite to each task target;
According to the data transmission window list and the task execution opportunity list, calculating an environmental state characteristic space of a task planning period of each satellite and a multi-satellite multi-station system, wherein the environmental state characteristic space comprises the following parameters: the total number of satellites in the system, the maximum value of the chance of a single-satellite executable task in the system, the maximum number of data transmission windows of the single satellite and the maximum number of ground windings during single-satellite planning;
generating an environmental state of the multi-satellite and multi-station system according to parameters in an environmental state feature space in a task planning period of each satellite, wherein the environmental state is represented by an environmental state constant and an environmental state variable which are needed in the satellite task planning process;
in the task planning period, the decision system selects tasks to be executed according to the task execution opportunity list, selects corresponding data downloading windows according to the data transmission window list, and updates the environment state; wherein,
the environmental state constants include at least:
task identification information used for representing the unique identification of the task corresponding to the task opportunity;
task storage information for representing the size of data generated when the satellite performs tasks;
the task turn number information is used for indicating the number of track turns of the satellite when each task execution opportunity occurs;
The window task mapping relation is used for representing the corresponding relation between the data transmission window and the task in time;
task conflict information for representing conflicts between tasks due to time constraints;
task value information for representing the weight or value of the corresponding task; the method comprises the steps of,
task order information for indicating the order of task opportunities in a task execution opportunity list;
the environmental state variables include at least:
task availability information used for indicating whether the tasks of the multi-star multi-station system are available in the task planning process;
window storage information for representing the size of the remaining downloadable data amount of the data transmission window;
the satellite storage information is used for representing the residual storage space on the satellite before the satellite enters the next data transmission window;
window availability information for indicating whether the data transmission window can download the task;
satellite energy allowance information for indicating the number of tasks that can be performed by a single satellite; the method comprises the steps of,
and the task residual execution times information is used for indicating the residual executable times of the task corresponding to the task opportunity in the residual planning time.
The embodiment of the application provides a simulation environment construction device of a large-scale remote sensing task system, which comprises the following components: the first calculation module is used for calculating a data transmission window list of each satellite to a ground station and a task execution opportunity list of each satellite to each task target in the multi-satellite multi-station system according to the satellite task resource information;
A second calculation module: the system is used for calculating the task planning period of each satellite and the environment state characteristic space of the multi-satellite multi-station system according to the data transmission window list and the task execution opportunity list, and the environment state characteristic space comprises the following parameters: the total number of satellites in the system, the maximum value of the chance of a single-satellite executable task in the system, the maximum number of data transmission windows of the single satellite and the maximum number of ground windings during single-satellite planning;
the generating module is used for generating the environmental state of the multi-satellite and multi-station system according to the parameters in the environmental state characteristic space in the task planning period of each satellite, wherein the environmental state is represented by environmental state constants and environmental state variables which are needed in the satellite task planning process;
the updating module is used for selecting tasks to be executed according to the task execution opportunity list by the decision system in the task planning period, selecting a corresponding data downloading window according to the data transmission window list and updating the environment state; wherein,
the environmental state constants include at least:
task identification information used for representing the unique identification of the task corresponding to the task opportunity;
task storage information for representing the size of data generated when the satellite performs tasks;
The task turn number information is used for indicating the number of track turns of the satellite when each task execution opportunity occurs;
the window task mapping relation is used for representing the corresponding relation between the data transmission window and the task in time;
task conflict information for representing conflicts between tasks due to time constraints;
task value information for representing the weight or value of the corresponding task; the method comprises the steps of,
task order information for indicating the order of task opportunities in a task execution opportunity list;
the environmental state variables include at least:
task availability information used for indicating whether the tasks of the multi-star multi-station system are available in the task planning process;
window storage information for representing the size of the remaining downloadable data amount of the data transmission window;
the satellite storage information is used for representing the residual storage space on the satellite before the satellite enters the next data transmission window;
window availability information for indicating whether the data transmission window can download the task;
satellite energy allowance information for indicating the number of tasks that can be performed by a single satellite; the method comprises the steps of,
and the task residual execution times information is used for indicating the residual executable times of the task corresponding to the task opportunity in the residual planning time.
An embodiment of the present application provides an electronic device including a processor and a memory storing computer program instructions; the processor, when executing the computer program instructions, implements the steps of the method as described above.
Embodiments of the present application provide a computer readable storage medium having stored thereon computer program instructions which when executed by a processor perform the steps of the method as described above.
By adopting the embodiment of the application, an effective simulation environment is constructed by extracting the key information related to task planning so as to describe a complex multi-star multi-station task planning system, and the complexity of the simulation environment is greatly reduced, thereby reducing the computational complexity of a task planning algorithm in the environment. Meanwhile, the method builds an effective environment state space for the artificial intelligence algorithm model based on reinforcement learning, reduces the difficulty of model training, and increases the robustness and application range of the model.
Drawings
In order to more clearly describe the technical solution of the embodiments of the present application, the following description briefly describes the drawings in the embodiments of the present application.
FIG. 1 is a flow chart of a method for constructing a simulation environment of a large-scale remote sensing task system according to an embodiment of the present application.
Fig. 2 is a schematic diagram of an environmental state of a multi-star multi-station system according to an embodiment of the present application.
FIG. 3 is a block diagram of a simulation environment construction apparatus for a large-scale remote sensing task system according to an embodiment of the present application.
FIG. 4 is a schematic diagram of an electronic device for implementing a method for building a simulation environment for a large-scale remote sensing task system in accordance with an embodiment of the present application.
Detailed Description
The principles and spirit of the present application will be described below with reference to several exemplary embodiments. It will be appreciated that such embodiments are provided to make the principles and spirit of the application clear and thorough, and enabling those skilled in the art to better understand and practice the principles and spirit of the application. The exemplary embodiments provided herein are merely some, but not all embodiments of the application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the embodiments herein, are within the scope of the present application.
Embodiments of the present application relate to a terminal device and/or a server. Those skilled in the art will appreciate that embodiments of the application may be implemented as a system, apparatus, device, method, or computer readable storage medium. Accordingly, the present disclosure may be embodied in at least one of the following forms: complete hardware, complete software, or a combination of hardware and software. According to an embodiment of the application, the application discloses a simulation environment construction method, a simulation environment construction device, electronic equipment and a computer readable storage medium of a large-scale remote sensing task system.
In this document, terms such as first, second, third, etc. are used solely to distinguish one entity (or action) from another entity (or action) without necessarily requiring or implying any order or relationship between such entities (or actions).
FIG. 1 shows a flow chart of a simulation environment construction method of a large-scale remote sensing task system according to an embodiment of the application, the method comprises the following steps:
s101: and calculating a data transmission window list of each satellite to the ground station in the multi-satellite multi-station system according to the satellite task resource information, and a task execution opportunity list of each satellite to each task target.
S102: according to the data transmission window list and the task execution opportunity list, calculating an environmental state characteristic space of a task planning period of each satellite and a multi-satellite multi-station system, wherein the environmental state characteristic space comprises the following parameters: total number of satellites in the system, maximum value of single-satellite executable task opportunities in the system, maximum number of data transmission windows of single satellites and maximum number of ground windings during single-satellite planning.
S103: and generating the environmental state of the multi-satellite and multi-station system according to the parameters in the environmental state characteristic space in the task planning period of each satellite, wherein the environmental state is represented by environmental state constants and environmental state variables which are needed in the satellite task planning process.
S104: and in the task planning period, the decision system selects tasks to be executed according to the task execution opportunity list, selects a corresponding data downloading window according to the data transmission window list, and updates the environment state.
Wherein the environmental state constants include at least:
task identification information used for representing the unique identification of the task corresponding to the task opportunity;
task storage information for representing the size of data generated when the satellite performs tasks;
the task turn number information is used for indicating the number of track turns of the satellite when each task execution opportunity occurs;
the window task mapping relation is used for representing the corresponding relation between the data transmission window and the task in time;
task conflict information for representing conflicts between tasks due to time constraints;
task value information for representing the weight or value of the corresponding task; the method comprises the steps of,
task order information for indicating the order of task opportunities in a task execution opportunity list;
the environmental state variables include at least:
task availability information used for indicating whether the tasks of the multi-star multi-station system are available in the task planning process;
window storage information for representing the size of the remaining downloadable data amount of the data transmission window;
The satellite storage information is used for representing the residual storage space on the satellite before the satellite enters the next data transmission window;
window availability information for indicating whether the data transmission window can download the task;
satellite energy allowance information for indicating the number of tasks that can be performed by a single satellite; the method comprises the steps of,
and the task residual execution times information is used for indicating the residual executable times of the task corresponding to the task opportunity in the residual planning time.
According to the processing mode of the embodiment of the application, firstly, a data transmission window list and a task execution opportunity list are calculated according to the original data of the satellite task, and the step is to calculate the time range capable of operating the satellite by combining the specific requirements of the existing resource and task planning.
Boundaries of the simulated environmental space are then further calculated based on the results of the calculations, including the task planable duration of each satellite and the size of the environmental state feature space. The mission opportunity within the mission planable period of the satellite is an executable mission opportunity. In a multi-satellite multi-station system, parameters such as the total number of satellites, the maximum value of the executable task opportunities of a single satellite in the system, the maximum data transmission window number of the single satellite, the maximum ground winding number during single satellite planning and the like represent the size of the environmental state feature space of the multi-satellite multi-station system.
And in the boundary of the simulation environment space, abstracting and extracting key information needed in the task planning process to generate the environment state of the multi-star multi-station system. The embodiment of the application is characterized by the environment state constant and the environment state variable which are needed in the satellite mission planning process.
The environment state constant multi-satellite multi-station system has the quantity which cannot change along with task planning, including the quantity related to satellite orbit, ground base station and task, such as the time window and gesture of the satellite which can communicate with the base station, the time window and gesture of the satellite which can execute the task, and the inherent attribute of the task. Because the state of the system is continuously changed in the task planning process, such as task availability, on-board storage state, on-board energy state, task remaining execution opportunity, data transmission window remaining capacity and the like, the environment state variables can be extracted.
The embodiment of the application abstracts and extracts a small amount of state parameters to describe a complex multi-star multi-station task planning system (such as star, station and any number of tasks), as shown in fig. 2, wherein the environmental state constants comprise: task identification information, task storage information, window task mapping relation, task conflict information, task value information and task sequence information. The environment state variables include: task availability information, window storage information, satellite storage information, window availability information, satellite energy margin information, and task remaining execution times information. The processing has the advantages that the environment state of the system can be fully expressed by using a small amount of state parameters, and the complexity of the simulation environment is reduced, so that the calculation complexity of an algorithm is reduced.
It should be noted that the simulation environment dynamically changes along with the task planning process, and in the task planning period, the decision system of the agent selects the task to be executed according to the task execution opportunity list, and selects the corresponding data downloading window according to the data downloading window list. The environmental status is then updated according to the changes that result after the current round of task selection. The method constructs an effective environment state for the planning algorithm of the reinforcement learning paradigm, and is favorable for further reinforcement learning training.
According to an embodiment of the application, the environmental state constant and the environmental state variable are optionally represented in the form of tensors.
The reason for this is:
first, in programming, tensor computation is more efficient;
secondly, the expression form of tensors also reduces the storage and calculation cost for representing the environmental state in engineering;
thirdly, the extracted key information is quantized, so that the complexity of the simulation environment of the multi-satellite multi-station system can be further reduced on the basis of the original advantages, namely, the state of one attribute of all objects (such as satellites, data transmission windows, orbits and the like) in the same category in the system can be expressed in a tensor expression mode;
Fourth, the interaction of the expression single agent AI model with the environmental state is facilitated.
How each key information in the task planning process is expressed in the form of tensors in the present embodiment will be described below:
1) Each environmental state constant is expressed in the form of a tensor.
The task identification information is represented by a second-order tensor of m rows and n columns, and is marked as a task identification tensor T_id (m, n), each row of elements in the tensor corresponds to one satellite, m rows correspond to m satellites, n corresponds to the maximum number of times a single satellite has a chance to execute a task, and the value of an element T_id [ i, j ] is the identification of the task which can be executed by the ith satellite, for example, T_id [2,1] represents the element of the 2 nd row and the 1 st column of the tensor, and corresponds to the identification of the task which can be executed by the 2 nd satellite.
The task storage information is represented by a second order tensor of m rows and n columns, denoted as a task storage tensor t_s (m, n), and the value of the element t_s [ i, j ] is the data size generated by the ith satellite executing its jth task opportunity, for example, t_s [3,4] =500, which indicates that the 3 rd satellite executing its 4 th task opportunity will generate 500m of on-satellite data.
The task number information is represented by a second order tensor of m rows and n columns, and is denoted as a task number tensor t_e (m, n), and the value of the element t_e [ i, j ] is the number of orbits in which the ith satellite performs its jth task opportunity, for example, t_e [3,4] =2, and indicates that the 3 rd satellite is in the 2 nd orbit of the ith satellite when performing its 4 th task opportunity.
The window task mapping relation is represented by a second-order tensor of m rows and l columns and is marked as window task mapping tensor T_w (m, l), wherein l corresponds to the maximum number of single star data transmission windows in a planning period, the value of an element T_w [ i, j ] is the last task opportunity which can be completed by an ith satellite before entering the jth data transmission window, for example, T_w [1,1] =3, and the last task opportunity which can be completed by the 1 st satellite before entering the 1 st data transmission window is the 3 rd task execution opportunity; t_w [1,2] =8 indicates that the task execution opportunity that the 1 st satellite can finish last before entering the 2 nd data transmission window is 8 th.
The task conflict information is represented by a third-order tensor of m·n·n, denoted as a task conflict tensor t_conflict (m, n, n), where m corresponds to the number of satellites, and the time conflict relationship between the executable tasks of each satellite is described by a second-order tensor of n·n, and if the value of the element t_conflict [ i, j, k ] is 1, it indicates that there is a time conflict between the j-th task and the k-th task of the i-th satellite, for example, t_conflict [3,4,5] =1, indicates that there is a time conflict between the 4-th task and the 5-th task of the 3 rd satellite, and each task and itself must have a time conflict, for example, t_conflict [3, 4] =1.
The task value information is represented by a second-order tensor of m rows and n columns, and is marked as a task value tensor T_v (m, n), and the value of an element T_v [ i, j ] is the weight or the value of the corresponding task when the ith satellite executes the jth task opportunity.
The task sequence information is represented by a second-order tensor of m rows and n columns, and is recorded as a task mapping tensor T_map (m, n), and the value of an element T_map [ i, j ] is the sequence of the task opportunity in a task execution opportunity list when the ith satellite executes the jth task opportunity, wherein the task mapping tensor is used for participating in the task planning process and the final planning result.
2) Each environmental state variable is expressed in the form of a tensor.
The task availability information is represented by a second-order tensor of m rows and n columns, and is recorded as a task availability tensor T_available (m, n), each row of elements in the tensor corresponds to one satellite, m rows correspond to m satellites, n corresponds to the maximum number of times a single satellite has the opportunity to execute a task, and an element T_available [ i, j ] =1 represents that the j-th task of the i-th satellite can be executed, and a value of 0 represents that the task cannot be executed.
The window storage information is represented by a second-order tensor of m rows and l columns and is recorded as a data transmission window communication allowance tensor W_s (m and l), wherein l corresponds to the maximum number of single star data transmission windows in a planning period, and the value of an element W_s [ i and j ] is the size of the residual downloadable data quantity of the j-th data transmission window of the i-th satellite.
The satellite storage information is expressed by a second-order tensor of m rows and l columns, and is marked as an on-satellite storage allowance tensor S_s (m and l), and the value of an element S_s [ i and j ] is the residual storage space size of the ith satellite after leaving the jth-1 data transmission window and before entering the jth data transmission window.
The window availability information is represented by a third-order tensor of m.l.n, and is denoted as a data transmission window availability tensor w_available (m, l, n), and the element w_available [ i, j, k ] =1 represents that the j-th data transmission window of the i-th satellite can completely download data generated by the k-th task.
The satellite energy margin information is represented by a second-order tensor of m rows and S columns, and is denoted as a satellite energy margin tensor s_e (m, S), where S corresponds to the number of orbits, and the value of the element s_e [ i, j ] is the number of tasks that the ith star can perform in the jth turn, for example, s_e [3,4] =2, which indicates that the 3 rd star can perform 2 tasks in the 4 th turn.
The information of the residual execution times of the tasks is expressed by a second-order tensor of m rows and n columns, and is recorded as a residual total execution times tensor T_remain (m, n), and the value of an element T_remain [ i, j ] represents the residual executable times of the task corresponding to the j-th task opportunity of the i-th satellite in the residual planning time, and if the task is already executed, the value is set to 0.
In the implementation process, the intelligent agent can intuitively recognize the current information such as the execution state of each task, the running state of each satellite, the use state of each data transmission window and the like according to the change of each tensor in the process of continuously interacting with the environment state; and the parameters are convenient to adjust so as to generate different execution actions in coordination with tasks with different priorities.
Optionally, according to an embodiment of the present application, the satellite task resource information includes: satellite resource information, ground station resource information, mission requirement information, and mission planning period information. The satellite task resource information is the original data of a satellite task, wherein the satellite resource information is for example two-line number used for describing the position, the speed and other information of the artificial satellite on the earth orbit, and also comprises the information of attitude capability, data transmission capability, loading capability, storage capability, energy and the like of the satellite. The ground station resource information includes, for example, information such as a station address, a data transmission bandwidth, a data transmission area, and an available time, and the task demand information includes, for example, information such as a task address, a load demand, and a completion time limit.
The data transmission window list can be calculated according to satellite resource information, ground station resource information and task planning time period information, and the data transmission window list comprises: the satellite stores information for the start time, end time, data bandwidth, and window of the ground station data transmission window. According to the satellite resource information, the task demand information and the task planning time period information, calculating the task execution opportunity list, wherein the task execution opportunity list comprises the following components: the satellite carries out the information such as the starting time, the ending time, the corresponding satellite turns, the satellite gesture, the task weight, the task value, the data amount and the like of the time window for executing the task.
In an embodiment of the application, the computational task planable period is the first step of computing the spatial boundary of the environment. The starting time of the task planning period is the ending time of the first data transmission window of each satellite, the ending time is the entry time of the last data transmission window of each satellite, and the task opportunities in the task planning period are all valid task opportunities. In other words, it is practical to build simulation environment states within a task planable period, and incorporating non-executable task opportunities would add unnecessary burden to the algorithm.
Optionally, according to an embodiment of the present application, updating the environmental state includes: and updating the data transmission window availability tensor, the satellite energy allowance tensor and the task availability tensor in the environment state variables according to the task identification tensor, the task storage tensor and the task conflict tensor in the environment state constants.
After the decision system of the intelligent agent makes the execution action, the simulation environment state can change correspondingly, and based on the change, the environment state can update the data transmission window availability tensor, the satellite energy allowance tensor and the task availability tensor in the environment state variables step by step according to the task identification tensor, the task storage tensor and the task conflict tensor in the environment state constant.
The method comprises the steps of representing an interaction mechanism of an environment state and an agent, wherein a task availability tensor T_available (m, n) corresponds to an executable action of an agent decision system, when the agent selects a specific action, the environment state is correspondingly changed, and the interaction process represents a process of causing the environment state transition by the agent action.
Correspondingly, the present application also provides a greedy strategy-based dynamic planning device for multi-star remote sensing tasks, as shown in fig. 3, where the device 100 includes:
the first calculation module 110 is configured to calculate, according to the satellite task resource information, a data transmission window list of each satellite to the ground station and a task execution opportunity list of each satellite to each task target in the multi-satellite multi-station system;
the second calculation module 120: the system is used for calculating the task planning period of each satellite and the environment state characteristic space of the multi-satellite multi-station system according to the data transmission window list and the task execution opportunity list, and the environment state characteristic space comprises the following parameters: the total number of satellites in the system, the maximum value of the chance of a single-satellite executable task in the system, the maximum number of data transmission windows of the single satellite and the maximum number of ground windings during single-satellite planning;
The generating module 130 is configured to generate, during a task planning period of each satellite, an environmental state of the multi-satellite multi-station system according to parameters in an environmental state feature space, where the environmental state is represented by an environmental state constant and an environmental state variable that are required to be used in a satellite task planning process;
and the updating module 140 is configured to select a task to be executed according to the task execution opportunity list, select a corresponding data downloading window according to the data downloading window list, and update an environment state during the task planning period.
Wherein the environmental state constants include at least:
task identification information used for representing the unique identification of the task corresponding to the task opportunity;
task storage information for representing the size of data generated when the satellite performs tasks;
the task turn number information is used for indicating the number of track turns of the satellite when each task execution opportunity occurs;
the window task mapping relation is used for representing the corresponding relation between the data transmission window and the task in time;
task conflict information for representing conflicts between tasks due to time constraints;
task value information for representing the weight or value of the corresponding task; the method comprises the steps of,
Task order information for indicating the order of task opportunities in a task execution opportunity list;
the environmental state variables include at least:
task availability information used for indicating whether the tasks of the multi-star multi-station system are available in the task planning process;
window storage information for representing the size of the remaining downloadable data amount of the data transmission window;
the satellite storage information is used for representing the residual storage space on the satellite before the satellite enters the next data transmission window;
window availability information for indicating whether the data transmission window can download the task;
satellite energy allowance information for indicating the number of tasks that can be performed by a single satellite; the method comprises the steps of,
and the task residual execution times information is used for indicating the residual executable times of the task corresponding to the task opportunity in the residual planning time.
Based on at least one of the above embodiments, there are the following advantages:
1. through ingenious design of environment state constants and environment state variables, a small amount of key information is used for representing the environment state, a complex multi-star multi-station task planning system (such as any star, station and task number) can be described, and the algorithm has strong universality.
2. And each state parameter is expressed in the form of tensor, so that the complexity of the task planning system is further reduced, and the calculation efficiency of the algorithm is effectively improved.
3. An effective simulation environment state is constructed for strengthening a planning algorithm of a learning paradigm.
4. The difficulty of training the artificial intelligent algorithm model by adopting reinforcement learning and other methods is reduced, and the artificial intelligent algorithm model trained in the environment state space has stronger robustness, so that the universality of the artificial intelligent algorithm model under different condition system states is improved.
The electronic device in the embodiment of the application can be user terminal equipment, a server, other computing devices and a cloud server. Fig. 4 shows a schematic diagram of a hardware structure of an electronic device according to an embodiment of the present application, where the electronic device may include a processor 601 and a memory 602 storing computer program instructions, where the processor 601 implements the flow or functions of any of the methods of the embodiments described above when executing the computer program instructions.
In particular, the processor 601 may include a Central Processing Unit (CPU), or an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), or may be configured as one or more integrated circuits that implement embodiments of the present application. Memory 602 may include mass storage for data or instructions. For example, the memory 602 may be at least one of: hard Disk Drive (HDD), read-only memory (ROM), random-access memory (RAM), floppy Disk Drive, flash memory, optical Disk, magneto-optical Disk, magnetic tape, universal serial bus (Universal Serial Bus, USB) Drive, or other physical/tangible memory storage device. As another example, the memory 602 may include removable or non-removable (or fixed) media. For another example, memory 602 may be internal or external to the integrated gateway disaster recovery device. The memory 602 may be a non-volatile solid state memory. In other words, generally the memory 602 includes a tangible (non-transitory) computer-readable storage medium (e.g., a memory device) encoded with computer-executable instructions and when the software is executed (e.g., by one or more processors) may perform the operations described by the methods of embodiments of the application. The processor 601 implements the flow or functions of any of the methods of the above embodiments by reading and executing computer program instructions stored in the memory 602.
In one example, the electronic device shown in fig. 4 may also include a communication interface 603 and a bus 610. The processor 601, the memory 602, and the communication interface 603 are connected to each other through a bus 610 and perform communication with each other. The communication interface 603 is mainly used to implement communications between modules, apparatuses, units, and/or devices in the embodiments of the present application. Bus 610 includes hardware, software, or both, and may couple components of the online data flow billing device to each other. For example, bus 610 may include at least one of: accelerated Graphics Port (AGP) or other graphics bus, enhanced Industry Standard Architecture (EISA) bus, front Side Bus (FSB), hyperTransport (HT) interconnect, industry Standard Architecture (ISA) bus, infiniBand interconnect, low Pin Count (LPC) bus, memory bus, micro channel architecture (MCa) bus, peripheral Component Interconnect (PCI) bus, PCI-Express (PCI-X) bus, serial Advanced Technology Attachment (SATA) bus, video electronics standards Association local (VLB) bus, or other suitable bus. Bus 610 may include one or more buses. Although embodiments of the application describe or illustrate a particular bus, embodiments of the application contemplate any suitable bus or interconnection.
In connection with the methods of the above embodiments, embodiments of the present application also provide a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, implement the flow or function of any of the methods of the above embodiments.
The foregoing exemplary descriptions of the flowcharts and/or block diagrams of methods, apparatuses and systems according to embodiments of the present application describe various aspects related thereto. It will be understood that each block of the flowchart illustrations and/or block diagrams, or combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions, special purpose hardware which perform the specified functions or acts, and combinations of special purpose hardware and computer instructions. For example, these computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the present application, enable the implementation of the functions/acts specified in the flowchart and/or block diagram block or blocks. Such a processor may be a general purpose processor, a special purpose processor, an application specific processor, or a field programmable logic circuit.
Functional blocks shown in the block diagrams of the embodiments of the present application can be implemented in hardware, software, firmware, or a combination thereof. When implemented in hardware, it may be, for example, an electronic circuit, an Application Specific Integrated Circuit (ASIC), suitable firmware, a plug-in, a function card, or the like; when implemented in software, are the programs or code segments used to perform the required tasks. The program or code segments can be stored in a memory or transmitted over transmission media or communication links through data signals carried in carrier waves. The code segments may be downloaded via computer networks such as the internet, intranets, etc.
It should be noted that the present application is not limited to the specific configurations and processes described above or shown in the drawings. The foregoing is merely specific embodiments of the present application, and it will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working processes of the described system, apparatus, module or unit may refer to corresponding processes in the method embodiments, and need not be repeated. It should be understood that the scope of the present application is not limited thereto, and any person skilled in the art may conceive various equivalent modifications or substitutions within the technical scope of the present application, which are intended to be included in the scope of the present application.

Claims (12)

1. A simulation environment construction method of a large-scale remote sensing task system is characterized by comprising the following steps:
calculating a data transmission window list of each satellite to a ground station in the multi-satellite multi-station system according to the satellite task resource information, and a task execution opportunity list of each satellite to each task target;
according to the data transmission window list and the task execution opportunity list, calculating an environmental state characteristic space of a task planning period of each satellite and a multi-satellite multi-station system, wherein the environmental state characteristic space comprises the following parameters: the total number of satellites in the system, the maximum value of the chance of a single-satellite executable task in the system, the maximum number of data transmission windows of the single satellite and the maximum number of ground windings during single-satellite planning;
generating an environmental state of the multi-satellite and multi-station system according to parameters in an environmental state feature space in a task planning period of each satellite, wherein the environmental state is represented by an environmental state constant and an environmental state variable which are needed in the satellite task planning process;
in the task planning period, the decision system selects tasks to be executed according to the task execution opportunity list, selects corresponding data downloading windows according to the data transmission window list, and updates the environment state; wherein,
The environmental state constants include:
task identification information used for representing the unique identification of the task corresponding to the task opportunity;
task storage information for representing the size of data generated when the satellite performs tasks;
the task turn number information is used for indicating the number of track turns of the satellite when each task execution opportunity occurs;
the window task mapping relation is used for representing the corresponding relation between the data transmission window and the task in time;
task conflict information for representing conflicts between tasks due to time constraints;
task value information for representing the weight or value of the corresponding task; the method comprises the steps of,
task order information for indicating the order of task opportunities in a task execution opportunity list;
the environmental state variables include:
task availability information used for indicating whether the tasks of the multi-star multi-station system are available in the task planning process;
window storage information for representing the size of the remaining downloadable data amount of the data transmission window;
the satellite storage information is used for representing the residual storage space on the satellite before the satellite enters the next data transmission window;
window availability information for indicating whether the data transmission window can download the task;
Satellite energy allowance information for indicating the number of tasks that can be performed by a single satellite; the method comprises the steps of,
and the task residual execution times information is used for indicating the residual executable times of the task corresponding to the task opportunity in the residual planning time.
2. The method of claim 1, wherein the environmental state constant and the environmental state variable are represented in tensors.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the task identification information is represented by a second-order tensor of m rows and n columns, and is marked as a task identification tensor T_id (m, n), each row of elements in the tensor corresponds to one satellite, m rows correspond to m satellites, n corresponds to the maximum times that a single satellite has the opportunity to execute a task, and the value of an element T_id [ i, j ] is the identification of the task which can be executed by the ith satellite;
the task storage information is represented by a second-order tensor of m rows and n columns, and is marked as a task storage tensor T_s (m, n), and the value of an element T_s [ i, j ] is the data size generated when an ith satellite executes the jth task;
the task number information is represented by a second-order tensor of m rows and n columns, and is marked as a task number tensor T_e (m, n), and the value of an element T_e [ i, j ] is the number of orbit turns where the ith satellite performs the jth task opportunity;
The window task mapping relation is expressed by a second-order tensor of m rows and l columns and is marked as window task mapping tensor T_w (m, l), wherein l corresponds to the maximum number of single star data transmission windows in a planning period, and the value of an element T_w [ i, j ] is the last task opportunity which can be completed by an ith satellite before entering a jth data transmission window;
the task conflict information is expressed by a third-order tensor of m.n.n, and is marked as a task conflict tensor T_conflict (m, n, n), wherein m corresponds to the number of satellites, the time conflict relation between executable tasks of each satellite is described by a second-order tensor of n.n, and if the value of an element T_conflict [ i, j, k ] is 1, the time conflict exists between the j-th task and the k-th task of the i-th satellite;
the task value information is expressed by a second-order tensor of m rows and n columns, and is marked as a task value tensor T_v (m, n), and the value of an element T_v [ i, j ] is the weight or the value of a corresponding task when an ith satellite executes the jth task opportunity;
the task sequence information is represented by a second-order tensor of m rows and n columns, and is marked as a task mapping tensor T_map (m, n), and the value of an element T_map [ i, j ] is the sequence of the task opportunity in a task execution opportunity list when the ith satellite executes the jth task opportunity.
4. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the task availability information is represented by a second-order tensor of m rows and n columns, and is recorded as a task availability tensor T_available (m, n), each row of elements in the tensor corresponds to one satellite, m rows correspond to m satellites, n corresponds to the maximum times that a single satellite has the opportunity to execute a task, and an element T_available [ i, j ] =1 represents that the j-th task of the i-th satellite can be executed;
the window storage information is represented by a second-order tensor of m rows and l columns and is recorded as a data transmission window communication allowance tensor W_s (m, l), wherein l corresponds to the maximum number of single star data transmission windows in a planning period, and the value of an element W_s [ i, j ] is the size of the residual downloadable data quantity of the j-th data transmission window of the i-th satellite;
the satellite storage information is represented by a second-order tensor of m rows and l columns, and is marked as an on-satellite storage allowance tensor S_s (m, l), and the value of an element S_s [ i, j ] is the size of the residual storage space on the satellite after the ith satellite leaves the jth-1 data transmission window and before entering the jth data transmission window;
the window availability information is represented by a third-order tensor of m.l.n, and is marked as a data transmission window availability tensor W_available (m, l, n), and an element W_available [ i, j, k ] =1 represents that the j-th data transmission window of the i-th satellite can completely download data generated by the k-th task;
The satellite energy allowance information is expressed by a second-order tensor of m rows and S columns and is marked as a satellite energy allowance tensor S_e (m, S), wherein S corresponds to the number of orbit turns, and the value of an element S_e [ i, j ] is the number of tasks which can be executed by an ith star in a jth turn;
the task remaining execution times information is represented by a second-order tensor of m rows and n columns, and is marked as a task remaining total execution times tensor T_remain (m, n), and the value of an element T_remain [ i, j ] represents the remaining executable times of the task corresponding to the j-th task opportunity of the i-th satellite in the remaining planning time.
5. The method of claim 1, wherein the satellite mission resource information comprises: satellite resource information, ground station resource information, mission requirement information, and mission planning period information.
6. The method of claim 5, wherein calculating a data window list of each satellite to a ground station and a task execution opportunity list of each satellite to each task target in the multi-satellite multi-station system based on the satellite task resource information comprises:
the data transmission window list is calculated according to satellite resource information, ground station resource information and task planning time period information, and the data transmission window list comprises: the satellite stores information for the start time, end time, data bandwidth, and window of the ground station data transmission window.
7. The method of claim 5, wherein calculating a data window list of each satellite to a ground station and a task execution opportunity list of each satellite to each task target in the multi-satellite multi-station system based on the satellite task resource information comprises:
according to satellite resource information, task demand information and task planning time period information, calculating the task execution opportunity list, wherein the task execution opportunity list comprises the following components: the satellite's start time, end time, corresponding satellite turns, satellite pose, task weight, task value, and generated data volume for the task's execution time window.
8. The method of claim 1, wherein the start time of the mission planable period is an end time of a first data transmission window of each satellite, the end time of the mission planable period is an inbound time of a last data transmission window of each satellite, and the mission opportunities within the mission planable period are all valid mission opportunities.
9. The method of claim 1, wherein updating the environmental state comprises: and updating the data transmission window availability tensor, the satellite energy allowance tensor and the task availability tensor in the environment state variables according to the task identification tensor, the task storage tensor and the task conflict tensor in the environment state constants.
10. The simulation environment construction device of the large-scale remote sensing task system is characterized by comprising the following components:
the first calculation module is used for calculating a data transmission window list of each satellite to a ground station and a task execution opportunity list of each satellite to each task target in the multi-satellite multi-station system according to the satellite task resource information;
a second calculation module: the system is used for calculating the task planning period of each satellite and the environment state characteristic space of the multi-satellite multi-station system according to the data transmission window list and the task execution opportunity list, and the environment state characteristic space comprises the following parameters: the total number of satellites in the system, the maximum value of the chance of a single-satellite executable task in the system, the maximum number of data transmission windows of the single satellite and the maximum number of ground windings during single-satellite planning;
the generating module is used for generating the environmental state of the multi-satellite and multi-station system according to the parameters in the environmental state characteristic space in the task planning period of each satellite, wherein the environmental state is represented by environmental state constants and environmental state variables which are needed in the satellite task planning process;
the updating module is used for selecting tasks to be executed according to the task execution opportunity list by the decision system in the task planning period, selecting a corresponding data downloading window according to the data transmission window list and updating the environment state; wherein,
The environmental state constants include:
task identification information used for representing the unique identification of the task corresponding to the task opportunity;
task storage information for representing the size of data generated when the satellite performs tasks;
the task turn number information is used for indicating the number of track turns of the satellite when each task execution opportunity occurs;
the window task mapping relation is used for representing the corresponding relation between the data transmission window and the task in time;
task conflict information for representing conflicts between tasks due to time constraints;
task value information for representing the weight or value of the corresponding task; the method comprises the steps of,
task order information for indicating the order of task opportunities in a task execution opportunity list;
the environmental state variables include:
task availability information used for indicating whether the tasks of the multi-star multi-station system are available in the task planning process;
window storage information for representing the size of the remaining downloadable data amount of the data transmission window;
the satellite storage information is used for representing the residual storage space on the satellite before the satellite enters the next data transmission window;
window availability information for indicating whether the data transmission window can download the task;
Satellite energy allowance information for indicating the number of tasks that can be performed by a single satellite; the method comprises the steps of,
and the task residual execution times information is used for indicating the residual executable times of the task corresponding to the task opportunity in the residual planning time.
11. An electronic device, the electronic device comprising: a processor and a memory storing computer program instructions; the electronic device, when executing the computer program instructions, implements the method of any of claims 1-9.
12. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon computer program instructions which, when executed by a processor, implement the method according to any of claims 1-9.
CN202310868770.7A 2023-07-17 2023-07-17 Simulation environment construction method and device for large-scale remote sensing task system Active CN116599575B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310868770.7A CN116599575B (en) 2023-07-17 2023-07-17 Simulation environment construction method and device for large-scale remote sensing task system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310868770.7A CN116599575B (en) 2023-07-17 2023-07-17 Simulation environment construction method and device for large-scale remote sensing task system

Publications (2)

Publication Number Publication Date
CN116599575A CN116599575A (en) 2023-08-15
CN116599575B true CN116599575B (en) 2023-10-13

Family

ID=87601246

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310868770.7A Active CN116599575B (en) 2023-07-17 2023-07-17 Simulation environment construction method and device for large-scale remote sensing task system

Country Status (1)

Country Link
CN (1) CN116599575B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111651905A (en) * 2020-07-09 2020-09-11 中国人民解放军国防科技大学 Agile satellite scheduling method considering time-dependent conversion time
CN115795775A (en) * 2022-06-16 2023-03-14 中国人民解放军国防科技大学 Point group target-oriented satellite task planning method, system and device
CN116090743A (en) * 2022-12-01 2023-05-09 数字太空(北京)科技股份公司 Satellite task allocation method and device, electronic equipment and storage medium
CN116227856A (en) * 2023-02-02 2023-06-06 北京信息科技大学 Task planning method, device and storage medium for multi-star multi-task

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
IT201700056428A1 (en) * 2017-05-24 2018-11-24 Telespazio Spa INNOVATIVE SATELLITE SCHEDULING METHOD BASED ON GENETIC ALGORITHMS AND SIMULATED ANNEALING AND RELATIVE MISSION PLANNER
CA3017007A1 (en) * 2018-09-10 2020-03-10 Telesat Canada Resource deployment optimizer for non-geostationary communications satellites

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111651905A (en) * 2020-07-09 2020-09-11 中国人民解放军国防科技大学 Agile satellite scheduling method considering time-dependent conversion time
CN115795775A (en) * 2022-06-16 2023-03-14 中国人民解放军国防科技大学 Point group target-oriented satellite task planning method, system and device
CN116090743A (en) * 2022-12-01 2023-05-09 数字太空(北京)科技股份公司 Satellite task allocation method and device, electronic equipment and storage medium
CN116227856A (en) * 2023-02-02 2023-06-06 北京信息科技大学 Task planning method, device and storage medium for multi-star multi-task

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
多卫星区域观测任务的效率优化研究;李曦;祝江汉;毛赤龙;;计算机仿真(第12期);全文 *
李曦 ; 祝江汉 ; 毛赤龙 ; .多卫星区域观测任务的效率优化研究.计算机仿真.2006,(第12期),全文. *

Also Published As

Publication number Publication date
CN116599575A (en) 2023-08-15

Similar Documents

Publication Publication Date Title
US11783227B2 (en) Method, apparatus, device and readable medium for transfer learning in machine learning
US20220363259A1 (en) Method for generating lane changing decision-making model, method for lane changing decision-making of unmanned vehicle and electronic device
EP3523761B1 (en) Recurrent environment predictors
CN109990790A (en) A kind of unmanned plane paths planning method and device
CN112001585A (en) Multi-agent decision method and device, electronic equipment and storage medium
CN111753076B (en) Dialogue method, dialogue device, electronic equipment and readable storage medium
CN114139637B (en) Multi-agent information fusion method and device, electronic equipment and readable storage medium
JP7284277B2 (en) Action selection using the interaction history graph
CN114261400A (en) Automatic driving decision-making method, device, equipment and storage medium
CN113467487B (en) Path planning model training method, path planning device and electronic equipment
CN110866602A (en) Method and device for integrating multitask model
CN116599575B (en) Simulation environment construction method and device for large-scale remote sensing task system
CN112287950B (en) Feature extraction module compression method, image processing method, device and medium
CN113723603A (en) Method, device and storage medium for updating parameters
CN112528160A (en) Intelligent recommendation method, intelligent recommendation device, model training device, electronic equipment and storage medium
CN112465148A (en) Network parameter updating method and device of multi-agent system and terminal equipment
CN113228056A (en) Runtime hardware simulation method, device, equipment and storage medium
CN114435165A (en) Charging method and device of charging pile, electronic equipment and storage medium
CN113919505A (en) Reverse reinforcement learning processing method and device, storage medium and electronic device
CN114169906A (en) Electronic ticket pushing method and device
CN111340234A (en) Video data processing method and device, electronic equipment and computer readable medium
CN116629463B (en) Multi-star remote sensing task dynamic programming method and device based on greedy strategy
CN114454899B (en) Vehicle driving method and device
CN113272813A (en) Method, device, equipment and storage medium for customizing data stream hardware analog simulation
CN111330269B (en) Application difficulty adjustment and strategy determination method, device, system, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant