CN113271338B - Intelligent preloading method for mobile augmented reality scene - Google Patents

Intelligent preloading method for mobile augmented reality scene Download PDF

Info

Publication number
CN113271338B
CN113271338B CN202110445941.6A CN202110445941A CN113271338B CN 113271338 B CN113271338 B CN 113271338B CN 202110445941 A CN202110445941 A CN 202110445941A CN 113271338 B CN113271338 B CN 113271338B
Authority
CN
China
Prior art keywords
user
content
sub
state
algorithm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110445941.6A
Other languages
Chinese (zh)
Other versions
CN113271338A (en
Inventor
吴俊�
韩雨琪
胡蝶
刘典
徐跃东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202110445941.6A priority Critical patent/CN113271338B/en
Publication of CN113271338A publication Critical patent/CN113271338A/en
Application granted granted Critical
Publication of CN113271338B publication Critical patent/CN113271338B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/55Push-based network services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/131Protocols for games, networked simulations or virtual reality

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention belongs to the technical field of wireless transmission, and particularly relates to an intelligent preloading algorithm for a mobile augmented reality scene. According to the intelligent preloading algorithm, the edge server side learns the track of the user, and pushes the file to the user before the user does not reach a certain holographic content. The push algorithm utilizes idle bandwidth to transmit holographic content to improve transmission efficiency of the edge base station. Under the condition that the motion trail of the user is not predicted in advance, the intelligent preloading algorithm takes the motion trail of the user as a Markov decision process, and an optimal preloading strategy is learned in a self-adaptive mode. The intelligent preload algorithm causes the mobile device to selectively store the received content through its own cache space for future requests. Particularly, in order to solve the problem of learning non-convergence caused by sparse holographic content in a scene, a state-dependent Q learning algorithm is provided.

Description

Intelligent preloading method for mobile augmented reality scene
Technical Field
The invention belongs to the technical field of wireless transmission, and particularly relates to a cache deployment method for an Augmented Reality (AR) scene.
Background
While edge computing is generally considered useful for optimizing the AR experience for multiple users, there is currently a lack of specific solutions for such settings. Due to the high resolution and diversity requirements of multi-user mobile AR applications for holographic content, it becomes necessary to solve the holographic content transmission problem of mobile AR devices. Currently, most mobile devices will store all holographic content on the device when downloading AR applications, which requires on the one hand a sufficiently large storage space of the user device and on the other hand is difficult to cope with real-time updates of the three-dimensional AR content. In order to enable the continued growth of mobile ARs, existing work utilizes edge servers to allow the correct hologram to be loaded onto the user device at the appropriate time. An edge server located near the mobile AR device is used as an aid to the AR experience, the edge server being used to store and provide the user with holographic content that the user may need within a particular area.
Disclosure of Invention
The invention aims to provide an intelligent preloading algorithm for a mobile Augmented Reality (AR) scene, which is high in transmission efficiency and low in calculation complexity.
The user obtains an immersive virtual viewing experience through the mobile AR device. However, due to practical wireless transmission bandwidth limitations, when the number of user requests is large, the base station cannot provide the three-dimensional holographic content to the user as required. As shown in fig. 1, the users have different paths and access different holograms, and before the users access the holograms, the edge base stations may send the holographic content to the device side in advance, which is called a preloading procedure.
According to the cache deployment of the mobile AR scene, an intelligent preloading algorithm is adopted, and the edge server side pushes a file to a user before the user does not reach a certain holographic content. This preloading utilizes spare bandwidth to transmit holographic content to improve the transmission efficiency of the edge base stations. And (3) adopting an intelligent preloading strategy, and taking the motion trail of the user as a Markov decision process under the scene that the motion trail of the user is not predicted in advance, so as to adaptively learn the optimal preloading strategy.
The intelligent pre-loading algorithm defines the problem as a Markov decision model and provides a state-related Q learning algorithm; in the algorithm, the invention provides a state-dependent Q learning algorithm for learning an AR sparse deployment scene, which comprises the following specific steps:
(1) establishing a motion model of a user by sampling a motion track of the user and utilizing a Markov decision model;
(2) sampling the user motion model to obtain the current position of the user, and predicting the position of the user at the next moment according to the transition probability after continuously accumulating knowledge;
(3) discretizing the current position of the user into a plurality of sub-regions, each sub-region representing a state stAlso a node of the markov model, an edge represents the transition probability between two adjacent sub-regions, i.e. two states;
(4) when receiving the reward value, updating the Q table by using the reward value;
(5) and when the reward value is not received, updating the Q value of the current state by using the state transition probability of the adjacent subarea, namely the Markov decision model.
The invention provides an intelligent preloading algorithm for a mobile augmented reality scene, which comprises the following specific steps:
(1) the edge server side learns the track of the user, the motion track of the user is regarded as a Markov decision process, and the optimal preloading strategy is learned in a self-adaptive mode. Dividing the whole area into a plurality of sub-areas, and determining the state s of the user m at the time slot ttThe state is determined by the current sub-area i and the set of cacheable content. Recording the behavior trace and direction of the user and calculating the transition probability to the new state
Figure BDA0003036903810000023
(2) If the user receives a push content, the user selects one of the push content and all the cache contents to discard; action atRepresenting the holographic content discarded at each moment in time, definition atN denotes a discarded file n;
(3) when the user equipment displays the holographic content, the corresponding cache action can obtain rewards; if the holographic content n is displayed on the screen, action a should not be selectedtN is used as a discarded file;
(4) updating the Q table: when the user device receives the reward rst,atAt a given attenuation factor γ, the behavior value function of the device is updated as:
Figure BDA0003036903810000021
and e-greedy exploration is adopted to avoid suboptimal caused by insufficient knowledge. Given 0< e <1, the equipment selection probability e selects a random strategy, and the current best strategy is selected by 1-e;
if no reward is received, updating the Q table according to the current sub-area position i and the content n of each cache, and utilizing the Q value and the transition probability P (m, d, i, j) of the adjacent sub-area j, wherein d represents the advancing direction, and L represents the advancing directionm,d,iThe number of times that the user equipment m enters i according to the current direction is represented, and the formula for updating the Q value is as follows:
Figure BDA0003036903810000022
here, the dependency of updating the Q-table is simulated from the Q-table of the neighboring sub-region using a dependency factor b, 0< b < 1.
The innovation points of the invention are as follows: in an actual scene, AR content is often deployed sparsely, a traditional reinforcement learning algorithm cannot make the AR content converge, and the Q table is updated only when the user m obtains a reward, which may cause a decrease in learning speed. However, the present invention proposes a state-dependent Q learning algorithm, and for a content n, the Q value of n can be obtained by the Q values of the adjacent sub-regions even though m does not obtain any knowledge. When the user enters sub-region i, the Q value can be updated without deriving from the neighboring sub-region, since the available knowledge of sub-region i is already sufficient. The invention uses the dependency factor b to simulate the dependency of updating the Q table according to the Q table of the adjacent sub-region.
Drawings
Figure 1 is a multi-user mobile AR system. The users in the graph have different paths and access different holograms. Before the user accesses the hologram, the edge base station may send the hologram content to the device side in advance, and this process is called a preloading process.
FIG. 2 is an embodiment environment illustration.
Figure 3 is a graphical representation of the performance of the present invention.
Detailed Description
The invention provides an intelligent preloading algorithm for a mobile augmented reality scene for carrying out a caching strategy, and provides a state-dependent Q learning algorithm in order to solve the problem of non-convergence caused by sparse cache deployment. The user motion model obtains the current position of the user through sampling, and establishes the motion model of the user by sampling the motion track of the user and utilizing the Markov decision model. And predicting the position of the user at the next moment according to the transition probability. At the same time, the user's current location is discretized into a plurality of sub-regions, each sub-region representing a state stAlso a node of the markov model, an edge represents the transition probability between two adjacent sub-regions, i.e. two states.
And when receiving the reward value, the state-dependent Q learning algorithm updates the Q table by using the reward value.
When the reward value is not received, the state-related Q learning algorithm updates the Q value of the current state by using the state transition probability in the adjacent sub-region, namely the Markov decision model.
Example (b):
(1) assuming that within a 10 × 10 size mobile AR area, each 1 × 1 area is considered as a sub-area, the user moves to another sub-area at each time t, and the user has 9 directions, i.e., upper left, lower right, upper left, and still in place. There are 50 users and 8 AR contents in the network. As shown in fig. 2, the black portions represent obstacles such as walls in the simulated actual scene. User m determines state s at time slot tt,State stDetermined by the current sub-region i and the set of cacheable content. There are 50 users in the network. The numbers 1-8 in the figure represent indices of AR content.
(2) The user's cache space is set to 2, and if the user receives a push content, the user selects one of the current push content and all cache contents to discard. Action atRepresenting the discarded holographic content at each instant of time rather than the cached content from time period t. Due to limited wireless bandwidth, the devices m andthe holographic content will not be received at every moment, if m does not receive the holographic content at t, it will keep the content currently cached and will not do discarding action.
(3) When the user equipment displays the holographic content, the corresponding cache action can obtain rewards, and the rewards can be set according to the number of the holographic content in the scene, the importance degree of each holographic content in the scene and other parameters. If the holographic content n is displayed on the screen, action a should not be selectedtN as a discarded file.
(4) Updating the Q table: when the user device receives the reward rst,atThe action value function of the device is updated as follows:
Figure BDA0003036903810000041
and e-greedy exploration is adopted to avoid suboptimal caused by insufficient knowledge. Given 0< e <1, the device selection probability e selects the random strategy, with 1-e selecting the current best strategy.
Otherwise, updating the Q table as follows:
Figure BDA0003036903810000042
the dependency of updating the Q-table is simulated from the Q-table of the neighboring sub-region using a dependency factor b, 0< b < 1.
The performance is shown in figure 3. And comparing the performance of the preloading algorithm under different iteration times by evaluating the accumulated response user quantity Acc. The JPC strategy [1], the nearest distance push (NDT) strategy and the LRUC strategy [2] are used as a comparative benchmark algorithm, and the pre-loading algorithm provided by the invention has a higher response user number Acc.
Reference documents:
[1]W.Chen and H.V.Poor,``Content Pushing With Request Delay Information," in IEEE Transactions on Communications,2017.
[2]D.Lee et al.,``LRFU:a spectrum of policies that subsumes the least recently used and least frequently used policies,"in IEEE Transactions on Computers,2001。

Claims (1)

1. an intelligent preloading method for a mobile augmented reality scene is characterized in that an edge server side pushes a file to a user before the user does not reach a certain holographic content; in order to learn the AR sparse deployment scene, a state-dependent Q learning algorithm is adopted, and the method comprises the following steps:
(1) establishing a motion model of a user by sampling a motion track of the user and utilizing a Markov decision model;
(2) sampling the user motion model to obtain the current position of the user, and predicting the position of the user at the next moment according to the transition probability after continuously accumulating knowledge;
(3) discretizing the current position of the user into a plurality of sub-regions, each sub-region representing a state stAlso a node of the markov model, an edge represents the transition probability between two adjacent sub-regions, i.e. two states;
(4) when receiving the reward value, updating the Q table by using the reward value;
(5) when the reward value is not received, updating the Q value of the current state by using the adjacent subarea, namely the state transition probability in the Markov decision model;
the method comprises the following specific steps:
(1) the edge server side learns the track of the user, the motion track of the user is regarded as a Markov decision process, and the optimal preloading strategy is learned in a self-adaptive mode; dividing the whole area into a plurality of sub-areas, and determining the state s of the user m at the time slot ttThe state is determined by the current sub-area i and the cacheable content set; recording the behavior trace and direction of the user and calculating the transition probability to the new state
Figure FDA0003434248010000011
(2) If the user receives a push content, the user selects one of the push content and all the cache contents to discard; action atRepresenting the holographic content discarded at each moment in time, definition atN denotes a discarded file n;
(3) when the user equipment displays the holographic content, the corresponding cache action can obtain rewards; if the holographic content n is displayed on the screen, action a should not be selectedtN is used as a discarded file;
(4) updating the Q table: when the user device receives the reward rst,atAt a given attenuation factor γ, the behavior value function of the device is updated as:
Figure FDA0003434248010000012
adopting epsilon-greedy exploration, selecting a random strategy according to the equipment selection probability epsilon of 0< epsilon <1, and selecting the current optimal strategy according to 1-epsilon;
if no reward is received, updating the Q table according to the current sub-area position i and the content n of each cache, and updating the Q value by using the Q value and the transition probability P (m, d, i, j) of the adjacent sub-area j, wherein the formula is as follows:
Figure FDA0003434248010000013
wherein d represents the direction of travel, Lm,d,iRepresents the number of times that the user equipment m enters i according to the current direction, beta is a dependent factor, 0<β<1, simulating the dependency of updating the Q table according to the Q table of the adjacent sub-region.
CN202110445941.6A 2021-04-25 2021-04-25 Intelligent preloading method for mobile augmented reality scene Active CN113271338B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110445941.6A CN113271338B (en) 2021-04-25 2021-04-25 Intelligent preloading method for mobile augmented reality scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110445941.6A CN113271338B (en) 2021-04-25 2021-04-25 Intelligent preloading method for mobile augmented reality scene

Publications (2)

Publication Number Publication Date
CN113271338A CN113271338A (en) 2021-08-17
CN113271338B true CN113271338B (en) 2022-04-12

Family

ID=77229392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110445941.6A Active CN113271338B (en) 2021-04-25 2021-04-25 Intelligent preloading method for mobile augmented reality scene

Country Status (1)

Country Link
CN (1) CN113271338B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569443A (en) * 2019-03-11 2019-12-13 北京航空航天大学 Self-adaptive learning path planning system based on reinforcement learning
CN110989614A (en) * 2019-12-18 2020-04-10 电子科技大学 Vehicle edge calculation transfer scheduling method based on deep reinforcement learning

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102199093B1 (en) * 2017-02-10 2021-01-06 닛산 노쓰 아메리카, 인크. Self-driving vehicle operation management, including operating a partially observable Markov decision process model instance
US11586974B2 (en) * 2018-09-14 2023-02-21 Honda Motor Co., Ltd. System and method for multi-agent reinforcement learning in a multi-agent environment
CN109587519B (en) * 2018-12-28 2021-11-23 南京邮电大学 Heterogeneous network multipath video transmission control system and method based on Q learning
CN110968816B (en) * 2019-12-23 2023-11-28 广东技术师范大学 Content caching method and device based on reinforcement learning and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110569443A (en) * 2019-03-11 2019-12-13 北京航空航天大学 Self-adaptive learning path planning system based on reinforcement learning
CN110989614A (en) * 2019-12-18 2020-04-10 电子科技大学 Vehicle edge calculation transfer scheduling method based on deep reinforcement learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
《Deep reinforcement learning for dynamic computation offloading and resource allocation in cache-assisted mobile edge computing systems》;Samrat Nath;《IEEE》;20201130;全文 *
基于平均奖赏强化学习算法的零阶分类元系统;臧兆祥等;《计算机工程与应用》;20160617(第21期);全文 *

Also Published As

Publication number Publication date
CN113271338A (en) 2021-08-17

Similar Documents

Publication Publication Date Title
CN113094982B (en) Internet of vehicles edge caching method based on multi-agent deep reinforcement learning
US6834329B2 (en) Cache control method and cache apparatus
CN104796439B (en) Web page push method, client, server and system
CN110430440A (en) Video transmission method, system, computer equipment and storage medium
US11373062B1 (en) Model training method, data processing method, electronic device, and program product
CN112752308B (en) Mobile prediction wireless edge caching method based on deep reinforcement learning
CN114973673B (en) Task unloading method combining NOMA and content cache in vehicle-road cooperative system
Isaacman et al. Low-infrastructure methods to improve internet access for mobile users in emerging regions
CN117221403A (en) Content caching method based on user movement and federal caching decision
CN116321307A (en) Bidirectional cache placement method based on deep reinforcement learning in non-cellular network
CN112996058A (en) User QoE (quality of experience) optimization method based on multi-unmanned aerial vehicle network, unmanned aerial vehicle and system
EP3193490B1 (en) Method and system for distributed optimal caching of content over a network
Li et al. DQN-enabled content caching and quantum ant colony-based computation offloading in MEC
CN112911614B (en) Cooperative coding caching method based on dynamic request D2D network
CN115361710A (en) Content placement method in edge cache
CN113141634B (en) VR content caching method based on mobile edge computing network
CN113271338B (en) Intelligent preloading method for mobile augmented reality scene
Wang et al. Edge Caching with Federated Unlearning for Low-latency V2X Communications
CN115904731A (en) Edge cooperative type copy placement method
CN111901833A (en) Unreliable channel transmission-oriented joint service scheduling and content caching method
JP7174372B2 (en) Data management method, device and program in distributed storage network
CN111901394A (en) Method and system for caching moving edge by jointly considering user preference and activity degree
CN112822726B (en) Modeling and decision-making method for Fog-RAN network cache placement problem
CN117939505B (en) Edge collaborative caching method and system based on excitation mechanism in vehicle edge network
Wang et al. Multi-Agent Deep Reinforcement Learning for Cooperative Edge Caching via Hybrid Communication

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant