WO2021104132A1 - 一种基于云虚拟机的数据访问方法及设备 - Google Patents

一种基于云虚拟机的数据访问方法及设备 Download PDF

Info

Publication number
WO2021104132A1
WO2021104132A1 PCT/CN2020/129865 CN2020129865W WO2021104132A1 WO 2021104132 A1 WO2021104132 A1 WO 2021104132A1 CN 2020129865 W CN2020129865 W CN 2020129865W WO 2021104132 A1 WO2021104132 A1 WO 2021104132A1
Authority
WO
WIPO (PCT)
Prior art keywords
file
user
virtual machine
cloud virtual
data set
Prior art date
Application number
PCT/CN2020/129865
Other languages
English (en)
French (fr)
Inventor
邬国权
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021104132A1 publication Critical patent/WO2021104132A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • G06F9/452Remote windowing, e.g. X-Window System, desktop virtualisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45583Memory management, e.g. access or allocation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/45595Network integration; Enabling network access in virtual machine instances

Definitions

  • This application relates to the field of cloud desktop technology, and in particular to a data access method and device based on a cloud virtual machine.
  • the cloud desktop service system can use virtualization technology to virtualize one or more cloud virtual machines on each cloud server. After the user logs in to the cloud virtual machine through the user terminal, the corresponding cloud virtual machine can be displayed on the user terminal. Cloud desktop. By operating the displayed cloud desktop, the user can realize some operation functions by successfully logging in to the cloud virtual machine.
  • Cloud desktop service systems generally virtualize multiple cloud virtual machines on cloud servers through cloning.
  • the more common cloud virtual machine cloning method is linked cloning.
  • the system disk of the cloud virtual machine created through linked cloning (ie C drive) ) Both include system master disk and differential disk.
  • the system master disk stores the data shared by all cloud virtual machines created under the cloud server through linked cloning. The data will not be cleared when the cloud virtual machine is shut down, that is, each cloud virtual created through linked clones.
  • the machine shares a system master disk.
  • Each cloud virtual machine also has its own differential disk.
  • the differential disk is used to store the files generated after the user logs in to the cloud virtual machine. The file is invisible to other cloud virtual machines. When the user closes the logged-in cloud virtual machine, the The data in the differential disk of the cloud virtual machine is then cleared.
  • the existing files in the differential disk after the user logs in to the cloud virtual machine will be uploaded to the remote storage device, such as network attached storage ( network attached storage (NAS) equipment.
  • NAS network attached storage
  • the user logs in to the corresponding cloud virtual machine again through the user terminal he may need to access the files generated when logging in to the cloud virtual machine before, so the cloud virtual machine obtains the files generated by the user in the cloud virtual machine from the NAS disk.
  • the time for downloading the file from the NAS disk after logging in to the cloud virtual machine this time will be relatively long, resulting in a relatively long time delay when the user accesses these files, which affects the user experience.
  • the present application provides a data access method and device based on a cloud virtual machine, so as to reduce the time delay for the user to perform certain operations when using the cloud virtual machine, and improve the user experience.
  • this application provides a data access method based on a cloud virtual machine.
  • the method includes: the cloud virtual machine receives a user's login request, and responds to the user's login request to predict that the user may access after logging in to the cloud virtual machine. And download the determined at least one file from the remote storage device to the local cloud virtual machine.
  • the cloud virtual machine when the cloud virtual machine responds to the user’s login request, on the one hand, it establishes a connection with the user terminal of the user; on the other hand, it determines at least one file that the user may access after logging in to the cloud virtual machine, and remotely The determined at least one file is downloaded from the storage device. After the user logs in to the cloud virtual machine, there is no need to wait for the cloud virtual machine to download the file from the remote storage device. This reduces the time delay for the user to access these files and improves Improve the user experience.
  • the cloud virtual machine may obtain at least one file corresponding to the user through its own prediction. Wherein, when making a prediction through the cloud virtual machine, it is necessary to obtain the historical data record of the user, and according to the obtained historical data record of the user, predict at least one file that the user may access after logging in to the cloud virtual machine.
  • the historical data record records a record of files accessed by the user in the cloud virtual machine before the cloud virtual machine receives the login request.
  • the cloud virtual machine can make predictions by itself, reduce interaction with other devices, and save resources.
  • a cloud virtual machine or other device predicts at least one file that a user may access after logging in to the cloud virtual machine
  • the prediction can be made in the following ways:
  • each parameter value is preset with a weight value; according to a preset algorithm, the multiple parameters and the preset weight values of the multiple parameters Calculate the probability value of each file that may be accessed after the user logs in to the cloud virtual machine; determine the at least one target file according to the probability value.
  • the probability value that each file contained in the historical data record may be accessed is calculated according to the preset algorithm, so as to determine at least one file that the user may access.
  • the prediction method is simple, does not need to distinguish between logged-in users, and is applicable to many scenarios. , When deployed on a cloud virtual machine, it will not occupy too much resources of the cloud virtual machine and has strong applicability.
  • a cloud virtual machine or other device predicts at least one file that a user may access after logging in to the cloud virtual machine
  • the prediction can also be made in the following ways:
  • the first data set is the set of the first file accessed each time when the user logs in to the cloud virtual machine for N consecutive times; and the first data Set the input to the first prediction model so that the first prediction model makes predictions based on the first data set, and outputs the first file that the user may access after logging in to the cloud virtual machine; and the first prediction
  • the first file output by the model and the user recorded in the historical data log in to the cloud virtual machine for M consecutive times multiple files accessed sequentially each time constitute a second data set;
  • the second data set is input into the first data set
  • the second prediction model is to predict the second file that the user may access after logging in to the cloud virtual machine; add the second file to the second data set to update the second data set; return to the The step of inputting the second data set into the second prediction model until the M files that may be accessed by the user after logging in to the cloud virtual machine are predicted.
  • the first prediction model is used to predict the first file that the user may access after logging in to the cloud virtual machine
  • the second prediction model is used to predict the user's possibility based on the file output by the first prediction model (the first file that the user may access) Access other files sequentially.
  • the second prediction model is trained based on the training samples of multiple files accessed sequentially each time the user continuously logs in to the cloud virtual machine recorded in the user’s historical data record. That is, the second prediction model can be based on the user’s continuous access to files. Regularly predict the files that may be accessed next time.
  • the second prediction model can also be used to predict the first file that the user may access after logging in to the cloud virtual machine, but the first prediction module is based on the historical data record, the first file that the user accesses every time the user logs in to the cloud virtual machine.
  • the training samples are trained. Therefore, the first prediction model is more accurate for predicting the first file that a user may access after logging in to the cloud virtual machine.
  • the first prediction model and the second prediction model can handle a larger amount of data. Large, when there are many records of files accessed by the user on the cloud virtual machine recorded in the user’s historical data record, the first prediction model and the second prediction model are used for prediction, and the prediction efficiency is higher.
  • the first prediction model determines at least one associated file of the last file in the first data set, and determines the degree of association between the last file and the at least one associated file; Wherein, the files in the first data set are arranged according to the time they were accessed from far to nearest, and the associated files are the files that are accessed by the user recorded in the first data set after accessing the last file. File; the first prediction model outputs the associated file with the highest degree of association with the last file; the degree of association is the interval between the first prediction model and the last file according to the associated file, and/ Or, in the first data set, the frequency of the user accessing the associated file after accessing the last file is determined.
  • the second prediction model determines at least one associated file of the last file in the second data set, and determines the degree of association between the last file and the at least one associated file; Wherein, the files in the second data set are arranged in order of access time from farthest to most recent, and the associated file is after the user recorded in the second data set has accessed the same file as the last file The accessed file; the second prediction model outputs the associated file with the highest degree of relevance to the last file; wherein, the degree of relevance is the second prediction model based on the associated file and the last file Interval, and/or determined by the frequency with which the user accesses the associated file after accessing the same file as the last file in the second data set; the associated file to be output by the second prediction model Add to the end of the second data set, so that the output associated file is the last file of the updated second data set.
  • the cloud virtual machine may also predict at least one target file to be accessed after the user logs in to the cloud virtual machine in the following manner: according to the probability value of the file being accessed from large to small In order, at least one target file identifier is selected from the multiple file identifiers included in the user's historical data, and the file corresponding to the target file identifier is used as the target file to be accessed after the user logs in to the cloud virtual machine; When it is detected that the historical data record meets the preset condition, then: input the first data set into a first prediction model, so that the first prediction model makes predictions based on the first data set, and outputs the user login The first file that may be accessed after the cloud virtual machine; when the first file output by the first prediction model and the user recorded in the historical data log in to the cloud virtual machine for M consecutive times, each time The accessed multiple files constitute a second data set; the second data set is input into a second prediction model to predict the second file that the user may access after logging in to the cloud
  • the probability value determined by the preset algorithm is used to determine the target file. If the historical data record is detected to meet the preset condition, the first prediction model is adopted And the second prediction model predicts the target file. For example, if the preset condition is that the file identifier contained in the historical data record exceeds the preset threshold, the first prediction model and the second prediction model are used to predict. If the file contained in the historical data record If the identifier does not exceed the preset threshold, the preset algorithm is used for calculation to avoid consuming a large amount of computing resources and improve the prediction efficiency.
  • the cloud virtual machine may also generate a prediction request and send the prediction request to an electronic device.
  • the prediction request is used to instruct the electronic device to log in to the cloud virtual machine for the user. Predicting at least one file that may be accessed behind the machine; receiving at least one file predicted by the electronic device that the electronic device may access after the user logs in to the cloud virtual machine.
  • the cloud virtual machine can also trigger the electronic device to make a prediction and obtain the prediction result of the electronic device. There is no need to make predictions on the cloud virtual machine, that is, there is no need to deploy the function of the prediction on the cloud virtual machine, thereby avoiding occupation A large number of resources for cloud virtual machines.
  • the embodiments of the present application also provide a device, which includes a plurality of functional units, and these functional units can perform the functions performed by each step in the method of the first aspect.
  • These functional units can be implemented by hardware or software.
  • the device includes a transmission unit and a processing unit. Regarding the beneficial effects achieved by the device, please refer to the description of the first aspect, which will not be repeated here.
  • an embodiment of the present application also provides a device, which includes a processor and a memory, the memory stores program instructions, and the processor runs the program instructions in the memory to implement the program instructions in the first aspect.
  • a device which includes a processor and a memory, the memory stores program instructions, and the processor runs the program instructions in the memory to implement the program instructions in the first aspect.
  • the present application also provides a computer-readable storage medium in which instructions are stored, which when run on a computer, cause the computer to execute the method provided in the above-mentioned first aspect.
  • the present application also provides a computer chip, which is connected to a memory, and the chip is used to read and execute a software program stored in the memory, and execute the method provided in the above first aspect.
  • Figure 1 is a schematic diagram of the system architecture of an existing cloud virtual machine
  • Figure 2 is a schematic diagram of the system disk structure of the linked clone cloud virtual machine
  • Figure 3 is a schematic diagram of module interaction between the existing cloud virtual machine and the NAS disk
  • FIG. 4 is a schematic diagram of module interaction between cloud virtual machine ring and NAS disk provided by an embodiment of the application
  • Figure 5 is a schematic diagram of an application scenario provided by this application.
  • Figure 6 is a schematic diagram of another application scenario provided by this application.
  • FIG. 7 is a schematic diagram of another application scenario provided for this application.
  • FIG. 8 is a schematic flowchart of a data access method based on a cloud desktop according to an embodiment of the application.
  • Figure 9 is a schematic diagram of the algorithm structure of the LSTM model
  • FIG. 10 is a schematic diagram of a process of training and predicting an LSTM model provided by an embodiment of the application.
  • FIG. 11 is a schematic flowchart of a third cloud desktop-based data access method provided by an embodiment of the application.
  • FIG. 12 is a schematic diagram of a data access device based on a cloud virtual machine provided by an embodiment of the application.
  • FIG. 13 is a schematic diagram of a data access device based on a cloud virtual machine provided by an embodiment of the application.
  • Cloud desktop also known as desktop virtualization, cloud computer
  • desktop virtualization cloud computer
  • Cloud desktop is a new model that replaces traditional computers. After adopting the cloud desktop, users can access our personal cloud desktop on the network at any place and any time through different terminal devices.
  • FIG. 1 this is a schematic diagram of a cloud system architecture to which the embodiments of this application can be applied.
  • the left side of Figure 1 is a virtual desktop client, usually called a thin client, and the client device can be an ordinary computer 101a. , Tablet PC 101b, Smart Phone 101c, etc. They use the remote desktop protocol 103 to access the remote desktop service through the network 102.
  • the server 204a...204n provides the carrier of remote desktop, and the user's virtual desktop (Virtual Desktop) exists on the server in the form of virtual machines 205a, 205b ⁇ ⁇ 205n.
  • the virtual desktop management system 106 is used to provide functions such as mapping between the user's client and the virtual machine.
  • the client first connects to the virtual desktop management system 106.
  • the virtual desktop management system 106 After receiving the virtual machine connection request (hereinafter referred to as the login request) sent by the client, the virtual desktop management system 106 allocates virtual machines for the client from the virtual machine resource pool. The client obtains the virtual machine address of the virtual machine allocated by the virtual desktop management system 106, and then connects to the virtual machine.
  • the virtual desktop management system 106 may be a server or a common personal computer, which is not specifically limited in the present invention.
  • the user accesses the virtual desktop (ie, virtual machine) assigned to the user on the server through the client, and the virtual desktop transmits the content accessed by the user to the user's client (also referred to as user terminal) for display.
  • the server 204a...204n can virtualize multiple cloud virtual machines using virtualization technology.
  • the desktop of each cloud virtual machine is the cloud desktop.
  • the user logs in to the cloud virtual machine through the user terminal and performs the process on the logged-in cloud virtual machine.
  • Cloud desktop operation Among them, those skilled in the art can understand that the so-called user logging in to the cloud virtual machine means that the user logs in to the cloud desktop. Therefore, in the embodiments of the present application, the cloud virtual machine and the cloud desktop can sometimes be used together, and login and connection can be used together. It should be pointed out Yes, when the difference is not emphasized, the meaning to be expressed is the same.
  • the cloud virtual machines produced by server virtualization include linked clone virtual machines and full clone virtual machines.
  • the full clone virtual machine occupies a completely independent virtual disk space.
  • the full clone virtual machine and the first login to the full clone virtual machine After binding the full clone virtual machine can only serve the user, that is, the full clone virtual machine will not be assigned to other users. Therefore, the complete clone virtual machine can permanently store all the operating data of the user bound to it on the cloud desktop in its own virtual disk, and the user's operating data will not be erased when the cloud virtual machine is shut down.
  • the virtual disks of each full clone virtual machine are independent of each other and do not interfere with each other.
  • the full clone virtual machine occupies dedicated virtual disk resources. When the storage space of the server is limited, cloud virtual machine services cannot be provided to more users.
  • the linked clone virtual machine came into being, and the linked clone virtual machine was cloned through a source virtual machine (or called a virtual machine template/parent virtual machine).
  • the characteristic of the linked clone virtual machine is that multiple linked clone virtual machines share the same virtual disk (hereinafter referred to as the system master disk), that is, the linked clone virtual machine can have the same operating system, application system, etc. with the configuration of the same system master disk .
  • the linked clone virtual machine will not be bound to the user. After the user initiates a login request through the user terminal, the virtual machine management system randomly allocates an idle linked clone virtual machine for the user.
  • the server divides each linked clone virtual machine into its own dedicated part of the virtual disk (hereinafter referred to as the differential disk), and combines the system master disk and the differential disk
  • the disk combination is a system disk (ie C disk) of a linked clone virtual machine.
  • the operation data generated after the user logs in to the cloud virtual machine will be temporarily stored in the differential disk of the cloud virtual machine.
  • the differential disk of the linked clone virtual machine can be temporarily stored by the user on the cloud virtual machine.
  • the generated operation data which includes but is not limited to: all files of the user, the attribute information of each file, the configuration parameters of the user to the system or application programs, etc., such as word files created by the user, videos and audios downloaded, Color configuration of the window of word software, etc.
  • the attribute information of the file includes but is not limited to: the icon of the file, the location of the file (storage path), the size of the file, the file type and other information.
  • the linked clone virtual machine will not be bound to the user, its allocation method is random. In order to ensure that the user’s operating data is not viewed by other users, when the user closes the virtual machine, the virtual machine’s The data saved in the differential disk also needs to be cleared.
  • the cloud virtual machine will The saved data is backed up to the NAS disk synchronously.
  • the user's operations on the cloud desktop will generate user data, such as created word files, modified Excel files, downloaded videos and audios, and so on.
  • the above user data generated by the user on the cloud desktop will be temporarily stored in the differential disk of the cloud virtual machine, and synchronized to the user's account in the NAS disk.
  • the NAS disk can also store user data of other cloud virtual machine users, and the user data includes all file data of the user, file attribute information, and system or application configuration information.
  • the cloud virtual machine can download all user data of the user from the NAS disk to the local for seamless switching of the user.
  • the cloud virtual machine can only be used by the user after downloading all the user data of the user. Therefore, this method will affect the time for the user to turn on the cloud virtual machine.
  • one implementation is to store the attribute information of all the files of the user on a NAS disk, where the attribute information of the file includes but is not limited to the size, file type, storage location and File icons etc.
  • the cloud virtual machine obtains the attribute information of all files of the user from the NAS disk, and virtualizes the user's file system according to the obtained attribute information of all files, where the virtualized output
  • the file system of the file system is the same as the user’s actual file system on the cloud desktop, which can be used for users to browse and provide users with access to files.
  • the files in the virtualized file system do not exist locally, but still exist in the NAS, that is, When the user needs to access a certain file, the cloud virtual machine sends a file access request, and downloads the file corresponding to the file access request to the local cloud virtual machine from the NAS disk.
  • FIG. 3 it is a schematic diagram of the module interaction between the cloud virtual machine and the NAS disk in the above implementation.
  • the cloud virtual machine includes but is not limited to the bottom layer and the application layer.
  • the bottom layer includes the file monitor driver, and the application layer includes file management.
  • the file management driver When a user logs in to a cloud virtual machine, the file management driver triggers the configuration file application to download the attribute information of all files of the user from the NAS disk.
  • the file management driver is based on the attribute information of all the files downloaded by the configuration file application.
  • the virtualized file system of the user is output, and when the virtualized file system is constructed, the user logs in to the cloud virtual machine.
  • the user After logging in to the cloud virtual machine, the user can operate through the virtualized file system, and the file management application can respond to the user's file operation on the virtualized file system and generate I/O requests. For example, when a user accesses a file in the virtualized file system, the file management application responds to the user's operation of reading the file, and generates an I/O request for reading the file, and carries the file identifier of the file in the I/O request.
  • the file management driver can monitor the I/O requests generated by the file management application, and trigger the file management application to perform corresponding operations according to the monitored I/O request.
  • the file management driver monitors that the I/O request generated by the file management application is a file read request, and determines that the file corresponding to the file identifier carried in the I/O request does not exist locally, that is, the user has accessed
  • the file management driver triggers the configuration file application to download the file corresponding to the file identifier carried in the I/O request for reading the file from the NAS disk to the local differential disk of the cloud virtual machine.
  • the file management application responds to the user's operation and generates an I/O request for writing the file.
  • the /O request carries information about the file named "work”, and the information includes the attribute information of the file and the data in the file.
  • the file management driver listens to the I/O request for writing the file generated by the file management application, and the file management driver triggers the configuration file application to change the information of the file named "work” carried in the I/O request for writing the file. Synchronous upload to the NAS disk. Later, if the user deletes or modifies the file named "work” in the cloud desktop, the file management driver will also trigger the profile application to update the file named "work” in the NAS disk synchronously.
  • the cloud virtual machine when the cloud virtual machine is started, it can construct a virtualized file system based on the attribute information of all the files obtained, without downloading the files stored in the NAS to the local cloud virtual machine, so the cloud is shortened.
  • the startup time of the virtual machine when a user accesses a file, he first needs to download the accessed file from the NAS to the local cloud virtual machine. This prolongs the time for the user to access the file, especially for relatively large files.
  • the time for the virtual machine to download the file from the NAS disk is relatively long, resulting in a relatively long time delay for the user to access the file, thereby affecting the user experience.
  • this application provides a cloud desktop-based data access method.
  • a user logs in to a cloud virtual machine, according to the historical data records of files generated by the user in the cloud virtual machine, it is predicted that the user may log in to the cloud virtual machine.
  • the files that will be accessed the cloud virtual machine and download the predicted files from the NAS to the cloud virtual machine.
  • the access to these files is shortened Time delay.
  • FIG. 4 is a schematic diagram of the interaction of various software modules in the software module architecture in this application, compared with FIG. 3, a prediction system is added to FIG. 4.
  • the forecasting system is a forecasting service, and can also be understood as an application program that can realize the forecasting function in this application.
  • the following takes the cloud virtual machine in FIG. 4 as the cloud virtual machine in FIG. 1 as an example to describe the process of this application in detail:
  • the file management driver obtains the attribute information of all files of the user from the NAS disk through the configuration file application, and virtualizes the user based on the obtained attribute information of all files Virtualized file system.
  • the relevant introduction in Figure 3 please refer to the relevant introduction in Figure 3, which will not be repeated here.
  • the prediction system obtains the user’s historical data record, and predicts the user’s target file based on the user’s historical data record, that is, the file that the user will access after logging in to the cloud virtual machine. It should be understood that the actual prediction is The file identification of the target file predicted by the system. The specific prediction method will be described in detail below.
  • the prediction system triggers the configuration file application to download the file corresponding to the predicted file identifier from the NAS to the local cloud virtual machine, and save the downloaded file to the corresponding storage path of the file in the virtualized file system.
  • the file management application responds to the user's operation of reading the file, and the user can directly access the file.
  • the file management application generates an I/file for the user's reading of the file. O request.
  • the file management driver monitors the I/O request for reading the file generated by the file management application and determines that the file is saved locally to avoid triggering the configuration file application from the NAS disk after monitoring the I/O request for the user to read the file Download the file to be read, thereby shortening the time delay for the user to access the file.
  • the historical data records include records of all files accessed by the user on the cloud virtual machine before receiving the login request.
  • the historical data records of users in the embodiments of the present application may be stored in a cloud server, or may be stored in a storage medium other than the cloud server, such as an independent server other than the cloud server, or Cloud storage resources (NAS or 360 cloud disks, etc.), where the user data of different users in this application can also be stored in other storage media.
  • the prediction system in this application can be deployed on a cloud virtual machine, or on an electronic device other than the cloud virtual machine, such as a cloud server or another storage medium independent of the cloud server or cloud virtual machine, Among them, the prediction system and historical data can be deployed on the same device or the same storage medium, or on different devices or different storage media.
  • the virtual machine management system responds to the login request triggered by the user through the user terminal, allocates a cloud virtual machine to the user terminal of the user (passed by authentication), and forwards the user's login request to the cloud virtual machine.
  • the cloud virtual machine establishes a connection with the user terminal.
  • the cloud virtual machine obtains the user's historical data record according to the user's identity information (such as account name and password) carried in the login request, and deploys it based on the cloud virtual
  • the prediction system and the historical data records obtained predict the file identifier of the target file that the user may access after logging in to the cloud virtual machine, and download the file corresponding to the predicted file identifier from the NAS disk to the local differential disk of the cloud virtual machine .
  • the user terminal After the user terminal is connected to the cloud virtual machine, the user performs operations on the cloud desktop through the user terminal.
  • FIG. 6 and FIG. 5 please refer to the specific introduction of the relevant parts in FIG. 5, which will not be repeated here.
  • the difference between Figure 6 and Figure 5 is that the cloud virtual machine in Figure 6 does not deploy the prediction system, but is deployed on the cloud server. Therefore, when the cloud virtual machine in Figure 6 responds to the user’s login request, it also needs to trigger the cloud The prediction system deployed on the server predicts the target file that the user may access after logging in to the cloud virtual machine.
  • the prediction system obtains the user's historical data and makes predictions based on the obtained historical data, and then the predicted target file
  • the file identifier of is sent to the cloud virtual machine, and the cloud virtual machine downloads the file corresponding to the file identifier from the NAS disk to the local differential disk of the cloud virtual machine.
  • Trigger method 1 Trigger through cloud virtual machine
  • the cloud virtual machine When the cloud virtual machine receives the user's login request forwarded by the virtual machine management system, the cloud virtual machine triggers the prediction system to predict the files that the user may access after logging in to the cloud virtual machine.
  • the cloud virtual machine after receiving the login request of the user, the cloud virtual machine sends a prediction request to the prediction system through the profile application.
  • the prediction request carries the user's identity information, for example, the user's account information.
  • the prediction system receives the prediction request sent by the configuration file application of the cloud virtual machine, obtains the user's historical data from the historical data storage medium, and based on the historical data, the user may access files after logging in to the cloud virtual machine
  • the file ID is predicted.
  • the cloud virtual machine when the cloud virtual machine triggers the prediction system to make predictions, the cloud virtual machine can obtain the historical data of the user. When the cloud virtual machine sends a prediction request to the prediction system, the obtained The user's historical data is sent to the forecasting system.
  • Trigger mode two trigger through the virtual machine management system
  • the virtual machine management system After passing the authentication of the user's identity information, the virtual machine management system sends a prediction request to the prediction system, and the prediction request carries the user's identity information.
  • the prediction system receives the prediction request sent by the virtual machine management system, obtains historical data of the user from the historical data storage medium, and performs file identification of files that the user may access after logging in to the cloud virtual machine based on the historical data. prediction.
  • the cloud virtual machine updates the user's historical data record in the historical data storage medium and updates the user's user data in the NAS according to the user's operation.
  • the prediction system is deployed on a storage medium other than the cloud virtual machine and the cloud server (hereinafter referred to as the prediction system storage medium)
  • the specific process of this example is introduced:
  • FIG. 7 For the parts of FIG. 7 that are the same as FIG. 6 or FIG. 5, please refer to the specific introduction of the relevant parts in FIG. 6 or FIG.
  • the following takes Figure 6 as an example to introduce the difference between Figure 7 and Figure 6.
  • the difference between Figure 7 and Figure 6 is that the prediction system of Figure 7 is not deployed on a cloud server, but is deployed on a cloud virtual machine and a cloud.
  • the cloud virtual machine in Figure 7 responds to the user’s login request, it also needs to trigger the prediction system to predict the target file for the user, so that the prediction system can obtain the user’s history Data records, and based on the acquired historical data records, predict the file identifier of the target file that the user may access after logging in to the cloud virtual machine.
  • the prediction system sends the file identifier of the predicted target file to the cloud virtual machine, and the cloud virtual machine downloads the file corresponding to the predicted file identifier from the NAS disk to the local differential disk of the cloud virtual machine.
  • the completion of the prediction described in this application may be that after each file identification of a target file is predicted, the predicted file identification is sent to the cloud virtual machine, or the file identification of a preset number of target files After all the predictions are completed, all the file identifiers of the predictions are sent to the cloud virtual machine together.
  • the embodiment of the present application does not limit the interaction mode between the prediction system and the cloud virtual machine.
  • an embodiment of the present application provides a data access process based on a cloud virtual machine.
  • the user terminal, virtual desktop management system, cloud virtual machine, NAS disk, and historical data storage medium in the process can be respectively The above-mentioned user terminal, virtual desktop management system, cloud virtual machine, NAS disk and historical data storage medium in Fig. 7, the process includes:
  • step 800 the user sends a login request through the user terminal.
  • the user operates the user terminal to send a login request, and the login request carries the user's identity information, for example, the account name and password for logging in to the cloud virtual machine entered by the user at the user terminal.
  • the login request carries the user's identity information, for example, the account name and password for logging in to the cloud virtual machine entered by the user at the user terminal.
  • Step 801 After receiving the login request sent by the user, the virtual desktop management system authenticates the user's identity information, and if the authentication passes, allocates a cloud virtual machine to the user terminal of the user.
  • the virtual desktop management system After the virtual desktop management system receives the login request sent by the user, it authenticates the user's identity information, and if the authentication is passed, the user terminal of the user is allocated an idle cloud virtual machine selected from the virtual machine pool If the authentication fails, the user’s login request will not be responded to, and the flow will be exited.
  • Step 802 The virtual desktop management system sends the user's login request to the assigned cloud virtual machine.
  • Step 803 The cloud virtual machine receives the login request of the user.
  • step 804 the cloud virtual machine responds to the user's login request, establishes a connection with the user terminal, and obtains historical data records of the user.
  • the cloud virtual machine When the assigned cloud virtual machine receives the login request sent by the virtual machine desktop management system, on the one hand, the cloud virtual machine establishes a connection (for example, an HDP connection) with the user terminal of the user, where the process of establishing a connection includes: cloud The virtual machine obtains the attribute information of all files of the user from the NAS, and constructs a virtualized file system according to the attribute information of all the acquired files. When the virtualized file system is constructed, the connection between the cloud virtual machine and the user terminal is completed. The user logs in to the cloud virtual machine, and then can perform cloud desktop operations through the virtualized file system.
  • a connection for example, an HDP connection
  • the cloud virtual machine obtains the user's historical data records, and predicts the files that the user may access after logging in to the cloud virtual machine based on the obtained historical data records.
  • the specific prediction process please refer to step 805 and step 806. Introduction.
  • the prediction system can also be deployed on a storage medium other than the cloud virtual machine. If the prediction system is deployed on a storage medium other than the cloud virtual machine, then for step 804, the cloud virtual machine is responding to the user’s
  • the login request also needs to trigger the prediction system to predict the target file for the user.
  • the trigger mode please refer to the specific description of the above trigger mode 1 or trigger mode 2, which will not be repeated here.
  • step 805 the prediction system deployed on the cloud virtual machine predicts the file identifier of the target file that the user may access after logging in to the cloud virtual machine based on the acquired historical data record of the user.
  • the prediction system If the prediction system is not deployed on the cloud virtual machine, after the prediction system has completed the prediction, it is also necessary to send the predicted file identifier corresponding to the user to the cloud virtual machine.
  • Step 806 The cloud virtual machine downloads the file corresponding to the file identifier notified by the prediction system from the NAS to the local differential disk of the cloud virtual machine.
  • step 806 this application does not limit the time for the prediction system to predict the file identifier of the target file that the user may access. It can be during the process of the user logging in to the cloud virtual machine, or the user logging in to the cloud virtual machine. Later, or when the user is using the cloud virtual machine, it can also be after the user finishes using the cloud virtual machine and the user terminal exits the cloud virtual machine. For example, when the user terminal of the user exits the cloud virtual machine, the prediction system acquires the user The historical data record of the user can predict the files that the user may access next time when logging in to the cloud virtual machine, and save the file identification of the predicted file.
  • the cloud virtual machine When the user logs in to the cloud virtual machine next time, the cloud virtual machine receives the user After the login request of the user, the saved file identifier of the file corresponding to the user is directly obtained, without the need for the prediction system to start the prediction after the cloud virtual machine receives the user’s login request, which further shortens the user’s access to certain The time delay of the file.
  • the embodiment of the application does not limit this.
  • Prediction method 1 Calculate the accessed weight of each file included in the historical data record according to a preset algorithm, and determine the target file according to the size of the accessed weight of each file.
  • the user's historical data record includes all the file access records generated by the user in the process of using the cloud desktop, such as the time information of the user's access to the file, and the file identifier (for example, the file name) of the accessed file , The directory level information of the accessed file, the size of the accessed file, and other information.
  • a file access record included in the historical data record is that the user accessed the file identified as A in the cloud desktop at 1:30:06 on December 1, 1998, the size of the file is 3M, and the directory level For 2 floors.
  • the prediction system obtains the access parameters of the file corresponding to the file identifier according to the historical data record.
  • the access parameters include but are not limited to: the latest access time of the file, the number of accesses, the directory level and File size, etc.
  • the prediction system calculates the access weight of the file corresponding to the file identifier according to the weight assigned to each access parameter, and further, according to the size of the access weight of the file corresponding to each file identifier contained in the historical data record Determine the target file.
  • the prediction system can be deployed on a cloud server or on a cloud virtual machine.
  • the cloud virtual machine predicts the target file for the user who will log in, and after the prediction is completed, directly based on the file identification of the predicted target file Download from NAS to reduce interaction between systems.
  • the cloud server also needs to notify the file identifier of the predicted target file to the cloud virtual machine assigned to the user, and the cloud virtual machine then The notification from the server, go to the NAS to download the corresponding file.
  • the interaction between the cloud server and the cloud virtual machine will cause a waste of resources.
  • the interaction process will also increase the delay.
  • this application preferably deploys the prediction system based on the first prediction method in a cloud virtual machine.
  • the historical data storage medium stores all file access records generated by different users when using the cloud desktop, that is, the historical data record consists of each file access record of the user.
  • the historical data records of the user acquired by the cloud virtual machine are all file access records generated by the user in the cloud virtual machine before the cloud virtual machine receives the user's login request.
  • the file access record includes but is not limited to some or all of the following:
  • the time the user accessed the file the file identifier of the accessed file, the directory level of the accessed file, or the size of the accessed file.
  • Table 1 it is the historical data record of the user obtained by the cloud virtual machine.
  • the cloud virtual machine predicts the file identifier of the target file corresponding to the user, it does not need to obtain all the historical data records of the user, but may also obtain the historical data records of the user over a period of time. This is not limiting. For example, the cloud virtual machine obtains the file access records of the six months before the current time, and makes predictions based on the user's file access records within the six months, thereby reducing the amount of calculation of the prediction system.
  • the cloud virtual machine calculates the access weight of the file corresponding to each file identifier contained in the acquired historical data record according to a preset algorithm, and determines the file identifier of the target file according to the size of the accessed weight of each file, and downloads it from the NAS to determine it The file corresponding to the file identifier of the file to the local differential disk.
  • the preset algorithm is to calculate the access weight of the file according to the access parameters of the file and the weights assigned to the access parameters, where the access parameters include but are not limited to some or all of the following:
  • Last access time number of accesses, directory level or file size.
  • Weight represents the access weight of the file
  • ⁇ 1 represents the number of times the file has been accessed
  • ⁇ 1 is the weight coefficient of ⁇ 1
  • ⁇ 2 represents the last access time of the file
  • ⁇ 2 is the weight coefficient of ⁇ 2
  • ⁇ 3 represents The directory level of the file
  • ⁇ 3 is the weighting coefficient of ⁇ 3
  • ⁇ 4 is the size of the file
  • ⁇ 4 is the weighting coefficient of ⁇ 4.
  • the file directory level users may place frequently accessed files in a relatively shallow level. Therefore, if the file directory level is shallower, the higher the probability of being accessed compared to the deeper directory level files High; the larger the file size, the longer it takes for the cloud virtual machine to download the file from the NAS, and the more likely it is to increase the delay for users to access certain files, which affects user experience. For example, if the other access parameters of the two files are the same except the file size, the larger file among the two files is more likely to be the target file.
  • the access time of the file recorded in the historical data record is an absolute time. For example, 9:12 on November 2, 2018 is an absolute time. This time is not comparable.
  • the implementation of this application uses Algorithm 1 to calculate each When a file is accessed, it is necessary to convert the access parameter of each file—the last access time to UTC, and use UTC as the time base to convert the absolute time.
  • the specific conversion method is as follows:
  • the starting point of UTC time is 00:00:00.000 on January 1, 1970
  • the last access time (absolute time) of the file obtained according to historical data records is less than 00:00:00.000 on January 1, 1970
  • the offset is used as the last access time of the file when calculating the access weight of the file. For example, the offset between 01:01:00.000 on January 1, 1970 and UTC is 61 minutes.
  • the cloud virtual machine determines the access weights of files corresponding to all the file identifiers contained in the acquired historical data records through the above algorithm 1, and selects the access weights of the files in descending order of the access weights of each file. Large files are used as target files.
  • the cloud virtual machine traverses Table 1 to count the number of file access records containing the file identifier, that is, the number of times the file corresponding to the file identifier has been accessed, and then the file will be included In the identified file access records, the access time closest to the current time is used as the last access time of the file corresponding to the file identifier, and the file access record with the last access time is used to determine the directory level and file status of the file corresponding to the file identifier. size.
  • T A is the offset between the last access time of the file corresponding to the file identification A in Table 2 at 9:12 on November 2, 2018 and UTC;
  • T B is the offset corresponding to the file identification B in Table 2 The offset between the last access time of the file at 9:40 on November 3, 2018 and UTC;
  • T C is the last access time of the file corresponding to the file identifier C in Table 2 on November 1, 2018 The offset between 12:14 and UTC.
  • the file is sorted from largest to smallest according to the value of the accessed weight of each file, and the file with the larger value of the accessed weight is selected as the target file.
  • cloud virtual There are many ways for the computer to determine the number of target files. The following example illustrates:
  • one achievable way is to determine the number of target files in advance. For example, suppose the number of target files is preset If the number is 3, the prediction system selects the first 3 file IDs in the sorting as the file ID of the target file, that is, the prediction system determines the file ID of the target file as D, B, and A, and the cloud virtual machine downloads the file ID as D, B, and A. After the files of B and A stop downloading.
  • Another achievable way is to determine according to the preset access weight threshold. For example, if the preset access weight threshold is 80%, the determined file whose access weight value is not less than 0.8 is target document.
  • the cloud virtual machine determines the order of the value of the accessed weight of each file as D>B>A>C>E, where the value of the accessed weight of the file corresponding to the file identifier C is 0.8, then the cloud virtual machine determines the target The file identifiers of the files are D, B, A, and C, and the cloud virtual machine stops downloading the files with the file identifiers D, B, A, and C after downloading sequentially.
  • the order of the target files can indicate the order in which the user accesses the files after logging in to the cloud virtual machine.
  • the file corresponding to the file identifier D is It is the predicted file that the user will first access after logging in to the cloud virtual machine.
  • the file corresponding to file ID B is the file that is sequentially accessed after the predicted user accesses the file corresponding to file ID D
  • the file corresponding to file ID A is The predicted files that the user accesses sequentially after accessing the file corresponding to the file identifier B, and so on.
  • the cloud virtual machine downloads files with the file identification D from the NAS disk in sequence according to the order of the access weight. After the file with the file identification D is downloaded, the file with the file identification B is downloaded again, and the file identification is B. After the download of the file is completed, download the file with the file identification A, and so on, until all the predicted target files are downloaded in the order of the access weight.
  • the above introduction is a specific example of prediction by prediction method 1.
  • the prediction system can also perform prediction by prediction method 2.
  • the following embodiment introduces the specific prediction process of prediction method 2. .
  • Prediction method 2 Predict the target file through the deep learning algorithm model.
  • the embodiments of this application can also train the deep learning algorithm model (hereinafter referred to as the prediction model) based on the training samples obtained from the user’s historical data records, so that the trained prediction model can grasp the user’s access habits of accessing files and predict The system predicts the file identifier of the target file that the user may access after logging in to the cloud virtual machine based on the prediction model of the user.
  • the prediction model the deep learning algorithm model
  • the prediction model in the embodiment of this application is a deep learning algorithm model obtained after training the deep learning algorithm.
  • the deep learning algorithm model that can be applied in the embodiment of this application includes but is not limited to: LSTM, SVM (support vector machine, support vector machine) ), Naive Bayes, Decision Tree, Neural Network, RNN ((Recursive Neural Network, Recurrent Neural Network), DBSCAN (Density-Based Spatial Clustering of Applications with Noise.
  • the LSTM model is a time recursive neural network, which is suitable for processing and predicting important events with relatively long intervals and delays in the time series, that is, it can be based on events that occur at each time point in a long historical period of time.
  • the events that occur at each time point are recorded by numbering the time point, that is, through the time sequence. For example, in the historical data record shown in Table 1, the time sequence of the user's access to file A recorded in the first row of Table 1 is 1. The time sequence of user access to file B recorded in the second row of Table 1 is 2, and the time sequence of user access to file C recorded in the third row of Table 1 is 3, and so on.
  • the LSTM model can estimate the files that the user will access at the next time based on the files accessed by the user corresponding to each time sequence in the historical data record. For example, for Table 1, the LSTM model can target users corresponding to the time sequence 1-9. For each file to be accessed, it is estimated that the next time point, that is, the time sequence 10, the user may access the file.
  • the prediction model in the embodiment of the present application is preferably the LSTM model.
  • the structure and principle of the LSTM model first introduce the structure and principle of the LSTM model:
  • FIG. 9 it is a schematic diagram of the structure and prediction process of the LSTM model.
  • the structure of LSTM is introduced.
  • the LSTM model includes forget, input and output, and cell state C.
  • the cell state C is the algorithm structure that implements time recursion. .
  • the file number of the file there are two inputs to LSTM, namely the input x t and h t-1 , where x t represents the time sequence to be predicted, and the other input h t-1 is the user's visit in the previous time sequence x t-1
  • the file number of the file Usually, the input of the LSTM model is a numeric value. Therefore, the file number needs to be predefined for the file actually accessed by the user.
  • the file number h t-1 is the unique digital number of the file for the same user.
  • the file ID and The file numbers have a corresponding relationship. For example, the file number of the file with the file identifier a is 1, the file number of the file with the file identifier b is 2, and so on. Based on all the file identifiers in the user's historical data record, the pre-processing is performed. Definition, respectively define a unique corresponding digital number for each file ID.
  • the output h t of the LSTM is the file number of the file accessed by the predicted user in the time sequence x t.
  • the output of the output gate output is o t
  • h t is obtained by clicking and multiplying o t and tanh(c t ).
  • the algorithm of o t refers to Algorithm 5
  • the algorithm of h t refers to Algorithm 6.
  • W xi above algorithm contained, W hi, W ci, b i, W xf, W hf, W cf, b f, W xc, W hc, b c, W xo, W ho, W co, b o and ⁇ are static parameters in the LSTM model, which can be obtained by training the LSTM model.
  • the embodiment of this application uses the file numbers of files accessed by users in different time sequences in the historical time as training samples for training to predict the file numbers of files that the user will access in the next time sequence after the current time sequence.
  • FIG. 10(a) it is a schematic diagram of the process of training the LSTM model, where x represents the time sequence, h 1 ⁇ h n represents the file number of the file actually accessed by the user, and the files accessed in different time sequences may be the same, that is In the input training samples, the file numbers of different time sequences may be the same.
  • the file number has a corresponding relationship with the file identifier of the file.
  • the file identifier of the file For example, there are 3 different files accessed by the user in the time sequence x 1 to x n , and the names of each file are "abc", "123", and "learning”. Then the correspondence between the 3 files and the file numbers can be as shown in Table 3 below.
  • the above corresponding relationship is only an example, and the file number corresponding to each file name can be defined randomly, or defined in ascending order or descending order. Since the files created by each user are different, the correspondence between the file identifier (file name) and the file number contained in the historical data record of each user is different, but the file number corresponding to the same file identifier of the same user is the same.
  • the file number of the file “abc” is 1; the user accessed the file “123” and the file “123” in the time sequence x 2 The file number of is 9; the user accessed the file “learning” in the time sequence x 3 , the file number of the file “learning” is 5, and the output is h t , which represents the file number of the file accessed by the time sequence x 1 predicted by the LSTM model.
  • the prediction x 3 times corresponding to the node H if based on [x 1, 1] and [x 2, 9] forecast x 3 corresponding to h 3 to 5, there is no need to adjust the various static parameters LSTM Model , If the predicted h 3 is not 5, adjust the static parameters of the LSTM model so that the output of h 3 is 5.
  • the LSTM model is trained through a large number of training samples, so that the algorithm law of the LSTM model can learn the law of the user accessing files, or the habit of accessing files.
  • the LSTM model trained in this application is based on the user’s history of accessing files, reflecting the user’s access to files through mathematical laws, and predicting the user’s access through the mathematical laws summarized from the user’s historical access to files.
  • the file (file number) that will be accessed after one login.
  • a folder For example, every time a user logs in to a cloud virtual machine, he will access a folder named "work".
  • the folder contains all the files created by the user since his work, which can be derived from the user’s historical access to files.
  • the user will create 1-2 new word files in the "work" folder every other week, and every time the user logs in to the cloud virtual machine within the week, the 1-2 word files created in the week will be randomly opened . Therefore, when the LSTM model is trained through the training samples corresponding to the above data, the LSTM model can be used to reflect the law of the user accessing files through mathematical laws. This law is used to determine the correlation between each file number.
  • the performance can be embodied in the time sequence closest to the current time sequence (the files accessed in the two time sequences are the same) and/or the correlation between the probability values of the occurrence of other file numbers after the file number.
  • the LSTM model can obtain the file number with the highest degree of association with the file accessed by the user in the current latest time sequence according to this rule, and output the file number, which is the next time Sequence the file numbers of files that the user may access. That is, the LSTM model can learn the rules of the user accessing files in different time sequences according to the input data.
  • the rules can be embodied as determining the correlation between the file numbers according to the file numbers corresponding to the input time sequences. And predict the file number with the highest relevance to the file number corresponding to a time sequence before the next time sequence, and predict the file (file number) that the user may access in the next time sequence through this rule.
  • the use process (or called the prediction process) of the LSTM model is introduced below.
  • x represents the time sequence
  • h 1 to h n represent the file numbers of the files actually accessed by the user
  • h n+1 represents the file number of the file predicted to be accessed by the user in the time sequence x n+1.
  • file file file number 1 is a; in chronological order x 2 accessed the document file 2, file file file number 2 is B; in chronological order x 3 visited file file 3, file file 3 file number c.
  • the trained LSTM model predicts the file number of the file to be accessed at time node x 4 based on the mathematical law summarized by the user's habit of accessing files, for example, it may be one of a, b, and c. x 1 x 2 x 3 h t
  • an LSTM model can be used to predict all the file access records contained in the user's historical data record to predict the files (file numbers) that the user may access in the next time sequence.
  • two LSTM models can also be used to predict, that is, one of the LSTM models is used to predict the first file (file number) that the user accesses after logging in to the cloud virtual machine, and the other LSTM model is used to predict that the user should visit the first file.
  • the LSTM model predicting the first file accessed by the user is based on the training sample of the first file accessed every time the user logs in to the cloud virtual machine .
  • the LSTM model can quickly learn the rules of the first file accessed by the user every time the user logs in to the cloud virtual machine.
  • the process of model training is easier. Therefore, the prediction accuracy of the two LSTM models is higher.
  • the training process is easier.
  • the embodiment of the present application preferably uses two LSTM models to make predictions, and the two LSTM models are described in detail below.
  • the prediction model in the embodiment of the present application includes a first prediction model and a second prediction model, where the first prediction model is obtained by training according to the file number of the first file accessed after the user logs in to the cloud virtual machine each time, and is used for prediction The first file accessed by the user after logging in to the cloud virtual machine; the second predictive model is trained based on the file numbers of all files accessed sequentially each time the user logs in to the cloud virtual machine, and is used to predict the first access based on the first predictive model , Predict the other files that will be accessed sequentially after the first accessed file.
  • the prediction system obtains the historical data record of the user, and obtains the training sample data of each prediction model according to the historical data record.
  • the training sample data of the first prediction model in this application is the data set A1
  • the training sample data of the second prediction model is the data set A2.
  • Data set A1 contains the file number of the file accessed for the first time after the user logs in to the cloud virtual machine each time, and the files in the data set A1 are arranged from the farthest to the most recent when they were accessed;
  • the data set A2 contains the file numbers of all files accessed sequentially after the user logs in to the cloud virtual machine each time, and the files in the data set A2 are also arranged according to the time of access from the farthest to the most recent.
  • X represents the time sequence obtained after numbering the user login to the cloud virtual machine in chronological order
  • X1 refers to the number of the user logging in to the cloud virtual machine for the first time
  • X2 represents the user logging in to the cloud virtual machine for the second time
  • X3 represents The user logs in to the cloud virtual machine for the third time, and so on.
  • Filexn represents the nth file accessed sequentially when the user logs in to the cloud virtual machine for the Xth time.
  • File11 represents the first file accessed by the user when the user logs in to the cloud virtual machine for the first time
  • File12 represents the user in the first file.
  • file13 means that when the user logs in to the cloud virtual machine for the first time, it is the third file that the user accesses in sequence after accessing File12, and so on .
  • the first file accessed by a user every time after logging in it can be determined through historical data records.
  • the first file recorded in the historical data record that the user accesses each time the user logs in to the cloud virtual machine will have Corresponding identification, the identification can be but not limited to one or more of words, numbers or symbols.
  • the aforementioned Filexn is the access sequence of files embodied in two dimensions, and does not indicate a file identifier.
  • File11, File21, File31, and File41 may be the same file.
  • Table 5 the file identifiers of the files actually accessed by the user in different time sequences in Table 4 are shown.
  • the file identifier of each file corresponds to a file number.
  • the files (file identifiers) accessed by the user include a, b, c, d, e, h, r, and y. It is assumed that the file number corresponding to each file identifier is shown in Table 6 Shown.
  • the training sample data A1 of the first prediction model is obtained as follows:
  • the training sample data of the first prediction model is the first file accessed by the user every time the user logs in to the cloud virtual machine, namely [(X1, File11), (X2, File21), (X3, File31), (X4, File41) )].
  • [(X1, File11), (X2, File21), (X3, File31), (X4, File41)] [(X1, a), (X2, e), (X3, h), (X4, c)].
  • the first prediction model is based on the file number h2 corresponding to the (X1, 1) prediction time sequence X2, if h2 is 5 (the file number of the file actually accessed by the user at the time X2 ), there is no need to adjust the static parameters in the first prediction model. If h2 is not 5, adjust the static parameters in the first prediction model until h2 calculated by the first prediction model is 5; X3 corresponds to h3 for prediction. If the predicted h3 is 6 (the file number of the file actually accessed by the user at the time X3), there is no need to adjust the static parameters in the first prediction model. If h3 is not 6, then Adjust the static parameters in the first prediction model until h3 calculated by the first prediction model is 6, and so on.
  • the form of the above-mentioned data set input to the LSTM model is an example, and the present application is not limited to the form of the above-mentioned data set.
  • the data set input to the LSTM model can also contain only file numbers.
  • the LSTM model determines the time sequence corresponding to the file numbers according to the order of the file numbers. For example, taking the data set A1 as an example, the data set A1 input to the model also It can be [1, 5, 6, 3], that is, only file numbers are included.
  • the LSTM model determines the time sequence corresponding to each file number in the order of file numbers, and predicts the output of the next time sequence according to the determined time sequence.
  • the input to the prediction model may be a data set containing only file numbers, excluding the time sequence to be predicted.
  • the time sequence to be predicted may be determined by the model itself. This is not limited.
  • the The LSTM model will be trained from the beginning for the training sample data to readjust the static parameters in the model until the accuracy of the first prediction model reaches the ideal value, for example, the accuracy rate reaches 90%, and the accuracy rate of the model reaches the ideal value At that time, it can be considered that the training of the model is completed, that is, the model can be used for prediction.
  • the training sample data A2 of the second prediction model is obtained as follows:
  • A2 [(S1, File11), (S2, File12), (S3, File13), (S4, File14), (S5, File15), (S6, File16), (S7, File17), (S8, File18) , (S9, File21), (S10, File22), (S11, File23), (S12, File24), (S13, File25), (S14, File31), (S15, File32), (S16, File33), ( S17, File34), (S18, File35), (S19, File36), (S20, File37), (S21, File41), (S22, File42), (S23, File43), (S24, File44), (S25, File45), (S26, File46)].
  • the sequence starts from the first file accessed by the user logging in to the cloud virtual machine for the first time to the last file accessed during the last login, and the number obtained, for example, for Table 4, can also be understood as the user logging in and accessing 4 times
  • the time sequence of all files accessed during the cloud virtual machine process can also be understood as the user logging in and accessing 4 times.
  • Table 5 is the file identifier of the file actually accessed by the user corresponding to Table 4. Based on Table 5, the training sample data A2 of the second prediction model is obtained, as follows:
  • A2 [(S1, a), (S2, b), (S3, c), (S4, d), (S5, b), (S6, h), (S7, e), (S8, d) , (S9, e), (S10, b), (S11, e), (S12, c), (S13, d), (S14, h), (S15, c), (S16, d), ( S17, r), (S18, h), (S19, e), (S20, b), (S21, c), (S22, h), (S23, y), (S24, d), (S25, e), (S26, d)].
  • A2 [(1,1),(2,2),(3,3),(4,4),(5,2),(6,6),(7,5),(8,4) , (9, 5), (10, 2), (11, 5), (12, 3), (13, 4), (14, 6), (15, 3), (16, 4), ( 17, 7), (18, 6), (19, 5), (20, 2), (21, 3), (22, 6), (23, 8), (24, 4), (25, 5), (26, 4)].
  • the trained first prediction model can determine at least one related file of the last file in the data set A1, and determine the degree of relevance between the last file and each related file, and predict the user's likelihood of the next time sequence based on the degree of relevance
  • the accessed file is the first file that the user may access after logging in to the cloud virtual machine.
  • the associated file is the file that is accessed by the user recorded in the data A1 after accessing the same file as the last file; for example, assume that the data set A1 is [(1, 1), (2, 2), (3 , 3), (4, 2), (5, 3), (6, 4), (7, 5), (8, 3)], the file number of the last file in the data set is 3, according to Data set A1 shows that in the previous process of using the cloud desktop, the user has accessed the file number 2 and the file number 4 after accessing the file number 3 file. Therefore, the file number 3 and the file number 4 This is the associated file with file number 3.
  • the first prediction model can determine the direct correlation between file number 3 and each associated file according to the learned rules for user access to files. For example, the first prediction model determines the relationship between the two files according to the interval between the two files. The degree of relevance. For example, if a user accesses file B after accessing file A a year ago, and accesses file C after accessing file A a week ago, then both file B and file C are related files of file A, but file C The degree of relevance to file A is higher than the degree of relevance to file B and file A, or the user accesses file B after accessing file A a year ago, and then the user accesses file C after accessing file A, but then the user During the process of using the cloud desktop for a long period of time, file B is no longer accessed, but C is frequently accessed, so the degree of association between file C and file A is higher than that of file B and file A.
  • the first prediction model determines the degree of association between two files based on the frequency of accessing the associated file after the user accesses the file that is the same as the last file in the first data set.
  • the associated files of the user after accessing file 1 are file 2 and file 3, but according to the second data set, the frequency of accessing file 2 after accessing file 1 is lower than that of user accessing file 3 after accessing file 1. Frequency, the degree of relevance between file 3 and file 1 is higher than the degree of relevance between file 2 and file 1.
  • the first prediction model can also combine the above two methods to determine the degree of relevance between two files. For example, determine the degree of relevance 1 of the two files according to the method of the first example, and determine the degree of relevance of two files according to the second example. An example method determines the degree of relevance 2 of the same two files, and assigns weights to the values of the two relevance degrees to determine the final value of the degree of relevance.
  • the trained second prediction model can also determine the degree of association between the last file and each associated file based on at least one associated file of the last file in the data set A2, and predict the user's next time order based on the degree of association
  • the files that may be accessed please refer to the above introduction to the first prediction model after training, which will not be repeated here.
  • the first prediction model after training can be used to predict the first file that the user will access after logging in to the cloud virtual machine
  • the second prediction model after training is used to predict that the user will access sequentially after the first file accessed. document. Therefore, when the second prediction model makes predictions, it is necessary to add the file number of the first file that the user may access predicted by the first prediction model to the end of the data set A2 in chronological order, and predict based on the first prediction model
  • the first file predicts the second file that the user may access, and then updates the data set A2 with the file number of the predicted second file, and predicts the third file that the user may access based on the predicted second file Files, and so on.
  • the time sequence for the user to log in to the cloud virtual machine again is X5
  • the first prediction model is used to predict the file of the first file that the user may access in the time sequence X5 Numbering.
  • the data sets A1 and X5 are input to the first prediction model, and the first prediction model outputs h 5 , where h 5 is the file number of the file that the user may access in the time sequence X5.
  • A2 [(1,1),(2,2),(3,3),(4,4),(5,2),(6,6),(7,5),(8,4) , (9, 5), (10, 2), (11, 5), (12, 3), (13, 4), (14, 6), (15, 3), (16, 4), ( 17, 7), (18, 6), (19, 5), (20, 2), (21, 3), (22, 6), (23, 8), (24, 4), (25, 5), (26, 4) (27, 8)].
  • the file number of the file that the user will access in the time sequence S28 is predicted.
  • the updated data set A2 and the time sequence S28 are input into the second prediction model.
  • the second prediction model predicts the file number of the file that the user will access in S28, which is assumed to be h 6 .
  • the end condition can take many forms, several of which are listed below:
  • the preset number is 3, when the prediction model determines the file numbers of the three target files (for example, h 5 , h 6 , h 7 ), the process of ending the prediction is determined.
  • the predicted target file is duplicated, for example, the file number h 5 is e, and the file number h 7 is also e; or the file corresponding to the file number h 5 and the file corresponding to the file number h 7 are the same file.
  • each cloud virtual machine user has its own corresponding prediction model. Since linked clone cloud virtual machines are randomly assigned, if a prediction model corresponding to each user is deployed on a cloud virtual machine, it is necessary to deploy some or all of the cloud virtual machine to which the cloud virtual machine may be allocated. The prediction model corresponding to the user will occupy a large amount of storage resources of the cloud virtual machine. Therefore, in the embodiment of the present application, when predicting the target file through the prediction model, it is preferable to deploy the prediction model in a storage medium other than the cloud virtual machine, such as a cloud server, or an independent device or storage other than the cloud server. medium.
  • the above introduction is a specific example of a cloud virtual machine making individual predictions through prediction method 1 or prediction method 2.
  • the embodiment of the present application may combine prediction method 1 and prediction method 2 to make predictions, for example, when a user uses a cloud desktop to generate historical data records When the number is small, use prediction method one to predict. When the number of historical data records of the user is large, use prediction method two to perform prediction. The following describes the process of predicting the file number of the target file in combination with the above two prediction methods.
  • Step 1100 the user triggers the user terminal to send a login request
  • Step 1101 After receiving the login request sent by the user, the virtual desktop management system authenticates the user's identity information, and if the authentication passes, allocates a cloud virtual machine to the user terminal of the user;
  • Step 1102 The cloud virtual machine obtains the user's historical data record from the historical data storage medium, and determines whether the current prediction condition for using prediction method 1 (hereinafter referred to as the first prediction condition) is met, and if so, execute step 1103 , Otherwise go to step 1104;
  • the first prediction condition for using prediction method 1
  • the first prediction condition may include some or all of the following:
  • the number of files corresponding to the file identifier contained in the historical data record is less than or equal to the preset number
  • the cumulative time for the user to use the cloud virtual machine is less than or equal to the preset cumulative time.
  • the execution subject that determines whether the first prediction condition is currently met in step 1102 may also be a cloud server, and the cloud server informs the object that has deployed the corresponding prediction method according to the judgment result Make predictions. For example, if the cloud server determines that the first prediction condition is currently met, it notifies the cloud virtual machine to make a prediction, and sends the user's historical data record to the cloud virtual machine; if the cloud server determines that the first prediction condition is not currently met, Then the prediction model storage medium is notified to use the prediction model corresponding to the user to make predictions, and the historical data record of the user is sent to the prediction model storage medium.
  • the embodiment of this application can flexibly perform information interaction between storage media with different functions.
  • the embodiment of this application finally makes predictions based on the acquired historical data records of the user based on the main body integrated with the prediction system, and is not limited The way and process of data interaction in each part of the system.
  • Step 1103 The cloud virtual machine calculates the access weight of the file corresponding to each file identifier contained in the historical data record according to a preset algorithm, and determines the file identifier of the target file according to the size of the accessed weight of each file, and downloads it from the NAS.
  • the file corresponding to the file identifier of the file to the local differential disk.
  • step 1103 reference may be made to the specific execution steps introduced in the above prediction method 1, which will not be repeated here.
  • Step 1104 The prediction model predicts the file identification of the target file after the user logs in to the cloud virtual machine based on the acquired historical data records, and sends the predicted file identification to the cloud virtual machine, and the cloud virtual machine downloads the file identification from the NAS Corresponding files to the local differential disk.
  • step 1104 reference may be made to the specific execution steps introduced in the second prediction method, which will not be repeated here.
  • the cloud virtual machine downloads the target file from the NAS
  • the user can also use the cloud desktop normally, for example, access the file through the virtual machine file system, or create a new file locally on the cloud virtual machine.
  • the cloud virtual machine accesses the downloaded target file on the cloud virtual machine, there is no need to wait for the cloud virtual machine to download the file from the NAS disk. This effectively reduces the time delay for the user to access these files and improves the user experience , Has strong applicability.
  • process 1 downloads the file actually accessed by the user from the NAS
  • process 2 downloads the predicted target file from the NAS disk.
  • the prediction function can also re-predict the files that the user will access sequentially based on the files actually accessed by the user.
  • the cloud virtual machine downloads the latest target file predicted by the prediction system from the NAS disk through process 2.
  • the embodiment of the application also provides a device for executing the method executed by the cloud virtual machine in the above method embodiment.
  • the device includes a processing unit 1201 and a transmission unit 1202.
  • the processing unit 1201 is configured to receive a user's login request, and respond to the login request to predict at least one file that the user may access after logging in to the cloud virtual machine; the processing unit 1201 may be the prediction system in FIG. 4, and The method corresponding to step 805 of the prediction system in the embodiment described in FIG. 8 is executed.
  • the transmission unit 1202 is configured to download the at least one target file from the remote storage device to the cloud virtual machine.
  • the transmission unit 1202 may be the configuration file application program in FIG. 4, and execute the method corresponding to step 803, step 804, and step 806 in the embodiment shown in FIG. 8.
  • the processing unit 1201 may obtain historical data records of the user, and according to the historical data records of the user, predict that the user will log in to the cloud virtual machine. At least one file that may be accessed, or when the processing unit 1201 responds to the login request; wherein, the historical data record records that the user was in the cloud virtual machine before the cloud virtual machine received the login request. A record of the files accessed.
  • the processing unit 1201 is specifically configured to obtain the historical data record when predicting at least one file that the user may access after logging in to the cloud virtual machine based on the historical data record of the user.
  • the processing unit is specifically configured to: obtain a first data set from the historical data record, and the first data set is that when a user logs in to the cloud virtual machine for N consecutive times, every The first set of files accessed at the time; the first data set is input into the first prediction model, so that the first prediction model makes predictions based on the first data set, and outputs the user to log in to the cloud
  • the first file that may be accessed after the virtual machine; when the first file output by the first prediction model and the user recorded in the historical data log in to the cloud virtual machine for M consecutive times, the number of sequential accesses each time Files form a second data set; input the second data set into a second prediction model to predict the second file that the user may access after logging in to the cloud virtual machine; add the second file to the first Two data sets to update the second data set; return to the step of inputting the second data set into the second prediction model until M files that the user may access after logging in to the cloud virtual machine are predicted.
  • the processing unit is inputting the first data set into a first prediction model, so that the first prediction model makes predictions based on the first data set, and outputs the user login
  • the first file that may be accessed later by the cloud virtual machine
  • it is specifically used to: control the first prediction model to determine at least one associated file of the last file in the first data set, and determine the last file The degree of relevance to the at least one associated file, and output the associated file with the highest degree of relevance to the last file; wherein, the files in the first data set are sorted from the farthest to the most recent when they are accessed, so
  • the associated file is the file that is accessed by the user recorded in the first data set after accessing the last file; the degree of association is that the first prediction model is based on the relationship between the associated file and the last file.
  • the file interval and/or the frequency of accessing the associated file after the user accesses the last file in the first data set is determined.
  • the processing unit when the processing unit inputs the second data set into a second prediction model to predict a second file that the user may access after logging in to the cloud virtual machine, it is specifically used to: control
  • the second prediction model determines at least one associated file of the last file in the second data set, determines the degree of association between the last file and the at least one associated file, and outputs the association with the last file For the associated file with the highest degree, the output associated file is added to the end of the second data set, so that the output associated file is the last file of the updated second data set.
  • the files in the second data set are arranged in order of access time from farthest to most recent, and the associated file is after the user recorded in the second data set has accessed the same file as the last file
  • the accessed file; the degree of association is the interval between the associated file and the last file according to the second prediction model, and/or in the second data set after the user visits the last file
  • the frequency of accessing the associated file after the file with the same file is determined.
  • the processing unit is specifically configured to:
  • At least one target file identifier is selected from a plurality of file identifiers contained in the historical data of the user, and the file corresponding to the target file identifier is used as the user Files that may be accessed after logging in to the cloud virtual machine; when it is detected that the historical data record meets the preset conditions, the first data set is input into the first prediction model, so that the first prediction model is based on the The first data set is used for prediction, and the first file that the user may access after logging in to the cloud virtual machine is output; the first file output by the first prediction model and the user recorded in the historical data are successively When logging in to the cloud virtual machine for M times, a plurality of files accessed sequentially each time constitute a second data set; the second data set is input into the second prediction model to predict which users may access after logging in to the cloud virtual machine The second file; add the second file to the second data set to update the second data set; return to the step of inputting the
  • the processing unit is specifically configured to generate a prediction request, and send the prediction request to an electronic device through a transmission unit, and the prediction request is used to instruct the electronic device to Predict at least one file that the user may access after logging in to the cloud virtual machine;
  • At least one file predicted by the electronic device that the electronic device may access to the user after logging in to the cloud virtual machine is received by a transmission unit.
  • this application provides a device 1300, which can be applied to the cloud virtual machine in the scenario shown in FIG. 5, FIG. 6, or FIG.
  • the device 1300 may include a processor 1301 and a memory 1302. Further, the device may further include a communication interface 1304, and the communication interface may be a transceiver. Further, the device may also include a bus system 1303.
  • the processor 1301, the memory 1302, and the communication interface 1304 can be connected via a bus system 1303.
  • the memory 1302 can store instructions.
  • the processor 1301 can be used to execute instructions stored in the memory 1302 to control the communication interface 1304 to receive or send signals. Complete the steps of using the cloud virtual machine as the main body in the method shown in FIG. 8 above.
  • the memory 1302 may be integrated in the processor 1301, or may be a different physical entity from the processor 1301.
  • the function of the communication interface 1304 may be implemented by a transceiver circuit or a dedicated chip for transceiver.
  • the processor 1301 may be implemented by a dedicated processing chip, a processing circuit, a processor, or a general-purpose chip.
  • a computer may be considered to implement the functions of the first computing node or the first computing node provided in the embodiments of the present application.
  • the program code for realizing the functions of the processor 1301 and the communication interface 1304 is stored in the memory 1302, and the general-purpose processor can realize the functions of the processor 1301 and the communication interface 1304 by executing the codes in the memory.
  • the device 1300 may be used to execute the steps in the process shown in FIG. 8 with the cloud virtual machine as the main body.
  • the communication interface 1304 may receive a user's login request, and download the at least one target file from a remote storage device to the cloud virtual machine; the processor 1301 may respond to the login request to predict that the user will log in to the cloud virtual machine. At least one file that may be accessed after the computer.
  • the embodiments of the present application also provide a computer storage medium, the storage medium stores a software program, and the software program can implement any one or more of the above when read and executed by one or more processors.
  • the computer storage medium may include: U disk, mobile hard disk, read-only memory, random access memory, magnetic disk or optical disk and other media that can store program codes.
  • the embodiments of the present application also provide a computer program product.
  • the computer program product includes computer instructions.
  • the computer instructions When the computer instructions are executed by a computer, the computer executes any one or more of the above implementations. The method provided by the example.
  • the embodiments of the present application also provide a chip, which includes a processor, which is used to implement the functions involved in any one or more of the above embodiments, such as acquiring or processing the information involved in the above methods or news.
  • the chip further includes a memory for storing program instructions and data executed by the processor.
  • the chip may also include chips and other discrete devices.
  • the processor may be a central processing unit (central processing unit, CPU), and the processor may also be other general-purpose processors, digital signal processors (digital signal processors, DSP), and dedicated integration Circuit (application-specific integrated circuit, ASIC), ready-made programmable gate array (field programmable gate array, FPGA) or other programmable logic devices, transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or any conventional processor.
  • the memory may include a read-only memory and a random access memory, and provides instructions and data to the processor.
  • a part of the memory may also include a non-volatile random access memory.
  • the bus system may also include a power bus, a control bus, and a status signal bus.
  • various buses are marked as bus systems in the figure.
  • each step of the above method can be completed by an integrated logic circuit of hardware in the processor or instructions in the form of software.
  • the steps of the method disclosed in combination with the embodiments of the present application may be directly embodied as being executed and completed by a hardware processor, or executed and completed by a combination of hardware and software modules in the processor.
  • the software module can be located in a mature storage medium in the field, such as random access memory, flash memory, read-only memory, programmable read-only memory, or electrically erasable programmable memory, registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory and completes the steps of the above method in combination with its hardware. To avoid repetition, it will not be described in detail here.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种基于云虚拟机的数据访问方法及设备。该方法包括:云虚拟机接收到用户的登录请求,并响应该用户的登录请求预测用户登录到云虚拟机后可能访问的至少一个文件,并从远端存储设备中下载预测到的至少一个文件至云虚拟机,当用户登录云虚拟机后可以直接访问这些文件,避免用户在云虚拟机访问该文件时,云虚拟机再从远端存储设备下载,从而缩短了用户访问这些文件的时延,提高了用户体验,具有较强的应用性。

Description

一种基于云虚拟机的数据访问方法及设备 技术领域
本申请涉及云桌面技术领域,尤其涉及一种基于云虚拟机的数据访问方法及设备。
背景技术
云桌面服务系统可以采用虚拟化技术在每台云服务器上虚拟出一台或多台云虚拟机,用户在通过用户终端登录云虚拟机后,可以在用户终端上显示登录成功的云虚拟机对应的云桌面。用户通过操作显示的云桌面,就可以实现通过登录成功的云虚拟机实现一些操作功能。
云桌面服务系统一般通过克隆方式在云服务器上虚拟出多台云虚拟机,目前,较为常见的云虚拟机的克隆方式为链接克隆,通过链接克隆创建的云虚拟机的系统盘(即C盘)均包含系统母盘和差分盘。系统母盘内存储有通过链接克隆在该云服务器下创建的所有云虚拟机共同访问的数据,该数据不会随着关闭云虚拟机而被清除,也就是通过链接克隆创建出的各个云虚拟机共用一个系统母盘。各云虚拟机还有自身单独使用的差分盘,差分盘用于存储用户登录云虚拟机后产生的文件,该文件对其他云虚拟机不可见,当用户关闭登录的该云虚拟机后,该云虚拟机的差分盘内的数据随即被清除。
为了保存用户在关闭登录的云虚拟机后该云虚拟机的差分盘内的文件,现有会将用户登录云虚拟机后在差分盘内的文件上传至远端存储设备,例如网络附属存储(network attached storage,NAS)设备中。当用户通过用户终端再次登录对应云虚拟机时,可能需要访问之前登陆所述云虚拟机时生成的文件,所以云虚拟机会从NAS盘中获取该用户在该云虚拟机中生成的文件。如果文件对应的数据量比较大,则本次登录云虚拟机后从NAS盘下载文件的时间就比较长,从而导致用户在访问这些文件时,时延较大,影响用户体验。
发明内容
本申请提供一种基于云虚拟机的数据访问方法及设备,用以实现在使用云虚拟机时减少用户执行某些操作的时延,提高用户体验。
第一方面,本申请提供一种基于云虚拟机的数据访问方法,该方法包括:云虚拟机接收用户的登录请求,并响应该用户的登录请求预测所述用户登录到云虚拟机后可能访问的至少一个文件,并从远端存储设备中下载确定的所述至少一个文件至云虚拟机本地。
通过上述方法,云虚拟机在响应用户的登录请求时,一方面与该用户的用户终端建立连接,另一方面,确定该用户登录到云虚拟机后可能访问的至少一个文件,并从远端存储设备中下载该确定的至少一个文件,当用户登录到云虚拟机后,不需要再等待云虚拟机从远端存储设备下载该文件的时间,从而缩短了用户访问这些文件的时延,提高了用户体验。
在一种可能的实现方式中,云虚拟机在确定用户登录到云虚拟机后可能访问的至少一个文件时,可以通过自身进行预测得到该用户对应的至少一个文件。其中,通过云虚拟机进行预测时,需要获取该用户的历史数据记录,根据获取的该用户的历史数据记录,预测该用户登录到云虚拟机后可能访问的至少一个文件。
其中,所述历史数据记录记录了在云虚拟机接收到所述登陆请求之前,该用户在所述云虚拟机中所访问的文件的记录。
通过上述方法,云虚拟机可以通过自身进行预测,减少与其他设备之间的交互,节省资源。
在一种可能的实现方式中,云虚拟机或其他设备在预测用户登录云虚拟机后可能访问的至少一个文件时,可以通过下列方式进行预测:
获取所述历史数据记录中记录的每个文件的多个参数值,其中每个参数值预设一个权重值;根据预设算法对所述多个参数及所述多个参数的预设权重值计算每个文件在用户登录所述云虚拟机后可能被访问的概率值;根据所述概率值确定所述至少一个目标文件。
通过上述方法,根据预设算法计算历史数据记录中包含的每个文件可能被访问的概率值,从而确定用户可能访问的至少一个文件,该预测方式简单,不需要区分登录的用户,适用场景多,部署于云虚拟机上时,不会占用云虚拟机过多的资源,应用性强。
在一种可能的实现方式中,云虚拟机或其他设备在预测用户登录云虚拟机后可能访问的至少一个文件时,还可以通过下列方式进行预测:
从所述历史数据记录中获取第一数据集合,所述第一数据集合为用户连续N次登陆所述云虚拟机时,每次所访问的第一个文件的集合;将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件。
通过上述方法,通过第一预测模型预测用户登录云虚拟机后可能访问的首个文件,并通过第二预测模型基于第一预测模型输出的文件(用户可能访问的首个文件)来预测用户可能顺序访问其他文件。第二预测模型是根据用户的历史数据记录中记录的用户连续登录云虚拟机时每次顺序访问的多个文件的训练样本进行训练的,也就是,第二预测模型能够根据用户连续访问文件的规律预测下一次可能访问的文件。其中,第二预测模型也可以用于预测用户登录云虚拟机后可能访问的首个文件,但第一预测模块为根据历史数据记录中,用户每次登录云虚拟机后访问的首个文件的训练样本进行训练的,因此,第一预测模型用于预测用户登录云虚拟机后可能访问的首个文件的准确性更高,另外,第一预测模型和第二预测模型可以处理的数据量较大,当用户的历史数据记录中记录的用户在云虚拟机上所访问的文件的记录较多时,使用第一预测模型和第二预测模型进行预测,预测效率较高。
在一种可能的实现方式中,所述第一预测模型确定所述第一数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度;其中,所述第一数据集合中的文件按照被访问的时间由远及近进行排列,所述关联文件为所述第一数据集合中记录的用户在访问完与所述最后一个文件之后访问的文件;所述第一预测模型输出与所述最后一个文件关联度最高的关联文件;所述关联度,为所述第一预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第一数据集合中 用户在访问完与所述最后一个文件之后访问所述关联文件的频率确定的。
在一种可能的实现方式中,所述第二预测模型确定所述第二数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度;其中,所述第二数据集合中的文件按照访问时间由远及近的顺序排列,所述关联文件为所述第二数据集合中记录的用户在访问完与所述最后一个文件相同的文件之后访问的文件;所述第二预测模型输出与所述最后一个文件关联度最高的关联文件;其中,所述关联度,为所述第二预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第二数据集合中用户在访问完与所述最后一个文件相同的文件之后访问所述关联文件的频率确定的;所述第二预测模型将输出的所述关联文件添加至所述第二数据集的尾端,使输出的所述关联文件为更新后的第二数据集的最后一个文件。
在一种可能的实现方式中,云虚拟机还可以通过下列方式预测所述用户登录所述云虚拟机后待访问的至少一个目标文件:按照所述文件被访问的概率值从大到小的顺序,在所述用户的历史数据包含的多个文件标识中选择至少一个目标文件标识,将所述目标文件标识对应的文件作为所述用户登录所述云虚拟机后待访问的目标文件;当侦测到历史数据记录符合预设条件时,则:将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件;根据第一预测模型输出的所述用户登录所述云虚拟机后可能访问的第一个文件及第二预测模型输出的所述用户登录所述云虚拟机后可能访问的M个文件确定所述用户登录所述云虚拟机后可能访问的目标文件。
通过上述方法,当侦测到历史数据记录不符合预设条件时,使用预设算法确定的概率值来确定目标文件,若侦测到历史数据记录符合预设条件时,则通过第一预测模型和第二预测模型预测目标文件,例如,预设条件为历史数据记录中包含的文件标识超过预设阈值,则使用第一预测模型和第二预测模型进行预测,若历史数据记录中包含的文件标识未超过预设阈值,则使用预设算法进行计算,避免耗费大量的计算资源,提高预测效率。
在一种可能的实现方式中,云虚拟机还可以生成一个预测请求,并将所述预测请求发送至一电子设备,所述预测请求用于指示所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件进行预测;接收所述电子设备所预测的所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件。
通过上述方法,云虚拟机也可以触发电子设备进行预测,获取电子设备的预测结果,云虚拟机上不需要进行预测,也就是不需要在云虚拟机上部署实现该预测的功能,从而避免占用云虚拟机大量的资源。
第二方面,本申请实施例还提供了一种设备,该设备包括多个功能单元,这些功能单元可以执行第一方面的方法中各个步骤所执行的功能。这些功能单元可以通过硬件实现,也可以通过软件实现。在一个可能的设计中,该设备包括传输单元以及处理单元。关于该设备实现的有益效果,请参考第一方面的描述,在此不再赘述。
第三方面,本申请实施例还提供了一种设备,该设备包括处理器和存储器,所述存储器中存储有程序指令,所述处理器运行所述存储器中的程序指令以实现第一方面所提供的方法。关于该设备实现的有益效果,请参考第一方面的描述,在此不再赘述。
第四方面,本申请还提供一种计算机可读存储介质,计算机可读存储介质中存储有指令,当其在计算机上运行时,使得计算机执行上述第一方面所提供的方法。
第五方面,本申请还提供一种计算机芯片,芯片与存储器相连,芯片用于读取并执行存储器中存储的软件程序,执行上述第一方面所提供的方法。
附图说明
图1为现有云虚拟机的系统架构示意图;
图2为链接克隆云虚拟机的系统盘结构示意图;
图3为现有云虚拟机响与NAS盘的模块交互示意图;
图4为本申请实施例提供的云虚拟机响与NAS盘的模块交互示意图;
图5为本申请提供的一种应用场景示意图;
图6为本申请提供的又一种应用场景示意图;
图7的为本申请提供的另一种应用场景示意图;
图8为本申请实施例提供的一种基于云桌面的数据访问方法流程示意图;
图9为LSTM模型的算法结构示意图;
图10为本申请实施例提供的对LSTM模型进行训练和预测的过程示意图;
图11为本申请实施例提供的第三种基于云桌面的数据访问方法流程示意图;
图12为本申请实施例提供的一种基于云虚拟机的数据访问设备示意图;
图13为本申请实施例提供的一种基于云虚拟机的数据访问设备示意图。
具体实施方式
下面将结合附图对本申请的实施方式进行详细描述。首先对本发明实施例可以应用到的系统架构和一些基本概念进行描述,以便本领域技术人员理解。
云桌面又称桌面虚拟化、云电脑,是替代传统电脑的一种新模式。采用云桌面后,用户可以通过不同终端设备,在任何地点,任何时间访问在网络上的属于我们个人的云桌面。
如图1所示,为本申请实施例可以应用到的一种云系统架构示意图,图1中左侧是虚拟桌面的客户端,通常称为瘦客户端,客户端的设备形式可以是普通计算机101a、平板电脑101b、智能手机101c等。它们通过网络102使用远程桌面协议103访问远程桌面服务。服务器204a….204n提供了远程桌面的载体,用户的虚拟桌面(Virtual Desktop)以虚拟机205a、205b¨¨205n的形式存在于服务器上。虚拟桌面管理系统106,用于提供用户的客户端与虚拟机的映射等功能。客户端首先连接到虚拟桌面管理系统106,虚拟桌面管理系统106在接收到客户端发送的连接虚拟机请求(下文称为登录请求)后,从虚拟机资源池中为该客户端分配虚拟机,客户端获取虚拟桌面管理系统106分配的虚拟机的虚拟机地址,进而连接到虚拟机,虚拟桌面管理系统106可以为服务器,也可以为普通个人计算机等,本发明对此不作具体限定。用户通过客户端访问服务器上分配给该用户的虚拟桌面(即虚拟机),该虚拟桌面将用户访问的内容传输到用户的客户端(也称为用户终端)进行显示。
简单来说,服务器204a….204n使用虚拟化技术能够虚拟出多台云虚拟机,每台云虚拟机的桌面便是云桌面,用户通过用户终端登录云虚拟机,在登录的云虚拟机进行云桌面操作。其中,本领域技术人员可以理解,所谓的用户登录云虚拟机,就是用户登录云桌面,因此,本申请实施例中,云虚拟机和云桌面有时可以混用,登录和连接可以混用,应当指出的是,在不强调其区别时,其所要表达的含义是一致的。
目前,服务器虚拟化出的云虚拟机包括链接克隆虚拟机和完整克隆虚拟机,完整克隆虚拟机占据完全独立的虚拟磁盘空间,在使用过程中,完整克隆虚拟机会与首次登录该完整克隆虚拟机的用户进行绑定,绑定后该完整克隆虚拟机只能为该用户服务,也就是,该完整克隆虚拟机不会被分配给其他用户使用。因此,完整克隆虚拟机可以将与自身绑定的用户在云桌面的全部操作数据可以永久存储于自身的虚拟磁盘内,该用户的操作数据不会随着关闭云虚拟机而清除。然而,每个完整克隆虚拟机的虚拟磁盘互相独立,互不干扰,完整克隆虚拟机占用专属的虚拟磁盘资源,当服务器的存储空间有限时,无法为更多的用户提供云虚拟机服务。
因此,链接克隆虚拟机便应运而生,链接克隆虚拟机是通过一个源虚拟机(或称为虚拟机模板/父虚拟机)克隆出的。链接克隆虚拟机的特点是多台链接克隆虚拟机共用一个相同的虚拟磁盘(下文称为系统母盘),即链接克隆虚拟机可具有同一系统母盘的配置的相同的操作系统,应用系统等。在使用过程中,链接克隆虚拟机不会与用户进行绑定。当用户通过用户终端发起登录请求后,虚拟机管理系统会为该用户随机分配一台闲置的链接克隆虚拟机。如图2所示,为了满足用户登录到链接克隆虚拟机后的个性化操作,服务器为每个链接克隆虚拟机划分各自专用的部分虚拟磁盘(下文称差分盘),并将系统母盘和差分盘组合为一个链接克隆虚拟机的系统盘(即C盘)。也就是说,用户登录云虚拟机后产生的操作数据会被临时保存在该云虚拟机的差分盘内,换而言之,链接克隆虚拟机的差分盘可以临时存储用户在该云虚拟机上产生的操作数据,该操作数据包括但不限于:该用户的所有文件、各文件的属性信息,用户对系统或应用程序的配置参数等,例如,用户创建的word文件,下载的视频、音频,对word软件的窗口颜色配置等。其中,文件的属性信息包括但不限于:该文件的图标,文件所在位置(存储路径),文件的大小,文件类型等信息。
一方面,由于链接克隆虚拟机不会与用户进行绑定,其分配的方式是随机的,为了保证该用户的操作数据不被其他用户查看到,当该用户关闭虚拟机后,该虚拟机的差分盘内保存的数据也需要被清除。
另一方面,为了使该用户下一次通过用户终端登录云虚拟机后,仍然能够访问之前在云桌面上的操作数据,目前,在用户使用云虚拟机的过程中,云虚拟机会将差分盘内保存的数据同步备份至NAS盘内。示例性的,用户在云桌面的操作会产生用户数据,比如,创建的word文件、修改后的Excel文件、下载的视频和音频等。用户在云桌面产生的以上用户数据会被暂时存储在该云虚拟机的差分盘中,并同步备份到NAS盘中该用户的账户中。其中,该NAS盘还可以存储其他云虚拟机用户的用户数据,该用户数据包含有该用户的所有文件数据、文件的属性信息以及系统或应用的配置信息等。
相应的,当该用户通过用户终端再次登录云虚拟机时,云虚拟机可以从NAS盘将该用户的全部用户数据下载至本地,以供用户无缝切换。但该方式下,云虚拟机只有将该用户的全部用户数据下载完成后,才能供用户使用,因此该方式会影响用户开启云虚拟机的时间。
目前,为了缩短云虚拟机的启动时间,一种实现方式为,由NAS盘保存用户的所有文件的属性信息,其中,文件的属性信息包括但不限于该文件的大小、文件类型、保存位置和文件图标等。当用户登录到云虚拟机时,云虚拟机从NAS盘中获取该用户的所有文件的属性信息,并根据获取到的所有文件的属性信息虚拟化出该用户的文件系统,其中,虚拟化出的文件系统与用户实际在云桌面上的文件系统相同,能够供用户浏览以及提供用户访问文件的入口,但该虚拟化出的文件系统中的文件没有存在本地,还是存在NAS中,也就是,当该用户需要访问某个文件时,云虚拟机发送访问文件的请求,去NAS盘下载该访问文件请求对应的文件至云虚拟机本地。
如图3所示,为上述实现方式中云虚拟机与NAS盘的模块交互示意图,云虚拟机包括但不限于底层和应用层,底层包括文件管理驱动(file monitor driver),应用层包括文件管理应用程序(exlporer app)和配置文件应用程序(profile app)。
当用户登录云虚拟机时,文件管理驱动触发配置文件应用程序,从NAS盘下载该用户的所有文件的属性信息,文件管理驱动基于配置文件应用程序下载的所述所有文件的属性信息,虚拟化出该用户的虚拟化文件系统,当虚拟化文件系统构建完成时,用户便登录到云虚拟机上。
当登录到云虚拟机后,用户可以通过虚拟化文件系统进行操作,文件管理应用程序能够响应用户在虚拟化文件系统上对文件的操作,并产生I/O请求。比如,用户访问虚拟化文件系统中的某文件时,文件管理应用程序响应用户读文件的操作,并产生读文件的I/O请求,并在该I/O请求中携带该文件的文件标识。
文件管理驱动可以监听文件管理应用程序产生的I/O请求,并根据监听到的I/O请求触发文件管理应用程序执行相应的操作。
示例性的,如文件管理驱动监听到文件管理应用程序产生的I/O请求为读文件请求,且确定本地不存在该I/O请求中携带的文件标识对应的文件,也就是,用户访问了本地不存在的文件时,文件管理驱动触发配置文件应用程序从NAS盘下载所述读文件的I/O请求中携带的文件标识对应的文件至云虚拟机本地差分盘内。
作为另一种示例,假设用户在云桌面创建了名称为“工作”的word文件,文件管理应用程序响应用户的该操作,并产生写文件的I/O请求,并在所述写文件的I/O请求中携带所述名称为“工作”的文件的信息,所述信息包括该文件的属性信息和该文件中的数据。文件管理驱动监听到文件管理应用程序产生的该写文件的I/O请求,文件管理驱动触发配置文件应用程序将所述写文件的I/O请求中携带的名称为“工作”的文件的信息同步上传至NAS盘中。后续,若用户在云桌面内删除或修改了所述名称为“工作”的文件时,文件管理驱动也会触发配置文件应用程序同步更新NAS盘内所述名称为“工作”的文件。
通过上述实现方式,由于云虚拟机在启动的时候,可以根据获取到的所有文件的属性信息构建虚拟化文件系统,而不需要把NAS中存储的文件下载至云虚拟机本地,所以缩短了云虚拟机的启动时间,但是,由于用户在访问某个文件时,首先需要从NAS中下载所访问的文件至云虚拟机本地,所以延长了用户访问文件的时间,尤其对于比较大的文件,云虚拟机从NAS盘下载该文件的时间就比较长,导致用户访问该文件的时延较大,从而影响用户体验。
鉴于此,本申请提供了一种基于云桌面的数据访问方法,在用户登录云虚拟机时,根据用户在云虚拟机中所生成文件的历史数据记录,预测该用户登录到云虚拟机后可能会访问的文件,云虚拟机并将所预测的文件从NAS下载至云虚拟机,这样,在用户登陆 虚拟机后,由于可能访问的文件已经下载至云虚拟机,所以缩短了这些文件的访问时延。
下面结合具体的附图对本申请技术方案进行具体介绍。
如图4所示,为本申请中的软件模块架构中各软件模块交互的示意图,与图3相比,在图4中增加预测系统。该预测系统为一预测服务,也可以理解为能够实现本申请中预测功能的应用程序。下面以图4中的云虚拟机为图1中的云虚拟机为例,详细说明本申请的过程:
当用户登录云虚拟机时,一方面,文件管理驱动通过配置文件应用程序,从NAS盘获取该用户的所有文件的属性信息,并基于获取的所述所有文件的属性信息,虚拟化出该用户的虚拟化文件系统。具体步骤请参见对图3中的相关介绍,此处不再赘述。
另一方面,预测系统获取该用户的历史数据记录,并基于该用户的历史数据记录预测该用户的目标文件,即该用户登录到云虚拟机后将会访问的文件,应理解,实际上预测系统预测出的为目标文件的文件标识,具体预测方法将在下文做详细描述。预测完成后,预测系统触发配置文件应用程序从NAS中下载预测到的文件标识对应的文件至云虚拟机本地,即将下载的文件保存至虚拟化文件系统中该文件的对应存储路径下,当用户通过虚拟化文件系统访问该文件时,文件管理应用程序响应用户读取该文件的操作,用户可以直接访问该文件,同时文件管理应用程序针对该用户读取该文件的操作产生读文件的I/O请求。文件管理驱动监听到文件管理应用程序产生的读文件的I/O请求,并确定本地保存有该文件,避免当监听到用户读文件的I/O请求后,再触发配置文件应用程序从NAS盘下载待读取的文件,以此,缩短了用户访问该文件的时延。
其中,历史数据记录包含了在接收到登录请求之前,该用户在云虚拟机上所访问的所有的文件的记录。
需要说明的是,本申请实施例中用户的历史数据记录可以存储于云服务器中,也可以存储于云服器之外的一存储介质中,例如云服务器之外的一台独立的服务器,或云存储资源(NAS或360云盘等),其中,本申请中不同用户的用户数据还可以存储于其他存储介质中。以及本申请中的预测系统可以部署于云虚拟机上,也可以部署于云虚拟机之外的电子设备上,例如云服务器或者独立于云服务器或云虚拟机之外的另一存储介质中,其中,预测系统和历史数据可以部署于同一装置或同一存储介质上,也可以部署于不同的装置或不同的存储介质上。下面针对不同部署方式对应的应用场景进行介绍说明:
如图5所示,为本申请实施例提供的一云虚拟机上部署该预测系统的应用场景的具体示例。该示例中,虚拟机管理系统响应用户通过用户终端触发的登录请求,为该用户(鉴权通过)的用户终端分配云虚拟机,并将该用户的登录请求转发给该云虚拟机,一方面,云虚拟机与用户终端建立连接,另一方面,云虚拟机根据登录请求中携带的该用户的身份信息(例如账户名称和密码),获取该用户的历史数据记录,并基于云虚拟上部署的预测系统以及获取的历史数据记录,预测该用户登录到云虚拟机后可能访问的目标文件的文件标识,并从NAS盘中下载预测到的文件标识对应的文件至云虚拟机本地差分盘中。当用户终端连接到云虚拟机后,用户通过用户终端在云桌面进行操作。
如图6的所示,为本申请提供的又一应用场景的具体示例,该示例中,预测系统部署在云服务器上,下面对该示例的具体流程进行介绍:
图6与图5相同的部分请参见对图5中相关部分的具体介绍,此处不再赘述。图6与图5不同的是,图6中的云虚拟机上未部署预测系统,而是部署于云服务器上,因此,图6中在云虚拟机响应用户的登录请求时,还需要触发云服务器上部署的预测系统对该 用户登录到云虚拟机后可能访问的目标文件进行预测,即,预测系统获取该用户的历史数据,并基于获取到的历史数据进行预测,将预测到的目标文件的文件标识发送给云虚拟机,云虚拟机从NAS盘中下载所述文件标识对应的文件至云虚拟机本地差分盘中。
其中,触发预测系统进程预测的方式有多种,下面列举两种:
触发方式一,通过云虚拟机来触发;
云虚拟机接收到虚拟机管理系统转发的该用户的登录请求时,云虚拟机触发预测系统对该用户登录到云虚拟机后可能访问的文件进行预测。
示例性的,云虚拟机接收到该用户的登录请求后,通过配置文件应用程序向预测系统发送预测请求,该预测请求携带该用户的身份信息,例如,该用户的账户信息。对应的,预测系统接收云虚拟机的配置文件应用程序发送的预测请求,从历史数据存储介质中获取该用户的历史数据,并基于该历史数据对该用户登录到云虚拟机后可能访问的文件的文件标识进行预测。
作为又一种示例,当由云虚拟机来触发预测系统进行预测时,可以由云虚拟机来获取该用户的历史数据,云虚拟机在向预测系统发送预测请求时,还可以将获取的该用户的历史数据发送给预测系统。
触发方式二,通过虚拟机管理系统来触发;
虚拟机管理系统对该用户的身份信息鉴权通过后,向预测系统发送预测请求,该预测请求携带有该用户的身份信息。对应的,预测系统接收虚拟机管理系统发送的预测请求,从历史数据存储介质中获取该用户的历史数据,并基于该历史数据对该用户登录到云虚拟机后可能访问的文件的文件标识进行预测。
当用户终端连接到云虚拟机后,用户通过用户终端进行云桌面的操作,云虚拟机根据用户操作更新历史数据存储介质中该用户的历史数据记录,以及更新NAS中该用户的用户数据。
如图7所示,为本申请提供的又一应用场景的具体示例,该实例中,预测系统部署于除云虚拟机和云服务器之外的其他一存储介质(下文称为预测系统存储介质)中,下面对该示例的具体流程进行介绍:
图7与图6或图5相同的部分请参见对图6或图5中相关部分的具体介绍,此处不再赘述。下面以图6为例,介绍图7与图6的不同之处,图7与图6不同之处在于,图7的预测系统并未部署于云服务器上,而是部署于云虚拟机和云服务器之外的一单独的存储介质上,因此,图7中在云虚拟机响应用户的登录请求时,还需要触发预测系统对该用户进行目标文件的预测,以使预测系统获取该用户的历史数据记录,并基于获取的历史数据记录预测该用户登录到云虚拟机后可能访问的目标文件的文件标识。预测完成后,预测系统将预测到的目标文件的文件标识发送给云虚拟机,云虚拟机再从NAS盘中下载预测到的文件标识对应的文件至云虚拟机本地差分盘中。
需要说明的是,本申请所述的预测完成,可以是每预测到一个目标文件的文件标识后,将该预测到的文件标识发送给云虚拟机,或将预设数量的目标文件的文件标识全部预测完成后,将预测的全部文件标识一起发送给云虚拟机。本申请实施例并不作限定预测系统与云虚拟机的交互方式。
下面以图5所示的应用场景为例,对本申请实施例的技术方案进行具体介绍。
如图8所示,本申请实施例提供了一种基于云虚拟机的数据访问的流程,该流程中的用户终端、虚拟桌面管理系统、云虚拟机、NAS盘和历史数据存储介质可分别为上述 图7中的用户终端、虚拟桌面管理系统、云虚拟机、NAS盘和历史数据存储介质,该流程包括:
步骤800,用户通过用户终端发送登录请求。
用户操作用户终端发送登录请求,该登录请求中携带该用户的身份信息,例如,该用户在用户终端输入的登录云虚拟机的账户名称和密码。
步骤801,虚拟桌面管理系统接收该用户发送的登录请求后,对该用户的身份信息进行鉴权,若鉴权通过,则为该用户的用户终端分配云虚拟机。
虚拟桌面管理系统接收到该用户发送的登录请求后,对该用户的身份信息进行鉴权,若鉴权通过后,则为该用户的用户终端分配从虚拟机池中选择的空闲的云虚拟机,若鉴权不通过,则不响应该用户的登录请求,退出该流程。
步骤802,虚拟桌面管理系统将该用户的登录请求发送给该被分配的云虚拟机。
步骤803,云虚拟机接收该用户的登录请求。
步骤804,云虚拟机响应该用户的登录请求,与该用户终端建立连接,并获取该用户的历史数据记录。
被分配的云虚拟机在接收到虚拟机桌面管理系统发送的登录请求时,一方面,该云虚拟机与该用户的用户终端建立连接(例如HDP连接),其中,建立连接的过程包括:云虚拟机从NAS中获取该用户的所有文件的属性信息,根据获取到的所有文件的属性信息构建虚拟化文件系统,当虚拟化文件系统构建完成时,即云虚拟机与用户终端建立连接完成,该用户便登录到云虚拟机上,继而可以通过虚拟化文件系统进行云桌面的操作。另一方面,云虚拟机获取用户的历史数据记录,并根据获取的历史数据记录对该用户登录到云虚拟机后可能访问的文件进行预测,对于具体的预测过程请参见步骤805和步骤806的介绍。
需要说明的是,预测系统还可以部署于云虚拟机之外的存储介质上,若预测系统部署于云虚拟机之外的存储介质上时,则对于步骤804,云虚拟机在响应该用户的登录请求,还需要触发预测系统对该用户进行目标文件的预测。其触发方式可以参见上述触发方式一或触发方式二的具体描述,此处不再赘述。
步骤805,云虚拟机上部署的预测系统基于获取的该用户的历史数据记录,预测该用户登录到云虚拟机后可能访问的目标文件的文件标识。
若预测系统未部署于云虚拟机上时,当预测系统预测完成后,还需要将预测到的该用户对应的文件标识发送给云虚拟机。
步骤806,该云虚拟机从NAS中下载预测系统通知的文件标识对应的文件至云虚拟机本地差分盘中。
需要对于步骤806进行说明的是,本申请并不限定预测系统预测用户可能访问的目标文件的文件标识的时间,可以是在用户登录云虚拟机的过程中,也可以是用户登录到云虚拟机后,或用户在使用云虚拟机的过程中,还可以是在用户使用完云虚拟机,用户终端退出云虚拟机后,例如当该用户的用户终端退出云虚拟机后,预测系统获取该用户的历史数据记录,对该用户下次登录云虚拟机可能访问的文件进行预测,并保存预测到的文件的文件标识,当该用户下一次登录云虚拟机时,云虚拟机在接收到该用户的登录请求后,直接获取保存的预测到的该用户对应的文件的文件标识,而不需要在云虚拟机接收到该用户的登录请求后,预测系统再开始预测,进一步缩短了用户访问某些文件的时延。本申请实施例对此不作限定。
其中,本申请实施例中的预测系统预测目标文件的方式有多种,下面列举两种:
预测方式一:根据预设算法计算历史数据记录中包含的各文件的被访问权重,并根据各文件的被访问权重的大小确定目标文件。
本申请实施例中,用户的历史数据记录包含该用户在使用云桌面过程中产生的所有的文件访问记录,比如,该用户访问文件的时间信息、被访问的文件的文件标识(例如文件名称)、被访问的文件的目录层级信息、被访问的文件的大小等信息。示例性的,历史数据记录包含的一条文件访问记录为,该用户在1998年12月1日1:30:06访问了云桌面中文件标识为A的文件,该文件的大小为3M,目录层级为2层。
针对历史数据记录中包含的任意一个文件标识,预测系统根据历史数据记录获取该文件标识对应的文件的访问参数,其中,访问参数包括但不限于:文件的最新访问时间、访问次数、目录层级和文件大小等,预测系统根据各访问参数被分配的权重,计算该文件标识对应的文件的被访问权重,进一步,再根据该历史数据记录中包含的各文件标识对应的文件的被访问权重的大小确定目标文件。
需要说明的是,对于预测方式一对应的预测系统而言,该预测系统可以部署于云服务器上,或部署于云虚拟机上。示例性的,若将基于预测方式一对应的预测系统部署于云虚拟机上时,则云虚拟机针对将登录的用户进行目标文件的预测,预测完成后直接根据预测到的目标文件的文件标识从NAS中进行下载,减少系统间的交互。
作为又一示例,若将预测方式一对应的预测系统部署于云服务器时,云服务器还需要将预测到的目标文件的文件标识通知给该用户被分配的云虚拟机,云虚拟机再根据云服务器的通知,去NAS中下载对应的文件。在该场景下,一方面,云服务器和云虚拟机的交互会造成资源浪费。另一方面,交互过程也会增加时延。另外,由于该预测方式一的计算方式简单,不需要占用较多的存储资源,因此,本申请优选将基于预测方式一的预测系统部署于云虚拟机。
下面以将基于预测方式一的预测系统部署于云虚拟机为例,对预测方式一的具体流程进行详细介绍:
示例性的,历史数据存储介质中存储有不同用户在使用云桌面时产生的所有的文件访问记录,也就是,历史数据记录由该用户的每一条文件访问记录组成,换而言之,本申请云虚拟机获取到的该用户的历史数据记录,为在云虚拟机接收到该用户的登录请求之前,该用户在云虚拟机中产生的所有的文件访问记录。其中,文件访问记录包含但不限于下列中的部分或全部:
该用户访问文件的时间、被访问的文件的文件标识、被访问的文件的目录层级或被访问的文件的大小。
示例性的,如下表1所示,为云虚拟机获取到的该用户的历史数据记录。
表1
Figure PCTCN2020129865-appb-000001
Figure PCTCN2020129865-appb-000002
需要说明的是,云虚拟机在预测该用户对应的目标文件的文件标识时,并非需要获取该用户的全部历史数据记录,也可以获取一段时间内该用户的历史数据记录,本申请实施例对此并不作限定。例如,云虚拟机获取当前时间之前6个的月的文件访问记录,并基于该6个月内该用户的文件访问记录进行预测,以此减少预测系统的计算量。
云虚拟机根据预设算法计算获取的历史数据记录中包含的各文件标识对应的文件的被访问权重,并根据各文件的被访问权重的大小确定目标文件的文件标识,并从NAS中下载确定的文件标识对应的文件至本地差分盘内。
具体的,预设算法为,根据文件的各访问参数以及各访问参数被分配的权重来计算该文件的被访问权重,其中,该访问参数包括但不限于下列中的部分或全部:
最后访问时间、被访问次数、目录层级或文件的大小。
假设预设算法为下述算法1。
weight=μ 1×β 12×β 23×β 34×β 4.................算法1
其中,Weight表示该文件的被访问权重,μ 1表示文件的被访问次数,β 1为μ 1的权重系数,μ 2表示文件的最后访问时间,β 2为μ 2的权重系数,μ 3表示文件的目录层级,β 3为μ 3的权重系数,μ 4表示文件的大小,β 4为μ 4的权重系数。
应理解的是,若文件的被访问次数越大,则用户再次访问该文件的概率也就越高;文件的最后访问时间离当前时间越近,则用户再次访问该文件的概率也就越高;对于文件的目录层级而言,用户可能将经常访问的文件置于比较浅的层级中,因此,若文件的目录层级较浅则相对于目录层级较深的文件的被访问的概率也就越高;文件的大小越大,则云虚拟机从NAS中下载该文件的用时越长,越容易导致用户访问某些文件的时延增加,影响用户体验。例如,两个文件除文件的大小之外的其他访问参数都相同,则两个文件中,将较大的文件作为目标文件的概率越高。
需要说明的是,历史数据记录中记录的文件的访问时间为绝对时间,比如,2018年11月2日9:12为一个绝对时间,该时间不具有可比性,本申请实施通过算法1计算各文件的被访问权重时,需要将各文件的访问参数—最后访问时间进行UTC转换,以UTC为时间基准,对绝对时间进行转换,具体转换方式如下:
其中,UTC时间起点为1970年1月1日的00:00:00.000,以根据历史数据记录得到的该文件的最后访问时间(绝对时间)距1970年1月1日的00:00:00.000的偏移量,作为计算该文件的被访问权重时,该文件的最后访问时间。例如,1970年1月1日的01:01:00.000与世界标准时间的偏移量为61分钟。
云虚拟机通过上述算法1确定获取到的历史数据记录中包含的所有文件标识对应的 文件的被访问权重,并根据各文件的被访问权重的值从大到小的顺序,选取被访问权重较大的文件作为目标文件。
通过比较文件的访问参数和表1(历史数据记录)可知,历史数据记录中并没有记载文件的被访问次数,因此,在通过算法1计算各文件的被访问权重之前,应根据获取到的历史数据记录确定历史数据记录中包含的各文件标识对应的文件的访问参数,再根据确定的访问参数,使用算法1计算各文件的被访问权重。
下面以表1为例,对根据历史数据记录确定文件的访问参数的过程进行详细说明:
针对表1中包含的任一文件标识,云虚拟机遍历表1,以统计包含该文件标识的文件访问记录的条数,即为该文件标识对应的文件的被访问次数,再将包含该文件标识的文件访问记录中,与当前时间最接近的访问时间作为该文件标识对应的文件的最后访问时间,根据记录有最后访问时间的文件访问记录确定该文件标识对应的文件的目录层级和文件的大小。
如下表2所示,为云虚拟机基于上述方式,从表1中统计出的各文件的访问参数,参见表2所示。
表2
Figure PCTCN2020129865-appb-000003
下面针对表2所示的各文件的访问参数,以算法1为例,对使用预设算法计算表2中各文件的被访问权重的过程进行详细介绍。
假设β 1的值为0.4,β 2的值为0.3,β 3的值为0.1,β 4的值为0.1,则云虚拟机确定文件标识A对应的文件的被访问权重满足weightA=0.4×2+0.3×TA+0.2×2+0.1×2;文件标识B对应的文件的被访问权重满足weightV=0.4×3+0.3×T B+0.2×3+0.1×4;文件标识C对应的文件的被访问权重满足weightC=0.4×2+0.3×T C+0.2×3+0.1×4。其中,上述T A为表2中文件标识A对应的文件的最后访问时间2018年11月2日9:12与世界标准时间之间的偏移量;T B为表2中文件标识B对应的文件的最后访问时间2018年11月3日9:40与世界标准时间之间的偏移量;同样的,T C为表2中文件标识C对应的文件的最后访问时间2018年11月1日12:14与世界标准时间之间的偏移量。依次类推,通过上述方式计算文件标识D和文件标识E对应的文件的被访问权重。
当确定出各文件标识对应的文件的被访问权重后,根据各文件的被访问权重的值按照从大到小进行排序,选取被访问权重的值较大的文件作为目标文件,其中,云虚拟机确定目标文件的数量的方式有多种。下面举例说明:
假设各文件按照被访问权重的值的排序为D>B>A>C>E,一种可实现的方式为,通过预设目标文件的数量的方式进行确定,例如,假设预设目标文件的数量为3,则预测系统选取排序中前3个文件标识作为目标文件的文件标识,即预测系统确定目标文件的文件标识为D、B和A,云虚拟机在顺序下载完文件标识为D、B和A的文件后即停 止下载。另一种可实现的方式为,根据预设被访问权重的阈值进行确定,比如,预设被访问权重的阈值为80%,则确定的各文件的被访问权重的值不小于0.8的文件为目标文件。例如,云虚拟机确定各文件的被访问权重的值的排序为D>B>A>C>E,其中,文件标识C对应的文件的被访问权重的值为0.8,则云虚拟机确定目标文件的文件标识为D、B、A和C,云虚拟机在顺序下载完文件标识为D、B、A和C的文件后停止下载。
需要说明的是,目标文件的排序可以表示用户登录到云虚拟机后访问文件的顺序,例如,预测系统确定目标文件的文件标识为D、B、A和C,则文件标识D对应的文件即为预测到的该用户登录到云虚拟机后首个访问的文件,文件标识B对应的文件为预测得出的用户访问文件标识D对应的文件之后顺次访问的文件,文件标识A对应的文件为预测得出的用户访问文件标识B对应的文件之后顺次访问的文件,依次类推。对应的,云虚拟机按照该访问权重的排序,顺次从NAS盘中下载文件标识为D的文件,文件标识为D的文件下载完成后,再下载文件标识为B的文件,文件标识为B的文件下载完成后,再下载文件标识为A的文件,依次类推,直到按照被访问权重的顺序将所有预测到的目标文件全部下载完成为止。
上述介绍的为通过预测方式一进行预测的具体示例,在一种可选的场景中,预测系统还可以通过预测方式二进行预测,下面通过下述实施例对预测方式二的具体预测流程进行介绍。
预测方式二:通过深度学习算法模型预测目标文件。
本申请实施例还可以通过,基于用户的历史数据记录得到的训练样本,对深度学习算法模型(下文称预测模型)进行训练,以使训练后的预测模型掌握该用户访问文件的访问习惯,预测系统基于该用户的预测模型预测该用户在登录到云虚拟机后,可能访问的目标文件的文件标识。
本申请实施例中的预测模型为对深度学习算法进行训练后得到的深度学习算法模型,本申请实施例可以应用的深度学习算法模型包括但不限于:LSTM、SVM(支持向量机,support vector machine)、朴素贝叶斯、决策树、神经网络、RNN((Recursive Neural Network,递归神经网络/Recurrent Neural Network,循环神经网络)、DBSCAN(Density-Based Spatial Clustering of Applications with Noise。
其中,LSTM模型是一种时间递归型神经网络,适用于处理和预测,时间序列中间隔和延迟相对较长的重要事件,也就是,能够基于较长一段历史时间内每个时间点发生的事件,来预测未来某个时间点可能会发生的事件。其中,通常,通过为时间点编号,即通过时间次序来记录各个时间点发生的事件,比如,表1所示的历史数据记录中,表1第一行记录的用户访问文件A的时间次序为1,表1第二行记录的用户访问文件B的时间次序为2,表1第三行记录的用户访问文件C的时间次序为3,依次类推。LSTM模型能够根据历史数据记录中各时间次序对应的用户访问的文件来推测下一时间点该用户将访问的文件,例如,针对表1而言,LSTM模型能够针对时间次序1~9对应的用户访问的各文件,来推测下一时间点,即时间次序10,用户可能访问的文件。
因此,本申请实施例中的预测模型优选LSTM模型。在介绍本申请通过LSTM模型进行目标文件的预测前,首先对LSTM模型的结构和原理进行介绍:
如图9所示,为LSTM模型的结构和预测过程示意图。
首先对LSTM的结构进行介绍,如图9所示,LSTM模型包含遗忘门(forget),输入门(input)和输出门(output)以及单元状态C,其中单元状态C为实现时间递归的算 法结构。
其中,LSTM的输入有两个,分别为输入x t和h t-1,其中,x t表示待预测的时间次序,另一输入h t-1为用户在上一时间次序x t-1访问的文件的文件编号,通常,LSTM模型的输入为数值,因此,需要为用户实际访问的文件预定义文件编号,该文件编号h t-1为该文件针对同一用户的唯一数字编号,文件标识与文件编号具有对应关系,例如,文件标识为a的文件的文件编号为1,文件标识为b的文件的文件编号为2,依次类推,基于该用户的历史数据记录中的所有文件标识,进行预定义,分别为每个文件标识定义唯一对应的数字编号。
LSTM的输出h t,为预测的用户在时间次序x t时访问的文件的文件编号。
下面对LSTM的预测流程进行介绍,参见图9,假设用户在时间次序x t-3访问了文件标识为file t-3的文件,file t-3对应的文件编号为h t-3,在时间次序x t-2访问了文件标识为file t-2的文件,file t-2对应的文件编号为h t-2在时间次序x t-1访问了文件标识为file t-1的文件,file t-1对应的文件编号为h t-1,h t-4为用户在时间次序x t-4访问的文件的文件编号,在图9的示例中,LSTM当前要预测时间次序x t用户将要访问的文件的文件标识h t,则图9中LSTM模型分别执行①②③④步骤,以对h t进行预测。
从图9可知,输入门input的输出为i t,i t和模型左侧的tanh进行点积相乘后输入至单元状态C。其中,i t的算法参见算法2。
i t=σ(W xix t+W hih t-1+W cic t-1+b i)    算法2
遗忘门forget的输出为f t,f t与c t-1点击相乘后输入至单元状态C,用于计算c t,其中,f t的算法参见算法3,c t的算法参见算法4。
f t=σ(W xfx t+W hfh t-1+W cfc t-1+b f)    算法3
c t=f t⊙c t-1+i t⊙tanh(W xcx t+W hch t-1+b c)    算法4
通过算法4可以得出,c t取值依赖于c t-1,而c t-1依赖于c t-2,于c t-2依赖于于c t-3,依次类推,也就是,单元状态c t保留了对未来输出有影响的历史信息,单元状态C能够根据时间次序以及每个时间次序访问的文件的文件标识计算决定h t的c t
输出门output的输出为o t,o t与tanh(c t)点击相乘后得到h t。其中,o t的算法参见算法5,h t的算法参见算法6。
o t=σ(W xox t+W hoh t-1+W coc t-1+b o)    算法5
h t=o t⊙tanh(c t)    算法6
其中,上述算法中包含的W xi、W hi、W ci、b i、W xf、W hf、W cf、b f、W xc、W hc、b c、W xo、W ho、W co、b o和σ为LSTM模型中的静态参数,可以通过训练LSTM模型得出。
以上为对LSTM模型结构和原理的基本介绍,下面对LSTM模型的训练过程进行介绍:
本申请实施例是使用用户在历史时间内不同时间次序访问的文件的文件编号作为训练样本进行训练,以预测用户在距当前的时间次序之后的下一时间次序将会访问的文件的文件编号。
如图10(a)所示,为训练LSTM模型过程示意图,其中,x表示时间次序,h 1~h n表 示用户实际访问的文件的文件编号,且不同时间次序访问的文件可能相同,也就是输入的训练样本中,不同时间次序的文件编号可能相同。
其中,文件编号与文件的文件标识具有对应关系,例如,用户在时间次序x 1~x n访问的不同文件有3个,各文件的名称分别为“abc”、“123”、“学习”,则该3个文件与文件编号的对应关系可以为下列表3所示。
表3
文件名称 文件编号
abc 1
123 9
学习 5
需要说明的是,上述对应关系仅为举例,各文件名称对应的文件编号可以随机定义,或按照由大到小或由小到大的顺序定义。由于每个用户建立的文件不同,每个用户的历史数据记录中包含的文件标识(文件名称)与文件编号的对应关系不同,但同一个用户的同一文件标识对应的文件编号相同。
假设用户为在时间次序x 1访问了文件“abc”,基于上述对应关系可知,文件““abc””的文件编号为1;用户在时间次序x 2访问了文件“123”,文件“123”的文件编号为9;用户在时间次序x 3访问了文件“学习”,文件“学习”的文件编号为5,输出为h t,表示LSTM模型预测的时间次序x 1访问的文件的文件编号。
在训练LSTM模型时,将[x 1,1]、[x 2,9]、[x 3,5]作为训练样本输入至LSTM模型,输出h n为LSTM模型预测的用户在时间次序x n可能访问的文件的文件编号,若该LSTM模型基于文件编号1预测出x 2对应的文件编号h 2为9时,则无须调整LSTM模型的各静态参数,若预测得到的h 2不为9时,则调整LSTM模型的各静态参数,以使h 2的输出为9。同样的,在预测x 3时间节点对应的h 3时,若基于[x 1,1]和[x 2,9]预测x 3对应的h 3为5时,则无须调整LSTM模型的各静态参数,若预测得到的h 3不为5时,则调整LSTM模型的各静态参数,以使h 3的输出为5。依次类推,通过大量的训练样本训练LSTM模型,以使该LSTM模型的算法规律能够学习该用户访问文件的规律,或者说是访问文件的习惯。
简单来说,本申请训练后的LSTM模型基于该用户的历史访问文件的记录,通过数学规律体现该用户访问文件的规律,通过对用户的历史访问文件的记录总结出的数学规律来预测用户下一次登录后将会访问的文件(文件编号)。
比如,用户每次登录云虚拟机后都会访问文件名称为“工作”的文件夹,该文件夹下包含的为用户从工作以来创建的所有文件,通过用户的历史访问文件的记录可以得出,用户每隔一周便会在“工作”的文件夹下创建1~2个新的word文件,且用户在该周内每次登录云虚拟机都会随机打开该周创建的该1~2个word文件。因此,当通过上述数据对应的训练样本对LSTM模型进行训练时,可以让该LSTM模型通过数学规律体现该用户访问文件的规律,该规律用于确定每个文件编号之间的关联性,该关联性可以体现为与当前时间次序最接近的时间次序(两个时间次序访问的文件相同)和/或在该文件编号之后其他文件编号出现的概率值之间的关联。当用户下一次登录云虚拟机时,LSTM模型可以根据该规律,得到与当前最新的时间次序下用户访问的文件关联度最高的文件编号,并输出该文件编号,该文件编号即为下一时间次序用户可能访问的文件的文件编号。也就是,LSTM模型能够根据输入的数据,学习该用户在不同时间次序访问文件的规律, 该规律可以体现为,根据输入的各时间次序对应的文件编号,确定各文件编号之间的关联性,并预测与下一时间次序之前的一个时间次序对应的文件编号的关联性最高的文件编号,通过该规律预测下一时间次序用户可能访问的文件(文件编号)。
下面对该LSTM模型的使用过程(或称为预测过程)进行介绍。
如图10的(b)所示,为使用训练后的LSTM模型进行的预测过程。x表示时间次序,h 1~h n表示用户实际访问的文件的文件编号。h n+1表示预测出的用户在时间次序x n+1将会访问的文件的文件编号。
假设用户访问了3个文件,分别为在时间次序x 1访问了文件file 1,文件file 1的文件编号为a;在时间次序x 2访问了文件file 2,文件file 2的文件编号为b;在时间次序x 3访问了文件file 3,文件file 3的文件编号为c。若要基于训练后LSTM模型预测时间次序x 4时用户将会访问的文件的文件编号,则将x 4和x 4之前的时间次序以及各时间次序访问的文件的文件编号输入至训练后的LSTM模型,训练后的LSTM模型基于对用户访问文件的习惯总结出的数学规律预测在时间节点x 4将会访问的文件的文件编号,比如可能为a、b、c中的一个。x 1x 2x 3h t
以上为对LSTM模型的简要介绍,下面结合本申请的技术方案对本申请基于LSTM的预测模型进行预测的过程进行详细介绍。
本申请实施例可以通过一个LSTM模型预测基于用户的历史数据记录包含的所有的文件访问记录来预测该用户在下一时间次序可能访问的文件(文件编号)。为了提高精度,也可以通过两个LSTM模型进行预测,即通过其中一个LSTM模型分别预测用户登录云虚拟机后访问的首个文件(文件编号),以及通过另一个LSTM模型预测用户该访问完首个文件后,顺序访问的其他文件(文件编号),由于预测用户访问的首个文件(文件编号)的LSTM模型为基于用户每次登陆云虚拟机后访问的首个文件的训练样本进行训练的,则该LSTM模型能够较快学习到用户每次登陆云虚拟机后访问的首个文件的规律,模型训练的过程更加简便,因此,通过两个LSTM模型分别预测的方式,预测精度更高,训练过程更加简便。本申请实施例优选通过两个LSTM模型进行预测,下面对该两个LSTM模型进行详细介绍。
本申请实施例中的预测模型包含第一预测模型和第二预测模型,其中,第一预测模型是根据用户每次登录云虚拟机后首个访问的文件的文件编号进行训练得到的,用于预测用户登录云虚拟机后首个访问的文件;第二预测模型是根据用户每次登录云虚拟机顺序访问的所有文件的文件编号进行训练得到的,用于根据第一预测模型预测出的首个访问的文件,预测在首个访问的文件之后将会顺序访问的其他文件。
下面以一个用户为例,对训练该用户对应的预测模型的过程进行详细介绍:
预测系统获取该用户的历史数据记录,根据历史数据记录得到各预测模型的训练样本数据。示例性的,本申请中第一预测模型的训练样本数据为数据集A1,第二预测模型的训练样本数据为数据集A2。
数据集A1包含该用户每次登录云虚拟机后首次访问的文件的文件编号,数据集A1中的文件按照被访问的时间由远及近进行排列;
数据集A2包含该用户每次登录云虚拟机后顺序访问的所有文件的文件编号,数据集A2中的文件也是按照被访问的时间由远及近进行排列。
由于历史数据记录为如表2所示的文件访问记录的形式,假设根据获取到的某用户的历史数据记录整理出如下表4所示的数据。
表4
登录云虚拟机的次序 X1 X2 X3 X4
登录云虚拟机后首个访问的文件 File11 File21 File31 File41
顺序访问的第2个文件 File12 File22 File32 File42
顺序访问的第3个文件 File13 File23 File33 File43
顺序访问的第4个文件 File14 File24 File34 File44
顺序访问的第5个文件 File15 File25 File35 File45
顺序访问的第6个文件 File16   File36 File46
顺序访问的第7个文件 File17   File37  
顺序访问的第8个文件 File18      
顺序访问的第9个文件        
…… …… …… …… ……
顺序访问的第n个文件        
其中,X表示按照时间顺序对用户每次登录云虚拟机进行编号后得到的时间次序,X1即编号该用户第一次登录云虚拟机,X2表示该用户第二次登录云虚拟机,X3表示该用户第三次登录云虚拟机,依次类推。对应的,X1=1,X2=2,X3=3,X4=4,依此类推。
其中,Filexn表示用户在第X次登录云虚拟机时顺序访问的第n个文件,例如,File11表示用户在第1次登录云虚拟机时,户访问的首个文件,File12表示用户在第1次登录云虚拟机时,为用户在访问File11之后顺序访问的第2个文件,File13表示用户在第1次登录云虚拟机时,为用户在访问File12之后顺序访问的第3个文件,依次类推。
需要说明的是,在确定用户每次登录后访问的首个文件时,可以通过历史数据记录来确定,比如,历史数据记录中记录的用户每次登录云虚拟机时访问的首个文件会有相应标识,该标识可以是但不限于文字、数字或符号中的一项或多项。另外,上述Filexn为通过两个维度来体现的文件的访问顺序,并非表示文件标识,例如,File11、File21、File31和File41可能为同一个文件。示例性的,如下表5所示,为表4中不同时间次序该用户实际访问的文件的文件标识。
表5
Figure PCTCN2020129865-appb-000004
其中,每个文件的文件标识对应一个文件编号,该用户访问的文件(文件标识)包括a、b、c、d、e、h、r、y,假设各文件标识对应的文件编号如下表6所示。
表6
Figure PCTCN2020129865-appb-000005
下面基于表5和表6,对第一预测模型的训练过程进行介绍:
基于上述表5和表6得到第一预测模型的训练样本数据A1,如下:
根据表4可知第一预测模型的训练样本数据为用户每次登录云虚拟机访问的首个文件,即[(X1,File11),(X2,File21),(X3,File31),(X4,File41)]。根据表5可知,[(X1,File11),(X2,File21),(X3,File31),(X4,File41)]=[(X1,a),(X2,e),(X3,h),(X4,c)]。
由于LSTM模型输入为数值,不能直接将文件标识输入至LSTM模型中,因此根据表6可以确定,输入到第一预测模型中的训练样本数据A1=[(1,1),(2,5),(3,6),(4,3)]。
将上述数据集A1输入至第一预测模型中,第一预测模型基于(X1,1)预测时间次序X2对应的文件编号h2,若h2为5(用户在时间次数X2实际访问的文件的文件编号),则不需调整第一预测模型内的各静态参数,若h2不为5,则调整第一预测模型内的各静态参数,直到第一预测模型计算出的h2为5;继而对时间次序X3对应的h3进行预测,若预测出的h3为6(用户在时间次数X3实际访问的文件的文件编号),则不需调整第一预测模型内的各静态参数,若h3不为6,则调整第一预测模型内的各静态参数,直到第一预测模型计算出的h3为6,依次类推。
需要说明的是,上述输入至LSTM模型的数据集的形式为举例,本申请并不限定于上述数据集的形式。例如,输入至LSTM模型的数据集也可以只包含文件编号,由LSTM模型按照文件编号的排序确定各文件编号对应的时间次序,例如,以数据集A1为例,输入到模型的数据集A1还可以为[1,5,6,3],即仅包含文件编号,由LSTM模型按照文件编号的顺序确定各文件编号对应的时间次序,并根据确定的时间次序,预测下一时间次序的输出,在一种可选的场景中,输入至预测模型的可以为仅包含文件编号的数据集,而不包括待预测的时间次序,该待预测的时间次序可以由模型自身确定,本申请实施例对此不作限定。
在训练的过程中,根据用户在使用过程中实际的文件访问记录更新该用户的历史数据记录,并根据该用户的更新后的历史数据记录更新数据集A1,并通过上述方式使用更新后的数据集A1对该用户的第一预测模型进行持续训练,需要说明的是,当输入的训练 样本数据的数量足够大时,LSTM模型会统计预测的错误率,当错误率达到预设数值时,该LSTM模型会针对该训练样本数据从头开始训练,以重新调整模型内的静态参数,直到该第一预测模型的准确率达到理想值,比如,准确率达到90%,该模型的准确率达到理想值时,便可认为该模型训练完成,即可以使用该模型进行预测。
下面对第二预测模型的训练过程进行介绍:
基于上述表4得到第二预测模型的训练样本数据A2,如下:
A2=[(S1,File11),(S2,File12),(S3,File13),(S4,File14),(S5,File15),(S6,File16),(S7,File17),(S8,File18),(S9,File21),(S10,File22),(S11,File23),(S12,File24),(S13,File25),(S14,File31),(S15,File32),(S16,File33),(S17,File34),(S18,File35),(S19,File36),(S20,File37),(S21,File41),(S22,File42),(S23,File43),(S24,File44),(S25,File45),(S26,File46)]。其中,S(n)为按照时间顺序对用户在虚拟机上所访问的所有文件的时间点进行编号后得到的时间次序,假设S1=1,则S26=26,换而言之就是,按照访问顺序将用户第一次登录云虚拟机访问的首个文件开始至最后一次登录访问的最后一个文件为止进行排序,得到的编号,例如,针对表4,也可以理解为,用户在4次登录访问云虚拟机过程中所访问的所有文件的时间次序。
表5为表4对应的用户实际访问的文件的文件标识,基于表5得到第二预测模型的训练样本数据A2,如下:
A2=[(S1,a),(S2,b),(S3,c),(S4,d),(S5,b),(S6,h),(S7,e),(S8,d),(S9,e),(S10,b),(S11,e),(S12,c),(S13,d),(S14,h),(S15,c),(S16,d),(S17,r),(S18,h),(S19,e),(S20,b),(S21,c),(S22,h),(S23,y),(S24,d),(S25,e),(S26,d)]。
根据表6所示的对应关系,确定第二预测模型的训练样本数据A2为:
A2=[(1,1),(2,2),(3,3),(4,4),(5,2),(6,6),(7,5),(8,4),(9,5),(10,2),(11,5),(12,3),(13,4),(14,6),(15,3),(16,4),(17,7),(18,6),(19,5),(20,2),(21,3),(22,6),(23,8),(24,4),(25,5),(26,4)]。
使用上述最终确定的数据集A2对第二预测模型进行训练,具体训练方式请参见上述对第一预测模型的训练步骤,此处不再赘述。
综上,训练后的第一预测模型能够确定数据集A1中最后一个文件的至少一个关联文件,并确定该最后一个文件与各关联文件的关联度,并根据关联度预测用户在下一时间次序可能访问的文件,即用户登录到云虚拟机后可能访问的第一个文件。其中,关联文件为数据A1中记录的用户在访问完与所述最后一个文件相同的文件之后访问的文件;例如,假设数据集A1为[(1,1),(2,2),(3,3),(4,2),(5,3),(6,4),(7,5),(8,3)],则该数据集中的最后一个文件的文件编号为3,根据数据集A1可知,用户在之前使用云桌面过程中,在访问完文件编号为3的文件之后还访问过文件编号为2的文件以及文件编号为4的文件,因此,文件编号3和文件编号4为文件编号3的关联文件。
第一预测模型能够根据学习到的用户访问文件的规则,确定文件编号3与各关联文件直接的关联度,示例性的如,第一预测模型根据两个文件之间的间隔来确定两者的关联度,例如,用户在一年前访问完文件A后访问了文件B,在一周前访问完文件A后访 问了文件C,则文件B和文件C都是文件A的关联文件,但文件C与文件A的关联度高于文件B文件A的关联度,或用户在一年前访问完文件A后访问了文件B,之后用户在访问完文件A后又访问了文件C,但之后用户在使用云桌面过程的很长一段时间不再访问文件B,但会频繁访问C,则文件C与文件A的关联度高于文件B文件A的关联度。
作为另一种示例,第一预测模型根据第一数据集合中用户在访问完与所述最后一个文件相同的文件之后访问所述关联文件的频率确定的两个文件之间的关联度。例如,用户在访问完文件1的关联文件为文件2和文件3,但根据第二数据集确定用户在访问完文件1后访问文件2的频率低于用户在访问完文件1后访问文件3的频率,则文件3与文件1的关联度高于文件2文件1的关联度。
作为又一种示例,第一预测模型还可以结合上述两种方式来确定两个文件之间的关联度,例如,按照第一种示例的方式确定出两个文件的关联度1,按照第二种示例的方式确定出相同的两个文件的关联度2,并为两个关联度的值分配权重,以确定最终的关联度的值。
对应的,训练后的第二预测模型也能够根据数据集A2中最后一个文件的至少一个关联文件,并确定该最后一个文件与各关联文件的关联度,并根据关联度预测用户在下一时间次序可能访问的文件,体请参见上述对训练后的第一预测模型的介绍,此处不再赘述。
其中,训练后的第一预测模型便可以用于预测用户登录云虚拟机后将访问的首个文件,则训练后的第二预测模型用于预测用户在访问的首个文件之后还会顺序访问的文件。因此,第二预测模型在进行预测时,需要将第一预测模型预测到的用户可能访问的第一个文件的文件编号按照时间次序添加至数据集A2的尾端,并基于第一预测模型预测的第一个文件预测用户可能访问的第二个文件,再将预测到的第二个文件的文件编号更新数据集A2,并基于预测到的第二个文件来预测用户可能访问的第三个文件,以此类推。
需要说明的是,上述仅为举例说明,实际上LSTM模型中的静态参数有多种,且并不限于本申请所示的算法中包含的参数,本申请实施例可以通过上述训练方式训练LSTM模型,使其能够确定各文件之间的关联度,且确定关联度的方式并不限于上述示例。
下面对使用第一预测模型和第二预测模型进行预测的过程进行具体介绍:
结合上述表4所示的例子,按照当前数据集A1内的排序,用户再次登录云虚拟机的时间次序为X5,下面通过第一预测模型预测用户在时间次序X5可能访问的首个文件的文件编号。
将数据集A1和X5输入至第一预测模型,第一预测模型输出h 5,h 5为用户在时间次序X5可能访问的文件的文件编号。
通过第二预测模型预测在h 5之后用户将顺序访问的文件的文件编号。需要说明的是,本申请实施例的第二预测模型需要结合第一预测模型预测的用户访问的首个文件来预测顺序访问的文件,因此,在使用第二预测模型预测用户在访问完h 5(对应的文件)之后顺序访问的文件时,需要根据第一预测模型预测的用户可能访问的首个文件的文件编号更新数据集A2,假设h 5(文件编号)的值为8更新后的数据集A2为:
A2=根据表6所示的对应关系,确定第二预测模型的训练样本数据A2为:
A2=[(1,1),(2,2),(3,3),(4,4),(5,2),(6,6),(7,5),(8,4),(9,5),(10,2),(11,5),(12,3),(13,4),(14,6),(15, 3),(16,4),(17,7),(18,6),(19,5),(20,2),(21,3),(22,6),(23,8),(24,4),(25,5),(26,4)(27,8)]。
其中,将h 5(8)添加至数据集A2时,按照数据集A2当前的排序确定h 5的时间次序为顺序递增的,即S27,按照排序S27=27。
根据更新后的数据集A2预测用户在时间次序S28将会访问的文件的文件编号。即将更新后的数据集A2和时间次序S28输入至第二预测模型中,第二预测模型预测用户在S28将会访问的文件的文件编号,假设为h 6
若要继续预测用户在时间次序S28之后的时间次序S29将会访问的文件的文件编号,则按照时间次序将(S28,h 6)添加至数据集A2的尾端,以更新数据集A2,继续使用更新后的数据集A2预测用户在时间次序S29将会访问的文件的文件编号(假设为h 7),依次类推,直到满足结束条件。
其中,该结束条件可以为多种形式,下面列举几种:
1)确定当前预测出的目标文件的文件编号达到预设数量后,输出该目标文件的文件编号。例如:预设数量为3,该当该预测模型确定出3个目标文件的文件编号后(例如上述的h 5,h 6,h 7),确定结束预测的流程。
3)预测出的目标文件重复,比如,文件编号h 5为e,文件编号h 7也为e;或者文件编号h 5对应的文件与文件编号h 7对应的文件是同一文件。
需要说明的是,对预测方式二对应的预测系统而言,由于不同用户具有不同的访问文件的使用习惯,因此,通过不同的训练样本训练出的深度学习算法模型的静态参数也就不相同,也就是,每个云虚拟机的用户都有自身对应的预测模型。由于链接克隆云虚拟机具有随机被分配的特性,因此,若在云虚拟机上部署各用户对应的预测模型,则需要在云虚拟机上部署该云虚拟机可能会被分配给的部分或全部用户对应的预测模型,会占用云虚拟机大量的存储资源。因此,本申请实施例,在通过预测模型预测目标文件时,优选的,将预测模型部署于云虚拟机之外的存储介质中,例如云服务器中,或云服务器之外的独立的装置或存储介质。
上述介绍的为云虚拟机通过预测方式一或预测方式二进行单独预测的具体示例,本申请实施例可以结合预测方式一和预测方式二进行预测,例如,当用户使用云桌面产生的历史数据记录较少时,通过预测方式一进行预测,当用户的历史数据记录的数量较大时,使用预测方式二进行预测,下面结合上述两种预测方式预测目标文件的文件编号的流程进行介绍。
下面结合图11说明结合预测方式一和预测方式二进行预测确定目标文件的流程,在图11所示的示例中,云服务器上部署有预测方式一对应的预测系统,预测模型部署于与云服务器独立的另一服务器上,该流程包括:
步骤1100,用户触发用户终端发送登录请求;
步骤1101,虚拟桌面管理系统接收该用户发送的登录请求后,对该用户的身份信息进行鉴权,若鉴权通过,则为该用户的用户终端分配云虚拟机;
步骤1102,云虚拟机从历史数据存储介质中获取该用户的历史数据记录,并判断当前是否满足使用预测方式一进行预测的条件(以下简称为第一预测条件),如果是,则执行步骤1103,否则执行步骤1104;
其中,第一预测条件可以包括下列中的部分或全部:
1)历史数据记录内包含的文件标识对应的文件的数量小于或等于预设数量;
2)所述用户登录云虚拟机的次数小于或等于预设次数;
3)所述用户使用云虚拟机的累积时间小于或等于预设累积时间。
需要说明的是,若历史数据存储介质部署于云服务器,则步骤1102中判断当前是否满足第一预测条件的执行主体还可以是云服务器,由云服务器根据判断结果通知部署了对应预测方式的对象进行预测,比如,云服务器确定当前满足第一预测条件,则通知该云虚拟机进行预测,并将该用户的历史数据记录发送给云虚拟机;若云服务器确定当前不满足第一预测条件,则通知预测模型存储介质使用该用户对应的预测模型进行预测,并将该用户的历史数据记录发送给预测模型存储介质。
本申请实施例,可以灵活执行具有不同功能的存储介质之间的信息交互,本申请实施例最终是以集成有预测系统的主体基于获取到的该用户的历史数据记录进行预测的,并不限定于该系统内各部分进行数据交互的方式和过程。
步骤1103,云虚拟机根据预设算法计算历史数据记录中包含的各文件标识对应的文件的被访问权重,并根据各文件的被访问权重的大小确定目标文件的文件标识,从NAS中下载确定的文件标识对应的文件至本地差分盘内。
对于步骤1103可以参见上述预测方式一中介绍的具体执行步骤,此处不再赘述。
步骤1104,预测模型基于获取的历史数据记录预测该用户登录到云虚拟机后的目标文件的文件标识,并将预测到的文件标识发送给云虚拟机,云虚拟机从NAS中下载该文件标识对应的文件至本地差分盘内。
对于步骤1104可以参见上述预测方式二中介绍的具体执行步骤,此处不再赘述。
优先的,云虚拟机在从NAS中下载目标文件时,用户还可以正常使用云桌面,例如,通过虚拟机文件系统访问文件,或在云虚拟机本地建立新的文件。当用户在云虚拟机上访问已下载的目标文件时,不需要再等待云虚拟机从NAS盘中下载该文件的这段时间,有效的缩短了用户访问这些文件的时延,提高了用户体验,具有较强的应用性。
本申请实施例在用户终端登录到云虚拟机后,若用户通过虚拟化文件系统实际访问的文件不是预测系统预测到的目标文件,则可以同时通过两个并列的进程(进程1和进程2)分别执行下载操作,例如,进程1从NAS中下载用户实际访问的文件,进程2从NAS盘下载预测出的目标文件。
在用户使用云桌面过程中,预测功能还可以基于用户实际访问的文件重新预测该用户将会顺序访问的文件,云虚拟机通过进程2从NAS盘下载预测系统最新预测出的目标文件。
基于与方法实施例同一发明构思,本申请实施例还提供了一种设备,用于执行上述方法实施例中云虚拟机执行的方法,相关特征可参见上述方法实施例,此处不再赘述,如图12所示,该设备包括处理单元1201以及传输单元1202。
处理单元1201,用于接收用户的登陆请求,并响应所述登陆请求预测所述用户登录到云虚拟机后可能访问的至少一个文件;所述处理单元1201可以为图4中的预测系统,并执行该预测系统在图8所述的实施例中的步骤805对应的方法。
传输单元1202,用于从远端存储设备中下载所述至少一个目标文件至所述云虚拟机。所述传输单元1202可以为图4中的配置文件应用程序,并执行图8所示的实施例中的步骤803、步骤804、以及步骤806对应的方法。
在一种可能的实施方式中,处理单元1201在响应所述登录请求时,可以获取所述用户的历史数据记录,根据所述用户的历史数据记录,预测所述用户登录所述云虚拟机后 可能访问的至少一个文件,或处理单元1201在响应所述登录请求时;其中,所述历史数据记录记录了在云虚拟机接收到所述登陆请求之前,所述用户在所述云虚拟机中所访问的文件的记录。
在一种可能的实施方式中,处理单元1201在根据所述用户的历史数据记录,预测所述用户登录所述云虚拟机后可能访问的至少一个文件时具体用于,获取所述历史数据记录中记录的每个文件的多个参数值,其中每个参数值预设一个权重值;根据预设算法对所述多个参数及所述多个参数的预设权重值计算每个文件在用户登录所述云虚拟机后可能被访问的概率值;根据所述概率值确定所述至少一个目标文件。
在一种可能的实施方式中,所述处理单元具体用于:从所述历史数据记录中获取第一数据集合,所述第一数据集合为用户连续N次登陆所述云虚拟机时,每次所访问的第一个文件的集合;将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件。
在一种可能的实施方式中,所述处理单元在将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件时,具体用于:控制所述第一预测模型确定所述第一数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度,并输出与所述最后一个文件关联度最高的关联文件;其中,所述第一数据集合中的文件按照被访问的时间由远及近进行排列,所述关联文件为所述第一数据集合中记录的用户在访问完与所述最后一个文件之后访问的文件;所述关联度,为所述第一预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第一数据集合中用户在访问完与所述最后一个文件之后访问所述关联文件的频率确定的。
在一种可能的实施方式中,所述处理单元在将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件时具体用于:控制所述第二预测模型确定所述第二数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度,并输出与所述最后一个文件关联度最高的关联文件,将输出的所述关联文件添加至所述第二数据集的尾端,使输出的所述关联文件为更新后的第二数据集的最后一个文件。其中,所述第二数据集合中的文件按照访问时间由远及近的顺序排列,所述关联文件为所述第二数据集合中记录的用户在访问完与所述最后一个文件相同的文件之后访问的文件;所述关联度,为所述第二预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第二数据集合中用户在访问完与所述最后一个文件相同的文件之后访问所述关联文件的频率确定的。
在一种可能的实施方式中,所述处理单元具体用于:
按照所述文件被访问的概率值从大到小的顺序,在所述用户的历史数据包含的多个文件标识中选择至少一个目标文件标识,将所述目标文件标识对应的文件作为所述用户登录所述云虚拟机后可能访问的文件;当侦测到历史数据记录符合预设条件时,则:将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进 行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件;根据第一预测模型输出的所述用户登录所述云虚拟机后可能访问的第一个文件及第二预测模型输出的所述用户登录所述云虚拟机后可能访问的M个文件确定所述用户登录所述云虚拟机后可能访问的文件。
在一种可能的实施方式中,所述处理单元具体用于:生成一个预测请求,并将所述预测请求通过传输单元发送至一电子设备,所述预测请求用于指示所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件进行预测;
通过传输单元接收所述电子设备所预测的所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件。
与上述构思相同,如图13所示,本申请提供一种设备1300,设备1300可应用于上述图5、图6或图7所示场景中的云虚拟机上。
设备1300可包括处理器1301和存储器1302。进一步的,该装置还可包括通信接口1304,该通信接口可为收发器。进一步的,该装置还可包括总线系统1303。
其中,处理器1301、存储器1302和通信接口1304可通过总线系统1303相连,该存储器1302可用存储指令,该处理器1301可用于执行该存储器1302存储的指令,以控制通信接口1304接收或发送信号,完成上述图8所示方法中以云虚拟机主体的步骤。
其中,存储器1302可以集成在处理器1301中,也可以是与处理器1301不同的物理实体。
作为一种实现方式,通信接口1304的功能可以考虑通过收发电路或收发的专用芯片实现。处理器1301可以考虑通过专用处理芯片、处理电路、处理器或通用芯片实现。
作为另一种实现方式,可以考虑使用计算机的方式,来实现本申请实施例提供的第一计算节点或第一计算节点的功能。即将实现处理器1301和通信接口1304功能的程序代码存储在存储器1302中,通用处理器可通过执行存储器中的代码来实现处理器1301和通信接口1304的功能。
该设备1300所涉及的与本申请提供的技术方案相关的概念、解释和详细说明以及其他步骤,可参见前述方法或其它实施例中关于这些内容的描述,此处不作赘述。
在本申请的一示例中,所述设备1300可用于执行上述图8所示流程中,以云虚拟机为主体的步骤。比如,通信接口1304可以接收用户的登陆请求,以及从远端存储设备中下载所述至少一个目标文件至所述云虚拟机;处理器1301可以响应所述登陆请求预测所述用户登录到云虚拟机后可能访问的至少一个文件。
关于处理器1301和通信接口1304的介绍,可参见上述图8所示流程的介绍,在此不再赘述。
基于以上实施例,本申请实施例还提供了一种计算机存储介质,该存储介质中存储软件程序,该软件程序在被一个或多个处理器读取并执行时可实现上述任意一个或多个实施例提供的方法。该计算机存储介质可以包括:U盘、移动硬盘、只读存储器、随机存取存储器、磁碟或者光盘等各种可以存储程序代码的介质。
基于以上实施例,本申请实施例还提供了一种计算机程序产品,所述计算机程序产品中包括计算机指令,当所述计算机指令被计算机执行时,使得所述计算机执行上述任意一个或多个实施例提供的方法。
基于以上实施例,本申请实施例还提供了一种芯片,该芯片包括处理器,用于实现上述任意一个或多个实施例所涉及的功能,例如获取或处理上述方法中所涉及的信息或者消息。可选地,该芯片还包括存储器,该存储器,用于存储处理器所执行的程序指令和数据。该芯片,也可以包含芯片和其他分立器件。
应理解,在本申请实施例中,处理器可以是中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processor,DSP)、专用集成电路(application-specific integrated circuit,ASIC)、现成可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器,也可以是任何常规的处理器等。
该存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器的一部分还可以包括非易失性随机存取存储器。
该总线系统除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,在图中将各种总线都标为总线系统。在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
在本申请的各个实施例中,如果没有特殊说明以及逻辑冲突,不同的实施例之间的术语和/或描述具有一致性、且可以相互引用,不同的实施例中的技术特征根据其内在的逻辑关系可以组合形成新的实施例。
可以理解的是,在本申请的实施例中涉及的各种数字编号仅为描述方便进行的区分,并不用来限制本申请的实施例的范围。上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定。

Claims (16)

  1. 一种基于云虚拟机的数据访问方法,其特征在于,包括:
    云虚拟机接收用户的登陆请求,并响应所述登陆请求预测所述用户登录到云虚拟机后可能访问的至少一个文件;
    所述云虚拟机从远端存储设备中下载所述至少一个文件至所述云虚拟机。
  2. 如权利要求1所述的方法,其特征在于,所述云虚拟机预测所述用户登录到云虚拟机后可能访问的至少一个文件,包括:
    所述云虚拟机获取所述用户的历史数据记录,根据所述用户的历史数据记录,预测所述用户登录所述云虚拟机后可能访问的至少一个文件;所述历史数据记录记录了在云虚拟机接收到所述登陆请求之前,所述用户在所述云虚拟机中所访问的文件的记录。
  3. 如权利要求2所述的方法,其特征在于,所述根据所述用户的历史数据记录,预测所述用户登录所述云虚拟机后可能访问的至少一个文件,包括:
    获取所述历史数据记录中记录的每个文件的多个参数值,其中每个参数值预设一个权重值;
    根据预设算法对所述多个参数及所述多个参数的预设权重值计算每个文件在用户登录所述云虚拟机后可能被访问的概率值;
    根据所述概率值确定所述至少一个文件。
  4. 如权利要求1所述的方法,其特征在于,所述根据所述用户的历史数据记录,预测所述用户登录所述云虚拟机后可能访问的至少一个文件,包括:
    从所述历史数据记录中获取第一数据集合,所述第一数据集合为用户连续N次登陆所述云虚拟机时,每次所访问的第一个文件的集合;
    将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;
    将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;
    将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;
    将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;
    返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件。
  5. 如权利要求4所述的方法,其特征在于,所述将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件包括:
    所述第一预测模型确定所述第一数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度;其中,所述第一数据集合中的文件按照被访问的时间由远及近进行排列,所述关联文件为所述第一数据集合中记录的用户在访问完与所述最后一个文件之后访问的文件;
    所述第一预测模型输出与所述最后一个文件关联度最高的关联文件;
    所述关联度,为所述第一预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第一数据集合中用户在访问完与所述最后一个文件之后访问所述关联文件 的频率确定的。
  6. 如权利要求4所述的方法,其特征在于,所述将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件包括:
    所述第二预测模型确定所述第二数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度;其中,所述第二数据集合中的文件按照访问时间由远及近的顺序排列,所述关联文件为所述第二数据集合中记录的用户在访问完与所述最后一个文件相同的文件之后访问的文件;
    所述第二预测模型输出与所述最后一个文件关联度最高的关联文件;其中,所述关联度,为所述第二预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第二数据集合中用户在访问完与所述最后一个文件相同的文件之后访问所述关联文件的频率确定的;
    所述第二预测模型将输出的所述关联文件添加至所述第二数据集的尾端,使输出的所述关联文件为更新后的第二数据集的最后一个文件。
  7. 如权利要求3所述的方法,其特征在于,所述预测所述用户登录所述云虚拟机后待访问的至少一个文件,包括:
    按照所述文件被访问的概率值从大到小的顺序,在所述用户的历史数据包含的多个文件标识中选择至少一个目标文件标识,将所述目标文件标识对应的文件作为所述用户登录所述云虚拟机后可能访问的文件;
    当侦测到历史数据记录符合预设条件时,则:
    将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;
    将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;
    将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;
    将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;
    返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件;
    根据第一预测模型输出的所述用户登录所述云虚拟机后可能访问的第一个文件及第二预测模型输出的所述用户登录所述云虚拟机后可能访问的M个文件确定所述用户登录所述云虚拟机后可能访问的文件。
  8. 如权利要求1所述的方法,其特征在于,所述云虚拟机预测所述用户登录到云虚拟机后可能访问的至少一个文件,包括:
    生成一个预测请求,并将所述预测请求发送至一电子设备,所述预测请求用于指示所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件进行预测;
    接收所述电子设备所预测的所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件。
  9. 一种设备,其特征在于,该设备包括处理单元、传输单元:
    所述处理单元,用于接收用户的登陆请求,并响应所述登陆请求预测所述用户登录到云虚拟机后可能访问的至少一个文件;
    传输单元,从远端存储设备中下载所述至少一个文件至所述云虚拟机。
  10. 如权利要求9所述的设备,其特征在于,所述处理单元具体用于:
    获取所述用户的历史数据记录,根据所述用户的历史数据记录,预测所述用户登录所述云虚拟机后可能访问的至少一个文件;所述历史数据记录记录了在云虚拟机接收到所述登陆请求之前,所述用户在所述云虚拟机中所访问的文件的记录。
  11. 如权利要求10所述的设备,其特征在于,所述处理单元在根据所述用户的历史数据记录,预测所述用户登录所述云虚拟机后可能访问的至少一个文件时,具体用于:
    获取所述历史数据记录中记录的每个文件的多个参数值,其中每个参数值预设一个权重值;
    根据预设算法对所述多个参数及所述多个参数的预设权重值计算每个文件在用户登录所述云虚拟机后可能被访问的概率值;
    根据所述概率值确定所述至少一个文件。
  12. 如权利要求9所述的设备,其特征在于,所述处理单元具体用于:
    从所述历史数据记录中获取第一数据集合,所述第一数据集合为用户连续N次登陆所述云虚拟机时,每次所访问的第一个文件的集合;
    将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;
    将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;
    将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;
    将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;
    返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件。
  13. 如权利要求12所述的设备,其特征在于,所述处理单元在将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件时,具体用于:
    控制所述第一预测模型确定所述第一数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度,并输出与所述最后一个文件关联度最高的关联文件;
    其中,所述第一数据集合中的文件按照被访问的时间由远及近进行排列,所述关联文件为所述第一数据集合中记录的用户在访问完与所述最后一个文件之后访问的文件;所述关联度,为所述第一预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第一数据集合中用户在访问完与所述最后一个文件之后访问所述关联文件的频率确定的。
  14. 如权利要求12所述的设备,其特征在于,所述处理单元在将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件时,具体用于:
    控制所述第二预测模型确定所述第二数据集合中最后一个文件的至少一个关联文件,并确定所述最后一个文件与所述至少一个关联文件的关联度,并输出与所述最后一个文件关联度最高的关联文件,将输出的所述关联文件添加至所述第二数据集的尾端,使输出的所述关联文件为更新后的第二数据集的最后一个文件;
    其中,所述第二数据集合中的文件按照访问时间由远及近的顺序排列,所述关联文件为所述第二数据集合中记录的用户在访问完与所述最后一个文件相同的文件之后访问的文件;所述关联度,为所述第二预测模型根据所述关联文件与所述最后一个文件的间隔,和/或在所述第二数据集合中用户在访问完与所述最后一个文件相同的文件之后访问所述关联文件的频率确定的。
  15. 如权利要求11所述的设备,其特征在于,所述处理单元具体用于:
    照所述文件被访问的概率值从大到小的顺序,在所述用户的历史数据包含的多个文件标识中选择至少一个目标文件标识,将所述目标文件标识对应的文件作为所述用户登录所述云虚拟机后可能访问的文件;
    当侦测到历史数据记录符合预设条件时,则:
    将所述第一数据集合输入第一预测模型,以使所述第一预测模型根据所述第一数据集合进行预测,输出所述用户登录所述云虚拟机后可能访问的第一个文件;
    将所述第一预测模型输出的第一个文件及所述历史数据中记录的用户连续M次登陆所述云虚拟机时,每次顺序访问的多个文件构成第二数据集合;
    将所述第二数据集合输入第二预测模型,以预测用户登录所述云虚拟机后可能访问的第二个文件;
    将所述第二个文件加入所述第二数据集合以更新所述第二数据集合;
    返回所述将所述第二数据集合输入第二预测模型的步骤,直到预测出用户登陆所述云虚拟机后可能访问的M个文件;
    根据第一预测模型输出的所述用户登录所述云虚拟机后可能访问的第一个文件及第二预测模型输出的所述用户登录所述云虚拟机后可能访问的M个文件确定所述用户登录所述云虚拟机后可能访问的文件。
  16. 如权利要求9所述的设备,其特征在于,所述处理单元具体用于:
    生成一个预测请求,并将所述预测请求通过传输单元发送至一电子设备,所述预测请求用于指示所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件进行预测;
    通过传输单元接收所述电子设备所预测的所述电子设备对所述用户登录到云虚拟机后可能访问的至少一个文件。
PCT/CN2020/129865 2019-11-29 2020-11-18 一种基于云虚拟机的数据访问方法及设备 WO2021104132A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911205728.7 2019-11-29
CN201911205728.7A CN111158807A (zh) 2019-11-29 2019-11-29 一种基于云虚拟机的数据访问方法及设备

Publications (1)

Publication Number Publication Date
WO2021104132A1 true WO2021104132A1 (zh) 2021-06-03

Family

ID=70556304

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/129865 WO2021104132A1 (zh) 2019-11-29 2020-11-18 一种基于云虚拟机的数据访问方法及设备

Country Status (2)

Country Link
CN (1) CN111158807A (zh)
WO (1) WO2021104132A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111158807A (zh) * 2019-11-29 2020-05-15 华为技术有限公司 一种基于云虚拟机的数据访问方法及设备
CN112862099B (zh) * 2021-03-12 2023-11-07 云知声智能科技股份有限公司 企业级神经网络模型处理方法、装置、电子设备和存储介质
CN113946853B (zh) * 2021-10-29 2024-01-30 苏州浪潮智能科技有限公司 一种文件过滤方法、装置、电子设备及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130226837A1 (en) * 2012-02-23 2013-08-29 Microsoft Corporation Content Pre-fetching for Computing Devices
CN108710528A (zh) * 2018-05-09 2018-10-26 深圳安布斯网络科技有限公司 桌面云虚拟机的访问、控制方法、装置、设备及存储介质
WO2019001463A1 (zh) * 2017-06-30 2019-01-03 华为技术有限公司 数据处理方法及装置
CN110020310A (zh) * 2017-12-05 2019-07-16 广东欧珀移动通信有限公司 资源加载的方法、装置、终端及存储介质
CN111158807A (zh) * 2019-11-29 2020-05-15 华为技术有限公司 一种基于云虚拟机的数据访问方法及设备

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7437438B2 (en) * 2001-12-27 2008-10-14 Hewlett-Packard Development Company, L.P. System and method for energy efficient data prefetching
CN108418871B (zh) * 2018-02-09 2022-02-11 国家电网公司 一种云存储性能优化方法和系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130226837A1 (en) * 2012-02-23 2013-08-29 Microsoft Corporation Content Pre-fetching for Computing Devices
WO2019001463A1 (zh) * 2017-06-30 2019-01-03 华为技术有限公司 数据处理方法及装置
CN110020310A (zh) * 2017-12-05 2019-07-16 广东欧珀移动通信有限公司 资源加载的方法、装置、终端及存储介质
CN108710528A (zh) * 2018-05-09 2018-10-26 深圳安布斯网络科技有限公司 桌面云虚拟机的访问、控制方法、装置、设备及存储介质
CN111158807A (zh) * 2019-11-29 2020-05-15 华为技术有限公司 一种基于云虚拟机的数据访问方法及设备

Also Published As

Publication number Publication date
CN111158807A (zh) 2020-05-15

Similar Documents

Publication Publication Date Title
US11392843B2 (en) Utilizing a machine learning model to predict a quantity of cloud resources to allocate to a customer
WO2021104132A1 (zh) 一种基于云虚拟机的数据访问方法及设备
JP7355404B2 (ja) クラウド・マイクロサービス埋め込み用自動チューナ
US10452441B1 (en) Determining an allocation of computing resources for a job
US11836578B2 (en) Utilizing machine learning models to process resource usage data and to determine anomalous usage of resources
US9372898B2 (en) Enabling event prediction as an on-device service for mobile interaction
CN109669985B (zh) 在微服务环境中使用相关数据分配执行任务
US20210042628A1 (en) Building a federated learning framework
US11630851B2 (en) Systems and methods for providing predictions to applications executing on a computing device
CN114384997B (zh) 传感器不可知姿势检测
US20220358171A1 (en) Efficient freshness crawl scheduling
US20150121373A1 (en) User Privacy Systems And Methods
US20220300822A1 (en) Forgetting data samples from pretrained neural network models
WO2023093354A1 (en) Avoidance of workload duplication among split-clusters
US11687848B2 (en) Identifying correlated roles using a system driven by a neural network
US20230316087A1 (en) Serving distributed inference deep learning (dl) models in serverless computing
US11381468B1 (en) Identifying correlated resource behaviors for resource allocation
US20210034946A1 (en) Recognizing problems in productivity flow for productivity applications
US11740726B2 (en) Touch sensitivity management
US20230112031A1 (en) System and method for workload management in a distributed system
US11853187B1 (en) System and method for remote management of data processing systems
US20230222008A1 (en) System and method for metadata-informed container deployment
US20240202017A1 (en) Systems and methods for deploying a containerized network function (cnf) based on information regarding the cnf
US20240211963A1 (en) System and method for managing issues through resource optimization
US20240160974A1 (en) Real time contract-based qubit registry

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20892418

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20892418

Country of ref document: EP

Kind code of ref document: A1