WO2023236444A1 - 数据读取方法、装置及电子设备 - Google Patents

数据读取方法、装置及电子设备 Download PDF

Info

Publication number
WO2023236444A1
WO2023236444A1 PCT/CN2022/130597 CN2022130597W WO2023236444A1 WO 2023236444 A1 WO2023236444 A1 WO 2023236444A1 CN 2022130597 W CN2022130597 W CN 2022130597W WO 2023236444 A1 WO2023236444 A1 WO 2023236444A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
read
read data
page
target application
Prior art date
Application number
PCT/CN2022/130597
Other languages
English (en)
French (fr)
Inventor
杨正
Original Assignee
杨正
卢聪
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杨正, 卢聪 filed Critical 杨正
Publication of WO2023236444A1 publication Critical patent/WO2023236444A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading

Definitions

  • the present application relates to the field of communication processing technology, and in particular to a data reading method, device and electronic equipment.
  • the application will have a long reading time when reading data, causing users to wait for a long time on the page to see the content they want. For example, after a user downloads a game APP, the user often needs to wait for a long time to enter a game scene or use some game props and skills while playing the game. Therefore, the user often encounters the problem of game lag.
  • a first aspect of this application provides a data reading method, which method includes:
  • the user behavior data includes multiple read data sequences, and the method further includes:
  • the user behavior data includes multiple read data sequences. Based on multiple page screenshots and the user behavior data, the pre-read data to be read by the target application next time is determined, including:
  • the read-ahead data to be read by the target application next time is determined.
  • determine the pre-read data to be read by the target application next time including:
  • the prediction model is obtained by training the target neural network with multiple joint training samples as input and the actual read data sample corresponding to each joint training sample as the true value;
  • Each of the joint training samples includes multiple page screenshot samples and user behavior data samples, and the actual read data samples are data actually read by the target application.
  • the method also includes:
  • the target application reads data for the i-th time
  • the actual read data sequence actually read for the i-th time and the corresponding pre-read data sequence for the i-th time are obtained; where i is an integer greater than or equal to 1;
  • the prediction model is updated based on the target user behavior data, the target page screenshot and the actual read data sequence.
  • the method also includes:
  • the prediction model is updated, including:
  • the prediction model is updated with the newly added incremental samples in the current cycle as input and the corresponding actual read data sequence as the true value.
  • the user behavior data samples include multiple read data sequence samples
  • the prediction model is trained in the following manner:
  • the parameters of the target neural network are updated multiple times to obtain the prediction model.
  • the method also includes:
  • determine the pre-read data to be read by the target application next time including:
  • the plurality of page screenshots and the user behavior data are sent to the prediction model configured in the terminal to obtain the next read Fetched pre-read data;
  • the plurality of page screenshots and the user behavior data are sent to the second server to obtain the next page to be read. Read ahead data.
  • the method includes:
  • the target application When the target application is not installed in advance, obtain the startup running package and the startup image package of the target application; wherein the startup image package includes the startup data of the target application;
  • the data corresponding to the read request is read from the startup image package and/or the first server; wherein, The first server includes all original data of the target application.
  • the embodiment of the present application also discloses a data reading device, the device includes:
  • the data acquisition module is used to obtain multiple page screenshots and user behavior data of the target application before the current moment; wherein the page screenshots are obtained by taking screenshots of the display page of the target application;
  • a data prediction module configured to determine the pre-read data to be read by the target application next time based on multiple screenshots of the page and the user behavior data;
  • a download module configured to download the pre-read data from the first server and store the pre-read data locally, so that when the next read request of the target application hits the pre-read data, the pre-read data will be downloaded from the first server. Read the read-ahead data locally.
  • An embodiment of the present application also discloses an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When executed by the processor, the data reading as described in the first aspect is implemented. method.
  • An embodiment of the present application also discloses a computer-readable storage medium, which stores a computer program that causes the processor to execute the data reading method described in the first aspect of the present application.
  • the embodiment of the present application also discloses a computer program product, which includes a computer program/instruction.
  • a computer program product which includes a computer program/instruction.
  • multiple page screenshots and user behavior data of the target application before the current moment can be obtained; and based on the multiple page screenshots and user behavior data, the predetermined data to be read by the target application next time can be determined.
  • Read data download the read-ahead data from the first server and store the read-ahead data in the local cache, so that when the next read request of the target application hits the read-ahead data, the read-ahead data is read from the local cache.
  • the pre-read data to be read by the user next time is obtained based on the page screenshot and user behavior data, and the pre-read data is downloaded from the first server in advance and stored locally on the terminal, so that the target application can read the data next time.
  • the data can be read directly from the local.
  • reading the data from the local will greatly shorten the reading path, thus greatly improving the data reading efficiency of the target application and shortening the data reading time. , to avoid the problem of lagging in the target application during use.
  • the user behavior data can reflect the reading behavior of the user when using the target application, and the display page is generally the interface currently viewed by the user, which is related to the user's There is a correlation between reading behavior.
  • reading behavior For example, in a game APP, when the user slides up, down, left, and right in a game scene page, the user's viewing behavior in the game scene can be recorded through screenshots of the displayed page, and user behavior data generally includes scene switching, prop purchase and replacement, etc. Behavior, which is closely related to the viewing behavior in the game scene.
  • Figure 1 is a diagram of the software and hardware environment in which a data reading method is executed in an embodiment of the present application
  • Figure 2 is a step flow chart of a data reading method in an embodiment of the present application
  • Figure 3 is a flow chart of steps for determining pre-read data in the embodiment of the present application.
  • Figure 4 is a schematic diagram of the overall flow of model training in the embodiment of the present application.
  • Figure 5 is a schematic flowchart of self-iterative update of the prediction model in the embodiment of the present application.
  • Figure 6a is a software and hardware environment diagram in server mode in the embodiment of the present application.
  • Figure 6b is a software and hardware environment diagram in client mode in the embodiment of the present application.
  • Figure 7 is a schematic structural framework diagram of a data reading device in an embodiment of the present application.
  • Embodiments of the present invention can be applied to various operating systems of terminals.
  • the terminals include PC terminals and mobile terminals.
  • the operating systems include PC-side operating systems such as Windows, Linux, Unix, and virtual machine simulation systems, and mobile terminals. Operating systems such as Android, IOS, etc.
  • the target application in the embodiment of the present invention may refer to an application with a larger software installation package and data package, such as 3D games, PS and other applications; wherein, the target application may be a PC-side application or a mobile terminal application.
  • Program Application, APP
  • a mobile terminal as an example, the method and system of the embodiment of the present invention will be described below.
  • this application proposes a data pre-reading scheme. Specifically, it can take screenshots of the display page of the target application before the current moment to obtain multiple page screenshots. After that, based on the multiple pages Screenshots and user behavior data before the current moment are used to determine the pre-read data that the target application will read next; the pre-read data can be downloaded from the first server in advance and stored locally on the terminal, thereby shortening the time it takes for the target application to read data next time. time, improving data reading efficiency.
  • this improvement plan can speed up the downloading of the target application. , installation and startup speed, and ensure the operation of the target application, as detailed below:
  • target applications with larger data packets it takes longer to download and install the target application.
  • target application larger bandwidth resources are required and higher performance requirements for the terminal. Therefore, for applications with larger data packets,
  • the target application requires a very long waiting time when users first use it, and the requirements for terminal performance and network bandwidth resources are also high during use.
  • the startup run package and startup image package of the target application can be obtained in advance when the target application is not installed; the target application can be started through the startup run package.
  • After application in response to the read request of the target application, read the data corresponding to the read request from the startup image package and/or the first server; wherein the first server includes all original data of the target application.
  • the startup run package is used to start the target application.
  • the startup run package contains the most basic files for starting the target application.
  • the startup run package completes the basic tasks of the target application during the installation process. Component installation, and configuring the interaction between the target application and the terminal; therefore, the amount of data contained in the startup package is very small, which greatly improves the efficiency of the terminal downloading the startup package.
  • the startup image package includes the startup data of the target application.
  • the terminal obtains the startup image package while obtaining the startup running package.
  • the target application starts, it can directly obtain the data required for the target application to start locally, which greatly reduces The time when the target application is started.
  • the data corresponding to the read request can be obtained from the startup image package and/or the first server; where, when the startup image package contains data corresponding to the read request, the data corresponding to the read request can be obtained directly.
  • the supplier of the target application can make the original data of the target application into a startup running package and a startup image package according to the functions it plays in the target application.
  • the data outside the startup package is packaged into the original image package, and the startup package, startup image package, and original image package can be uploaded to the first server. Therefore, when the user downloads and installs the target application, he can first download the startup package, startup package, and the original image package.
  • Image package due to the small data size and fast download speed of the startup running package and startup image package, can help users quickly start the target application on the terminal.
  • the running speed of the target application can be improved, allowing users to use the target application as quickly as possible.
  • the game APP can be quickly downloaded.
  • start when the user starts playing the game APP, during the game playing process, the local startup image package and the original image package in the first server can be used to support data acquisition during the game process, so as not to affect the user's play In the case of games, it can also greatly reduce the user's waiting time.
  • the startup running package and startup image package of the target application are stored in the terminal, and the corresponding pre-read data is downloaded from the first server only during the use of the target application, compared with storing all the data of the target application, Storing to the terminal can reduce the occupation of the terminal's storage resources and ensure the performance of the terminal.
  • the pre-read data is read from the first server in advance and stored locally in the terminal, with the use of the target application, the data stored locally in the terminal will become more abundant, and the probability of reading data directly from the local computer will be higher. As a result, with the in-depth use of the target application, the response speed becomes faster and faster, and the user's use becomes smoother.
  • FIG. 1 a schematic diagram of a software and hardware environment of the present application is shown. As shown in Figure 1, it includes a terminal, a first server and a second server, where the first server stores information required to run the target application. All original data (original image package), the second server is used to determine the read-ahead data to be read next for the target application.
  • the terminal is configured with a pre-reading module, a file management module and an application screenshot module corresponding to the target application.
  • the file management module can obtain the user behavior data of the target application
  • the application screenshot module is used to obtain page screenshots of the display page
  • the pre-reading module has functions with the file management module and the application screenshot module respectively.
  • the respective data paths can receive user behavior data sent by the file management module and multiple page screenshots sent by the application screenshot module.
  • the file management module sends user behavior data to the pre-reading module
  • the application screenshot module also sends multiple page screenshots to the pre-reading module.
  • the pre-reading module can package the user behavior data and multiple page screenshots. Then sent to the second server, the second server determines the pre-read data to be read next by the target application based on the user behavior data and page screenshots, and then feeds back the identification of the pre-read data to the pre-read module; then, the pre-read module starts from Download pre-read data from the first server.
  • the pre-reading module can communicate with the first server through multiple interfaces to obtain the determined pre-reading data from the first server through multiple interfaces in sequence; wherein the multiple interfaces can include a content distribution network CDN interface, a point-to-point Transmit P2P interface and origin station interface.
  • the file management module can be initialized when the target application is started by starting the running package.
  • the process of obtaining page screenshots by the application screenshot module can be as follows:
  • One way is to directly take screenshots of the display page of the target application at a preset frequency to obtain multiple page screenshots.
  • Another way is to use the screenshot capture component configured in the terminal's operating system to take a screenshot of the display page, and then obtain the screenshotted image frame in the display buffer of the video memory through the calling interface with the terminal's video memory. , and use the obtained image frame as a page screenshot.
  • the application screenshot module can call the Nvidia Game Stream interface to extract relevant image frames from the GPU memory and the display buffer corresponding to the current 3D game. If the frame rate of the game is generally 60-120fps, you can Extract image frames at certain sampling intervals to obtain multiple page screenshots.
  • this application can preprocess the screenshot of the display page to obtain the required page screenshot.
  • the preprocessing can include adjusting the direction and size of the screenshot, and marking the screenshot with a timestamp.
  • adjusting the direction of the screenshot you can adjust the direction of the screenshot to the forward direction.
  • adjusting the size of the screenshot you can compress the screenshot to the target size.
  • preprocessing the screenshot of the display page may also include: cropping the edges of the display page in the screenshot that do not belong to the target application.
  • This situation is suitable for the situation where the screenshot mechanism is a full-screen screenshot, such as Some screenshot software configured in the operating system takes a full-screen screenshot of the display screen. In this way, the screenshot will inevitably include other screens besides the display page of the target application.
  • some screenshot mechanisms will take a full-screen screenshot of the terminal.
  • the screenshot will include the taskbar of the operating system and the screen on the desktop that is not covered by the game page.
  • the image area in the screenshot that belongs to the display page of the target application can be captured to remove the edge images of the display page that do not belong to the target application in the screenshot, and then the image area can be captured.
  • the intercepted image area is used as the page screenshot described in this application.
  • FIG. 2 a step flow chart of a data reading method in an embodiment is shown. As shown in Figure 2, it may specifically include the following steps:
  • Step S201 Obtain multiple page screenshots and user behavior data of the target application before the current moment.
  • the page screenshot is obtained by taking a screenshot of the display page of the target application.
  • the display page of the target application can be screenshotted.
  • the display page can be screenshotted once every preset time, such as 100-200ms, thereby obtaining page screenshots.
  • preset time such as 100-200ms
  • multiple page screenshots can be obtained by taking multiple screenshots of the display page of the target application within a preset period before the current moment.
  • the user behavior data can also be obtained within a preset period before the current moment. behavioral data.
  • the preset period can be set to 10s.
  • user behavior data can record the user's operating behavior in the target application, including but not limited to data reading logs.
  • the data reading logs are used to record each data read within a preset period, and each read Data reading time.
  • the data reading log can include multiple data reading records.
  • Each data reading record can include the identification of the data read each time and the reading time of each reading.
  • the reading time can be the timestamp corresponding to the reading moment.
  • each page screenshot can also include a timestamp corresponding to the page screenshot.
  • the timestamp can represent the moment when the screenshot displays the page. For example, a screenshot of the displayed page was taken at 12:12:34 and 100 milliseconds. Then the timestamp corresponding to the moment of the screenshot can be recorded in the page screenshot obtained by the screenshot.
  • user behavior data not only includes data reading records, but also includes user jump behavior data between pages, data writing behavior data, etc.
  • jump behavior data can reflect when the user jumped from page to page.
  • Step S202 Based on a plurality of page screenshots and the user behavior data, determine the pre-read data to be read by the target application next time.
  • each data read record in the data read log can be compared with the page screenshot, thereby obtaining each page screenshot and data read Obtain the association between records.
  • the association between the display page viewed by the user in the target application and the data reading behavior performed by the user when viewing a certain display page can be established.
  • multiple page screenshots can represent which game scenes the user has watched in the game
  • data reading records can represent which data the user has read
  • the association relationship can represent the process of the user watching the game scenes. What data is read from .
  • a page screenshot may not have a corresponding data reading record.
  • the resource data that the user is interested in within a preset period of time can be obtained.
  • the pre-read data that the user will read next can be predicted based on the resource data that the user is interested in.
  • Step S203 Download the pre-read data from the first server and store the pre-read data locally, so that when the next read request of the target application hits the pre-read data, the pre-read data will be read from the local Get the pre-read data.
  • the determined pre-read data can be represented as an identifier of the pre-read data, such as an ID of the pre-read data, so that the pre-read data can be downloaded from the first server based on this identifier of the pre-read data, and the pre-read data can be downloaded from the first server.
  • the pre-read data is stored locally, for example, in the external memory of the terminal. Of course, it can also be stored in the memory of the terminal. In this way, when the next read request of the target application hits the read-ahead data, the read-ahead data can be read directly from the local, thus greatly shortening the data reading path and achieving fast data reading.
  • the pre-read data only includes part of the data required for the next read request
  • the remaining data that misses the pre-read data in the next read request can be downloaded from the first server, and then will be downloaded from the first server.
  • the remaining data downloaded and the hit data in the pre-read data are encapsulated into data packets and fed back to the target application.
  • the amount of data to be downloaded in real time is relatively reduced, and the real-time efficiency of reading data can also be improved.
  • the duration of downloading the pre-read data from the first server can be limited to a target duration. If the pre-read data is not downloaded within the target duration, the download of the pre-read data will no longer be performed.
  • the target duration can be determined based on the duration of the interval between every two consecutive reads of data by the user, for example, it can be the average duration of the interval between every two consecutive reads of data. In this way, when the target application initiates the next read request when the pre-read data is not successfully read, the conflict between the pre-read and the actual read can be avoided, so that the next read request of the target application can be responded to independently. Ensure the data reading efficiency of the target application.
  • the target application can directly read data from the local terminal the next time it reads data.
  • reading the data locally will greatly shorten the reading path, thereby greatly shortening the data reading time, improving the data reading efficiency of the target application, and avoiding the target application being stuck during use. problems, improving the smoothness of using the target application.
  • the pre-read data that the user will read next can be obtained. The correlation between the user's page viewing behavior and data reading behavior can be established. Based on this correlation, the user's next step can be accurately predicted.
  • the data to be read at one time thus increases the probability that the next read request will hit the read-ahead data, thereby ensuring that each read request can be responded to quickly.
  • the user behavior data may include a data reading log, and the data reading log may include multiple pieces of data.
  • Read records because the data read records can include the identification of the data read, such as the ID of the data, and the data read each time is generally not just one data, but may be multiple data. Therefore, each time A data read record represents an ID sequence. Therefore, a data read record can be called a read data sequence, and multiple data read records are multiple read data sequences.
  • multiple page screenshots and multiple read data sequences can be first The read data sequences are aligned, that is, a page screenshot can be mapped to a read data sequence to accurately map user viewing behavior and data reading behavior.
  • the multiple page screenshots and the multiple read data sequences can be aligned based on the respective times of the multiple page screenshots and the respective times of the multiple read data sequences.
  • the read data sequence matching the timestamp can be used as the read data sequence aligned with the page screenshot based on the timestamp corresponding to the page screenshot.
  • the read data sequence matching the timestamp corresponding to the page screenshot refers to the read data sequence in which the time difference between the timestamp corresponding to the reading moment and the timestamp corresponding to the page screenshot is within the target time difference range.
  • the read data sequences before performing the alignment process of page screenshots and read data sequences, can be preprocessed, for example, deduplication of read data sequences that are retransmitted due to network/system abnormalities. .
  • the missing read data sequence can be completed based on the time of the page screenshot.
  • the process of determining read-ahead data may include the following steps:
  • Step S301 Based on each aligned page screenshot and the corresponding read data sequence, determine the attention score between each page screenshot and the plurality of read data sequences.
  • each page screenshot has a read data sequence aligned with it, in order to accurately reflect the impact of the jump of the display page on the read data, it is possible to determine the relationship between each page screenshot and multiple read data sequences.
  • the attention score is to determine the attention score between each page screenshot and each read data sequence.
  • the attention score is used to characterize the degree of correlation between the display page currently viewed by the user and the data read before and after. In this way, based on the viewing behavior and data reading behavior in chronological order, the degree of correlation between the display page viewed by the user and the data read can be better expressed, that is, the relationship between the display page viewed by the user and the data read by the user can be obtained. Impact.
  • the attention score between each page screenshot and multiple read data sequences can be determined through the BERT (Bidirectional Encoder Representations from Transformer) model.
  • Step S302 Based on the attention score, determine the pre-read data to be read by the target application next time.
  • the attention score is used to characterize the degree of correlation between the display page currently viewed by the user and the data read before and after
  • multiple attention scores corresponding to each page screenshot can be based on ( Each read data sequence corresponds to an attention score), construct a probability transfer matrix for the display page, and a probability transfer matrix for the read data sequence, so as to determine the pre-read data that the user will read next based on the probability transfer matrix.
  • the probability transition matrix is used in this application and can be used to reflect the probability distribution of the dynamic process of data reading behavior transitioning from one state to another state, that is, to learn the rules of the user's data reading behavior, which is expressed as a probability Transfer matrix, so that the read-ahead data to be read next can be predicted.
  • a further explanation is: since multiple page screenshots record the user's viewing behavior, they contain rich information such as the scene area, the relationship between the display page and resource data distribution; and in the design of the target application, the scene area, function page and The resource data distribution of related components is basically a one-to-one correspondence.
  • a prediction model for determining pre-read data can be constructed based on the idea of machine learning.
  • the prediction model can be obtained by training the target neural network based on the corresponding training samples; in this way, the page screenshots and read data sequences can be directly input into the prediction model to determine the pre-read data to be read next.
  • the training samples used to train the prediction model can include multiple joint training samples.
  • Each joint training sample includes multiple page screenshot samples and user behavior data samples.
  • the actual read data samples are the data actually read by the target application; Of course, the actual read data sample plays the role of true value in the training process.
  • This true value is used as the target of supervised learning to construct a loss function, so that during the training process, the loss value can continue to decrease, and then the target neural network can be continuously updated.
  • the parameters of the network are used to continuously converge the target neural network to obtain a prediction model.
  • a prediction model with strong prediction performance can be trained through a large number of training samples, thereby improving the accuracy of the prediction model in determining pre-read data.
  • the target neural network includes a first feature extraction module, a second feature extraction module, a splicing module and a prediction module.
  • Figure 4 takes the prediction module as a Transformer model as an example.
  • the training process of the prediction model is as follows:
  • the user behavior data sample includes multiple read data sequence samples
  • the multiple page screenshot samples are input to the first feature extraction module to obtain the first feature vector corresponding to the multiple page screenshot samples; and the multiple read data sequence samples are Input to the second feature extraction module to obtain the second feature vectors corresponding to the multiple read data sequence samples;
  • the joint vector obtained by splicing the first feature vector and the second feature vector is input to the prediction module to obtain the pre-read data sequence output by the prediction module; wherein the splicing module is used to combine the first feature vector and the second feature vector. Perform splicing.
  • the parameters of the target neural network are updated multiple times to obtain the prediction model.
  • the first feature extraction module is used to extract features of page screenshot samples to form a feature map.
  • the first feature extraction module may include multiple sequentially connected convolution layers. Through multiple convolution layers, it is possible to Extract features at different scales from the page screenshots to obtain feature maps, and then convert the feature maps into feature vectors to obtain the first feature vector.
  • the second feature extraction module can be used to vectorize each read data sequence sample in a specified format to obtain a second feature vector.
  • the prediction module can be a Transformer model.
  • the Transformer model uses a large number of multi-head Self-Attention (self-attention) mechanisms.
  • the algorithm used by the Transformer model is an attention-based algorithm. Based on the time-series algorithm of the force mechanism, the prediction model can learn the degree of correlation between the display page of the target application and the read data based on the time-series continuous joint vector.
  • a loss function can be constructed to obtain the loss value of the target neural network based on the pre-read data sequence and the actual read data sequence included in the actual read data sample, so that based on the loss value, The parameters of the target neural network are updated.
  • the convergence of the target neural network can be characterized, and the resulting prediction model can be deployed to online applications.
  • the prediction model is continuously trained through a large number of training samples, the prediction accuracy of the prediction model for the pre-read data can be improved.
  • multiple page screenshots and user behavior data can be included in the prediction model.
  • the read data sequence is input to the prediction model to obtain the pre-read data that the target application will read next.
  • the read data sequence and the page screenshots are first aligned with timestamps, and the aligned multiple page screenshots and multiple read data sequences are input to the first feature extraction module and the second feature extraction module respectively, and the output is obtained.
  • a feature vector, a second feature vector, and then the splicing module splices multiple time-sequential first feature vectors and second feature vectors to obtain multiple time-series continuous joint vectors. Finally, multiple time-series continuous joint vectors are obtained. Input to the prediction model to obtain the read-ahead data to be read by the target application next time.
  • the prediction performance of the prediction model is related to the richness of the training samples.
  • the richer the training samples the better the generalization ability and robustness of the prediction model.
  • training samples can be continuously collected to dynamically update the prediction model to improve the prediction accuracy of the prediction model.
  • a "self" driven prediction model iteration mechanism can be constructed. Through this mechanism, the prediction model can be prompted to enter an automatic iteration cycle. As the number of users using the target application increases, As more and more training samples are collected, the prediction ability of the prediction model will become more and more accurate, and only a little manual intervention is needed to make the prediction model complete self-update iterations.
  • FIG. 5 a schematic flow chart of self-iterative updating of the prediction model is shown.
  • the target application reads data for the i-th time
  • the actual read data sequence actually read for the i-th time and the corresponding pre-read data sequence for the i-th time can be obtained; and in the i-th time, the actual read data sequence can be obtained
  • the difference between the actual read data sequence and the pre-read data sequence output for the i-1th time (the pre-read data sequence output for the i-1th time is actually the predicted data sequence for the i-th read) exceeds the target difference.
  • the prediction model is Make an update.
  • i is an integer greater than or equal to 1.
  • the i-th read data may refer to any read request of the target application. That is, every time the target application issues a read request and needs to read data, it can determine the data actually to be read by the target application through the read request.
  • the actual data to be read is the actual read data sequence. As mentioned above, it can be specifically expressed as an ID sequence. Among them, the identifier of the data to be read secondaryly can be carried in the read request, and then by parsing the read request, the actual read data sequence of the i-th secondary read can be obtained.
  • the pre-read data corresponding to the read request is predicted in advance, that is, the pre-read data sequence, in this way, the actual read data sequence to be read by the i-th read request and the corresponding pre-read data sequence can be determined. Differences between data series.
  • the target application issues a read request for the fifth time
  • the actual read data sequence to be read by the read request is "01-02-03-06-08”
  • the fourth predicted read request to read is "01-03-04-07-08”
  • the difference between the actual read data sequence to be read and the corresponding pre-read data sequence may include inconsistencies in the read data sequence and inconsistencies in the reading order of the read data sequence.
  • the difference between the actual read data sequence and the corresponding pre-read data sequence can be represented by the error rate.
  • the target difference can be represented by the target error rate.
  • it can be expressed as the proportion of data that is inconsistent between the actual read data sequence and the pre-read data sequence, such as between "01-02-03-06-08" and "01-03-04-07-08" During the period, there are three inconsistent data, accounting for 60%, and the error rate is 60%.
  • the page screenshots and user behavior data used to predict the i-th pre-read data can be re-entered into the prediction model. , and use the actual read data sequence of the i-th actual read as the true value to train the prediction model to update the parameters of the prediction model so that the prediction model can re-learn samples with inaccurate predictions.
  • the user behavior data and page screenshots obtained online in real time can be used to dynamically update the prediction model, so that the prediction model can continuously learn the behavior of the current user group to While the target application is continuously used by the user group, the prediction accuracy of the pre-read data is always guaranteed.
  • the target user behavior data and target page screenshots corresponding to the i-th time can be used as incremental samples and added to the incremental sample pool; thereafter, the new data added in the current period are periodically obtained from the incremental sample pool. incremental samples; and use the incremental samples newly added in the current cycle as training samples, and use the corresponding actual read data sequence as the true value to update the prediction model.
  • the prediction model can be dynamically updated using user behavior data and page screenshots obtained online in real time; during specific implementation, the prediction model can be updated periodically, so that Whenever it is determined that the difference between the pre-read data sequence and the corresponding actual read data sequence is greater than the target difference, the corresponding target user behavior data and target page screenshots can be added to the incremental sample pool as incremental samples.
  • a corresponding index table can be constructed for the incremental sample pool.
  • the index table can record the index of the incremental samples newly added in the current period. In this way, every time the prediction model needs to be updated, the index table can be used to determine the new incremental samples in the current period. The incremental samples added in the current period are used as training samples to update the prediction model.
  • the prediction model is updated once a week, when a new week comes, such as every Sunday, the incremental samples added from Monday to the current time in that week can be called out and input into the prediction model, and Through the above training process, the prediction model is continuously trained to obtain an updated prediction model.
  • the full training sample set is a set of all incremental samples from the current time point to the time point half a year or a year ago, or it can be an existing basic training sample set plus incremental samples within a period of time.
  • Set then input the training samples in the full training sample set to the prediction model, and continue to train the prediction model through the above training process to obtain an updated prediction model; compared with the previous training method, this training method This method can minimize the decline rate of the model in specific scenarios and improve the generalization ability.
  • the updated prediction model can be used to predict the read-ahead data of the target application.
  • the prediction model can also be evaluated.
  • the process of evaluating the model in related technologies can be used, such as constructing an evaluation data set, inputting the evaluation data set into the prediction model for inference, and making inferences based on the prediction model.
  • the output prediction results (pre-read data sequence) are used to determine whether the prediction results meet the target indicators. If they do, it can be determined that the prediction model can be run online.
  • the updated prediction model is evaluated to determine whether the updated prediction model is better on the target indicator. If so, the updated prediction model can be The prediction model will be run online as the latest version of the model. If not, the latest version of the prediction model will not be released, and the prediction model before the update will continue to run online.
  • the target indicators can include the average prediction accuracy, recall rate, prediction range coverage, etc.
  • the prediction model when the prediction model is used to determine the pre-read data, the prediction model can be deployed on the second server, or can also be deployed on the terminal running the target application.
  • the mode of determining the pre-read data through the prediction model on the second server may be called the server mode
  • the mode of determining the pre-read data through the prediction model on the terminal may be called the client mode.
  • server mode since the computing power of the server is greater than that of the terminal, it can avoid occupying the computing power resources on the terminal, thereby reducing the performance consumption of the terminal and ensuring the battery life of the terminal.
  • Figure 6a shows a software and hardware environment diagram in server mode
  • Figure 6b shows a software and hardware environment diagram in client mode.
  • the prediction model in addition to the prediction model, it also includes a first component 601 related to determining pre-read data and a second component 602 related to iterative update of the prediction model;
  • the first component 601 may include a data packet separation component, a log filtering component, an alignment component, and a preprocessing component.
  • the second component 602 may include an incremental sample mining component, a database component, an update component, and an evaluation component.
  • the above-mentioned first component 601, second component 602 and prediction model are all deployed in the second server.
  • the above-mentioned first component 601 and the prediction model are deployed on the terminal, while the second component 602 is deployed on the second server.
  • the terminal chooses to enter the server mode or the client mode according to its own performance configuration parameters.
  • the terminal can also switch between the server mode and the client mode according to its own performance configuration parameters during use.
  • the current performance configuration parameters of the terminal running the target application can be obtained, and when the current performance configuration parameters meet the target conditions, multiple page screenshots and user behavior data are sent to the prediction model configured in the terminal to obtain The pre-read data to be read next time; when the current performance configuration parameters do not meet the target conditions, multiple page screenshots and user behavior data are sent to the second server to obtain the pre-read data to be read next time.
  • the performance configuration parameters can reflect the hardware configuration, software configuration and operating system configuration of the terminal, thereby reflecting the performance of the terminal as a whole.
  • the performance of the terminal increases as it continuously runs other applications and stores data. It will change dynamically. Therefore, the current performance configuration parameters can reflect the situation that the terminal's software and hardware carrying capacity is consumed by current applications and storage data.
  • the target condition can be any one or more conditions of the operating system being the target version of the operating system, the CPU occupancy rate of the terminal being lower than the target occupancy rate, the remaining capacity of the terminal's memory not being lower than the target capacity, etc.
  • the combination the more content the target condition contains, the higher the performance configuration parameters of the terminal need to be before the prediction model configured on the terminal can be used to predict the pre-read data.
  • the prediction model when installing the target application, it may be determined whether to deploy the prediction model on the terminal based on the current performance configuration parameters of the terminal. For example, if the current performance configuration parameters meet the target conditions, the prediction model can be deployed on the terminal; otherwise, the prediction model should not be deployed on the terminal.
  • the prediction of the pre-read data can be achieved to be compatible with the terminal's own computing power. matching effect.
  • the output end of the data packet separation component is connected to the log filtering component and the alignment component
  • the output end of the log filtering component is connected to the input end of the alignment component
  • the output end of the alignment component is connected to the input end of the preprocessing component
  • the preprocessing component The output end of the component is used to input preprocessed page screenshots and user behavior data to the prediction model.
  • the second component includes an incremental sample mining component, a database component, an update component and an evaluation component, where the input end of the incremental sample mining component can be connected to the output end of the alignment component and the output end of the prediction model for receiving
  • the read data sequence and page screenshots output by the alignment component, as well as the pre-read data determined by the receiving prediction model, such as the pre-read data determined last time, can be compared with the read data sequence output by the alignment component this time.
  • the output end of the incremental sample mining component is connected to the database component, and is used to store the read data sequence and page screenshots output by the alignment component, as well as the pre-read data determined by the prediction model, into the database;
  • the update component is connected to the database, and is used to store the read data sequence and page screenshots output by the alignment component in the database.
  • the second component 602 is deployed in the second server.
  • the prediction model in the second server is updated through the second component 602.
  • the prediction model can be issued to the terminal.
  • the terminal causes the terminal to deploy the latest version of the prediction model.
  • page screenshots are transmitted in the form of short videos on the path from the target application to the preprocessing component, that is, multiple page screenshots are encoded into videos and then transmitted to the preprocessing component on the path, and then , multiple page screenshots are obtained through decoding of the preprocessing component.
  • the file management module obtains the read data log of the target application.
  • the read data log includes multiple read data sequences. Each read data sequence has its own corresponding timestamp and is sent to the pre-read module.
  • the pre-reading module finds the read data sequence matching the time period from the read data log according to the time range of short video A (start timestamp ⁇ end timestamp), and then packages it into a log package A and sends it to the second Packet splitting component in the server.
  • the data packet separation component separates the log packet A into a short video A and several read data sequences; sends the short video A to the alignment module, and sends several read data sequences to the log filtering component;
  • the log filtering component checks the repeated read data sequences among several received read data sequences and removes the duplicates; after completing the missing read data sequences in a certain time period, the processed read data sequences are sent to the alignment component.
  • the alignment component again aligns the time of each image frame in short video A with the time of the read data sequence, packages the aligned short video A and the read data sequence into data group B, and sends a copy of data group B to The incremental sample mining component sends data group B to the preprocessing component.
  • the preprocessing component separates the received data group B into a short video A and a read data sequence, and decodes the short video A to obtain several page screenshots; and finds matching page screenshots based on the timestamp of the read data sequence. Realize the realignment of read data sequence and page screenshot.
  • S8 The second server feeds back the identification of the pre-read data to be read next to the pre-read module.
  • the pre-reading module encapsulates a data download request based on the identification of the pre-read data to be read next, and sends the data download request to the first server.
  • the pre-reading module receives the pre-reading data returned by the first server and stores the pre-reading data in the external memory of the terminal.
  • FIG. 7 shows a schematic framework diagram of the data reading device. As shown in Figure 7, it includes the following modules:
  • the data acquisition module 701 is used to obtain multiple page screenshots and user behavior data of the target application before the current moment; wherein the page screenshots are obtained by taking screenshots of the display page of the target application;
  • the data prediction module 702 is used to determine the pre-read data to be read by the target application next time based on multiple screenshots of the page and the user behavior data;
  • Download module 703 is configured to download the pre-read data from the first server and store the pre-read data locally, so that when the next read request of the target application hits the pre-read data, the pre-read data will be downloaded from the first server.
  • the read-ahead data is read locally.
  • the user behavior data includes multiple read data sequences
  • the device further includes:
  • An alignment module configured to align multiple page screenshots and multiple read data sequences based on the respective times of multiple page screenshots and the respective times of multiple read data sequences;
  • the user behavior data includes multiple read data sequences, and the data prediction module 702 includes:
  • An attention score determination unit is used to determine the attention score between each page screenshot and multiple read data sequences based on the aligned page screenshots and corresponding read data sequences;
  • a prediction unit configured to determine pre-read data to be read by the target application next time based on the attention score.
  • data prediction module 702 includes:
  • An input unit configured to input a plurality of page screenshots and user behavior data into a prediction model to obtain pre-read data to be read by the target application next time;
  • the prediction model is obtained by training the target neural network with multiple joint training samples as input and the actual read data sample corresponding to each joint training sample as the true value;
  • Each of the joint training samples includes multiple page screenshot samples and user behavior data samples, and the actual read data samples are data actually read by the target application.
  • the device also includes:
  • the first data sequence acquisition module is used to acquire the actual read data sequence actually read for the i-th time when the target application reads data for the i-th time, and the corresponding pre-read data sequence for the i-th time; where i is greater than an integer equal to 1;
  • the second data sequence acquisition module is used to acquire and determine the i-th corresponding pre-read data sequence when the difference between the actual read data sequence and the pre-read data sequence exceeds the target difference. Based on target user behavior data and target page screenshots;
  • a model update module configured to update the prediction model based on the target user behavior data, the target page screenshot, and the actual read data sequence.
  • the device also includes:
  • the sample adding module is used to add the i-th corresponding target user behavior data and target page screenshots as incremental samples to the incremental sample pool;
  • the model update module includes:
  • a sample update unit configured to periodically obtain incremental samples newly added in the current cycle from the incremental sample pool
  • the model update unit is used to update the prediction model by taking the newly added incremental samples in the current cycle as input and the corresponding actual read data sequence as the true value.
  • the user behavior data samples include multiple read data sequence samples
  • the prediction model is trained in the following manner:
  • the parameters of the target neural network are updated multiple times to obtain the prediction model.
  • the device also includes:
  • a configuration parameter acquisition module used to acquire the current performance configuration parameters of the terminal running the target application
  • Data prediction module 702 includes:
  • a first prediction mode unit configured to send the plurality of page screenshots and the user behavior data to the prediction model configured in the terminal when the current performance configuration parameters meet the target conditions, To obtain the pre-read data to be read next;
  • the second prediction mode unit is configured to send the plurality of page screenshots and the user behavior data to the second server when the current performance configuration parameters do not meet the target condition, so as to obtain The read-ahead data to be read next time.
  • the device also includes:
  • a data package acquisition module configured to obtain the startup running package and the startup image package of the target application in advance when the target application is not installed; wherein the startup image package includes the startup data of the target application;
  • An application startup module is configured to, after starting the target application through the startup run package, respond to a read request of the target application and read the read request from the startup image package and/or the first server. Corresponding data; wherein, the first server includes all original data of the target application.
  • the device embodiment is similar to the method embodiment, so the description is relatively simple. For relevant details, please refer to the method embodiment.
  • An embodiment of the present application also discloses an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor. When executed by the processor, the data reading method is implemented.
  • An embodiment of the present application also discloses a computer-readable storage medium, which stores a computer program that causes the processor to execute the data reading method described in the present application.
  • the embodiment of the present application also discloses a computer program product, which includes a computer program/instruction.
  • a computer program product which includes a computer program/instruction.
  • the data reading method is implemented.
  • the description is relatively simple. For relevant details, please refer to the partial description of the method embodiment.
  • embodiments of the embodiments of the present application may be provided as methods, devices, or computer program products. Therefore, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product implemented on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
  • computer-usable storage media including but not limited to disk storage, CD-ROM, optical storage, etc.
  • Embodiments of the present application are described with reference to flowcharts and/or block diagrams of methods, terminal devices (systems), and computer program products according to embodiments of the present application. It will be understood that each process and/or block in the flowchart illustrations and/or block diagrams, and combinations of processes and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general-purpose computer, special-purpose computer, embedded processor, or other programmable data processing terminal device to produce a machine such that the instructions are executed by the processor of the computer or other programmable data processing terminal device. Means are generated for implementing the functions specified in the process or processes of the flowchart diagrams and/or the block or blocks of the block diagrams.
  • These computer program instructions may also be stored in a computer-readable memory that causes a computer or other programmable data processing terminal equipment to operate in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including the instruction means, the The instruction means implements the functions specified in a process or processes of the flowchart and/or a block or blocks of the block diagram.
  • These computer program instructions can also be loaded onto a computer or other programmable data processing terminal equipment, so that a series of operating steps are performed on the computer or other programmable terminal equipment to produce computer-implemented processing, thereby causing the computer or other programmable terminal equipment to perform a computer-implemented process.
  • the instructions executed on provide steps for implementing the functions specified in a process or processes of the flow diagrams and/or a block or blocks of the block diagrams.
  • any reference signs placed between parentheses shall not be construed as limiting the claim.
  • the word “comprising” does not exclude the presence of elements or steps not listed in a claim.
  • the word “a” or “an” preceding an element does not exclude the presence of a plurality of such elements.
  • the present disclosure may be implemented by means of hardware comprising several different elements and by means of a suitably programmed computer. In the element claim enumerating several means, several of these means may be embodied by the same item of hardware.
  • the use of the words first, second, third, etc. does not indicate any order. These words can be interpreted as names.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

本申请实施例提供了一种数据读取方法、装置及电子设备,所述方法包括:获取目标应用在当前时刻之前的多张页面截图以及用户行为数据;其中,页面截图是对目标应用的显示页面进行截图得到的;基于多张页面截图和用户行为数据,确定目标应用在下一次要读取的预读数据;从第一服务器中下载预读数据,并将预读数据存储至本地,以使目标应用的下一次读请求命中预读数据时,从本地读取预读数据;通过本申请提供的数据读取方法,可以提高目标应用的数据读取效率、缩短每次目标应用进行数据读取的时长。

Description

数据读取方法、装置及电子设备
本申请要求在2022年6月10日提交中国专利局、申请号为202210653784.2、发明名称为“数据读取方法、装置及电子设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及通信处理技术领域,特别是涉及一种数据读取方法、装置及电子设备。
背景技术
随着各类应用的普及,用户一般会在终端下载各类应用以进行使用,例如,在手机上会安装很多APP。
然而,用户在使用各类应用时,一些情况下,应用读取数据时会存在读取时间较长的问题,导致用户在页面中要等待很长的时间才能看到自己想要的内容。例如,在用户下载了一款游戏APP后,用户玩游戏的过程中经常需要等待较长时间才能进入一个游戏场景、或使用一些游戏道具和技能等,因此经常遇到游戏卡顿的问题。
概述
鉴于上述问题,提出了本申请实施例,以便克服上述问题或者至少部分地解决上述问题。
为了解决上述问题,本申请的第一方面,提供一种数据读取方法,所述方法包括:
获取目标应用在当前时刻之前的多张页面截图以及用户行为数据;其中,所述页面截图是对所述目标应用的显示页面进行截图得到的;
基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一 次要读取的预读数据;
从第一服务器中下载所述预读数据,并将所述预读数据存储至本地,以使所述目标应用的下一次读请求命中所述预读数据时,从所述本地读取所述预读数据。
可选地,所述用户行为数据包括多条读数据序列,所述方法还包括:
基于多张所述页面截图各自的时间以及多条所述读数据序列各自的时间,对多张所述页面截图和多条所述读数据序列进行对齐;
所述用户行为数据中包括多条读数据序列,基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据,包括:
基于对齐的每张页面截图和对应的读数据序列,确定每张所述页面截图分别与多条所述读数据序列之间的注意力得分;
基于所述注意力得分,确定所述目标应用在下一次要读取的预读数据。
可选地,基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据,包括:
将多个所述页面截图和所述用户行为数据输入至预测模型,得到所述目标应用在下一次要读取的预读数据;
所述预测模型是以多个联合训练样本为输入、以每个联合训练样本对应的实际读数据样本为真值,对目标神经网络进行训练得到的;
其中,每个所述联合训练样本包括多张页面截图样本和用户行为数据样本,所述实际读数据样本为所述目标应用所真实读取的数据。
可选地,所述方法还包括:
在所述目标应用第i次读数据时,获取第i次实际读取的实际读数据序列,以及第i次对应的预读数据序列;其中,i为大于等于1的整数;
在所述实际读数据序列与所述预读数据序列之间的差异超过目标差异 的情况下,获取确定所述第i次对应的预读数据序列时,所依据的目标用户行为数据和目标页面截图;
基于所述目标用户行为数据、所述目标页面截图和所述实际读数据序列,对所述预测模型进行更新。
可选地,所述方法还包括:
将所述第i次对应的目标用户行为数据和目标页面截图作为增量样本,加入到增量样本池;
基于所述目标用户行为数据、所述目标页面截图和所述实际读数据序列,对所述预测模型进行更新,包括:
周期性从所述增量样本池中获取当前周期内新增的增量样本;
以所述当前周期内新增的增量样本为输入、以对应的实际读数据序列为真值,对所述预测模型进行更新。
可选地,所述用户行为数据样本包括多个读数据序列样本,所述预测模型是通过以下方式训练得到的:
获取多个所述页面截图样本各自对应的第一特征向量以及所述多个读数据序列样本各自对应的第二特征向量;
将所述第一特征向量和所述第二特征向量进行拼接后,得到联合向量;
基于所述联合向量,获得所述目标神经网络输出的预读数据序列;
基于所述预读数据序列和所述实际读数据样本,对所述目标神经网络的参数进行多次更新,得到所述预测模型。
可选地,所述方法还包括:
获取运行所述目标应用的终端的当前性能配置参数;
基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据,包括:
在所述当前性能配置参数满足目标条件的情况下,将所述多张所述页面截图和所述用户行为数据发送给所述终端中配置的所述预测模型,以得到所述下一次要读取的预读数据;
在所述当前性能配置参数不满足所述目标条件的情况下,将所述多张所述页面截图和所述用户行为数据发送给所述第二服务器,以得到所述下一次要读取的预读数据。
可选地,所述方法包括:
预先在未安装所述目标应用时,获取所述目标应用的启动运行包和启动镜像包;其中,所述启动镜像包包括所述目标应用的启动数据;
通过所述启动运行包启动所述目标应用后,响应于所述目标应用的读请求,从所述启动镜像包和/或所述第一服务器中读取所述读请求对应的数据;其中,所述第一服务器中包括所述目标应用的全部原始数据。
可选地,本申请实施例还公开了一种数据读取装置,所述装置包括:
数据获取模块,用于获取目标应用在当前时刻之前的多张页面截图以及用户行为数据;其中,所述页面截图是对所述目标应用的显示页面进行截图得到的;
数据预测模块,用于基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据;
下载模块,用于从第一服务器中下载所述预读数据,并将所述预读数据存储至本地,以使所述目标应用的下一次读请求命中所述预读数据时,从所述本地读取所述预读数据。
本申请实施例还公开了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行时实现如第一方面所述的数据读取方法。
本申请实施例还公开了一种计算机可读存储介质,其存储的计算机程序使得处理器执行如本申请第一方面所述的数据读取方法。
本申请实施例还公开了一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被处理器执行时实现第一方面所述的数据读取方法。
采用本申请实施例的数据读取方法,可以获取目标应用在当前时刻之前的多张页面截图以及用户行为数据;并基于多张页面截图和用户行为数据,确定目标应用在下一次要读取的预读数据;从第一服务器中下载预读数据,并将预读数据存储至本地缓存,以使目标应用的下一次读请求命中预读数据时,从本地缓存读取预读数据。
采用本申请实施例的数据读取方法,具有以下方面的优点:
一方面,由于基于页面截图和用户行为数据得到了用户下一次要读取的预读数据,并提前从第一服务器中下载预读数据后存储到终端本地,从而使得目标应用在下一次读数据时,可以直接从本地中读数据,相比于从第一服务器下载所需数据,从本地读数据会大大缩短读取路径,由此大大提高了目标应用的数据读取效率,缩短数据读取时长,避免了目标应用在使用过程中卡顿的问题。
另一方面,由于页面截图是对目标应用的显示页面进行截图得到的,用户行为数据可以反映用户使用目标应用时进行的读行为,而显示页面一般是用户当前所观看的界面,其与用户的读行为存在关联性。例如,在游戏APP中,用户在一个游戏场景页面中上下左右滑动时,可以通过显示页面的截图记录用户在游戏场景中的观看行为,而用户行为数据一般是进行的场景切换、道具购买置换等行为,其与在游戏场景中的观看行为具有紧密的关联关系。因此,在基于页面截图和用户行为数据预测用户下一次要读取的预读数据时,可以建立用户的页面观看行为和数据读行为之间的关联,基于这一关联可以准确预测用户下一次要读取的数据,从而提高了下一次读请求命中预 读数据的概率,进而保证每次读数据的快速读取。
附图简述
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请实施例中的一种数据读取方法所运行的软硬件环境图;
图2是本申请实施例中的一种数据读取方法的步骤流程图;
图3是本申请实施例中的确定预读数据的步骤流程图;
图4是本申请实施例中的模型训练的整体流程示意图;
图5是本申请实施例中对预测模型进行自我迭代更新的流程示意图;
图6a是本申请实施例中服务器模式下的软硬件环境图;
图6b是本申请实施例中客户端模式下的软硬件环境图;
图7是本申请实施例中的一种数据读取装置的结构框架示意图。
详细描述
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本发明实施例可以应用于终端的各种操作系统中,所述终端包括PC端和移动终端,操作系统包括PC端的操作系统如Windows,Linux,Unix,以及虚拟机仿真系统等,还包括移动终端的操作系统如安卓、IOS等。
本发明实施例中的目标应用可以是指软件安装包和数据包较大的应用程序,如3D游戏、PS等应用程序;其中,目标应用可以是PC端的应用程序,也可以是移动终端的应用程序(Application,APP);下面以移动终端为例,对本发明实施例的方法和系统进行说明。
相关技术中,用户在使用各类应用时,一些情况下,应用读取数据时会存在读取时间较长的问题。有鉴于此,本申请为解决上述技术问题,提出了一种数据预读方案,具体而言,可以对目标应用在当前时刻之前的显示页面进行截图得到多张页面截图,之后,基于多张页面截图和当前时刻之前的用户行为数据,确定目标应用在下一次要读取的预读数据;以提前从第一服务器中下载预读数据并存储到终端本地,从而缩短目标应用下一次读取数据的时长,提高了数据读取效率。
在此基础上,在一些实施场景中,针对数据包较大的目标应用,又提出了更进一步的提高目标应用的使用流畅性的改进方案,具体而言,该改进方案可以加快目标应用的下载、安装和启动速度,并保证目标应用的运行,具体如下所述:
对数据包较大的目标应用,下载目标应用和安装目标应用的时长都较长,而使用目标应用时,所需带宽资源较大、对终端的性能要求较高,这样对于数据包较大的目标应用,用户初次使用时的等待时间非常长,而使用过程中对终端的性能、网络带宽资源的要求也较高。
有鉴于此,提出了一种可以实现目标应用边下载边使用的改进方案,具体地,可以预先在未安装目标应用时,获取目标应用的启动运行包和启动镜像包;通过启动运行包启动目标应用后,响应于目标应用的读请求,从启动镜像包和/或第一服务器中读取读请求对应的数据;其中,第一服务器中包括目标应用的全部原始数据。
其中,启动运行包是用于启动目标应用的,启动运行包中包含启动目标应用的最基本的文件,在终端利用启动运行包运行目标应用时,启动运行包完成目标应用在安装过程中基本的组件安装,以及配置目标应用与终端的交互动作;因此,启动运行包中所含的数据量非常小,使得终端下载启动运行包的效率大大提高。
其中,启动镜像包包括目标应用的启动数据,终端在获取启动运行包的同时,获取该启动镜像包,则当目标应用启动时,可以直接在本地获取目标应用启动所需的数据,这大大减少了目标应用启动的时间。
当目标应用运行过程中,需要读数据时,则可以从启动镜像包和/或第一服务器中获取读请求对应的数据;其中,在启动镜像包中包含读请求对应的数据时,则可以直接从启动镜像包中读取出数据;在启动镜像包中不包含读请求对应的数据时,则可以从第一服务器中读取对应的数据;在启动镜像包中包含读请求对应的部分数据时,则可以从第一服务器和启动镜像包中共同读取出所需的全部数据。
采用此种实施方式时,目标应用的供应商可以将目标应用的原始数据按照其在目标应用中所起的功能,将目标应用的原始数据制作为启动运行包、启动镜像包,原始数据中除启动运行包外的数据打包为原始镜像包,并可以将启动运行包、启动镜像包、原始镜像包上传至第一服务器,从而用户在下载并安装目标应用时,可以先下载启动运行包、启动镜像包,由于启动运行包、启动镜像包的数据量小,下载速度快,从而可以帮助用户在终端上很快实现目标应用的启动,在目标应用启动后,可以从启动镜像包和/或第一服务器(原始镜像包)中获取读请求所需的数据,从而帮助目标应用的正常运行。
由此,可以提高目标应用的运行速度,使得用户尽快使用目标应用,例如,在游戏场景中,面临一款数据包极大的游戏APP,用户在点击下载游戏APP时,可以实现游戏APP的快速启动,当用户开始玩游戏APP时,在玩游戏的过程中,可以通过已在本地的启动镜像包和在第一服务器中的原始镜像包支持游戏过程中的数据获取,从而在不影响用户玩游戏的情况下,还可以大大降低用户的等待时长。
其中,在将上述所述的数据预读方案与该改进方案进行结合使用时,一方面,不仅可以帮助用户快速启动和快速使用目标应用,在使用过程中,由于采取了数据预读机制,从而提高了数据读取速度,显示页面的显示流畅性高。
另一方面,由于在终端中存储的是目标应用的启动运行包和启动镜像包,以及目标应用使用过程中才从第一服务器下载对应的预读数据,相比于将目标应用的全部数据都存储到终端,可以降低对终端的存储资源的占用,保证终端的性能。
再一方面,由于提前从第一服务器读取出预读数据存储至终端本地,则随着目标应用的使用,终端本地存储的数据会愈加丰富,则直接从本地读取数据的概率越高,从而随着目标应用的深入使用,响应速度也越来越快,用户使用流畅性更高。
下面,对本申请的数据读取方法所应用的软硬件环境、如何进行数据预读进行充分说明。
参照图1所示,示出了本申请的一种软硬件环境示意图,如图1所示,包括终端、第一服务器和第二服务器,其中,第一服务器上存储有目标应用运行所需的全部原始数据(原始镜像包),第二服务器用于为目标应用确定下一次要读取的预读数据。其中,在终端上配置有与目标应用对应的预读模块、文件管理模块和应用程序截图模块。
其中,文件管理模块与目标应用之间具有数据通路,可以获取目标应用的用户行为数据;应用程序截图模块用于获取显示页面的页面截图,预读模块分别与文件管理模块和应用程序截图模块具有各自的数据通路,可以接收文件管理模块发送的用户行为数据以及应用程序截图模块发送的多张页面截图。
在一种实施方式中,文件管理模块将用户行为数据发送给预读模块,应用程序截图模块将多张页面截图也发送给预读模块,预读模块可以将用户行为数据和多张页面截图打包后发送给第二服务器,由第二服务器基于用户行为数据和页面截图确定出目标应用下一次要读的预读数据,之后将预读数据的标识反馈给预读模块;接着,预读模块从第一服务器中下载预读数据。
其中,预读模块可以通过多个接口与第一服务器通信连接,以依次通过多个接口从第一服务器中获取确定出的预读数据;其中,多个接口可以包括内容分发网络CDN接口、点对点传输P2P接口和源站接口。
其中,文件管理模块可以是在通过启动运行包启动目标应用时所初始化的,应用程序截图模块获取页面截图的过程可以如下所述:
一种方式是;直接对目标应用的显示页面按照预设频率进行截图,从而得到多张页面截图。
另一种方式是:利用终端的操作系统配置的截图抓取组件进行显示页面的截图,之后,通过与终端的显存之间的调用接口,获取显存的显示缓冲区中被截图后得到的图像帧,将获取到的图像帧作为页面截图。
其中,以3D游戏为例,应用程序截图模块可以调用Nvidia Game Stream接口从GPU显存与当前3D游戏对应的显示缓冲区中抽取相关图像帧,如游戏运行的帧率一般为60-120fps,则可以按一定采样间隔抽取图像帧,从而得到多张页面截图。
需要说明的是,本申请可以对显示页面的截图进行预处理,从而得到所需的页面截图,预处理可以包括对截图的方向、尺寸的调整、以及标记上该截图的时间戳。在调整截图的方向时,可以将截图的方向调整为正向,在调整截图的尺寸时,可以将截图压缩到目标尺寸。
当然,在又一些实施例中,对显示页面的截图进行预处理还可以包括:对截图中不属于目标应用的显示页面的边缘进行剪裁,此种情况适合截图机制是全屏幕截图的情况,如一些操作系统配置的截图软件是对显示屏的显示画面进行全屏截图,如此,截图中不可避免地会包括除目标应用的显示页面之外的其他画面。
例如,在游戏APP中,若用户未全屏显示游戏页面,则一些截图机制下,会对终端的全屏截图,则截图中会包括操作系统的任务栏和桌面上未被游戏页面覆盖的画面。
相应地,为避免截图中包含除目标应用的显示页面外的画面,可以截取截图中属于目标应用的显示页面的图像区域,以去除截图中不属于目标应用的显示页面的边缘画面,进而将该截取出的图像区域作为本申请所述的页面截图。
下面,对本申请的数据读取方法进行详细介绍。
参照图2所示,示出了一种实施例中数据读取方法的步骤流程图,如图2所示,具体可以包括以下步骤:
步骤S201:获取目标应用在当前时刻之前的多张页面截图以及用户行为数据。
其中,页面截图是对目标应用的显示页面进行截图得到的。
本申请实施例中,可以对目标应用的显示页面进行截图,具体地,可以每隔预设时间,如100-200ms,对显示页面进行截图一次,从而得到页面截图。其中,如何对显示页面截图得到页面截图的过程可以参照上述所述,在此不再赘述。
其中,多张页面截图可以是在当前时刻之前的预设时段内,对目标应用的显示页面进行多次截图得到的,相应地,用户行为数据也可以是在当前时刻之前的预设时段内的行为数据。预设时段可以设置为10s。
其中,用户行为数据可以记录用户在目标应用中的操作行为,包括但不限于数据读取日志,数据读取日志用于记录在预设时段内每次所读取的数据、以及读取每个数据的读取时间,具体而言,数据读取日志可以包括多条数据读取记录,每条数据读取记录可以包括每次所读取的数据的标识、以及每次读取的读取时间,读取时间可以是读取时刻对应的时间戳。
相应地,在每张页面截图中也可以包括页面截图对应的时间戳,该时间戳可以表征截图显示页面时的时刻,例如,在12点12分34秒100毫秒对显示页面进行了一次截图,则可以将截图的时刻对应的时间戳记录在该次截图得到的页面截图中。
当然,由于用户行为数据不仅仅包括数据读取记录,还可以包括用户在页面之间的跳转行为数据、数据写入行为数据等,而跳转行为数据可以反映用户在何时进行了页面跳转、跳转前后的显示页面,由于显示页面一般对应一个后台的资源数据(资源数据经过处理后变为可显示的页面数据,从而构成显示页面),再结合页面截图对应的时间戳,可以知道用户在各个显示页面的停留时长,从而也可以得到用户在预设时段内所感兴趣的资源数据,进而帮助预测用户下一次所要读取的数据。
步骤S202:基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据。
本实施例中,通过读取时刻对应的时间戳和截图时刻对应的时间戳,可以将数据读取日志中的每条数据读取记录与页面截图进行对照,从而得到每 个页面截图与数据读取记录之间的关联关系,通过这一关联关系,可以建立起用户在目标应用中所观看的显示页面,与用户在观看某个显示页面时所进行的数据读行为之间的关联。
例如,以游戏APP为例,多个页面截图可以表征用户在游戏中观看了哪些游戏场景,数据读取记录可以表征用户读取了哪些数据,而关联关系则可以表征用户在观看游戏场景的过程中读取了哪些数据。
当然,在一些情况下,一个页面截图可能并没有可对应的数据读取记录,则此种情况下,可以设置该页面截图对应固定标识的数据读取记录,如对应标识为“00-00-00”的数据记录,以使每个页面截图均对应有数据读取记录,从而使得关联关系被补齐。
这样,基于页面截图和数据读取记录之间的关联关系,可以确定用户在显示页面中有没有读取数据、以及读取了哪些数据,同时,依据页面截图和数据读取记录的时间先后顺序,又可以确定显示页面之间的转移轨迹以及数据读取记录之间的转移轨迹,由此,可以获得显示页面的概率转移矩阵,以及用户行为数据的概率转移矩阵,这两个概率转移矩阵可以反映出用户在目标应用中进行的读行为的规律,从而基于这两个概率转移矩阵,可以预测用户在下一次要读取的预读数据。具体可以理解为:基于页面截图以及行为数据,可以获取用户在预设时段内所感兴趣的资源数据,这样,可以结合用户感兴趣的资源数据,预测用户下一次要读取的预读数据。
步骤S203:从第一服务器中下载所述预读数据,并将所述预读数据存储至本地,以使所述目标应用的下一次读请求命中所述预读数据时,从所述本地读取所述预读数据。
本实施例中,确定出的预读数据可以表示为预读数据的标识,如预读数据的ID,从而可以基于这一预读数据的标识,从第一服务器中下载预读数据,并将预读数据存储至本地,例如,存储至终端的外存中,当然,也可以存储至终端的内存中。这样,在目标应用的下一次读请求命中预读数据时,便可以直接从本地读取预读数据,从而大大缩短数据读取路径,实现了数据的快速读取。
当然,若预读数据中仅包括下一次读请求所需的部分数据,则针对下一次读请求中未命中预读数据的剩余数据,可以从第一服务器中下载,之后,将从第一服务器中下载的剩余数据和预读数据中被命中的数据,封装为数据包反馈给目标应用。此种情况下,由于从第一服务器中只实时下载了部分数据,因此,待实时下载的数据量相对减小,读数据的实时效率也能得到提升。
当然,在一些实施方式中,可以将从第一服务器中下载预读数据的时长限制在目标时长内,若目标时长内未下载到预读数据,则不再进行预读数据的下载,一般而言,目标时长可以基于用户每相邻两次读取数据的间隔时长确定,如可以是每相邻两次读取数据的平均间隔时长。这样,可以避免在未成功读取到预读数据的情况下,目标应用发起了下一次读请求时,预读和实际读取之间的冲突,从而可以单独响应目标应用的下一次读请求,保证目标应用的数据读取效率。
采用本申请实施例的技术方案,一方面,由于提前从第一服务器中获取预读数据存储到终端本地中,从而使得目标应用在下一次读数据时,可以直接从终端本地中读数据,相比于从第一服务器下载所需数据,从本地读数据会大大缩短读取路径,由此大大缩短数据读取时长,提高了目标应用的数据读取效率,避免了目标应用在使用过程中卡顿的问题,提高了目标应用的使用流畅性。另一方面,基于页面截图和用户行为数据得到了用户下一次要读取的预读数据,可以建立用户的页面观看行为和数据读取行为之间的关联,基于这一关联可以准确预测用户下一次要读取的数据,从而提高了下一次读请求命中预读数据的概率,进而保证每次读请求均可以被快速响应。
在一种实施方式中,参照图3所示,示出了确定预读数据的步骤流程图,其中,如上所述,用户行为数据可以包括数据读取日志,数据读取日志可以包括多条数据读取记录,由于数据读取记录可以包括所读取的数据的标识,如数据的ID,而每一次读取的数据一般并不仅仅是一个数据,而可能是多个数据,由此,每条数据读取记录表现为一个ID序列,因此,则一条数据读取记录可以称为一条读数据序列,多条数据读取记录则是多条读数据序 列。
示例地,在一次数据读取中,需要读取ID为“01”、“02”和“03”的数据,则数据读取记录便表现为“01-02-03”的ID序列。
如上所述,需要确定页面截图与多条读数据序列之间的关联关系,实际中,为使得用户对显示页面的观看行为和数据读取行为进行准确映射,可以先将多张页面截图和多条读数据序列进行对齐,即一张页面截图可以映射一条读数据序列,以将用户观看行为和数据读取行为进行准确映射。
具体地,可以基于多张页面截图各自的时间以及多条读数据序列各自的时间,对多张页面截图和多条读数据序列进行对齐。
具体实施时,对每张页面截图而言,可以依据该页面截图对应的时间戳,将与该时间戳匹配的读数据序列作为与该页面截图对齐的读数据序列。
例如,页面截图1对应的时刻是34秒200毫秒,则可以将读取时刻在34秒200毫秒附近的读数据序列与该页面截图1对齐,其中,读取时刻在34秒200毫秒附近可以理解为是:读取时刻与34秒200毫秒之间的时间差在目标时间差范围内。即,对齐时,与该页面截图对应的时间戳匹配的读数据序列是指:读取时刻对应的时间戳与页面截图对应的时间戳之间的时间差在目标时间差范围内的读数据序列。
在一种实施方式中,在进行页面截图和读数据序列的对齐处理之前,可以先对读数据序列进行预处理,如,对因网络/系统异常而导致的重传的读数据序列进行去重。其中,在进行对齐时,若一个页面截图无匹配的读数据序列,则可以针对该页面截图的时刻,补全缺失的读数据序列。
如图3所示,确定预读数据的过程具体可以包括以下步骤:
步骤S301:基于对齐的每张页面截图和对应的读数据序列,确定每张所述页面截图分别与多条所述读数据序列之间的注意力得分。
本实施例中,由于每张页面截图都有与之对齐的读数据序列,则为了准确反映显示页面的跳转对读数据的影响,可以确定每张页面截图分别与多条读数据序列之间的注意力得分,即可以确定每张页面截图分别与每条读数据序列之间的注意力得分,该注意得分用于表征当前用户所观看的显示页面与 前后所读取数据的关联程度。如此,可以基于具有时间先后顺序的观看行为和数据读取行为,更好地表达用户所观看的显示页面与所读数据之间的关联程度,即得到用户所观看的显示页面对用户读取数据的影响。
在一种可选的实施方式中,可以通过BERT(Bidirectional Encoder Representations from Transformer)模型确定每张页面截图分别与多条读数据序列之间的注意力得分。
步骤S302:基于所述注意力得分,确定所述目标应用在下一次要读取的预读数据。
本实施例中,由于注意力得分用于表征当前用户所观看的显示页面与前后所读取的数据之间的关联程度,由此,可以基于每个页面截图各自对应的多个注意力得分(每个读数据序列对应有一个注意力得分),构建显示页面的概率转移矩阵,以及读数据序列的概率转移矩阵,从而基于概率转移矩阵,确定用户下一次要读取的预读数据。
其中,概率转移矩阵应用于本申请中,可以用于反映读数据行为从一种状态向另一种状态转移的动态过程的概率分布,即学习用户读数据行为的规律,该规律表现为一个概率转移矩阵,从而可以预测出下一次要读取的预读数据。
进一步的解释为:由于多个页面截图记录了用户的观看行为,包含了丰富的场景区域、显示页面与资源数据分布的关联关系等信息;而在目标应用的设计中,场景区域、功能页面与相关组件的资源数据分布基本上是一一对应的关系,因此,通过页面截图确定用户当前所在的场景区域后,可推断出对应目标应用的哪一部分资源数据正在被访问,再结合用户行为数据,可以猜测到从当前的场景区域(显示页面)跳转到相关的其他场景区域(其他显示页面)的组合概率排序,进而得到用户下一步要访问的资源数据的概率排序,基于该概率排序,即可确定出用户下一次要读取的预读数据。
需要说明的是,图3所示的预读数据的确定过程指示了:本申请基于页面截图和读数据序列确定预读数据的实质内涵,而采用此实质内涵的一切具体技术实现方式均可以认为落入了本申请的保护范围。
其中,基于图3所示的确定预读数据的实质内涵,提出了一种可选的技术实现方式,在该方式中,可以基于机器学习的思想,构建一个用于确定预读数据的预测模型,该预测模型可以是基于相应的训练样本对目标神经网络进行训练得到的;这样,可以直接将页面截图和读数据序列输入至该预测模型,确定下一次要读取的预读数据。
其中,训练得到预测模型所使用的训练样本可以包括多个联合训练样本,每个联合训练样本包括多张页面截图样本和用户行为数据样本,实际读数据样本为目标应用所真实读取的数据;当然,实际读数据样本在训练过程中起到真值的作用,以该真值作为监督学习的目标构建损失函数,从而可以在训练的过程中,该损失值可以不断下降,进而不断更新目标神经网络的参数,以使目标神经网络不断收敛,从而得到预测模型。
在通过训练目标神经网络构建预测模型的情况下,可以通过大量的训练样本训练出预测性能较强的预测模型,从而提高预测模型确定预读数据的准确性。
参照图4所示,示出了本申请的模型训练的整体流程示意图,如图4所示,目标神经网络包括第一特征提取模块、第二特征提取模块、拼接模块以及预测模块。其中,图4以预测模块为Transformer模型为例进行了示意,结合图4所述,该预测模型的训练过程如下:
首先,用户行为数据样本包括多个读数据序列样本,将多个页面截图样本输入至第一特征提取模块,获取多个页面截图样本各自对应的第一特征向量;并将多个读数据序列样本输入至第二特征提取模块,获取多个读数据序列样本各自对应的第二特征向量;
接着,将第一特征向量和第二特征向量进行拼接后得到的联合向量输入至预测模块,获得预测模块输出的预读数据序列;其中,拼接模块用于对第一特征向量和第二特征向量进行拼接。
之后,基于预读数据序列和实际读数据样本,对目标神经网络的参数进行多次更新,得到预测模型。
本实施例中,第一特征提取模块用于提取页面截图样本的特征,从而构 成特征图,其中,第一特征提取模块可以包括多个依次连接的卷积层,通过多个卷积层,可以对页面截图进行不同尺度的特征提取,从而得到特征图,进而将特征图转换为特征向量,得到第一特征向量。
其中,第二特征提取模块可以用于对每条读数据序列样本进行指定格式的向量化处理,进而得到第二特征向量。
接着,将第一特征向量和对应的第二特征向量进行拼接后得到联合向量。然后,将多个时序连续的联合向量输入至预测模块,其中,预测模块可以是Transformer模型,Transformer模型当中大量用到了多头Self-Attention(自注意力)机制,Transformer模型采用的算法是一个基于注意力机制的时序算法,由此预测模型可以基于时序连续的联合向量,学习到目标应用的显示页面与所读数据之间的关联程度。
在获得预测模块输出的预读数据序列时,可以基于该预读数据序列和实际读数据样本所包括的实际读数据序列,构建损失函数求得目标神经网络的损失值,从而基于损失值,对目标神经网络的参数进行更新。
由此,经过多轮训练后,在损失值小于目标损失值,并且LOSS曲线符合预期变化的情况下,可以表征目标神经网络收敛,从而将得到的预测模型部署到线上应用。
采用此种实施方式时,由于通过大量训练样本不断地训练预测模型,可以提高预测模型对预读数据的预测准确率,则在训练得到预测模型后,可以将多个页面截图和用户行为数据中的读数据序列输入至预测模型,得到目标应用在下一次要读取的预读数据。
具体实施时,先将读数据序列与页面截图进行时间戳对齐,将对齐处理后的多个页面截图以及多个读数据序列分别输入至第一特征提取模块、第二特征提取模块,输出得到第一特征向量、第二特征向量,再经过拼接模块将多个时序连续的第一特征向量与第二特征向量进行拼接,得到多个时序连续的联合向量,最后,将多个时序连续的联合向量输入至预测模型,以得到目标应用在下一次要读取的预读数据。
当然,在基于预测模型确定预读数据时,预测模型的预测性能与训练样 本的丰富程度有关,训练样本越丰富,则预测模型的泛化能力、鲁棒性更好。如此,可以在目标应用的使用过程中,不断收集训练样本对预测模型进行动态更新,以提升预测模型的预测准确度。
相应地,在一种实施方式中,可以构建“自我”驱动的预测模型迭代机制,通过该机制,可以促使预测模型进入自动迭代的循环中,随着使用目标应用的用户量越来越大,搜集的训练样本越来越多,则预测模型的预测能力将越来越准,且只需要少许的人工介入便可以使得预测模型完成自我更新迭代。
参照图5所示,示出了对预测模型进行自我迭代更新的流程示意图。结合图5所示,在具体实施时,可以在目标应用第i次读数据时,获取第i次实际读取的实际读数据序列,以及第i次对应的预读数据序列;并在第i次实际读数据序列与第i-1次输出的预读数据序列(第i-1次输出的预读数据序列实际是预测出的第i次要读的数据序列)之间的差异超过目标差异的情况下,获取确定第i-1次对应的预读数据序列时所依据的目标用户行为数据和目标页面截图;之后,基于目标用户行为数据、目标页面截图和实际读数据序列,对预测模型进行更新。其中,i为大于等于1的整数。
本实施方式中,第i次读数据可以是指目标应用的任一次读请求,即目标应用在每一次发出读请求需要读数据时,可以通过读请求确定目标应用实际要读取的数据,该实际要读取的数据即为实际读数据序列,如上所述,具体可以表现为一条ID序列。其中,该次要读取的数据的标识可以携带在读请求中,进而通过解析读请求,便可以获取第i次要读取的实际读数据序列。
由于在每一次读请求之前,都提前预测了该次读请求对应的预读数据,即预读数据序列,这样,可以确定第i次读请求所要读取的实际读数据序列与对应的预读数据序列之间的差异。
例如,在目标应用第5次发出读请求时,该读请求要读取的实际读数据序列为“01-02-03-06-08”,而在之前,第4次预测的读请求所要读取的预读数据序列(与第5次实际读数据序列进行对比)为“01-03-04-07-08”,则可以确定“01-02-03-06-08”与“01-03-04-07-08”之间的差异。
其中,所要读取的实际读数据序列与对应的预读数据序列之间的差异可 以包括读数据序列的不一致,以及读数据序列的读取顺序的不一致。
其中,实际读数据序列与对应的预读数据序列之间的差异,可以用错误率表征,相应地,目标差异可以用目标错误率表示。具体而言,可以表示为实际读数据序列与预读数据序列之间不一致的数据的占比,如“01-02-03-06-08”与“01-03-04-07-08”之间,不一致的数据有三个,占比为60%,则错误率为60%。
其中,当差异大于目标差异的情况下,表示对第i次预读的预测准确性不高,则可以将预测第i次要读的数据所使用的页面截图和用户行为数据重新输入至预测模型,并以第i次实际读取的实际读数据序列为真值,对预测模型进行训练,以更新预测模型的参数,使预测模型可以对预测不准确的样本进行重新学习。
采用此种实施方式时,可以在预测模型上线使用过程中,利用线上实时获取的用户行为数据和页面截图,对预测模型进行动态更新,从而可以使得预测模型不断学习当前用户群体的行为,以在目标应用被用户群体持续使用的过程中,始终保证预读数据的预测准确率。
在一种实施方式中,可以将第i次对应的目标用户行为数据、目标页面截图作为增量样本,加入到增量样本池;之后,周期性从增量样本池中获取当前周期内新增的增量样本;并以当前周期内新增的增量样本为训练样本、以对应的实际读数据序列为真值,对预测模型进行更新。
基于上述所述,由于可以在预测模型上线使用过程中,利用线上实时获取的用户行为数据和页面截图,对预测模型进行动态更新;具体实施时,可以周期性对预测模型进行更新,这样,可以在每确定预读数据序列和对应的实际读数据序列之间的差异大于目标差异时,就将对应的目标用户行为数据和目标页面截图作为增量样本加入到增量样本池。
其中,可以为增量样本池构建对应的索引表,索引表中可以记录当前周期内新增的增量样本的索引,如此,可以在每需要更新预测模型时,通过索引表确定当前周期内新增的增量样本,并以当前周期内新增的增量样本为训练样本,对预测模型进行更新。
例如,每周更新一次预测模型,则在新的一周到来时,如每个星期天到来时,可以将该星期内从星期一至当前时间新增的增量样本调出,输入至预测模型,并通过上述的训练过程,对预测模型进行持续训练,从而得到更新后的预测模型。
又例如,每周更新一次预测模型,则在新的一周到来时,如每个星期天到来时,可以将该星期内从星期一至当前时间新增的增量样本调出,合并到一个全量训练样本集合中,该全量训练样本集合是从当前时间点至半年或一年前的时间点的所有增量样本集合,或者可以是已存在的一个基础训练样本集合加上一段时间内的增量样本集合,然后,将全量训练样本集合中的训练样本输入至预测模型,并通过上述的训练过程,对预测模型进行持续训练,从而得到更新后的预测模型;相比前一种训练方式,该训练方法可以最大化降低模型在特定场景下的衰退率,提高泛化能力。
这样,可以运用更新后的预测模型进行目标应用的预读数据预测。
当然,在一些实施方式中,还可以对预测模型进行评测,具体可以如相关技术中对模型进行评测的过程,如构建评测数据集,将评测数据集输入至预测模型进行推理,并基于预测模型输出的预测结果(预读数据序列),判断预测结果是否符合目标指标,如果符合,则可以确定预测模型可以上线运行。
相应地,在每次利用增量样本对预测模型进行更新的情况下,对更新后的预测模型进行评测,以确定更新后的预测模型在目标指标上是否更好,若是,则可以将更新后的预测模型作为最新版本的模型上线运行,若否,则不发布最新版本的预测模型,而让更新前的预测模型继续保持进行线上运行。
其中,目标指标可以包括预测的平均准确率、召回率、预测范围覆盖度等。
结合图1所示的软硬件环境图,在采用预测模型确定预读数据的情况下,预测模型可以部署在第二服务器上,或者也可以部署在运行目标应用的终端上。其中,通过第二服务器上的预测模型确定预读数据的模式可以称为服务器模式,通过终端上的预测模型确定预读数据的模式可以称为客户端模式。
需要说明的是,在客户端模式下,由于终端自身确定预读数据,因此无需将页面截图和用户行为数据发送给第二服务器,避免了对网络资源的占用和对网络带宽的依赖,从而避免了远程的预读数据确定,可以提高预读数据确定的效率。而在服务器模式下,由于服务器的算力大于终端,可以避免对终端上的算力资源的占用,从而降低对终端的性能消耗,保障了终端的续航。
参照图6a和图6b所示,图6a示出了服务器模式下的软硬件环境图,图6b示出了客户端模式下的软硬件环境图。从图6a和图6b可看出,除包括预测模型外,还包括与确定预读数据有关的第一组件601以及与预测模型迭代更新有关的第二组件602;
其中,参照图6a和图6b所示,第一组件601可以包括数据包分离组件、日志过滤组件、对齐组件和预处理组件。第二组件602可以包括增量样本挖掘组件、数据库组件、更新组件和评测组件。
如图6a所示,在服务器模式下,上述的第一组件601、第二组件602和预测模型均部署在第二服务器中,如图6b所示,在客户端模式下,上述的第一组件601和预测模型部署在终端上,而第二组件602部署在第二服务器上。
其中,终端依据自身的性能配置参数选择进入服务器模式或进入客户端模式,当然,终端也可以依据自身被使用过程中的性能配置参数,在服务器模式和客户端模式之间进行模式切换。
具体实施时,可以获取运行目标应用的终端的当前性能配置参数,并在当前性能配置参数满足目标条件的情况下,将多张页面截图和用户行为数据发送给终端中配置的预测模型,以得到下一次要读取的预读数据;在当前性能配置参数不满足目标条件的情况下,将多张页面截图和用户行为数据发送给第二服务器,以得到下一次要读取的预读数据。
本实施例中,性能配置参数可以反映终端的硬件配置、软件配置和操作系统的配置,从而整体上反映终端的性能,当然,终端的性能随着不断地运行其他的应用和存储数据的增多,而会动态的变化,因此,通过当前性能配置参数可以反映终端的软硬件承载容量被当前的应用和存储数据所消耗的 情况。
其中,目标条件可以是操作系统为目标版本的操作系统、终端的CPU占用率低于目标占用率、终端的内存的剩余容量不低于目标容量等多个条件的任一一种或多种条件的组合。其中,目标条件包含的内容越多,则终端的性能配置参数需要达到更高要求才能利用终端上配置的预测模型进行预读数据的预测。
当然,在一些实施例中,可以在安装目标应用时,根据终端的当前性能配置参数,确定是否在终端部署预测模型。例如,在当前性能配置参数满足目标条件的情况下,则可以在终端部署预测模型,反之,则不在终端部署预测模型。
相应地,在目标应用的使用过程中,由于也可以根据终端的当前性能配置参数,在客户端模式和服务器模式下动态切换,则达到了对预读数据的预测与终端自身的算力相适配的效果。
其中,对图6a和图6b所示中的第一组件和第二组件进行分别介绍:
在第一组件中,数据包分离组件的输出端连接日志过滤组件和对齐组件,日志过滤组件的输出端连接对齐组件的输入端,对齐组件的输出端连接预处理组件的输入端,而预处理组件的输出端用于向预测模型输入预处理后的页面截图和用户行为数据。
在第二组件中,包括增量样本挖掘组件、数据库组件、更新组件和评测组件,其中,增量样本挖掘组件的输入端可以与对齐组件的输出端和预测模型的输出端连接,用于接收对齐组件输出的读数据序列和页面截图,以及接收预测模型确定出的预读数据,如上一次确定出的预读数据,以便与本次对齐组件输出的读数据序列进行差异比较。
增量样本挖掘组件的输出端与数据库组件连接,用于将对齐组件输出的读数据序列和页面截图,以及预测模型确定出的预读数据存放至数据库;更新组件与数据库连接,用于在数据库中具有新增的增量样本时,读取出新增的增量样本,并利用新增的增量样本对预测模型进行更新;评测组件用于评测预测模型。
其中,第二组件602部署在第二服务器中,通过第二组件602对第二服务器中的预测模型进行更新,且经评测确定目标指标转好时,便可以将预测模型下发给终端,以使终端部署最新版本的预测模型。
需要说明的是,在服务器模式下,自目标应用至预处理组件的通路上,是以短视频的形式传输页面截图,即将多个页面截图编码为视频后在通路上传输至预处理组件,之后,通过预处理组件解码得到多张页面截图。
下面,结合图6a所示,以在移动终端中使用游戏APP为例,对本申请的数据读取方法的整体过程进行介绍:
S1:在游戏运行时对当前场景的显示页面按照预设频率进行采样,调用Nvidia Game Stream接口从GPU显存的显示缓冲区中获取当前缓存的图像帧,并对每个抽取的图像帧进行简单的方向、尺寸调整和标记时间戳,并将处理后的图像帧编码为视频,之后将视频截断为10-30s的短视频A发送给预读模块,如此,短视频A内的图像帧即为预读所需的页面截图。
S2:文件管理模块获取目标应用的读数据日志,读数据日志中包括多条读数据序列,每条读数据序列具有各自对应的时间戳,发送给预读模块。
S3:预读模块根据短视频A的时间范围(起始时间戳→结束时间戳),从读数据日志中找到匹配该时间段的读数据序列,然后打包成一个日志包A,发送给第二服务器中的数据包分离组件。
S4:数据包分离组件将日志包A分离出一个短视频A和若干个读数据序列;并将短视频A发送给对齐模块,将若干个读数据序列发送给日志过滤组件;
S5:日志过滤组件检查收到的若干个读数据序列中重复的读数据序列,并去重;补全某个时间段缺失的读数据序列之后,将处理后的读数据序列发送给对齐组件。
S6:对齐组件再次将短视频A中每个图像帧的时间与读数据序列的时间进行对齐,将对齐的短视频A与读数据序列打包成数据组B,并将数据组B的副本发送给增量样本挖掘组件,将数据组B发送给预处理组件。
S7:预处理组件,将收到的数据组B分离为短视频A与读数据序列, 并将短视频A解码,得到若干页面截图;以及,基于读数据序列的时间戳找到匹配的页面截图,实现读数据序列和页面截图的再次对齐。
然后,将多个页面截图和各自对应的读数据序列传送给预测模型进行推理,得到预测模型输出的目标应用在下一次要读的预读数据的标识。
S8:第二服务器将下一次要读的预读数据的标识反馈给预读模块。
S9:预读模块基于下一次要读的预读数据的标识封装一个数据下载请求,将数据下载请求发送给第一服务器。
S10:预读模块接收第一服务器回传的预读数据,并将该预读数据存放至终端的外存中。
S11:在目标应用下一次读请求到来时,确定读请求所需的数据,若读请求所需的数据已经存放至外存中,则从外存中读取;若读请求所需的数据只部分存在于外存中,则针对未存在于外存中的剩余数据构建新的数据下载请求,以从第一服务器中获取该剩余数据。
其中,对于预测模型的更新可以详见上述实施例的描述即可,在此不再进行示例性描述。
基于相同的发明构思,本申请还提供了一种数据读取装置,参照图7所示,示出了数据读取装置的框架示意图,如图7所示,包括以下模块:
数据获取模块701,用于获取目标应用在当前时刻之前的多张页面截图以及用户行为数据;其中,所述页面截图是对所述目标应用的显示页面进行截图得到的;
数据预测模块702,用于基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据;
下载模块703,用于从第一服务器中下载所述预读数据,并将所述预读数据存储至本地,以使所述目标应用的下一次读请求命中所述预读数据时,从所述本地读取所述预读数据。
可选地,所述用户行为数据包括多条读数据序列,所述装置还包括:
对齐模块,用于基于多张所述页面截图各自的时间以及多条所述读数据序列各自的时间,对多张所述页面截图和多条所述读数据序列进行对齐;
所述用户行为数据中包括多条读数据序列,数据预测模块702,包括:
注意力得分确定单元,用于基于对齐的每张页面截图和对应的读数据序列,确定每张所述页面截图分别与多条所述读数据序列之间的注意力得分;
预测单元,用于基于所述注意力得分,确定所述目标应用在下一次要读取的预读数据。
可选地,数据预测模块702,包括:
输入单元,用于将多个所述页面截图和所述用户行为数据输入至预测模型,得到所述目标应用在下一次要读取的预读数据;
所述预测模型是以多个联合训练样本为输入、以每个联合训练样本对应的实际读数据样本为真值,对目标神经网络进行训练得到的;
其中,每个所述联合训练样本包括多张页面截图样本和用户行为数据样本,所述实际读数据样本为所述目标应用所真实读取的数据。
可选地,所述装置还包括:
第一数据序列获取模块,用于在所述目标应用第i次读数据时,获取第i次实际读取的实际读数据序列,以及第i次对应的预读数据序列;其中,i为大于等于1的整数;
第二数据序列获取模块,用于在所述实际读数据序列与所述预读数据序列之间的差异超过目标差异的情况下,获取确定所述第i次对应的预读数据序列时,所依据的目标用户行为数据和目标页面截图;
模型更新模块,用于基于所述目标用户行为数据、所述目标页面截图和所述实际读数据序列,对所述预测模型进行更新。
可选地,所述装置还包括:
样本添加模块,用于将所述第i次对应的目标用户行为数据和目标页面截图作为增量样本,加入到增量样本池;
所述模型更新模块,包括:
样本更新单元,用于周期性从所述增量样本池中获取当前周期内新增的增量样本;
模型更新单元,用以所述当前周期内新增的增量样本为输入、以对应的 实际读数据序列为真值,对所述预测模型进行更新。
可选地,所述用户行为数据样本包括多个读数据序列样本,所述预测模型是通过以下方式训练得到的:
获取多个所述页面截图样本各自对应的第一特征向量以及所述多个读数据序列样本各自对应的第二特征向量;
将所述第一特征向量和所述第二特征向量进行拼接后,得到联合向量;
基于所述联合向量,获得所述目标神经网络输出的预读数据序列;
基于所述预读数据序列和所述实际读数据样本,对所述目标神经网络的参数进行多次更新,得到所述预测模型。
可选地,所述装置还包括:
配置参数获取模块,用于获取运行所述目标应用的终端的当前性能配置参数;
数据预测模块702,包括:
第一预测模式单元,用于在所述当前性能配置参数满足目标条件的情况下,将所述多张所述页面截图和所述用户行为数据发送给所述终端中配置的所述预测模型,以得到所述下一次要读取的预读数据;
第二预测模式单元,用于在所述当前性能配置参数不满足所述目标条件的情况下,将所述多张所述页面截图和所述用户行为数据发送给所述第二服务器,以得到所述下一次要读取的预读数据。
可选地,所述装置还包括:
数据包获取模块,用于预先在未安装所述目标应用时,获取所述目标应用的启动运行包和启动镜像包;其中,所述启动镜像包包括所述目标应用的启动数据;
应用启动模块,用于通过所述启动运行包启动所述目标应用后,响应于所述目标应用的读请求,从所述启动镜像包和/或所述第一服务器中读取所述读请求对应的数据;其中,所述第一服务器中包括所述目标应用的全部原始数据。
需要说明的是,装置实施例与方法实施例相近,故描述的较为简单,相 关之处参见方法实施例即可。
本申请实施例还公开了一种电子设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行时实现如所述的数据读取方法。
本申请实施例还公开了一种计算机可读存储介质,其存储的计算机程序使得处理器执行如本申请所述的数据读取方法。
本申请实施例还公开了一种计算机程序产品,包括计算机程序/指令,该计算机程序/指令被处理器执行时实现所述的数据读取方法。
对于装置实施例而言,由于其与方法实施例基本相似,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
本说明书中的各个实施例均采用递进的方式描述,每个实施例重点说明的都是与其他实施例的不同之处,各个实施例之间相同相似的部分互相参见即可。
本领域内的技术人员应明白,本申请实施例的实施例可提供为方法、装置、或计算机程序产品。因此,本申请实施例可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本申请实施例可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。
本申请实施例是参照根据本申请实施例的方法、终端设备(系统)、和计算机程序产品的流程图和/或方框图来描述的。应理解可由计算机程序指令实现流程图和/或方框图中的每一流程和/或方框、以及流程图和/或方框图中的流程和/或方框的结合。可提供这些计算机程序指令到通用计算机、专用计算机、嵌入式处理机或其他可编程数据处理终端设备的处理器以产生一个机器,使得通过计算机或其他可编程数据处理终端设备的处理器执行的指令产生用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的装置。
这些计算机程序指令也可存储在能引导计算机或其他可编程数据处理 终端设备以特定方式工作的计算机可读存储器中,使得存储在该计算机可读存储器中的指令产生包括指令装置的制造品,该指令装置实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能。
这些计算机程序指令也可装载到计算机或其他可编程数据处理终端设备上,使得在计算机或其他可编程终端设备上执行一系列操作步骤以产生计算机实现的处理,从而在计算机或其他可编程终端设备上执行的指令提供用于实现在流程图一个流程或多个流程和/或方框图一个方框或多个方框中指定的功能的步骤。
尽管已描述了本申请实施例的优选实施例,但本领域内的技术人员一旦得知了基本创造性概念,则可对这些实施例做出另外的变更和修改。所以,所附权利要求意欲解释为包括优选实施例以及落入本申请实施例范围的所有变更和修改。
最后,还需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者终端设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者终端设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者终端设备中还存在另外的相同要素。
以上对本申请所提供的一种数据读取方法、装置以及设备,进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性 变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。
本文中所称的“一个实施例”、“实施例”或者“一个或者多个实施例”意味着,结合实施例描述的特定特征、结构或者特性包括在本公开的至少一个实施例中。此外,请注意,这里“在一个实施例中”的词语例子不一定全指同一个实施例。
在此处所提供的说明书中,说明了大量具体细节。然而,能够理解,本公开的实施例可以在没有这些具体细节的情况下被实践。在一些实例中,并未详细示出公知的方法、结构和技术,以便不模糊对本说明书的理解。
在权利要求中,不应将位于括号之间的任何参考符号构造成对权利要求的限制。单词“包含”不排除存在未列在权利要求中的元件或步骤。位于元件之前的单词“一”或“一个”不排除存在多个这样的元件。本公开可以借助于包括有若干不同元件的硬件以及借助于适当编程的计算机来实现。在列举了若干装置的单元权利要求中,这些装置中的若干个可以是通过同一个硬件项来具体体现。单词第一、第二、以及第三等的使用不表示任何顺序。可将这些单词解释为名称。
最后应说明的是:以上实施例仅用以说明本公开的技术方案,而非对其限制;尽管参照前述实施例对本公开进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本公开各实施例技术方案的精神和范围。

Claims (10)

  1. 一种数据读取方法,其特征在于,所述方法包括:
    获取目标应用在当前时刻之前的多张页面截图以及用户行为数据;其中,所述页面截图是对所述目标应用的显示页面进行截图得到的;
    基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据;
    从第一服务器中下载所述预读数据,并将所述预读数据存储至本地,以使所述目标应用的下一次读请求命中所述预读数据时,从所述本地读取所述预读数据。
  2. 根据权利要求1所述的方法,其特征在于,所述用户行为数据包括多条读数据序列,所述方法还包括:
    基于多张所述页面截图各自的时间以及多条所述读数据序列各自的时间,对多张所述页面截图和多条所述读数据序列进行对齐;
    所述用户行为数据中包括多条读数据序列,基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据,包括:
    基于对齐的每张页面截图和对应的读数据序列,确定每张所述页面截图分别与多条所述读数据序列之间的注意力得分;
    基于所述注意力得分,确定所述目标应用在下一次要读取的预读数据。
  3. 根据权利要求1或2所述的方法,其特征在于,基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据,包括:
    将多个所述页面截图和所述用户行为数据输入至预测模型,得到所述目标应用在下一次要读取的预读数据;
    所述预测模型是以多个联合训练样本为输入、以每个联合训练样本对应的实际读数据样本为真值,对目标神经网络进行训练得到的;
    其中,每个所述联合训练样本包括多张页面截图样本和用户行为数据样本,所述实际读数据样本为所述目标应用所真实读取的数据。
  4. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    在所述目标应用第i次读数据时,获取第i次实际读取的实际读数据序列,以及第i次对应的预读数据序列;其中,i为大于等于1的整数;
    在所述实际读数据序列与所述预读数据序列之间的差异超过目标差异的情况下,获取确定所述第i次对应的预读数据序列时,所依据的目标用户行为数据和目标页面截图;
    基于所述目标用户行为数据、所述目标页面截图和所述实际读数据序列,对所述预测模型进行更新。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    将所述第i次对应的目标用户行为数据和目标页面截图作为增量样本,加入到增量样本池;
    基于所述目标用户行为数据、所述目标页面截图和所述实际读数据序列,对所述预测模型进行更新,包括:
    周期性从所述增量样本池中获取当前周期内新增的增量样本;
    以所述当前周期内新增的增量样本为输入、以对应的实际读数据序列为真值,对所述预测模型进行更新。
  6. 根据权利要求3所述的方法,其特征在于,所述用户行为数据样本包括多个读数据序列样本,所述预测模型是通过以下方式训练得到的:
    获取多个所述页面截图样本各自对应的第一特征向量以及所述多个读数据序列样本各自对应的第二特征向量;
    将所述第一特征向量和所述第二特征向量进行拼接后,得到联合向量;
    基于所述联合向量,获得所述目标神经网络输出的预读数据序列;
    基于所述预读数据序列和所述实际读数据样本,对所述目标神经网络的参数进行多次更新,得到所述预测模型。
  7. 根据权利要求3所述的方法,其特征在于,所述方法还包括:
    获取运行所述目标应用的终端的当前性能配置参数;
    基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据,包括:
    在所述当前性能配置参数满足目标条件的情况下,将所述多张所述页面 截图和所述用户行为数据发送给所述终端中配置的所述预测模型,以得到所述下一次要读取的预读数据;
    在所述当前性能配置参数不满足所述目标条件的情况下,将所述多张所述页面截图和所述用户行为数据发送给所述第二服务器,以得到所述下一次要读取的预读数据。
  8. 根据权利要求1-7任一所述的方法,其特征在于,所述方法包括:
    预先在未安装所述目标应用时,获取所述目标应用的启动运行包和启动镜像包;其中,所述启动镜像包包括所述目标应用的启动数据;
    通过所述启动运行包启动所述目标应用后,响应于所述目标应用的读请求,从所述启动镜像包和/或所述第一服务器中读取所述读请求对应的数据;其中,所述第一服务器中包括所述目标应用的全部原始数据。
  9. 一种数据读取装置,其特征在于,所述装置包括:
    数据获取模块,用于获取目标应用在当前时刻之前的多张页面截图以及用户行为数据;其中,所述页面截图是对所述目标应用的显示页面进行截图得到的;
    数据预测模块,用于基于多张所述页面截图和所述用户行为数据,确定所述目标应用在下一次要读取的预读数据;
    下载模块,用于从第一服务器中下载所述预读数据,并将所述预读数据存储至本地,以使所述目标应用的下一次读请求命中所述预读数据时,从所述本地读取所述预读数据。
  10. 一种电子设备,其特征在于,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行时实现如权利要求1-8任一项所述的数据读取方法。
PCT/CN2022/130597 2022-06-10 2022-11-08 数据读取方法、装置及电子设备 WO2023236444A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210653784.2 2022-06-10
CN202210653784.2A CN115202748A (zh) 2022-06-10 2022-06-10 数据读取方法、装置及电子设备

Publications (1)

Publication Number Publication Date
WO2023236444A1 true WO2023236444A1 (zh) 2023-12-14

Family

ID=83575303

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/130597 WO2023236444A1 (zh) 2022-06-10 2022-11-08 数据读取方法、装置及电子设备

Country Status (2)

Country Link
CN (1) CN115202748A (zh)
WO (1) WO2023236444A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117472801A (zh) * 2023-12-28 2024-01-30 苏州元脑智能科技有限公司 显存带宽优化方法、装置、系统及bmc系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115202748A (zh) * 2022-06-10 2022-10-18 杨正 数据读取方法、装置及电子设备
CN117435112B (zh) * 2023-12-20 2024-04-05 摩尔线程智能科技(成都)有限责任公司 数据处理方法、系统及装置、电子设备和存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020119544A1 (zh) * 2018-12-13 2020-06-18 深圳壹账通智能科技有限公司 网络传输模拟方法、装置、计算机设备及存储介质
CN111666497A (zh) * 2020-06-16 2020-09-15 腾讯科技(上海)有限公司 应用程序的加载方法、装置、电子设备及可读存储介质
CN111729305A (zh) * 2020-06-23 2020-10-02 网易(杭州)网络有限公司 地图场景预加载方法、模型训练方法、设备及存储介质
US20210311774A1 (en) * 2020-04-02 2021-10-07 Citrix Systems, Inc. Contextual Application Switch Based on User Behaviors
CN115202748A (zh) * 2022-06-10 2022-10-18 杨正 数据读取方法、装置及电子设备

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020119544A1 (zh) * 2018-12-13 2020-06-18 深圳壹账通智能科技有限公司 网络传输模拟方法、装置、计算机设备及存储介质
US20210311774A1 (en) * 2020-04-02 2021-10-07 Citrix Systems, Inc. Contextual Application Switch Based on User Behaviors
CN111666497A (zh) * 2020-06-16 2020-09-15 腾讯科技(上海)有限公司 应用程序的加载方法、装置、电子设备及可读存储介质
CN111729305A (zh) * 2020-06-23 2020-10-02 网易(杭州)网络有限公司 地图场景预加载方法、模型训练方法、设备及存储介质
CN115202748A (zh) * 2022-06-10 2022-10-18 杨正 数据读取方法、装置及电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117472801A (zh) * 2023-12-28 2024-01-30 苏州元脑智能科技有限公司 显存带宽优化方法、装置、系统及bmc系统
CN117472801B (zh) * 2023-12-28 2024-03-01 苏州元脑智能科技有限公司 显存带宽优化方法、装置、系统及bmc系统

Also Published As

Publication number Publication date
CN115202748A (zh) 2022-10-18

Similar Documents

Publication Publication Date Title
WO2023236444A1 (zh) 数据读取方法、装置及电子设备
US11825034B2 (en) Bullet screen delivery method for live broadcast playback and live video bullet screen playback method
WO2021082584A1 (zh) 消息交互方法、装置、可读介质及电子设备
US20170046432A1 (en) Cloud-Enabled Architecture For On-Demand Native Application Crawling
US11240290B2 (en) Application download method and apparatus, application sending method and apparatus, and system
CN110784759A (zh) 弹幕信息处理方法、装置、电子设备及存储介质
US10708662B1 (en) Customized streaming of digital content
WO2017028722A1 (zh) 显示应用程序闪屏图像的方法和装置
CN103905835A (zh) 一种视频播放器的进度预览方法、装置和系统
CN111510789B (zh) 视频播放方法、系统、计算机设备及计算机可读存储介质
CN108769816B (zh) 一种视频播放方法、装置及存储介质
CN111770355A (zh) 媒体服务器确定方法、装置、服务器以及存储介质
US20230186452A1 (en) Method and system for generating video cover based on browser
US20240147023A1 (en) Video generation method and apparatus, and device, medium and product
WO2021203918A1 (zh) 用于处理模型参数的方法和装置
CN111031376A (zh) 基于微信小程序的弹幕处理方法和系统
CN104883614A (zh) 一种基于Adobe FlashPlayer和Jquery框架的WEB视频播放方法
CN112714341B (zh) 信息获取方法、云化机顶盒系统、实体机顶盒及存储介质
CN111083534A (zh) 一种用于提供推荐视频列表的方法与设备
CN116088732A (zh) 多媒体播放方法、装置及设备
US11122139B2 (en) Systems and methods for reducing download requirements
CN112052377A (zh) 资源推荐方法、装置、服务器和存储介质
CN113448649B (zh) 一种基于Redis的首页数据加载的服务器及方法
CN111263195B (zh) 弹幕处理方法、装置、服务器设备及存储介质
CN111782915B (zh) 信息展示方法和装置、以及电子设备和介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22945573

Country of ref document: EP

Kind code of ref document: A1