WO2023046059A1 - Cache warmup method and apparatus, and computer device and storage medium - Google Patents

Cache warmup method and apparatus, and computer device and storage medium Download PDF

Info

Publication number
WO2023046059A1
WO2023046059A1 PCT/CN2022/120815 CN2022120815W WO2023046059A1 WO 2023046059 A1 WO2023046059 A1 WO 2023046059A1 CN 2022120815 W CN2022120815 W CN 2022120815W WO 2023046059 A1 WO2023046059 A1 WO 2023046059A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
access information
business
model
cache
Prior art date
Application number
PCT/CN2022/120815
Other languages
French (fr)
Chinese (zh)
Inventor
谢磊
王相玲
Original Assignee
中国第一汽车股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国第一汽车股份有限公司 filed Critical 中国第一汽车股份有限公司
Publication of WO2023046059A1 publication Critical patent/WO2023046059A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data
    • G06Q30/0202Market predictions or forecasting for commercial activities

Definitions

  • the embodiments of the present application relate to computer technology, for example, to a cache preheating method, device, computer equipment, and storage medium.
  • Cache applications are essential in large-scale websites or high-concurrency processing systems, and cache warming is one of the important measures to improve data query performance.
  • cache preheating is realized mainly by fixedly initializing all data of a service according to the number of visits, or by manually selecting data according to service characteristics.
  • the cache warm-up method in the related art has the problems of low cache hit rate, large warm-up range, low cache warm-up accuracy rate and high labor cost caused by a large amount of useless data in the cache.
  • Embodiments of the present application provide a cache preheating method, device, computer equipment, and storage medium, so as to implement cache data preheating, improve cache preheating accuracy, and reduce labor costs.
  • the embodiment of the present application provides a cache preheating method, including:
  • hotspot data is selected from the plurality of data in the database for cache preheating.
  • the embodiment of the present application also provides a cache preheating device, including:
  • a business collection and access information acquisition module is configured to obtain business collection and access information in historical time periods
  • the business forecast visit information acquisition module is configured to input the business collection visit information of the historical time period into the pre-trained business forecast model to obtain the business forecast visit information of the target time period;
  • the predicted hot comment value acquisition module is configured to input the business forecast access information into the pre-trained data prediction model to obtain multiple predicted hot comment values of multiple data in the database of the target time period;
  • the cache preheating module is configured to filter out hot data from the plurality of data in the database for cache preheating according to the values of the plurality of predicted hot comments.
  • the embodiment of the present application also provides a computer device, and the computer device includes:
  • processors one or more processors
  • a storage device configured to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the cache warming method provided in the embodiment of the present application.
  • the embodiment of the present application also provides a storage medium including computer-executable instructions, and the computer-executable instructions are used to execute the cache warming method provided in the embodiment of the present application when executed by a computer processor.
  • FIG. 1 is a flow chart of a cache preheating method in Embodiment 1 of the present application
  • FIG. 2 is a flow chart of a cache preheating method in Embodiment 2 of the present application.
  • FIG. 3 is a flow chart of a cache preheating method in Embodiment 3 of the present application.
  • FIG. 4 is a schematic diagram of an application scenario of a warm-up cache method in Embodiment 4 of the present application.
  • FIG. 5 is a schematic structural diagram of a cache preheating device in Embodiment 5 of the present application.
  • FIG. 6 is a schematic structural diagram of a computer device in Embodiment 6 of the present application.
  • FIG. 1 is a flow chart of a cache preheating method provided in Embodiment 1 of the present application.
  • This embodiment is applicable to the situation of cache preheating, and the method can be executed by a cache preheating device, which can be implemented by software and and/or implemented by hardware, and configured in a computer device, the computer device may be a server device and a client device, for example, the client device may be a mobile phone, a tablet computer, a vehicle terminal or a desktop computer, etc.
  • the cache warming method includes the following steps:
  • the historical time period refers to a time period before the current time, for example, it may be the previous 8 hours, the previous day, or the previous week.
  • Business collected visit information refers to the business type and visit volume of the visit information collected in the historical time period.
  • the business type may be order placing, return and refund, etc., where the order placing business type includes business data such as product information, product images, and recommended product lists.
  • the number of visits refers to the number of visits to data. Exemplarily, when a user opens a web page with detailed content of a product, the number of visits to data such as product information, product images, and recommended product lists displayed on the web page increases.
  • business types can also be divided in more detail.
  • the order business type can be subdivided into business types such as food, clothing, and daily necessities. The level of detail of the division can be set according to actual needs. Do limited.
  • the target time period is a time period that needs to be predicted, that is, a period of time after the current time period. Exemplarily, it may be 8 hours, a day, or a week after the current time.
  • the service forecast visit information refers to predicting the visit volume of the service to be visited in a future time period.
  • the business forecasting model refers to a model in which the input is the service collection and access information in the historical time period, and the output is the service forecasting access information in the target time period.
  • the service prediction model may be a convolutional neural network model, a time series neural network model, an extreme learning machine model, or an autoencoder neural network model.
  • a pre-trained business forecasting model refers to a model that meets the preset prediction accuracy rate conditions obtained by training the model through sample data in advance, wherein meeting the forecast accuracy rate conditions means that the forecast accuracy rate of the business forecasting model is greater than or equal to the preset Accuracy threshold, the preset accuracy threshold is a manually preset value.
  • the data prediction model refers to a model in which the input is business forecast access information in the target time period, and the output is multiple predicted hot spot values of multiple data in the database in the target time period.
  • the data prediction model may be a convolutional neural network model, a time series neural network model, an extreme learning machine model, or an autoencoder neural network model.
  • a database refers to a collection of data associated with a business.
  • the hot comment value indicates the degree of attention to the data.
  • the hot comment value may be the probability value of data being accessed.
  • the hot comment value calculation method can be set artificially, for example, divide the visit volume of a piece of data by the total data visit volume of the business corresponding to the data within a period of time as the hot comment value of the data.
  • Hot data refers to data with a high predicted access probability.
  • Cache warming refers to loading the data in the database into the cache in advance. After obtaining multiple predicted hot spot values, determine a numerical range interval for the predicted hot spot value, and use this range interval as a condition for filtering hot spot data.
  • Screening hot data from a plurality of data in the database refers to filtering data in the database whose predicted hot comment value is within the numerical range of the predicted hot comment value, and the filtered data is hot data.
  • the hot data is selected from multiple data in the database for cache preheating according to the multiple predicted hot comment values, including: multiple data in the database Screen out the data whose predicted hot comment value is greater than or equal to the preset evaluation value threshold, and determine the filtered data as hot data; store the hot data in the cache.
  • the data corresponding to the predicted hot comment value greater than or equal to the preset evaluation value threshold is taken as hot data.
  • the preset evaluation value threshold refers to a value set artificially in advance, which is used to screen hot data, that is, when the predicted hot comment value of the data is greater than or equal to the preset evaluation value threshold, the data is determined to be hot data. Filter out all the data in the database whose value of predicted hot reviews is greater than or equal to the preset evaluation value threshold, and load these data into the cache.
  • the business prediction model is used to predict the business prediction access information of the target time period based on the business collection and access information in the historical time period
  • the data prediction model is used to predict the value of multiple hot comments of multiple data in the database.
  • the evaluation value screens out hot data for cache preheating, which solves the problems of low cache preheating accuracy and high labor costs caused by fixedly initializing all data of a business or manually selecting data according to business characteristics in related technologies, and realizes data based on historical time
  • the segment business collects access information, predicts hot data, predicts business forecast access information and hot comment value based on two models, refines the prediction process, improves the accuracy of hot data prediction and reduces labor costs.
  • FIG. 2 is a flow chart of a cache preheating method provided in Embodiment 2 of the present application.
  • the technical solution of this embodiment is described on the basis of the above technical solution.
  • the business forecasting access information into the pre-trained data forecasting model Before inputting the business forecasting access information into the pre-trained data forecasting model to obtain multiple forecasted hot comment values of multiple data in the database of the target time period, it also includes: obtaining data training samples, the data training The samples include database access information, service access information, and a plurality of detected hot comment values of a plurality of data in the database; the first model is trained according to the data training samples; when the first model training is completed, the The first model at the current moment is determined as a data prediction model.
  • the method includes:
  • S220 Input the service collection and access information of the historical time period into the pre-trained service prediction model to obtain the service forecast visit information of the target time period.
  • Data training samples are used to train data prediction models.
  • the data training samples are database access information, business access information, and multiple detection hotspot evaluation values of multiple data in the database within a period of time.
  • the database access information refers to the number of visits to each of the multiple data in the database.
  • the data in the database includes product information, product images, and recommended product lists, etc.; the detected hot comment value of each data in the database can be Obtained by manual labeling.
  • Business access information refers to the access volume of each data under each business in various businesses.
  • the data training samples further include cache access information.
  • the cache access information is the visit volume of each of the multiple data in the cache, and the data in the cache is a source of information for a user when accessing a website.
  • the content of the data training sample can be enriched, the representativeness of the data training sample can be increased, and the prediction accuracy of the data prediction model can be improved.
  • the acquiring data training samples includes: acquiring cache access information, database access information, and service access information during the sampling period, and determining multiple detection hotspots of multiple data in the database Value: forming a data training sample according to cache access information, data access information, service access information and values of multiple detected hotspots of multiple data in the database during the sampling period.
  • the values of multiple detection hotspots of multiple data in the database can be calculated.
  • the access of each of the multiple data in the database can be counted amount, and normalize the access amount of each data to obtain the detection hotspot value of each data in the multiple data in the database.
  • the first model is an intelligent algorithm model.
  • the first model may include a linear model, a decision tree model, or a deep learning model.
  • the data training samples are used as input data to train the first model, and the training is a process in which the first model continuously adjusts the parameters of the first model according to the training results.
  • the determining the values of multiple detected hotspots of multiple data in the database includes: inputting data access information and business access information in the sampling time period into the first model In the method, a plurality of detection hotspot evaluation values of a plurality of data in the database are determined.
  • Exemplary constructed models may include linear models, decision tree models or deep learning models, etc. .
  • the training samples are constructed manually, and the training samples are used to train the constructed model, and the detection hot spot value of each data in the multiple data in the training samples is manually marked.
  • the training samples are divided into training set and validation set. Exemplarily, 80% of the training samples may be used as a training set, and 20% of the samples may be used as a verification set.
  • the training samples are used to train the built model, and the verification set is used to verify whether the output accuracy of the built model has reached the expected effect.
  • the output accuracy threshold of the model can be preset, and the output accuracy threshold is used to measure the built model Whether to complete the training, when the output accuracy of the model is greater than or equal to the preset model output accuracy threshold, the model training is completed to obtain the first model; when the output accuracy of the data prediction model is less than the preset model output accuracy threshold , continue to train the model until the output accuracy of the model is greater than or equal to the preset output accuracy threshold of the model, then end the training and obtain the first model.
  • the data access information and service access information of the sampling time period are input into the first model, and the first model outputs multiple detection hotspot evaluation values of multiple data in the database.
  • the completion of the first model training means that the prediction accuracy of the first model is greater than or equal to the preset accuracy threshold or the set rounds of training have been completed.
  • the preset accuracy rate threshold may be a value set manually, and the number of setting rounds refers to the number of times set manually.
  • the current moment refers to the moment when the training of the first model is completed, and the first model at the current moment is saved and determined as the data prediction model.
  • S260 Input the service forecast access information into a pre-trained data forecast model to obtain multiple predicted hot comment values of multiple data in the database in the target time period.
  • the first model is trained to determine the data prediction model by acquiring database access information, business access information, and multiple detection hot comment value data training samples of multiple data in the database.
  • the data training samples include multiple This information increases the coverage of data training samples, increases the representativeness of data training samples, improves the prediction accuracy of data prediction models, improves the accuracy of cached data predictions, and saves labor costs.
  • FIG. 3 is a flow chart of a cache preheating method provided in Embodiment 3 of the present application.
  • the technical solution of this embodiment is described on the basis of the above technical solution.
  • the business collects access information, and generates business training samples; trains the second model according to the business training samples; and determines the second model at the current moment as the business prediction model when the second model training is completed.
  • the method includes:
  • the business collection and access information on the first day is the input data in the training sample
  • the business collection and access information on the second day is the output data in the training sample
  • the business collection and access information on the first day and the business collection and access information on the second day are formed Business training samples.
  • the second model is an intelligent algorithm model.
  • the second model may be a convolutional neural network model, a time series neural network model, an extreme learning machine model, or an autoencoder neural network model.
  • the business training samples are used as input to train the second model, and the training is the process in which the second model continuously adjusts the parameters of the second model according to the training results.
  • the completion of the second model training means that the prediction accuracy of the second model is greater than or equal to the preset accuracy threshold or the set rounds of training have been completed.
  • the preset accuracy rate threshold may be a value set manually, and the number of setting rounds refers to the number of times set manually.
  • the current moment refers to the moment when the second model finishes training, and the second model at the current moment is saved and determined as the service prediction model.
  • S350 Input the service collection and access information of the historical time period into the pre-trained service prediction model to obtain the service forecast visit information of the target time period.
  • the business collection and visit information of the first day and the business collection and visit information of the second day are used as business training samples to train the second model and determine the business prediction model, so that the next day can be predicted based on the business collection and visit information of the previous day
  • the business collects and visits information, improves the accuracy of the business forecasting model and saves labor costs.
  • FIG. 4 is a schematic diagram of an application scenario of a warm-up cache method provided in Embodiment 4 of the present application.
  • the technical solution of this embodiment is applicable to the application scenario of the cache preheating method, and the schematic diagram includes:
  • the database access information module 410, the service access information module 420 and the cache access information module 430 are respectively configured to acquire database access information, service access information and cache access information in the sampling period.
  • the data training sample module 460 is configured to use the acquired database access information, service access information and cache access information data to form a data training sample, and the data training sample is used to train the data prediction model 470 .
  • the service access information acquired by the service access information module 420 in the sampling period is used to train the service prediction model 440 .
  • the business forecast model 440 is used to obtain business forecast visit information according to the input business visit information.
  • the business forecast model 440 can predict the business visit information of D+1 day according to the business visit information, wherein D+1 represents the number of days plus one, That is, the service prediction model 440 is used to predict service forecast visit information 24 hours after the input service visit information.
  • the business forecast visit information module 450 is configured to input the business forecast visit information obtained from the business forecast model 440 into the data forecast model 470 as input data.
  • the data prediction model 470 is used to obtain the predicted hot comment value according to the service forecast access information.
  • the data forecast model 470 predicts the data access information and hot comment value of D+1 day according to the business forecast visit information of D+1 day.
  • the input of the data prediction model 470 is business forecast access information, and the output is data access information and hot comment value.
  • the value of hot reviews is normalized so that the evaluation values of hot spots are concentrated between 0 and 1.
  • the predicted hot comment value module 480 is configured to screen hot data according to the obtained hot comment value. Exemplarily, the predicted hot comment value greater than or equal to 0.7 is selected from the predicted hot comment value, and the data corresponding to the filtered predicted hot comment value is used as hot data.
  • the cache preheating module 490 is configured to load hotspot data into the cache, perform cache preheating, and set the valid time to 24 hours.
  • the cache destruction module 411 is set to destroy the cache regularly, that is, the cache data is cleared when the storage time in the cache is equal to the valid time set, so as to load new hot data, and load the clear content into the data training sample module 460 for training
  • the data prediction model is to update the data prediction model 470 in time to improve the accuracy of predicting the value of hot reviews in the next time period.
  • the functional modules of this embodiment can be developed through Java code.
  • the embodiment of the present application predicts the value of multiple predicted hot comments of multiple data in the database through the obtained business forecast access information, and filters out the hot data, and sets the valid time of the cache.
  • the time for storing the cached data in the cache is equal to the set valid time.
  • use the cached data for data prediction model training and update the cached data. Realize timely update of cache data and improve the accuracy of data prediction model, improve the accuracy and real-time performance of cache preheating.
  • FIG. 5 is a schematic structural diagram of a cache preheating device provided in Embodiment 5 of the present application.
  • Embodiment 5 is a corresponding device for implementing the cache preheating method provided in the above embodiments of the present application.
  • the device can be implemented in software and/or hardware, and can generally be integrated into a computer device.
  • the buffer preheating device includes: a business collection and access information acquisition module 510, which is configured to obtain business collection and access information in historical time periods; a service forecast access information acquisition module 520, configured to input the business collection and access information in the historical time period into the pre-set
  • the trained business prediction model obtains the business forecast visit information of the target time period;
  • the forecasted hot comment value acquisition module 530 is configured to input the business forecast visit information into the pre-trained data forecast model to obtain the target time period database multiple predicted hot comment values of multiple data;
  • the cache warm-up module 540 is configured to filter out hot data from the multiple data in the database to perform cache warm-up according to the multiple predicted hot comment values .
  • the business prediction model is used to predict the business prediction access information of the target time period based on the business collection and access information in the historical time period
  • the data prediction model is used to predict the value of multiple hot comments of multiple data in the database.
  • the evaluation value screens out hot data for cache preheating, which solves the problems of low cache preheating accuracy and high labor costs caused by fixedly initializing all data of a business or manually selecting data according to business characteristics in related technologies, and realizes data based on historical time
  • the segment business collects access information, predicts hot data, predicts business forecast access information and hot comment value based on two models, refines the prediction process, improves the accuracy of hot data prediction and reduces labor costs.
  • the cache preheating device further includes: a training sample acquisition module configured to acquire data training samples, the data training samples including database access information, business access information and multiple detections of multiple data in the database hot comment value; the first model training module is set to train the first model according to the data training samples; the data prediction model determination module is set to use the first model at the current moment when the first model training is completed Determine the model for data prediction.
  • a training sample acquisition module configured to acquire data training samples, the data training samples including database access information, business access information and multiple detections of multiple data in the database hot comment value
  • the first model training module is set to train the first model according to the data training samples
  • the data prediction model determination module is set to use the first model at the current moment when the first model training is completed Determine the model for data prediction.
  • the data training samples further include cache access information.
  • the training sample acquisition module is configured to: acquire cache access information, database access information, and business access information in the sampling time period, and determine the value of multiple detection hotspots of multiple data in the database;
  • a data training sample is formed according to cache access information, data access information, service access information, and multiple detected hot comment values of multiple data in the database during the sampling period.
  • the training sample acquisition module determines the value of multiple detected hotspot reviews of multiple data in the database in the following manner: input data access information and service access information in the sampling time period into the first In the model, a plurality of detection hot spot values of a plurality of data in the database are determined.
  • the cache preheating device also includes: a business training sample generation module, configured to obtain the business collection and access information of the first day and the business collection and visit information of the second day, and generate business training samples; the second model training A module configured to train the second model according to the business training samples; a business prediction model determining module configured to determine the second model at the current moment as the business prediction model when the training of the second model is completed.
  • a business training sample generation module configured to obtain the business collection and access information of the first day and the business collection and visit information of the second day, and generate business training samples
  • the second model training A module configured to train the second model according to the business training samples
  • a business prediction model determining module configured to determine the second model at the current moment as the business prediction model when the training of the second model is completed.
  • the cache preheating module is set to: filter out the data whose value of the predicted hot comment is greater than or equal to the preset evaluation value threshold from the plurality of data in the database, and filter out the data determined as hotspot data; storing the hotspot data in the cache.
  • the above-mentioned device can execute the cache warm-up method provided in the embodiment of the present application, and has corresponding functional modules and effects for executing the cache warm-up method.
  • FIG. 6 is a schematic structural diagram of a computer device provided in Embodiment 6 of the present application.
  • FIG. 6 shows a block diagram of an exemplary computer device 12 suitable for implementing embodiments of the present application.
  • the computer device 12 shown in FIG. 6 is only an example, and should not limit the functions and scope of use of the embodiment of the present application.
  • computer device 12 takes the form of a general-purpose computing device.
  • Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16 , system memory 28 , bus 18 connecting various system components including system memory 28 and processing unit 16 .
  • the computer device 12 may be a bus-attached device.
  • Bus 18 represents one or more of a variety of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus structures.
  • bus structures include but are not limited to Industry Standard Architecture (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (Micro Channel Architecture, MCA) bus, Enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local bus and Peripheral Component Interconnect (PCI) bus.
  • Computer device 12 includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12 and include both volatile and nonvolatile media, removable and non-removable media.
  • System memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32 .
  • Computer device 12 may include other removable/non-removable, volatile/nonvolatile computer system storage media.
  • storage system 34 may be configured to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive").
  • the present application may provide a disk drive configured to read and write to a removable non-volatile disk (such as a "floppy disk"), as well as to a removable non-volatile disk (such as a compact disk ROM (Compact Disc Read-Only Memory, CD-ROM), Digital Video Disc (Digital Video Disc-Read Only Memory, DVD-ROM) or other optical media) CD-ROM drive.
  • a removable non-volatile disk such as a "floppy disk”
  • CD-ROM Compact Disc Read-Only Memory
  • DVD-ROM Digital Video Disc
  • System memory 28 may include at least one program product having a set (eg, at least one) of program components configured to perform the functions of various embodiments of the present application.
  • Programs/utilities 40 may be stored, for example, in system memory 28 as a set (at least one) of program components 42 including, but not limited to, an operating system, one or more application programs, other program components, and program data, each or a combination of these examples may include implementations of the network environment.
  • Program components 42 generally perform the functions and/or methodologies of the embodiments described herein.
  • the computer device 12 may also communicate with one or more external devices 14 (e.g., a keyboard, pointing device, display 24, etc.), and with one or more devices that enable a user to interact with the computer device 12, and/or with Any device (eg, network card, modem, etc.) that enables the computing device 12 to communicate with one or more other computing devices.
  • This communication can be performed through an input/output (Input/Output, I/O) interface 22 .
  • computer device 12 can also communicate with one or more networks (such as local area network (Local Area Network, LAN), wide area network (Wide Area Network, WAN)) by network adapter 20.
  • network adapter 20 communicates with by bus 18 other components of computer device 12.
  • computer device 12 may be used in conjunction with other hardware and/or software components, including but not limited to: microcode, device drivers, Additional processing units, external disk drive array (Redundant Arrays of Inexpensive Disks, RAID) systems, tape drives, and data backup storage systems, etc.
  • microcode device drivers
  • Additional processing units external disk drive array (Redundant Arrays of Inexpensive Disks, RAID) systems
  • tape drives tape drives
  • data backup storage systems etc.
  • the processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28 , such as implementing the cache warming method provided in any embodiment of the present application.
  • Embodiment 7 of the present application provides a computer-readable storage medium on which a computer program is stored.
  • the cache warming method provided in all embodiments of the present application is implemented: that is, the program is processed
  • the following methods are implemented when the controller is executed: acquiring the business collection and access information of the historical time period; inputting the business collection and access information of the historical time period into the pre-trained business forecasting model to obtain the business forecasting visit information of the target time period;
  • the business forecast access information is input into the pre-trained data prediction model, and multiple predicted hot comment values of multiple data in the database in the target time period are obtained; according to the multiple predicted hot comment values, filter out from the database Hotspot data is cached and warmed up.
  • the computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media.
  • the computer readable medium may be a computer readable signal medium or a computer readable storage medium.
  • a computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared semiconductor system, device, or device, or any combination thereof.
  • Computer-readable storage media can include (a non-exhaustive list) electrical connections with one or more conductors, portable computer disks, hard disks, RAM, Read Only Memory (ROM), erasable programmable only Read memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any combination of the above.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal in baseband or as a carrier wave, and the computer-readable signal medium carries computer-readable program code. Such propagated data signals may take many forms, including - but not limited to - electromagnetic signals, optical signals, or any combination of the above.
  • a computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can be sent, propagated, or transported for use by or in conjunction with an instruction execution system, apparatus, or device. Programs used in conjunction with the device.
  • the program code contained on the computer readable medium can be transmitted by any medium, including—but not limited to—wireless, wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any combination of the above.
  • Computer program code for carrying out the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, as well as conventional procedural programming languages Design Language—such as "C" or a similar programming language.
  • the program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user computer through any kind of network, including a LAN or WAN, or, alternatively, can be connected to an external computer (eg, via the Internet using an Internet service provider).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Accounting & Taxation (AREA)
  • Development Economics (AREA)
  • Strategic Management (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Databases & Information Systems (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A cache warmup method and apparatus, and a computer device and a storage medium. The cache warmup method comprises: acquiring service collection access information within a historical time period (S110); inputting the service collection access information within the historical time period into a pre-trained service prediction model, so as to obtain service prediction access information within a target time period (S120); inputting the service prediction access information into a pre-trained data prediction model, so as to obtain a plurality of predicted hotspot evaluation values of a plurality of pieces of data in a database within the target time period (S130); and according to the plurality of predicted hotspot evaluation values, selecting hotspot data from the plurality of pieces of data in the database, and performing cache warmup (S140).

Description

缓存预热方法、装置、计算机设备和存储介质Cache preheating method, device, computer equipment and storage medium
本申请要求在2021年09月24日提交中国专利局、申请号为202111121831.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with application number 202111121831.0 filed with the China Patent Office on September 24, 2021, the entire content of which is incorporated herein by reference.
技术领域technical field
本申请实施例涉及计算机技术,例如涉及一种缓存预热方法、装置、计算机设备和存储介质。The embodiments of the present application relate to computer technology, for example, to a cache preheating method, device, computer equipment, and storage medium.
背景技术Background technique
在大型网站或高并发处理系统中缓存的应用必不可少,缓存预热是提升数据查询性能的重要举措之一。Cache applications are essential in large-scale websites or high-concurrency processing systems, and cache warming is one of the important measures to improve data query performance.
相关技术中,主要通过根据访问量排序固定初始化一种业务的全部数据实现缓存预热,或通过根据业务特性人工选择数据的方式实现缓存预热。In related technologies, cache preheating is realized mainly by fixedly initializing all data of a service according to the number of visits, or by manually selecting data according to service characteristics.
但是,相关技术中的缓存预热方法存在缓存命中率低、预热范围大、缓存无用数据占用多导致的缓存预热准确率低以及人工成本高的问题。However, the cache warm-up method in the related art has the problems of low cache hit rate, large warm-up range, low cache warm-up accuracy rate and high labor cost caused by a large amount of useless data in the cache.
发明内容Contents of the invention
本申请实施例提供一种缓存预热方法、装置、计算机设备和存储介质,以实现预热缓存数据,并提高缓存预热的准确率和降低人工成本。Embodiments of the present application provide a cache preheating method, device, computer equipment, and storage medium, so as to implement cache data preheating, improve cache preheating accuracy, and reduce labor costs.
本申请实施例提供了一种缓存预热方法,包括:The embodiment of the present application provides a cache preheating method, including:
获取历史时间段的业务采集访问信息;Obtain business collection and access information in historical time periods;
将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息;Inputting the business collection and access information of the historical time period into the pre-trained business prediction model to obtain the business forecast access information of the target time period;
将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值;Inputting the business prediction access information into the pre-trained data prediction model to obtain multiple predicted hot comment values of multiple data in the database of the target time period;
根据所述多个预测热点评价值,从所述数据库中中的所述多个数据中筛选出热点数据进行缓存预热。According to the values of the plurality of predicted hotspots, hotspot data is selected from the plurality of data in the database for cache preheating.
本申请实施例还提供了一种缓存预热装置,包括:The embodiment of the present application also provides a cache preheating device, including:
业务采集访问信息获取模块,设置为获取历史时间段的业务采集访问信息;A business collection and access information acquisition module is configured to obtain business collection and access information in historical time periods;
业务预测访问信息获取模块,设置为将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息;The business forecast visit information acquisition module is configured to input the business collection visit information of the historical time period into the pre-trained business forecast model to obtain the business forecast visit information of the target time period;
预测热点评价值获取模块,设置为将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值;The predicted hot comment value acquisition module is configured to input the business forecast access information into the pre-trained data prediction model to obtain multiple predicted hot comment values of multiple data in the database of the target time period;
缓存预热模块,设置为根据所述多个预测热点评价值,从所述数据库中的所述多个数据中筛选出热点数据进行缓存预热。The cache preheating module is configured to filter out hot data from the plurality of data in the database for cache preheating according to the values of the plurality of predicted hot comments.
本申请实施例还提供了一种计算机设备,所述计算机设备包括:The embodiment of the present application also provides a computer device, and the computer device includes:
一个或多个处理器;one or more processors;
存储装置,设置为存储一个或多个程序;a storage device configured to store one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行时,所述一个或多个处理器实现如本申请实施例提供的缓存预热方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the cache warming method provided in the embodiment of the present application.
本申请实施例还提供了一种包括计算机可执行指令的存储介质,所述计算机可执行指令在由计算机处理器执行时用于执行如本申请实施例提供的缓存预热方法。The embodiment of the present application also provides a storage medium including computer-executable instructions, and the computer-executable instructions are used to execute the cache warming method provided in the embodiment of the present application when executed by a computer processor.
附图说明Description of drawings
图1是本申请实施例一中的一种缓存预热方法的流程图;FIG. 1 is a flow chart of a cache preheating method in Embodiment 1 of the present application;
图2是本申请实施例二中的一种缓存预热方法的流程图;FIG. 2 is a flow chart of a cache preheating method in Embodiment 2 of the present application;
图3是本申请实施例三中的一种缓存预热方法的流程图;FIG. 3 is a flow chart of a cache preheating method in Embodiment 3 of the present application;
图4是本申请实施例四中的一种预热缓存方法应用场景的示意图;FIG. 4 is a schematic diagram of an application scenario of a warm-up cache method in Embodiment 4 of the present application;
图5是本申请实施例五中的一种缓存预热装置的结构示意图;FIG. 5 is a schematic structural diagram of a cache preheating device in Embodiment 5 of the present application;
图6是本申请实施例六中的一种计算机设备的结构示意图。FIG. 6 is a schematic structural diagram of a computer device in Embodiment 6 of the present application.
具体实施方式Detailed ways
下面结合附图和实施例对本申请作说明。可以理解的是,此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。另外还需要说明的是,附图中仅示出了与本申请相关的部分而非全部结构。The application will be described below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, but not to limit the present application. In addition, it should be noted that the drawings only show some structures related to the present application but not all structures.
实施例一Embodiment one
图1为本申请实施例一提供的一种缓存预热方法的流程图,本实施例可适用于对缓存预热的情况,该方法可以由缓存预热装置来执行,该装置可以由软件和/或硬件来实现,并配置于计算机设备中,计算机设备可以是服务端设备和客户端设备,例如,客户端设备是可以是手机、平板电脑、车载终端或台式计 算机等。缓存预热方法包括如下步骤:Figure 1 is a flow chart of a cache preheating method provided in Embodiment 1 of the present application. This embodiment is applicable to the situation of cache preheating, and the method can be executed by a cache preheating device, which can be implemented by software and and/or implemented by hardware, and configured in a computer device, the computer device may be a server device and a client device, for example, the client device may be a mobile phone, a tablet computer, a vehicle terminal or a desktop computer, etc. The cache warming method includes the following steps:
S110,获取历史时间段的业务采集访问信息。S110. Obtain service collection and access information in historical time periods.
历史时间段是指当前时间前的一个时间段,示例性的,可以是前8小时、前一天或前一周等。业务采集访问信息指在历史时间段内采集到的访问信息的业务类型和访问量。业务类型可以是下单、退货和退款等,其中,下单业务类型中包括,商品信息、商品图像和推荐商品列表等业务数据。访问量指数据的访问次数,示例性的,当用户打开一商品详细内容的网页时,该网页中展示的商品信息、商品图像和推荐商品列表等数据的访问量增加。需要说明的是,业务类型也可以进行更详细的划分,例如,下单业务类型可以细化为食品、服饰和日用品等业务类型,划分的详细程度可以根据实际需求设定,本申请对此不做限定。The historical time period refers to a time period before the current time, for example, it may be the previous 8 hours, the previous day, or the previous week. Business collected visit information refers to the business type and visit volume of the visit information collected in the historical time period. The business type may be order placing, return and refund, etc., where the order placing business type includes business data such as product information, product images, and recommended product lists. The number of visits refers to the number of visits to data. Exemplarily, when a user opens a web page with detailed content of a product, the number of visits to data such as product information, product images, and recommended product lists displayed on the web page increases. It should be noted that business types can also be divided in more detail. For example, the order business type can be subdivided into business types such as food, clothing, and daily necessities. The level of detail of the division can be set according to actual needs. Do limited.
S120,将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息。S120. Input the service collection and access information of the historical time period into the pre-trained service prediction model to obtain the service forecast visit information of the target time period.
目标时间段为需要预测的时间段,即当前时间段之后的一段时间,示例性的,可以为当前时间之后的8小时、一天或一周等。业务预测访问信息是指预测在未来时间段内被访问的业务的访问量。业务预测模型是指输入为历史时间段的业务采集访问信息,输出为目标时间段的业务预测访问信息的模型。示例性的,业务预测模型可以为卷积神经网络模型、时间序列神经网络模型、极限学习机模型或自编码神经网络模型等。预先训练的业务预测模型是指在预先通过样本数据训练模型,得到的符合预设的预测准确率条件的模型,其中,符合预测准确率条件是指业务预测模型的预测准确率大于或等于预设准确率阈值,预设准确率阈值为人工预设的数值。The target time period is a time period that needs to be predicted, that is, a period of time after the current time period. Exemplarily, it may be 8 hours, a day, or a week after the current time. The service forecast visit information refers to predicting the visit volume of the service to be visited in a future time period. The business forecasting model refers to a model in which the input is the service collection and access information in the historical time period, and the output is the service forecasting access information in the target time period. Exemplarily, the service prediction model may be a convolutional neural network model, a time series neural network model, an extreme learning machine model, or an autoencoder neural network model. A pre-trained business forecasting model refers to a model that meets the preset prediction accuracy rate conditions obtained by training the model through sample data in advance, wherein meeting the forecast accuracy rate conditions means that the forecast accuracy rate of the business forecasting model is greater than or equal to the preset Accuracy threshold, the preset accuracy threshold is a manually preset value.
S130,将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值。S130. Input the service forecast access information into a pre-trained data forecast model to obtain multiple predicted hot comment values of multiple data in the database in the target time period.
数据预测模型是指输入为目标时间段的业务预测访问信息,输出为目标时间段的数据库中的多个数据的多个预测热点评价值的模型。示例性的,数据预测模型可以为卷积神经网络模型、时间序列神经网络模型、极限学习机模型或自编码神经网络模型等。数据库指与业务关联的数据集合。热点评价值表示对数据关注的程度,示例性的,热点评价值可以是数据被访问的概率值。热点评价值计算方法可以人为设定,例如,在一个时间段内用一个数据的访问量除以该数据对应业务的总的数据访问量作为该数据的热点评价值。The data prediction model refers to a model in which the input is business forecast access information in the target time period, and the output is multiple predicted hot spot values of multiple data in the database in the target time period. Exemplarily, the data prediction model may be a convolutional neural network model, a time series neural network model, an extreme learning machine model, or an autoencoder neural network model. A database refers to a collection of data associated with a business. The hot comment value indicates the degree of attention to the data. Exemplarily, the hot comment value may be the probability value of data being accessed. The hot comment value calculation method can be set artificially, for example, divide the visit volume of a piece of data by the total data visit volume of the business corresponding to the data within a period of time as the hot comment value of the data.
S140,根据所述多个预测热点评价值,从所述数据库中的所述多个数据中筛选出热点数据进行缓存预热。S140. According to the values of the plurality of predicted hotspots, select hotspot data from the plurality of data in the database to preheat the cache.
热点数据是指预测的访问概率高的数据。缓存预热是指预先将数据库中的数据加载到缓存。在得到多个预测热点评价值后,确定一个预测热点评价值的数值范围区间,并将该范围区间作为筛选热点数据的条件。从数据库中的多个数据中筛选出热点数据指在数据库中筛选出预测热点评价值在预测热点评价值的数值范围区间内的数据,筛选出的数据为热点数据。Hot data refers to data with a high predicted access probability. Cache warming refers to loading the data in the database into the cache in advance. After obtaining multiple predicted hot spot values, determine a numerical range interval for the predicted hot spot value, and use this range interval as a condition for filtering hot spot data. Screening hot data from a plurality of data in the database refers to filtering data in the database whose predicted hot comment value is within the numerical range of the predicted hot comment value, and the filtered data is hot data.
在一个可选实施例中,所述根据所述多个预测热点评价值,从所述数据库中的多个数据中筛选出热点数据进行缓存预热,包括:在所述数据库中的多个数据中筛选出预测热点评价值大于或等于预设评价值阈值的数据,并将筛选出的所述数据确定为热点数据;将所述热点数据存入缓存中。In an optional embodiment, the hot data is selected from multiple data in the database for cache preheating according to the multiple predicted hot comment values, including: multiple data in the database Screen out the data whose predicted hot comment value is greater than or equal to the preset evaluation value threshold, and determine the filtered data as hot data; store the hot data in the cache.
根据得到的预测热点评价值,将大于或等于预设评价值阈值的预测热点评价值对应的数据作为热点数据。其中,预设评价值阈值是指预先人为设定的数值,用于筛选热点数据,即当数据的预测热点评价值大于或等于预设评价值阈值时,确定该数据为热点数据。在数据库中筛选出所有预测热点评价值大于或等于预设评价值阈值的数据,将这些数据加载到缓存中。According to the obtained predicted hot comment value, the data corresponding to the predicted hot comment value greater than or equal to the preset evaluation value threshold is taken as hot data. Wherein, the preset evaluation value threshold refers to a value set artificially in advance, which is used to screen hot data, that is, when the predicted hot comment value of the data is greater than or equal to the preset evaluation value threshold, the data is determined to be hot data. Filter out all the data in the database whose value of predicted hot reviews is greater than or equal to the preset evaluation value threshold, and load these data into the cache.
根据预测热点评价值筛选出热点数据,减少无效数据占用缓存空间的,可以提高热点数据筛选的准确率,提高缓存空间利用率。通过缓存预热,在用户请求数据时,可以直接从缓存中读取该数据,而无需从数据库中读取,减小数据库的压力,节约系统的性能开销。Screen out hot data based on the value of predicted hot reviews, reduce the cache space occupied by invalid data, improve the accuracy of hot data screening, and improve the utilization of cache space. By preheating the cache, when a user requests data, the data can be read directly from the cache without reading from the database, reducing the pressure on the database and saving system performance overhead.
本申请实施例通过业务预测模型根据历史时间段的业务采集访问信息预测目标时间段的业务预测访问信息,通过数据预测模型预测数据库中的多个数据的多个热点评价值,根据多个预测热点评价值筛选出热点数据进行缓存预热,解决了相关技术中固定初始化一种业务的全部数据或根据业务特性人工选择数据导致的缓存预热准确率低和人工成本高的问题,实现根据历史时间段的业务采集访问信息,预测热点数据,根据两个模型分别预测业务预测访问信息和热点评价值,细化预测过程,提高热点数据预测的准确率和降低人工成本。In the embodiment of the present application, the business prediction model is used to predict the business prediction access information of the target time period based on the business collection and access information in the historical time period, and the data prediction model is used to predict the value of multiple hot comments of multiple data in the database. The evaluation value screens out hot data for cache preheating, which solves the problems of low cache preheating accuracy and high labor costs caused by fixedly initializing all data of a business or manually selecting data according to business characteristics in related technologies, and realizes data based on historical time The segment business collects access information, predicts hot data, predicts business forecast access information and hot comment value based on two models, refines the prediction process, improves the accuracy of hot data prediction and reduces labor costs.
实施例二Embodiment two
图2为本申请实施例二提供的一种缓存预热方法的流程图,本实施例的技术方案在上述技术方案的基础上进行说明。在将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值之前,还包括:获取数据训练样本,所述数据训练样本包括数据库访问信息、业务访问信息和所述数据库中的多个数据的多个检测热点评价值;根据所述数据训练样本对第一模型进行训练;在所述第一模型训练完成时,将 当前时刻的第一模型确定为数据预测模型。该方法包括:FIG. 2 is a flow chart of a cache preheating method provided in Embodiment 2 of the present application. The technical solution of this embodiment is described on the basis of the above technical solution. Before inputting the business forecasting access information into the pre-trained data forecasting model to obtain multiple forecasted hot comment values of multiple data in the database of the target time period, it also includes: obtaining data training samples, the data training The samples include database access information, service access information, and a plurality of detected hot comment values of a plurality of data in the database; the first model is trained according to the data training samples; when the first model training is completed, the The first model at the current moment is determined as a data prediction model. The method includes:
S210,获取历史时间段的业务采集访问信息。S210. Acquire service collection and access information in historical time periods.
S220,将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息。S220. Input the service collection and access information of the historical time period into the pre-trained service prediction model to obtain the service forecast visit information of the target time period.
S230,获取数据训练样本,所述数据训练样本包括数据库访问信息、业务访问信息和所述数据库中的多个数据的多个检测热点评价值。S230. Acquire a data training sample, where the data training sample includes database access information, service access information, and values of multiple detected hotspots of multiple data in the database.
数据训练样本用于训练数据预测模型。数据训练样本为一个时间段内的数据库访问信息、业务访问信息和数据库中多个数据的多个检测热点评价值。数据库访问信息指数据库中的多个数据中的每个数据的访问量,示例性的,数据库中的数据包括商品信息、商品图像和推荐商品列表等;数据库中每个数据的检测热点评价值可以通过人工标注的方式得到。业务访问信息指多种业务中每种业务下的每个数据的访问量。Data training samples are used to train data prediction models. The data training samples are database access information, business access information, and multiple detection hotspot evaluation values of multiple data in the database within a period of time. The database access information refers to the number of visits to each of the multiple data in the database. Exemplarily, the data in the database includes product information, product images, and recommended product lists, etc.; the detected hot comment value of each data in the database can be Obtained by manual labeling. Business access information refers to the access volume of each data under each business in various businesses.
在一个可选实施例中,所述数据训练样本还包括缓存访问信息。In an optional embodiment, the data training samples further include cache access information.
缓存访问信息是缓存中的多个数据中的每个数据的访问量,缓存中的数据是用户在访问网站时的信息来源。The cache access information is the visit volume of each of the multiple data in the cache, and the data in the cache is a source of information for a user when accessing a website.
通过将缓存访问信息作为数据训练样本,可以丰富数据训练样本的内容,增加数据训练样本的代表性,提高数据预测模型预测的准确率。By using the cache access information as a data training sample, the content of the data training sample can be enriched, the representativeness of the data training sample can be increased, and the prediction accuracy of the data prediction model can be improved.
在一个可选实施例中,所述获取数据训练样本,包括:获取采样时间段的缓存访问信息、数据库访问信息和业务访问信息,并确定所述数据库中的多个数据的多个检测热点评价值;根据所述采样时间段的缓存访问信息、数据访问信息、业务访问信息和所述数据库中的多个数据的多个检测热点评价值,形成数据训练样本。In an optional embodiment, the acquiring data training samples includes: acquiring cache access information, database access information, and service access information during the sampling period, and determining multiple detection hotspots of multiple data in the database Value: forming a data training sample according to cache access information, data access information, service access information and values of multiple detected hotspots of multiple data in the database during the sampling period.
根据采样时间段的缓存访问信息、数据库访问信息和业务访问信息,可以计算数据库中的多个数据的多个检测热点评价值。示例性的,可以根据业务访问信息关联的数据的访问量、缓存访问信息关联的数据的访问量和数据访问信息关联的数据的访问量,统计数据库中的多个数据中的每个数据的访问量,并将每个数据的访问量进行归一化处理,得到数据库中的多个数据中的每个数据的检测热点评价值。According to the cache access information, database access information and service access information in the sampling time period, the values of multiple detection hotspots of multiple data in the database can be calculated. Exemplarily, according to the visit volume of data associated with service access information, the visit volume of data associated with cache access information, and the visit volume of data associated with data access information, the access of each of the multiple data in the database can be counted amount, and normalize the access amount of each data to obtain the detection hotspot value of each data in the multiple data in the database.
将缓存访问信息、数据访问信息、业务访问信息和预先为所述数据库中的多个数据标注的多个检测热点评价值作为训练样本,增大数据训练样本数据覆盖范围,增加数据训练样本数据的代表性,可以提高训练得到的数据预测模型的准确率。Using cache access information, data access information, business access information, and multiple detection hot comment values pre-marked for multiple data in the database as training samples, increasing the coverage of data training sample data and increasing the coverage of data training sample data Representativeness can improve the accuracy of the trained data prediction model.
S240,根据所述数据训练样本对第一模型进行训练。S240. Train the first model according to the data training samples.
第一模型为智能算法模型,示例性的,第一模型可以包括线性模型、决策树模型或深度学习模型等。将数据训练样本作为输入数据训练第一模型,训练即第一模型根据训练结果不断调整第一模型参数的过程。The first model is an intelligent algorithm model. Exemplarily, the first model may include a linear model, a decision tree model, or a deep learning model. The data training samples are used as input data to train the first model, and the training is a process in which the first model continuously adjusts the parameters of the first model according to the training results.
在一个可选实施例中,所述确定所述数据库中的多个数据的多个检测热点评价值,包括:将所述采样时间段的数据访问信息和业务访问信息输入至所述第一模型中,确定所述数据库中的多个数据的多个检测热点评价值。In an optional embodiment, the determining the values of multiple detected hotspots of multiple data in the database includes: inputting data access information and business access information in the sampling time period into the first model In the method, a plurality of detection hotspot evaluation values of a plurality of data in the database are determined.
在确定数据库中的多个数据的多个检测热点评价值前,首先构建一个模型并对模型进行训练得到第一模型,示例性的构建的模型可以包括线性模型、决策树模型或深度学习模型等。通过人工方式构建训练样本,训练样本用于训练构建的模型,训练样本中的多个数据中的每个数据的检测热点评价值通过人工方式进行标记。训练样本分为训练集和验证集。示例性的,可以将训练样本中80%的样本作为训练集,20%样本作为验证集。训练样本用于训练构建的模型,验证集用于验证构建的模型的输出准确率是否达到预期效果,示例性的,可以预设模型的输出准确率阈值,输出准确率阈值用于衡量构建的模型是否完成训练,当模型的输出准确率大于或等于预设的模型的输出准确率阈值时,模型训练完成得到第一模型;当数据预测模型的输出准确率小于预设的模型的输出准确率阈值时,继续对模型进行训练,直到模型的输出准确率大于或等于预设的模型的输出准确率阈值时,结束训练,得到第一模型。得到第一模型后,将采样时间段的数据访问信息和业务访问信息输入到第一模型,第一模型输出数据库中的多个数据的多个检测热点评价值。Before determining the value of multiple detection hot spots of multiple data in the database, first construct a model and train the model to obtain the first model. Exemplary constructed models may include linear models, decision tree models or deep learning models, etc. . The training samples are constructed manually, and the training samples are used to train the constructed model, and the detection hot spot value of each data in the multiple data in the training samples is manually marked. The training samples are divided into training set and validation set. Exemplarily, 80% of the training samples may be used as a training set, and 20% of the samples may be used as a verification set. The training samples are used to train the built model, and the verification set is used to verify whether the output accuracy of the built model has reached the expected effect. For example, the output accuracy threshold of the model can be preset, and the output accuracy threshold is used to measure the built model Whether to complete the training, when the output accuracy of the model is greater than or equal to the preset model output accuracy threshold, the model training is completed to obtain the first model; when the output accuracy of the data prediction model is less than the preset model output accuracy threshold , continue to train the model until the output accuracy of the model is greater than or equal to the preset output accuracy threshold of the model, then end the training and obtain the first model. After the first model is obtained, the data access information and service access information of the sampling time period are input into the first model, and the first model outputs multiple detection hotspot evaluation values of multiple data in the database.
通过将采样时间段的数据访问信息和业务访问信息作为输入,输入到第一模型中,可以确定数据库中的多个数据的多个检测热点评价值,通过第一模型得到数据库中的多个数据的多个检测热点评价值,提高数据库中的多个数据中的每个数据的检测热点评价值的准确性,进而提高热点评价值预测的准确性。By inputting the data access information and business access information in the sampling period into the first model, it is possible to determine the value of multiple detection hotspots of multiple data in the database, and obtain multiple data in the database through the first model Multiple detected hot spot values, improving the accuracy of hot spot value detection for each of the multiple data in the database, thereby improving the accuracy of hot spot value prediction.
S250,在所述第一模型训练完成时,将当前时刻的第一模型确定为数据预测模型。S250. When the training of the first model is completed, determine the first model at the current moment as the data prediction model.
第一模型训练完成指第一模型的预测准确率大于或等于预设准确率阈值或完成设定轮次训练。其中,预设准确率阈值可以是人工设置的数值,设定轮次指人工设定的次数。当前时刻指第一模型的完成训练的时刻,保存当前时刻的第一模型并将该第一模型确定为数据预测模型。The completion of the first model training means that the prediction accuracy of the first model is greater than or equal to the preset accuracy threshold or the set rounds of training have been completed. Wherein, the preset accuracy rate threshold may be a value set manually, and the number of setting rounds refers to the number of times set manually. The current moment refers to the moment when the training of the first model is completed, and the first model at the current moment is saved and determined as the data prediction model.
S260,将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值。S260. Input the service forecast access information into a pre-trained data forecast model to obtain multiple predicted hot comment values of multiple data in the database in the target time period.
S270,根据所述多个预测热点评价值,从所述数据库中的多个数据中筛选出热点数据进行缓存预热。S270. According to the values of the plurality of predicted hotspots, select hotspot data from multiple pieces of data in the database to preheat the cache.
本申请实施例通过获取数据库访问信息、业务访问信息和所述数据库中多个数据的多个检测热点评价值数据训练样本,对第一模型进行训练,确定数据预测模型,数据训练样本包含了多种信息,增大数据训练样本数据覆盖范围,增加数据训练样本代表性,提高数据预测模型的预测的准确率,提高缓存数据预测的准确性并节约人工成本。In this embodiment of the present application, the first model is trained to determine the data prediction model by acquiring database access information, business access information, and multiple detection hot comment value data training samples of multiple data in the database. The data training samples include multiple This information increases the coverage of data training samples, increases the representativeness of data training samples, improves the prediction accuracy of data prediction models, improves the accuracy of cached data predictions, and saves labor costs.
实施例三Embodiment three
图3为本申请实施例三提供的一种缓存预热方法的流程图,本实施例的技术方案在上述技术方案的基础上进行说明。在将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息之前,还包括:获取第一天的业务采集访问信息,和第二天的业务采集访问信息,并生成业务训练样本;根据所述业务训练样本对第二模型进行训练;在所述第二模型训练完成时,将当前时刻的第二模型确定为业务预测模型。该方法包括:FIG. 3 is a flow chart of a cache preheating method provided in Embodiment 3 of the present application. The technical solution of this embodiment is described on the basis of the above technical solution. Before inputting the business collection and access information of the historical time period into the pre-trained business forecast model to obtain the business forecast visit information of the target time period, it also includes: obtaining the business collection and visit information of the first day, and the second day’s business collection and visit information. The business collects access information, and generates business training samples; trains the second model according to the business training samples; and determines the second model at the current moment as the business prediction model when the second model training is completed. The method includes:
S310,获取历史时间段的业务采集访问信息。S310. Acquire service collection and access information in historical time periods.
S320,获取第一天的业务采集访问信息,和第二天的业务采集访问信息,并生成业务训练样本。S320. Obtain the service collection visit information of the first day and the service collection visit information of the second day, and generate service training samples.
第一天的业务采集访问信息为训练样本中的输入数据,第二天的业务采集访问信息为训练样本中的输出数据,第一天的业务采集访问信息和第二天的业务采集访问信息形成业务训练样本。The business collection and access information on the first day is the input data in the training sample, the business collection and access information on the second day is the output data in the training sample, and the business collection and access information on the first day and the business collection and access information on the second day are formed Business training samples.
S330,根据所述业务训练样本对第二模型进行训练。S330. Train the second model according to the service training samples.
第二模型为智能算法模型,示例性的,第二模型可以是卷积神经网络模型、时间序列神经网络模型、极限学习机模型或自编码神经网络模型等。将业务训练样本作为输入训练第二模型,训练即第二模型根据训练结果不断调整第二模型参数的过程。The second model is an intelligent algorithm model. Exemplarily, the second model may be a convolutional neural network model, a time series neural network model, an extreme learning machine model, or an autoencoder neural network model. The business training samples are used as input to train the second model, and the training is the process in which the second model continuously adjusts the parameters of the second model according to the training results.
S340,在所述第二模型训练完成时,将当前时刻的第二模型确定为业务预测模型。S340. When the training of the second model is completed, determine the second model at the current moment as the service prediction model.
第二模型训练完成指第二模型的预测准确率大于或等于预设准确率阈值或 完成设定轮次训练。其中,预设准确率阈值可以是人工设置的数值,设定轮次指人工设定的次数。当前时刻指第二模型完成训练的时刻,保存当前时刻的第二模型并将该第二模型确定为业务预测模型。The completion of the second model training means that the prediction accuracy of the second model is greater than or equal to the preset accuracy threshold or the set rounds of training have been completed. Wherein, the preset accuracy rate threshold may be a value set manually, and the number of setting rounds refers to the number of times set manually. The current moment refers to the moment when the second model finishes training, and the second model at the current moment is saved and determined as the service prediction model.
S350,将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息。S350. Input the service collection and access information of the historical time period into the pre-trained service prediction model to obtain the service forecast visit information of the target time period.
S360,将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值。S360. Input the service forecast access information into a pre-trained data forecast model to obtain multiple predicted hot comment values of multiple data in the database in the target time period.
S370,根据所述多个预测热点评价值,从所述数据库中的多个数据中筛选出热点数据进行缓存预热。S370. According to the multiple predicted hot comment values, select hot data from multiple data in the database to preheat the cache.
本申请实施例通过第一天的业务采集访问信息和第二天的业务采集访问信息作为业务训练样本,训练第二模型并确定业务预测模型,可以根据前一天的业务采集访问信息预测第二天的业务采集访问信息,提高业务预测模型的准确率并节约人工成本。In this embodiment of the present application, the business collection and visit information of the first day and the business collection and visit information of the second day are used as business training samples to train the second model and determine the business prediction model, so that the next day can be predicted based on the business collection and visit information of the previous day The business collects and visits information, improves the accuracy of the business forecasting model and saves labor costs.
实施例四Embodiment four
图4为本申请实施例四提供的一种预热缓存方法应用场景的示意图。本实施例的技术方案适用于缓存预热方法的应用场景,示意图中包括:FIG. 4 is a schematic diagram of an application scenario of a warm-up cache method provided in Embodiment 4 of the present application. The technical solution of this embodiment is applicable to the application scenario of the cache preheating method, and the schematic diagram includes:
数据库访问信息模块410、业务访问信息模块420和缓存访问信息模块430分别设置为获取采样时间段中数据库访问信息、业务访问信息和缓存访问信息。The database access information module 410, the service access information module 420 and the cache access information module 430 are respectively configured to acquire database access information, service access information and cache access information in the sampling period.
数据训练样本模块460设置为将获取的数据库访问信息、业务访问信息和缓存访问信息数据形成数据训练样本,数据训练样本用于训练数据预测模型470。The data training sample module 460 is configured to use the acquired database access information, service access information and cache access information data to form a data training sample, and the data training sample is used to train the data prediction model 470 .
业务访问信息模块420获取的采样时间段中的业务访问信息用于训练业务预测模型440。业务预测模型440用于根据输入的业务访问信息得到业务预测访问信息,示例性的,业务预测模型440可以根据业务访问信息预测D+1日的业务访问信息,其中D+1表示天数加一,即业务预测模型440用于预测输入的业务访问信息24小时后的业务预测访问信息。The service access information acquired by the service access information module 420 in the sampling period is used to train the service prediction model 440 . The business forecast model 440 is used to obtain business forecast visit information according to the input business visit information. Exemplarily, the business forecast model 440 can predict the business visit information of D+1 day according to the business visit information, wherein D+1 represents the number of days plus one, That is, the service prediction model 440 is used to predict service forecast visit information 24 hours after the input service visit information.
业务预测访问信息模块450设置为将从业务预测模型440中获取的业务预测访问信息作为输入数据输入到数据预测模型470中。The business forecast visit information module 450 is configured to input the business forecast visit information obtained from the business forecast model 440 into the data forecast model 470 as input data.
数据预测模型470用于根据业务预测访问信息得到预测热点评价值,示例性的,数据预测模型470根据D+1日的业务预测访问信息预测D+1日数据访问信息和热点评价值。数据预测模型470的输入为业务预测访问信息,输出为数 据访问信息和热点评价值。将热点评价值进行归一化处理,使热点评价值为数值集中在0~1的数据。The data prediction model 470 is used to obtain the predicted hot comment value according to the service forecast access information. Exemplarily, the data forecast model 470 predicts the data access information and hot comment value of D+1 day according to the business forecast visit information of D+1 day. The input of the data prediction model 470 is business forecast access information, and the output is data access information and hot comment value. The value of hot reviews is normalized so that the evaluation values of hot spots are concentrated between 0 and 1.
预测热点评价值模块480设置为根据得到的热点评价值筛选出热点数据。示例性的,在预测热点评价值中筛选大于或等于0.7的预测热点评价值,将筛选出的预测热点评价值对应的数据作为热点数据。The predicted hot comment value module 480 is configured to screen hot data according to the obtained hot comment value. Exemplarily, the predicted hot comment value greater than or equal to 0.7 is selected from the predicted hot comment value, and the data corresponding to the filtered predicted hot comment value is used as hot data.
缓存预热模块490设置为将热点数据加载到缓存中,进行缓存预热,并设置有效时间为24小时。The cache preheating module 490 is configured to load hotspot data into the cache, perform cache preheating, and set the valid time to 24 hours.
缓存销毁模块411设置为定期销毁缓存,即缓存数据在缓存中存储时间等于设置的有效时间时清除缓存内容,以加载新的热点数据,并将清除内容加载到数据训练样本模块460中用于训练数据预测模型,及时更新数据预测模型470,以提高对下一时间段的预测热点评价值的准确率。The cache destruction module 411 is set to destroy the cache regularly, that is, the cache data is cleared when the storage time in the cache is equal to the valid time set, so as to load new hot data, and load the clear content into the data training sample module 460 for training The data prediction model is to update the data prediction model 470 in time to improve the accuracy of predicting the value of hot reviews in the next time period.
本实施例的功能模块可以通过Java代码开发。The functional modules of this embodiment can be developed through Java code.
本申请实施例通过得到的业务预测访问信息预测数据库中多个数据的多个预测热点评价值并筛选出热点数据,并设置缓存的有效时间,缓存数据在缓存中存储的时间等于设置的有效时间后将该缓存数据用于数据预测模型训练,并更新缓存数据。实现及时更新缓存数据和提高数据预测模型的准确率,提高缓存预热的准确率和实时性。The embodiment of the present application predicts the value of multiple predicted hot comments of multiple data in the database through the obtained business forecast access information, and filters out the hot data, and sets the valid time of the cache. The time for storing the cached data in the cache is equal to the set valid time Then use the cached data for data prediction model training and update the cached data. Realize timely update of cache data and improve the accuracy of data prediction model, improve the accuracy and real-time performance of cache preheating.
实施例五Embodiment five
图5为本申请实施例五提供的一种缓存预热装置的结构示意图。实施例五是实现本申请上述实施例提供的缓存预热方法的相应装置,该装置可采用软件和/或硬件的方式实现,并一般可集成在计算机设备中。缓存预热装置包括:业务采集访问信息获取模块510,设置为获取历史时间段的业务采集访问信息;业务预测访问信息获取模块520,设置为将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息;预测热点评价值获取模块530,设置为将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值;缓存预热模块540,设置为根据所述多个预测热点评价值,从所述数据库中的所述多个数据中筛选出热点数据进行缓存预热。FIG. 5 is a schematic structural diagram of a cache preheating device provided in Embodiment 5 of the present application. Embodiment 5 is a corresponding device for implementing the cache preheating method provided in the above embodiments of the present application. The device can be implemented in software and/or hardware, and can generally be integrated into a computer device. The buffer preheating device includes: a business collection and access information acquisition module 510, which is configured to obtain business collection and access information in historical time periods; a service forecast access information acquisition module 520, configured to input the business collection and access information in the historical time period into the pre-set The trained business prediction model obtains the business forecast visit information of the target time period; the forecasted hot comment value acquisition module 530 is configured to input the business forecast visit information into the pre-trained data forecast model to obtain the target time period database multiple predicted hot comment values of multiple data; the cache warm-up module 540 is configured to filter out hot data from the multiple data in the database to perform cache warm-up according to the multiple predicted hot comment values .
本申请实施例通过业务预测模型根据历史时间段的业务采集访问信息预测目标时间段的业务预测访问信息,通过数据预测模型预测数据库中的多个数据的多个热点评价值,根据多个预测热点评价值筛选出热点数据进行缓存预热,解决了相关技术中固定初始化一种业务的全部数据或根据业务特性人工选择数 据导致的缓存预热准确率低和人工成本高的问题,实现根据历史时间段的业务采集访问信息,预测热点数据,根据两个模型分别预测业务预测访问信息和热点评价值,细化预测过程,提高热点数据预测的准确率和降低人工成本。In the embodiment of the present application, the business prediction model is used to predict the business prediction access information of the target time period based on the business collection and access information in the historical time period, and the data prediction model is used to predict the value of multiple hot comments of multiple data in the database. The evaluation value screens out hot data for cache preheating, which solves the problems of low cache preheating accuracy and high labor costs caused by fixedly initializing all data of a business or manually selecting data according to business characteristics in related technologies, and realizes data based on historical time The segment business collects access information, predicts hot data, predicts business forecast access information and hot comment value based on two models, refines the prediction process, improves the accuracy of hot data prediction and reduces labor costs.
可选的,缓存预热装置,还包括:训练样本获取模块,设置为获取数据训练样本,所述数据训练样本包括数据库访问信息、业务访问信息和所述数据库中的多个数据的多个检测热点评价值;第一模型训练模块,设置为根据所述数据训练样本对第一模型进行训练;数据预测模型确定模块,设置为在所述第一模型训练完成时,将当前时刻的第一模型确定为数据预测模型。Optionally, the cache preheating device further includes: a training sample acquisition module configured to acquire data training samples, the data training samples including database access information, business access information and multiple detections of multiple data in the database hot comment value; the first model training module is set to train the first model according to the data training samples; the data prediction model determination module is set to use the first model at the current moment when the first model training is completed Determine the model for data prediction.
可选的,所述数据训练样本还包括缓存访问信息。Optionally, the data training samples further include cache access information.
可选的,所述训练样本获取模块,是设置为:获取采样时间段的缓存访问信息、数据库访问信息和业务访问信息,并确定所述数据库中的多个数据的多个检测热点评价值;根据所述采样时间段的缓存访问信息、数据访问信息、业务访问信息和所述数据库中的多个数据的多个检测热点评价值,形成数据训练样本。Optionally, the training sample acquisition module is configured to: acquire cache access information, database access information, and business access information in the sampling time period, and determine the value of multiple detection hotspots of multiple data in the database; A data training sample is formed according to cache access information, data access information, service access information, and multiple detected hot comment values of multiple data in the database during the sampling period.
可选的,所述训练样本获取模块通过如下方式确定所述数据库中的多个数据的多个检测热点评价值:将所述采样时间段的数据访问信息和业务访问信息输入至所述第一模型中,确定所述数据库中的多个数据的多个检测热点评价值。Optionally, the training sample acquisition module determines the value of multiple detected hotspot reviews of multiple data in the database in the following manner: input data access information and service access information in the sampling time period into the first In the model, a plurality of detection hot spot values of a plurality of data in the database are determined.
可选的,缓存预热装置,还包括:业务训练样本生成模块,设置为获取第一天的业务采集访问信息,和第二天的业务采集访问信息,并生成业务训练样本;第二模型训练模块,设置为根据所述业务训练样本对第二模型进行训练;业务预测模型确定模块,设置为在所述第二模型训练完成时,将当前时刻的第二模型确定为业务预测模型。Optionally, the cache preheating device also includes: a business training sample generation module, configured to obtain the business collection and access information of the first day and the business collection and visit information of the second day, and generate business training samples; the second model training A module configured to train the second model according to the business training samples; a business prediction model determining module configured to determine the second model at the current moment as the business prediction model when the training of the second model is completed.
可选的,缓存预热模块,是设置为:在所述数据库中的所述多个数据中筛选出预测热点评价值大于或等于预设评价值阈值的数据,并将筛选出的所述数据确定为热点数据;将所述热点数据存入缓存中。Optionally, the cache preheating module is set to: filter out the data whose value of the predicted hot comment is greater than or equal to the preset evaluation value threshold from the plurality of data in the database, and filter out the data determined as hotspot data; storing the hotspot data in the cache.
上述装置可执行本申请实施例所提供的缓存预热方法,具备执行缓存预热方法相应的功能模块和效果。The above-mentioned device can execute the cache warm-up method provided in the embodiment of the present application, and has corresponding functional modules and effects for executing the cache warm-up method.
实施例六Embodiment six
图6为本申请实施例六提供的一种计算机设备的结构示意图。图6示出了适于用来实现本申请实施方式的示例性计算机设备12的框图。图6显示的计算机设备12仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。FIG. 6 is a schematic structural diagram of a computer device provided in Embodiment 6 of the present application. FIG. 6 shows a block diagram of an exemplary computer device 12 suitable for implementing embodiments of the present application. The computer device 12 shown in FIG. 6 is only an example, and should not limit the functions and scope of use of the embodiment of the present application.
如图6所示,计算机设备12以通用计算设备的形式表现。计算机设备12的组件可以包括但不限于:一个或者多个处理器或者处理单元16,系统存储器28,连接不同系统组件(包括系统存储器28和处理单元16)的总线18。计算机设备12可以是挂接在总线上的设备。As shown in FIG. 6, computer device 12 takes the form of a general-purpose computing device. Components of computer device 12 may include, but are not limited to: one or more processors or processing units 16 , system memory 28 , bus 18 connecting various system components including system memory 28 and processing unit 16 . The computer device 12 may be a bus-attached device.
总线18表示多类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些总线结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结构(Micro Channel Architecture,MCA)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(PerIPheral Component Interconnect,PCI)总线。 Bus 18 represents one or more of a variety of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus structures. For example, these bus structures include but are not limited to Industry Standard Architecture (Industry Standard Architecture, ISA) bus, Micro Channel Architecture (Micro Channel Architecture, MCA) bus, Enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association, VESA) local bus and Peripheral Component Interconnect (PCI) bus.
计算机设备12包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备12访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。 Computer device 12 includes a variety of computer system readable media. These media can be any available media that can be accessed by computer device 12 and include both volatile and nonvolatile media, removable and non-removable media.
系统存储器28可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)30和/或高速缓存存储器32。计算机设备12可以包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储系统34可以设置为读写不可移动的、非易失性磁介质(图6未显示,通常称为“硬盘驱动器”)。尽管图6中未示出,本申请可以提供设置为对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM),数字视盘(Digital Video Disc-Read Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线18相连。系统存储器28可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序组件,这些程序组件被配置以执行本申请多个实施例的功能。 System memory 28 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 30 and/or cache memory 32 . Computer device 12 may include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 34 may be configured to read and write to non-removable, non-volatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive"). Although not shown in Figure 6, the present application may provide a disk drive configured to read and write to a removable non-volatile disk (such as a "floppy disk"), as well as to a removable non-volatile disk (such as a compact disk ROM (Compact Disc Read-Only Memory, CD-ROM), Digital Video Disc (Digital Video Disc-Read Only Memory, DVD-ROM) or other optical media) CD-ROM drive. In these cases, each drive may be connected to bus 18 via one or more data media interfaces. System memory 28 may include at least one program product having a set (eg, at least one) of program components configured to perform the functions of various embodiments of the present application.
具有一组(至少一个)程序组件42的程序/实用工具40,可以存储在例如系统存储器28中,这样的程序组件42包括但不限于操作系统、一个或者多个应用程序、其它程序组件以及程序数据,这些示例中的每一个或一种组合中可能包括网络环境的实现。程序组件42通常执行本申请所描述的实施例中的功能和/或方法。Programs/utilities 40 may be stored, for example, in system memory 28 as a set (at least one) of program components 42 including, but not limited to, an operating system, one or more application programs, other program components, and program data, each or a combination of these examples may include implementations of the network environment. Program components 42 generally perform the functions and/or methodologies of the embodiments described herein.
计算机设备12也可以与一个或多个外部设备14(例如键盘、指向设备、显示器24等)通信,还可与一个或者多个使得用户能与该计算机设备12交互的设备通信,和/或与使得该计算机设备12能与一个或多个其它计算设备进行通信 的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口22进行。并且,计算机设备12还可以通过网络适配器20与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN)通信。如图所示,网络适配器20通过总线18与计算机设备12的其它组件通信。尽管图6中未示出,可以将计算机设备12与其它硬件和/或软件组件结合使用,硬件和/或软件模块包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列(Redundant Arrays of Inexpensive Disks,RAID)系统、磁带驱动器以及数据备份存储系统等。The computer device 12 may also communicate with one or more external devices 14 (e.g., a keyboard, pointing device, display 24, etc.), and with one or more devices that enable a user to interact with the computer device 12, and/or with Any device (eg, network card, modem, etc.) that enables the computing device 12 to communicate with one or more other computing devices. This communication can be performed through an input/output (Input/Output, I/O) interface 22 . And, computer device 12 can also communicate with one or more networks (such as local area network (Local Area Network, LAN), wide area network (Wide Area Network, WAN)) by network adapter 20. As shown in the figure, network adapter 20 communicates with by bus 18 other components of computer device 12. Although not shown in FIG. 6, computer device 12 may be used in conjunction with other hardware and/or software components, including but not limited to: microcode, device drivers, Additional processing units, external disk drive array (Redundant Arrays of Inexpensive Disks, RAID) systems, tape drives, and data backup storage systems, etc.
处理单元16通过运行存储在系统存储器28中的程序,执行多种功能应用以及数据处理,例如实现本申请任意实施例所提供的缓存预热方法。The processing unit 16 executes various functional applications and data processing by running programs stored in the system memory 28 , such as implementing the cache warming method provided in any embodiment of the present application.
实施例七Embodiment seven
本申请实施例七提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请所有实施例提供的缓存预热方法:也即,该程序被处理器执行时实现以下方法:获取历史时间段的业务采集访问信息;将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息;将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值;根据所述多个预测热点评价值,从所述数据库中筛选出热点数据进行缓存预热。Embodiment 7 of the present application provides a computer-readable storage medium on which a computer program is stored. When the program is executed by a processor, the cache warming method provided in all embodiments of the present application is implemented: that is, the program is processed The following methods are implemented when the controller is executed: acquiring the business collection and access information of the historical time period; inputting the business collection and access information of the historical time period into the pre-trained business forecasting model to obtain the business forecasting visit information of the target time period; The business forecast access information is input into the pre-trained data prediction model, and multiple predicted hot comment values of multiple data in the database in the target time period are obtained; according to the multiple predicted hot comment values, filter out from the database Hotspot data is cached and warmed up.
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质可以(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、RAM、只读存储器(Read Only Memory,ROM)、可擦式可编程只读存储器(Erasable Programmable Read Only Memory,EPROM)、闪存、光纤、便携式CD-ROM、光存储器件、磁存储器件、或者上述的任意的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与指令执行系统、装置或者器件结合使用。The computer storage medium in the embodiments of the present application may use any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. A computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared semiconductor system, device, or device, or any combination thereof. Computer-readable storage media can include (a non-exhaustive list) electrical connections with one or more conductors, portable computer disks, hard disks, RAM, Read Only Memory (ROM), erasable programmable only Read memory (Erasable Programmable Read Only Memory, EPROM), flash memory, optical fiber, portable CD-ROM, optical storage device, magnetic storage device, or any combination of the above. In this document, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in conjunction with an instruction execution system, apparatus, or device.
计算机可读的信号介质可以包括在基带中或者作为载波传播的数据信号,计算机可读的信号介质承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括——但不限于——电磁信号、光信号或上述的任意的组 合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输由指令执行系统、装置或者器件使用或者与指令执行系统、装置或者器件结合使用的程序。A computer-readable signal medium may include a data signal in baseband or as a carrier wave, and the computer-readable signal medium carries computer-readable program code. Such propagated data signals may take many forms, including - but not limited to - electromagnetic signals, optical signals, or any combination of the above. A computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, which can be sent, propagated, or transported for use by or in conjunction with an instruction execution system, apparatus, or device. Programs used in conjunction with the device.
计算机可读介质上包含的程序代码可以用任何介质传输,介质包括——但不限于——无线、电线、光缆、无线电频率(Radio Frequency,RF)等等,或者上述的任意的组合。The program code contained on the computer readable medium can be transmitted by any medium, including—but not limited to—wireless, wire, optical cable, radio frequency (Radio Frequency, RF), etc., or any combination of the above.
可以以一种或多种程序设计语言来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、一部分在用户计算机上执行一部分在远程计算机上执行或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括LAN或WAN——连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。Computer program code for carrying out the operations of the present application may be written in one or more programming languages, including object-oriented programming languages—such as Java, Smalltalk, C++, as well as conventional procedural programming languages Design Language—such as "C" or a similar programming language. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. Where a remote computer is involved, the remote computer can be connected to the user computer through any kind of network, including a LAN or WAN, or, alternatively, can be connected to an external computer (eg, via the Internet using an Internet service provider).

Claims (10)

  1. 一种缓存预热方法,包括:A cache warming method, comprising:
    获取历史时间段的业务采集访问信息;Obtain business collection and access information in historical time periods;
    将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息;Inputting the business collection and access information of the historical time period into the pre-trained business prediction model to obtain the business forecast access information of the target time period;
    将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值;Inputting the business prediction access information into the pre-trained data prediction model to obtain multiple predicted hot comment values of multiple data in the database of the target time period;
    根据所述多个预测热点评价值,从所述数据库中的所述多个数据中筛选出热点数据进行缓存预热。According to the values of the plurality of predicted hotspots, hotspot data is selected from the plurality of data in the database for cache preheating.
  2. 根据权利要求1所述的方法,在将所述业务预测访问信息输入至预先训练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值之前,还包括:The method according to claim 1, before inputting the service forecast access information into the pre-trained data forecasting model to obtain multiple forecasted hot comment values of multiple data in the database of the target time period, further comprising:
    获取数据训练样本,所述数据训练样本包括数据库访问信息、业务访问信息和所述数据库中的多个数据的多个检测热点评价值;Acquiring data training samples, the data training samples including database access information, business access information and multiple detection hot spot values of multiple data in the database;
    根据所述数据训练样本对第一模型进行训练;training the first model according to the data training samples;
    在所述第一模型训练完成的情况下,将当前时刻的第一模型确定为数据预测模型。When the training of the first model is completed, the first model at the current moment is determined as the data prediction model.
  3. 根据权利要求2所述的方法,其中,所述数据训练样本还包括缓存访问信息。The method according to claim 2, wherein the data training samples further include cache access information.
  4. 根据权利要求3所述的方法,其中,所述获取数据训练样本,包括:The method according to claim 3, wherein said acquiring data training samples comprises:
    获取采样时间段的缓存访问信息、数据库访问信息和业务访问信息,并确定所述数据库中的多个数据的多个检测热点评价值;Obtain cache access information, database access information, and service access information in the sampling time period, and determine the value of multiple detection hotspots of multiple data in the database;
    根据所述采样时间段的缓存访问信息、数据访问信息、业务访问信息和所述数据库中的多个数据的多个检测热点评价值,形成数据训练样本。A data training sample is formed according to cache access information, data access information, service access information, and multiple detected hot comment values of multiple data in the database during the sampling period.
  5. 根据权利要求4所述的方法,其中,所述确定所述数据库中的多个数据的多个检测热点评价值,包括:The method according to claim 4, wherein said determining a plurality of detection hot spot values of a plurality of data in the database comprises:
    将所述采样时间段的数据访问信息和业务访问信息输入至所述第一模型中,确定所述数据库中的多个数据的多个检测热点评价值。Inputting the data access information and service access information of the sampling time period into the first model, and determining the value of a plurality of detected hotspots of a plurality of data in the database.
  6. 根据权利要求1所述的方法,在将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息之前,还包括:According to the method according to claim 1, before inputting the business collection and access information of the historical time period into the pre-trained business forecasting model to obtain the business forecasting visit information of the target time period, further comprising:
    获取第一天的业务采集访问信息,和第二天的业务采集访问信息,并生成业务训练样本;Obtain the business collection and visit information of the first day and the business collection and visit information of the second day, and generate business training samples;
    根据所述业务训练样本对第二模型进行训练;training the second model according to the business training sample;
    在所述第二模型训练完成的情况下,将当前时刻的第二模型确定为业务预测模型。When the training of the second model is completed, the second model at the current moment is determined as the service prediction model.
  7. 根据权利要求1-6任一项所述的方法,其中,所述根据所述多个预测热点评价值,从所述数据库中的所述多个数据中筛选出热点数据进行缓存预热,包括:The method according to any one of claims 1-6, wherein, according to the plurality of predicted hot comment values, selecting hot data from the plurality of data in the database for cache preheating includes :
    在所述数据库中的所述多个数据中筛选出预测热点评价值大于或等于预设评价值阈值的数据,并将筛选出的所述数据确定为热点数据;Screening out data whose predicted hot comment value is greater than or equal to a preset evaluation value threshold from the plurality of data in the database, and determining the filtered data as hot data;
    将所述热点数据存入缓存中。Store the hotspot data in the cache.
  8. 一种缓存预热装置,包括:A cache preheating device, comprising:
    业务采集访问信息获取模块,设置为获取历史时间段的业务采集访问信息;A business collection and access information acquisition module is configured to obtain business collection and access information in historical time periods;
    业务预测访问信息获取模块,设置为将所述历史时间段的业务采集访问信息输入至预先训练的业务预测模型,得到目标时间段的业务预测访问信息;The business forecast visit information acquisition module is configured to input the business collection visit information of the historical time period into the pre-trained business forecast model to obtain the business forecast visit information of the target time period;
    预测热点评价值获取模块,设置为将所述业务预测访问信息输入至预先训 练的数据预测模型中,得到目标时间段的数据库中的多个数据的多个预测热点评价值;Prediction hot comment value acquisition module is configured to input the business forecast access information into the pre-trained data prediction model to obtain multiple forecast hot comment values of a plurality of data in the database of the target time period;
    缓存预热模块,设置为根据所述多个预测热点评价值,从所述数据库中的所述多个数据中筛选出热点数据进行缓存预热。The cache preheating module is configured to filter out hot data from the plurality of data in the database for cache preheating according to the values of the plurality of predicted hot comments.
  9. 一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序,所述处理器执行所述程序时实现如权利要求1-7中任一所述的缓存预热方法。A computer device, comprising a memory, a processor, and a computer program stored on the memory and operable on the processor, when the processor executes the program, the cache pre-preservation according to any one of claims 1-7 is realized heat method.
  10. 一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如权利要求1-7中任一所述的缓存预热方法。A computer-readable storage medium, on which a computer program is stored, and when the program is executed by a processor, the cache warming method according to any one of claims 1-7 is implemented.
PCT/CN2022/120815 2021-09-24 2022-09-23 Cache warmup method and apparatus, and computer device and storage medium WO2023046059A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111121831.0 2021-09-24
CN202111121831.0A CN113849532A (en) 2021-09-24 2021-09-24 Cache preheating method and device, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2023046059A1 true WO2023046059A1 (en) 2023-03-30

Family

ID=78979216

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/120815 WO2023046059A1 (en) 2021-09-24 2022-09-23 Cache warmup method and apparatus, and computer device and storage medium

Country Status (2)

Country Link
CN (1) CN113849532A (en)
WO (1) WO2023046059A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117270794A (en) * 2023-11-22 2023-12-22 成都大成均图科技有限公司 Redis-based data storage method, medium and device

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113849532A (en) * 2021-09-24 2021-12-28 中国第一汽车股份有限公司 Cache preheating method and device, computer equipment and storage medium
CN114415965B (en) * 2022-01-25 2024-05-28 中国农业银行股份有限公司 Data migration method, device, equipment and storage medium
CN117453751B (en) * 2023-12-22 2024-03-26 中国海洋大学 Ocean big data cache loading system, operation method, device and medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180183891A1 (en) * 2016-12-28 2018-06-28 Google Inc. Optimizing user interface data caching for future actions
CN109284236A (en) * 2018-08-28 2019-01-29 北京三快在线科技有限公司 Data preheating method, device, electronic equipment and storage medium
CN110019362A (en) * 2017-11-08 2019-07-16 中移(苏州)软件技术有限公司 A kind of method and device accessing database
CN110334036A (en) * 2019-06-28 2019-10-15 京东数字科技控股有限公司 A kind of method and apparatus for realizing data cached scheduling
US20210073127A1 (en) * 2019-09-05 2021-03-11 Micron Technology, Inc. Intelligent Optimization of Caching Operations in a Data Storage Device
CN112685634A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Data query method and device, electronic equipment and storage medium
CN113076339A (en) * 2021-03-18 2021-07-06 北京沃东天骏信息技术有限公司 Data caching method, device, equipment and storage medium
CN113849532A (en) * 2021-09-24 2021-12-28 中国第一汽车股份有限公司 Cache preheating method and device, computer equipment and storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180183891A1 (en) * 2016-12-28 2018-06-28 Google Inc. Optimizing user interface data caching for future actions
CN110019362A (en) * 2017-11-08 2019-07-16 中移(苏州)软件技术有限公司 A kind of method and device accessing database
CN109284236A (en) * 2018-08-28 2019-01-29 北京三快在线科技有限公司 Data preheating method, device, electronic equipment and storage medium
CN110334036A (en) * 2019-06-28 2019-10-15 京东数字科技控股有限公司 A kind of method and apparatus for realizing data cached scheduling
US20210073127A1 (en) * 2019-09-05 2021-03-11 Micron Technology, Inc. Intelligent Optimization of Caching Operations in a Data Storage Device
CN112685634A (en) * 2020-12-29 2021-04-20 平安普惠企业管理有限公司 Data query method and device, electronic equipment and storage medium
CN113076339A (en) * 2021-03-18 2021-07-06 北京沃东天骏信息技术有限公司 Data caching method, device, equipment and storage medium
CN113849532A (en) * 2021-09-24 2021-12-28 中国第一汽车股份有限公司 Cache preheating method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117270794A (en) * 2023-11-22 2023-12-22 成都大成均图科技有限公司 Redis-based data storage method, medium and device
CN117270794B (en) * 2023-11-22 2024-02-23 成都大成均图科技有限公司 Redis-based data storage method, medium and device

Also Published As

Publication number Publication date
CN113849532A (en) 2021-12-28

Similar Documents

Publication Publication Date Title
WO2023046059A1 (en) Cache warmup method and apparatus, and computer device and storage medium
US11038984B2 (en) Data prefetching for large data systems
US10503710B2 (en) Webpage pre-reading method, apparatus and smart terminal
WO2019000887A1 (en) Method and device for recommending information
CN109885452B (en) Performance monitoring method and device and terminal equipment
US10922206B2 (en) Systems and methods for determining performance metrics of remote relational databases
WO2023005120A1 (en) Energy consumption prediction method and apparatus for building, and computer device and storage medium
WO2022110444A1 (en) Dynamic prediction method and apparatus for cloud native resources, computer device and storage medium
US20140280610A1 (en) Identification of users for initiating information spreading in a social network
WO2022057306A1 (en) Medical image data amplification method, apparatus, computer device, and medium
CN105183873A (en) Malicious clicking behavior detection method and device
WO2022142685A1 (en) Infection probability prediction method and apparatus for infectious disease, storage medium and electronic device
CN110647447B (en) Abnormal instance detection method, device, equipment and medium for distributed system
CN110188862B (en) Searching method, device and system for model hyper-parameters for data processing
CN108182240B (en) Interest point increasing rate prediction model training and prediction method, device and storage medium
WO2020232902A1 (en) Abnormal object identification method and apparatus, computing device, and storage medium
CN114205690A (en) Flow prediction method, flow prediction device, model training method, model training device, electronic equipment and storage medium
WO2019109798A1 (en) Method, device, terminal and storage medium for loading resource
CN112115372A (en) Parking lot recommendation method and device
US20190297473A1 (en) Data usage recommendation generator
CN109065176B (en) Blood glucose prediction method, device, terminal and storage medium
CN114297478A (en) Page recommendation method, device, equipment and storage medium
CN111835536A (en) Flow prediction method and device
CN113869599A (en) Fish epidemic disease development prediction method, system, equipment and medium
Wynia et al. Ethical triage demands a better triage survivability score

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22872103

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE