CN117119052B - Data processing method, device, electronic equipment and computer readable storage medium - Google Patents

Data processing method, device, electronic equipment and computer readable storage medium Download PDF

Info

Publication number
CN117119052B
CN117119052B CN202311392617.8A CN202311392617A CN117119052B CN 117119052 B CN117119052 B CN 117119052B CN 202311392617 A CN202311392617 A CN 202311392617A CN 117119052 B CN117119052 B CN 117119052B
Authority
CN
China
Prior art keywords
media data
source
data
heat
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311392617.8A
Other languages
Chinese (zh)
Other versions
CN117119052A (en
Inventor
周雯程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202311392617.8A priority Critical patent/CN117119052B/en
Publication of CN117119052A publication Critical patent/CN117119052A/en
Application granted granted Critical
Publication of CN117119052B publication Critical patent/CN117119052B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching
    • H04L67/5681Pre-fetching or pre-delivering data based on network characteristics
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/231Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
    • H04N21/23106Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion involving caching operations
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/232Content retrieval operation locally within server, e.g. reading video streams from disk arrays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • H04N21/2393Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests involving handling client requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/858Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot
    • H04N21/8586Linking data to content, e.g. by linking an URL to a video object, by creating a hotspot by using a URL
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides a data processing method, a data processing device, electronic equipment and a computer readable storage medium; the method comprises the following steps: responding to the fact that at least one edge server does not cache the media data, and acquiring back source information of the media data; determining the heat of the media data based on the back source information of the media data; and sending a cache request to at least part of the edge servers in response to the fact that the heat of the media data represents that the media data is hot data, wherein the cache request is used for indicating at least part of the edge servers to cache the media data in advance. According to the method and the device, the number of source returning times of media data can be reduced, and the cache hit rate of the edge server is improved.

Description

Data processing method, device, electronic equipment and computer readable storage medium
Technical Field
The present disclosure relates to data processing technologies, and in particular, to a data processing method, apparatus, electronic device, and computer readable storage medium.
Background
In a content delivery network (Content Delivery Network, CDN), when a terminal initiates a request for certain media data, the content delivery network selects an optimal edge server according to the geographic location and network condition of the terminal to respond to the request of the terminal. When the edge server caches the media data, the media data is directly returned to the terminal, so that the speed of downloading and transmitting the media data is increased, and the user experience is improved. When the edge server does not cache the media data, the source server needs to be returned to the source to download and acquire the media data, and then the media data is returned to the terminal user. But the source is returned to the source server to download and acquire the media data, which can result in slow data downloading and transmission speed and reduce user experience.
It can be seen that in the process of media data transmission by the content distribution network, whether a request of a terminal for media data hits the cache of the edge server has a great influence on transmission efficiency. However, because the number of media data is huge and the request frequency initiated by the terminal user for the media data is high, the probability of hitting the cache of the edge server by the request is very low, and the source needs to be frequently returned to the source server, so that the downloading and transmission speed of the media data can be influenced, and the user experience is reduced.
Disclosure of Invention
The embodiment of the application provides a data processing method, a data processing device, electronic equipment and a computer readable storage medium, which can reduce the number of source returning times of media data and improve the cache hit rate of an edge server.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a data processing method, which comprises the following steps:
responding to the fact that at least one edge server does not cache the media data, and acquiring back source information of the media data;
determining the heat of the media data based on the back source information of the media data;
and sending a cache request to at least part of the edge servers in response to the fact that the heat of the media data represents that the media data is hot data, wherein the cache request is used for indicating at least part of the edge servers to cache the media data in advance.
The embodiment of the application also provides a data processing method, which comprises the following steps:
receiving a cache request for media data sent by an origin server, wherein the cache request is generated based on the heat of the media data by the origin server determining that the edge server does not cache the media data;
and pre-caching the media data according to the caching request.
An embodiment of the present application provides a data processing apparatus, including:
the acquisition module is used for responding to the fact that at least one edge server does not cache the media data and acquiring the source returning information of the media data;
the determining module is used for determining the heat of the media data based on the back source information of the media data;
and the sending module is used for responding to the heat degree of the media data to represent the media data as hot data and sending a cache request to at least part of the edge servers, wherein the cache request is used for indicating at least part of the edge servers to cache the media data in advance.
The embodiment of the application also provides a data processing device, which comprises:
the receiving module is used for receiving a cache request for media data sent by an origin server, wherein the cache request is generated based on the heat of the media data and is generated by the origin server determining that the edge server does not cache the media data;
And the caching module is used for caching the media data in advance according to the caching request.
An embodiment of the present application provides an electronic device, including:
a memory for storing computer executable instructions or computer programs;
and the processor is used for realizing the data processing method provided by the embodiment of the application when executing the computer executable instructions or the computer programs stored in the memory.
The embodiment of the application provides a computer readable storage medium, which stores computer executable instructions or a computer program, and the computer readable storage medium is used for realizing the data processing method provided by the embodiment of the application when being executed by a processor.
The embodiment of the application provides a computer program product, which comprises computer executable instructions or a computer program, and the computer executable instructions or the computer program realize the data processing method provided by the embodiment of the application when being executed by a processor.
The embodiment of the application has the following beneficial effects:
according to the embodiment of the application, when the edge server does not cache the media data, the heat of the media data is determined through the source returning information of the media data, and when the heat of the media data can represent that the media data is hot data, a request for caching the media data is initiated to part of the edge servers, so that part of the edge servers can cache the media data in advance. When the media data are numerous and the data request frequency is high, the edge server is triggered to cache the media data in advance through the heat of the media data, so that the request for the media data is easier to hit the cache of the edge server, and the cache hit rate of the edge server is improved. Thus, when responding to the request for the media data, the downloading and transmission speed of the media data is increased, and the request is responded more quickly.
Drawings
FIG. 1 is a schematic diagram of an architecture of a data processing system 100 provided in an embodiment of the present application;
fig. 2A is a schematic structural diagram of an origin server 200 according to an embodiment of the present application;
fig. 2B is a schematic structural diagram of an edge server 500 according to an embodiment of the present application;
FIG. 3A is a flowchart illustrating a data processing method according to an embodiment of the present disclosure;
FIG. 3B is a second flowchart of a data processing method according to an embodiment of the present disclosure;
FIG. 3C is a flowchart illustrating a third embodiment of a data processing method according to the present disclosure;
FIG. 3D is a flowchart illustrating a data processing method according to an embodiment of the present disclosure;
FIG. 3E is a flowchart of a data processing method according to an embodiment of the present disclosure;
FIG. 3F is a flowchart illustrating a data processing method according to an embodiment of the present disclosure;
FIG. 3G is a flowchart of a data processing method according to an embodiment of the present disclosure;
FIG. 3H is a flowchart eighth of a data processing method according to an embodiment of the present disclosure;
FIG. 3I is a flowchart illustrating a data processing method according to an embodiment of the present disclosure;
fig. 4 is a schematic diagram of a video transmission process in a CDN network according to an embodiment of the present application;
fig. 5 is an interactive flowchart of a video transmission method provided in an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present application more apparent, the present application will be described in further detail with reference to the accompanying drawings, and the described embodiments should not be construed as limiting the present application, and all other embodiments obtained by those skilled in the art without making any inventive effort are within the scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is to be understood that "some embodiments" can be the same subset or different subsets of all possible embodiments and can be combined with one another without conflict.
In the following description, the terms "first", "second", "third" and the like are merely used to distinguish similar objects and do not represent a specific ordering of the objects, it being understood that the "first", "second", "third" may be interchanged with a specific order or sequence, as permitted, to enable embodiments of the application described herein to be practiced otherwise than as illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used in the embodiments of the present application have the same meaning as commonly understood by one of ordinary skill in the art. The terminology used in the embodiments of the application is for the purpose of describing the embodiments of the application only and is not intended to be limiting of the application.
Before further describing embodiments of the present application in detail, the terms and expressions that are referred to in the embodiments of the present application are described, and are suitable for the following explanation.
1) Thermal data, which refers to critical data that is accessed frequently or thermally in a business or application, is typically required to be accessed or retrieved quickly and efficiently and is therefore typically stored on a high performance, low latency storage medium for quick response to data requests.
2) A content delivery network (Content Delivery Network, CDN), which is a new type of service architecture for network content, provides delivery and services of content based on the efficiency requirements, quality requirements, and content order of content access to applications. Generally, by publishing site contents to massive acceleration network nodes throughout the world, users can obtain required contents from the network nodes nearby, the problems of unstable network, high access delay and the like caused by factors such as network congestion, cross operators, cross regions, cross borders and the like are avoided, the downloading speed is effectively improved, the response time is reduced, and smooth user experience is provided.
3) An edge server is a device or computer that resides at the logical extremity or "edge" of the network. The edge servers may provide entry points into the content distribution network, often serving as connections between different networks, in order to store content as close as possible to the requesting client, thereby reducing network latency and shortening the time to respond to client content requests.
4) An origin server, a server for storing content resources in a content distribution network, is used for responding to various content requests from the outside, such as responding to a back-source request of an edge server or even a client, so as to provide the content for the edge server or even the client.
The embodiment of the application provides a data processing method, a device, equipment, a computer readable storage medium and a computer program product, which can reduce the number of times of source returning of media data and improve the cache hit rate of an edge server.
The following describes exemplary applications of the electronic device provided in the embodiments of the present application, where the device provided in the embodiments of the present application may be implemented as a notebook computer, a tablet computer, a desktop computer, a set-top box, a mobile device (for example, a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, a portable game device), a smart phone, a smart speaker, a smart watch, a smart television, a vehicle-mounted terminal, and other various types of user terminals, and may also be implemented as a server.
With reference to fig. 1, fig. 1 is a schematic architecture diagram of a data processing system 100 according to an embodiment of the present application, including a terminal 400, an edge server 500, an origin server 200, a network 300, and other edge servers 600. The terminal 400 is connected to the edge server 500 through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of both.
The terminal 400 runs various Applications (APP) related to media data, such as an instant messaging APP, a reading APP, a video APP, a game APP, or other software programs. When a user initiates a request for media data through the network 300 by an application of the terminal 400, the edge server 500 receives the request for media data and determines whether the media data is cached. When the edge server 500 does not cache the media data, the source server 200 is requested to acquire the media data. The source server 200 receives the request for acquiring the media data from the edge server 500, acquires the back source information of the media data, and then determines the heat of the media data based on the back source information of the media data. When it is determined that the heat characterizing media data of the media data is hot data, a cache request is sent to the other edge server 600, where the cache request is used to instruct the other edge server 600 to pre-cache the media data. Finally, the source server 200 returns the media data to the edge server 500, and after the edge server 500 acquires the media data, the media data is returned to the terminal 400 through the network 300, so that the user can browse the media data in the application program running on the terminal 400.
In some embodiments, the source server 200, the edge server 500 and the other edge servers 600 shown in fig. 1 may be independent physical servers, may be a server cluster or a distributed system formed by a plurality of physical servers, and may also be cloud servers for providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, content distribution networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligent platforms. The terminal 400 may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, a car terminal, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, which is not limited in the embodiments of the present application.
The embodiments of the present application may be implemented by means of artificial intelligence (Artificial Intelligence, AI) technology, which is a theory, method, technique and application system that uses a digital computer or a digital computer-controlled machine to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
Taking the server provided in the embodiment of the present application as an example, for example, a server cluster may be deployed in the cloud, so as to open an artificial intelligence cloud Service (AI as a Service, AIaaS) to a user or a developer, where the AIaaS platform splits several types of common AI services and provides independent or packaged services in the cloud. This service model is similar to an AI theme mall, where all users or developers can access one or more artificial intelligence services provided using the AIaaS platform by way of an application programming interface.
For example, a cloud server encapsulates a program of the data processing method provided in the embodiments of the present application. The user invokes the data processing service in the cloud service through the terminal (the terminal runs with the APP, such as instant messaging APP, reading APP, etc.), so that the server deployed at the cloud end invokes the program of the encapsulated data processing method. When a user initiates a media data request at a terminal, an edge server receives the request for media data and determines whether the media data is cached. When the edge server does not cache the media data, requesting to acquire the media data from a server in the cloud. And the cloud server receives the request of the edge server for receiving the media data, acquires the back source information of the media data, and then determines the heat of the media data based on the back source information of the media data. And when the heat of the media data represents that the media data is hot data, sending a cache request to other edge servers, wherein the cache request is used for indicating the other edge servers to cache the media data in advance. And finally, the cloud server returns the media data to an edge server which requests to acquire the media data, the edge server returns the media data to the terminal after acquiring the media data, and a user can browse the media data in the APP operated by the terminal.
Referring to fig. 2A, fig. 2A is a schematic structural diagram of an origin server 200 provided in an embodiment of the present application, and the origin server 200 shown in fig. 2A includes: at least one processor 410, a memory 450, at least one network interface 420. The various components in origin server 200 are coupled together by bus system 440. It is understood that the bus system 440 is used to enable connected communication between these components. The bus system 440 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration the various buses are labeled as bus system 440 in fig. 2A.
The processor 410 may be an integrated circuit chip having signal processing capabilities such as a general purpose processor, such as a microprocessor or any conventional processor, a digital signal processor (Digital Signal Processor, DSP), or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or the like.
Memory 450 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard drives, optical drives, and the like. Memory 450 optionally includes one or more storage devices physically remote from processor 410.
Memory 450 includes volatile memory or nonvolatile memory, and may also include both volatile and nonvolatile memory. The non-volatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a random access Memory (Random Access Memory, RAM). The memory 450 described in the embodiments herein is intended to comprise any suitable type of memory.
In some embodiments, memory 450 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.
An operating system 451 including system programs, e.g., framework layer, core library layer, driver layer, etc., for handling various basic system services and performing hardware-related tasks, for implementing various basic services and handling hardware-based tasks;
a network communication module 452 for accessing other electronic devices via one or more (wired or wireless) network interfaces 420, the exemplary network interface 420 comprising: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (Universal Serial Bus, USB), etc.;
in some embodiments, the apparatus provided in the embodiments of the present application may be implemented in software, and fig. 2A shows a data processing apparatus 453 stored in a memory 450, which may be software in the form of a program and a plug-in, and includes the following software modules: the acquisition module 4531, the determination module 4532 and the transmission module 4533 are logical, and thus may be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be described hereinafter.
Referring to fig. 2B, fig. 2B is a schematic structural diagram of an edge server 500 according to an embodiment of the present application. The edge server 500 includes: at least one processor 510, a bus 540, a memory 550, at least one network interface 520. The various components in edge server 500 are coupled together by bus system 540. It is appreciated that the bus system 540 is used to enable connected communications between these components. The bus system 540 includes a power bus, a control bus, and a status signal bus in addition to the data bus. But for clarity of illustration, the various buses are labeled as bus system 540 in fig. 2B.
In some embodiments, memory 550 is capable of storing data to support various operations, examples of which include programs, modules and data structures, or subsets or supersets thereof, as exemplified below.
An operating system 551 including system programs for handling various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and handling hardware-based tasks;
network communication module 552 is used to reach other electronic devices via one or more (wired or wireless) network interfaces 520, exemplary network interfaces 520 include: bluetooth, wireless compatibility authentication (WiFi), and universal serial bus (Universal Serial Bus, USB), etc.;
Fig. 2B shows a data processing device 553 stored in a memory 550, which may be software in the form of programs and plug-ins, etc., comprising the following software modules: a receiving module 5531 and a buffering module 5532. The modules are logical, and thus may be arbitrarily combined or further split depending on the functions implemented, the functions of each module being described in relevant detail below.
In some embodiments, a terminal or server may implement the data processing methods provided in embodiments of the present application by running various computer-executable instructions or computer programs. For example, the computer-executable instructions may be commands at the micro-program level, machine instructions, or software instructions. The computer program may be a native program or a software module in an operating system; a local (Native) Application (APP), i.e. a program that needs to be installed in an operating system to run, such as a live APP or an instant messaging APP; or an applet that can be embedded in any APP, i.e., a program that can be run only by being downloaded into a browser environment. In general, the computer-executable instructions may be any form of instructions and the computer program may be any form of application, module, or plug-in.
The data processing method provided by the embodiment of the present application will be described with reference to exemplary applications and implementations of the server provided by the embodiment of the present application.
Referring to fig. 3A, fig. 3A is a schematic flow chart of a data processing method according to an embodiment of the present application, and the source server 200 shown in fig. 1 is taken as an execution body, and the steps shown in fig. 3A will be described.
In step 101, in response to at least one edge server not caching media data, back source information of the media data is obtained.
In the content distribution network, when a terminal user initiates a request for certain media data, the content distribution network selects an optimal edge server according to the geographic position and network condition of the terminal to respond to the media data request of the terminal. The media data request of the terminal is generally a request for downloading or acquiring media data, where the media data may be some content with large memory capacity, such as video, pictures, documents, voice packets, etc., for playing or browsing at the terminal. And an edge server is in the content distribution network for responding to requests of individual end users to provide media data to the end users.
When an end user initiates a request for media data, the content distribution network selects an optimal edge server to respond to the request of the end user, where the optimal edge server refers to an edge server closest to the geographic location of the end user or with the best network condition. The edge source server queries the database whether the corresponding media data is cached. When the database is inquired that the corresponding media data are stored, the original data of the media data are directly transmitted to the terminal user through the network node of the content distribution network, so that the request of the terminal user is responded quickly, the downloading and transmitting speed of the media data is increased, and the user experience is improved. When the edge server does not inquire the media data in the database, the media data is not cached, and then the edge server initiates a source-returning request to a source station (i.e. source server) of the media data, i.e. returns to the source server to request to download and acquire the original data resources of the media data.
Because the source server needs to accept the source return requests of the edge servers each time when the edge servers do not cache the media data, and corresponding source return information is recorded each time the source return requests are responded, in the embodiment of the application, the source return information of the media data is obtained in response to at least one edge server not caching the media data, wherein the source return information can be the source return times of the media data in a preset time period, or the source return requests and is used for counting the source return times and other information subsequently.
In step 102, the popularity of the media data is determined based on the back source information of the media data.
After the back source information of the media data is acquired, the heat of the media data is determined based on the back source information of the media data. Firstly, determining the back source times of the media data according to the acquired back source information, and further determining the characteristics of the media data, such as whether the access frequency is high, whether the access frequency is popular with more terminal users or not, etc., wherein the characteristics can be uniformly characterized as the heat degree of the media data, and further judging whether the media data is the heat data according to the heat degree.
In some embodiments, referring to fig. 3B, step 102 shown in fig. 3A may be implemented by the following steps 1021A through 1022A, which are described in detail below.
In step 1021A, at least one back-source parameter of the media data is determined based on the back-source information of the media data.
After the back source information of the media data is obtained, at least one back source parameter of the media data can be determined according to the back source information of the media data. Because the back source information records the back source times of the media data in the preset time period, the back source times can be directly used as the back source parameters of the media data. Further, more back source parameters, such as the increasing speed of the back source times, the value of the back source times exceeding the average value of the back source times, and the like, can be determined according to the back source times of the media data. Features of the media data may be obtained from these back-source parameters to determine if it is hot data. In the embodiment of the application, according to the back source information of the media data, at least one back source parameter of the media data is determined first, and then the heat of the media data is calculated according to the back source parameter.
In the embodiment of the present application, the source-back parameter of the media data may be at least one of the following five parameters:
the first parameter is a frequency parameter, i.e. the number of source returns in a preset time period. For example, the edge server may be configured to source back times (e.g., 500 times) for media data over the past 1 hour.
The second term is a growth rate parameter, i.e., a growth rate value of the number of back sources in a preset time period. For example, within the past 1 hour, the edge server grows the rate value for the number of times media data is back sourced. Specifically, the total number of source return times increase in the first half hour (such as 600 times) and the total number of source return times increase in the second half hour (such as 660 times) are counted respectively, then the source return increase speed in the first half hour (such as 20 times per minute) and the source return increase speed in the second half hour (such as 22 times per minute) are calculated respectively, and finally the increase speed value in 1 hour is calculated as "(22-20)20", i.e. 10%.
The third term is a proportional value parameter, that is, a proportional value of the number of source back times in a preset time period to the total number of source back requests of the corresponding edge server. For example, within the past 1 hour, the number of times the edge server sources back for media data is proportional to the total number of edge server source back requests. Because the edge server can also perform source-back on other content data except the media data, the source-back request number of the edge server can be recorded once every time the source-back is performed, and thus the total source-back request number of the edge server can be counted. At this time, the number of times (e.g. 300) of the source back of the edge server for the media data is counted again, and the ratio (i.e. 50%) of the total number (e.g. 600) of the source back requests at the edge server is counted.
The fourth term is a multiple value parameter, namely, the multiple value that the source returning times exceeds the average value of the source returning times in a preset time period. For example, within the past 1 hour, the edge server exceeds the multiple value of the average of the number of back sources for the media data. Specifically, the total number of times the edge server returns the source for the media data (e.g., 600 times) over the past 24 hours is counted, then the average value of the number of times the source is returned, i.e., the number of times the source is returned per hour (e.g., 25 times per hour), and then the multiple value of the number of times the source is returned (e.g., 75 times) over the past 1 hour (e.g., 25 times per hour) is calculated as "(75-25)25", i.e. 2 times.
The fifth item is a percentage parameter, that is, the percentage of the number of source back times in the preset time period to the total number of source back times of the corresponding edge server. For example, the number of back sources for media data is a percentage of the total number of edge server back sources in the past 1 hour. Because media data may be sourced back at multiple edge servers, and edge servers may also source other data. Each edge server may be considered herein as an edge node in the content distribution network. At this time, the number of source back times (for example, 1000 times) of all edge servers (for example, 10) for all content data is counted, that is, the number of source back times of all edge nodes in the content distribution network is counted and used as the total number of source back times of the edge servers. And then counting the number of times of source back (for example, 800 times) of the edge servers (for example, 5 times) which perform source back on the media data in all the edge servers, and finally determining that the percentage of the number of times of source back to the total number of times of source back of the edge servers is '800/1000', namely, 80%.
In step 1022A, the heat of the media data is determined based on the at least one back source parameter.
After determining at least one back source parameter of the media data, determining the heat of the media data based on the at least one back source parameter. In the embodiment of the present application, for each back source parameter, by determining whether the back source parameter exceeds a parameter threshold, each back source parameter is mapped to at least one mapping value, so that the heat degree of the media data is determined according to the mapping value, which is described in detail below.
In some embodiments, referring to fig. 3C, step 1022A shown in fig. 3B may also be implemented by the following steps 10221A to 10222A, which are specifically described below.
In step 10221A, when the back source parameter of the media data is one, mapping processing is performed on the back source parameter to obtain the heat of the media data.
First, the number of the back source parameters of the media data needs to be determined, that is, whether the back source parameters are one or more of the five parameters is determined. When the back source parameter of the media data is determined to be one, the back source parameter is only one of the five parameters, and then the mapping processing is directly carried out on the back source parameter to obtain the heat of the media data.
Specifically, in the embodiment of the present application, five parameter thresholds (i.e., a frequency parameter, a growth speed parameter, a proportion value parameter, a multiple value parameter, and a percentage parameter) are set for the five parameters respectively, i.e., a frequency parameter threshold, a growth speed parameter threshold, a proportion value parameter threshold, a multiple value parameter threshold, and a percentage parameter threshold. And then judging whether the one back source parameter of the media data exceeds a corresponding parameter threshold value, and if so, mapping the corresponding back source parameter into a mapping value (such as any positive integer, 100) serving as the heat of the media data. And if the media data is not exceeded, mapping the corresponding back source parameter to 0 as the heat of the media data.
For example, one back-source parameter of a certain media data is a growth rate parameter, i.e. the number of back-sources within a preset time period has a growth rate value of 20%. At this time, the corresponding growth rate parameter threshold is 10%, which means that the back source parameter 20% exceeds the growth rate parameter threshold by 10%, and the back source parameter 20% is mapped to the mapping value 100 as the heat of the media data. For another example, one source-back parameter of a certain media data is a percentage parameter, that is, the percentage of the source-back times in the preset time period to the total number of source-back times of the corresponding edge server is 70%. At this time, the corresponding percentage parameter threshold is 80%, which means that the source-returning parameter 70% does not exceed the percentage parameter threshold 80%, and the source-returning parameter 70% is mapped to the mapping value 0 as the heat of the media data.
In step 10222A, when the back source parameters of the media data are plural, mapping processing is performed for each back source parameter to obtain plural mapping values, and the sum of the plural mapping values is used as the heat of the media data.
When the number of the back source parameters of the media data is multiple, mapping processing is carried out on each back source parameter to obtain multiple mapping values, and the sum of the multiple mapping values is used as the heat of the media data. When a plurality of back source parameters of the media data are determined, the back source parameters are described as a plurality of the five parameters, at this time, mapping processing is performed on each back source parameter to obtain a plurality of mapping values, that is, each back source parameter can be mapped to obtain a mapping value, and finally, the sum of the plurality of mapping values is taken as the heat of the media data.
Specifically, when the number of the back source parameters of the media data is multiple, it may be determined whether each back source parameter exceeds the corresponding parameter threshold, if so, the corresponding back source parameter is mapped to a mapping value (for example, any positive integer, 100), if not, the corresponding back source parameter is mapped to 0, and finally, the multiple mapping values corresponding to the multiple back source parameters are summed to obtain a total mapping value as the heat of the media data.
For example, a certain media data has three source return parameters, namely a frequency parameter, a proportion value parameter and a multiple value parameter. The first back source parameter is a frequency parameter, that is, the number of back sources in a preset time period is 600. At this time, the corresponding number of times parameter threshold is 500, which means that the back-source parameter 600 exceeds the growth rate parameter threshold 500, and the back-source parameter 600 is mapped to the mapping value 100. The second back source parameter is a proportional value parameter, that is, the proportion of the back source times in the preset time period to the total back source requests of the corresponding edge server is 60%. At this time, the corresponding proportional parameter threshold is 50%, which means that the back source parameter 60% exceeds the percentage parameter threshold by 50%, and the back source parameter 60% is mapped to the mapping value 100. The third source-returning parameter is a multiple value parameter, namely, the multiple value that the source-returning times exceeds the average value of the source-returning times in the preset time period is 2.5 times. At this time, the corresponding proportional parameter threshold is 2 times, which means that the back source parameter 2.5 exceeds the multiple value parameter threshold 2, and the back source parameter 2.5 is mapped to the mapping value 100. Therefore, three back source parameters of the media data are mapped to obtain three mapping values of 100, 100 and 100 respectively, and finally the three mapping values are summed to obtain a total mapping value 300 which is used as the heat degree of the media data.
In some embodiments, referring to fig. 3D, step 102 shown in fig. 3A may also be implemented by the following steps 1021B through 1023B, which are described in detail below.
In step 1021B, the original data of the media data is acquired based on the back source information of the media data.
In some embodiments, the features of the original data of the media data may also be directly predicted, so as to obtain the heat of the media data. And after the back source information of the media data is acquired, acquiring the original data of the media data based on the back source information of the media data. Specifically, according to the source information of the media data, the original data corresponding to the media data is queried from the database, and the original data of the media data is obtained. For example, the media data is video, picture, document, voice packet, and the corresponding original data is video data, picture data, document data, voice data, etc.
In step 1022B, feature extraction processing is performed on the original data of the media data, so as to obtain the heat feature of the original data.
After the original data of the media data are obtained, the original data of the media data are subjected to feature extraction processing, and the heat features of the original data are obtained. Specifically, a pre-trained heat prediction model (a neural network or a convolutional neural network) can be called to perform feature extraction on the original data of the media data. For example, the media data is video, picture, document, then the video feature or picture feature of the corresponding video data, text feature of the document data are extracted. The video features can be features of video topics, video contents and the like identified by machine frame extraction, and the text features can be features of characters (such as subtitles) and the like in pictures or video frames. In addition, the media data may be a voice packet, and then voice features corresponding to the voice data, such as audio features of voice types of music, lessons, broadcasting, etc., are extracted.
After extracting the video features, text features or voice features, pre-training a heat prediction model (such as a hidden layer of a neural network or a convolution layer of the convolutional neural network) to perform preliminary processing on the features, for example, performing feature mapping on the video features, the text features or the voice features through weight parameters of the hidden layer of the neural network, so as to map the video features, the text features or the voice features into a plurality of heat features of video, pictures and voice packets, wherein the heat feature values can be feature values of features such as access quantity, play quantity, browsing quantity, listening quantity, watching (listening) time, praise number, comment number, forwarding sharing number and the like of media data.
In step 1023B, the heat characteristic is predicted to obtain the heat of the media data.
And finally, invoking a pre-training heat prediction model to predict the heat characteristics after the heat characteristics of the media data are determined, so as to obtain the heat of the media data. Specifically, the heat characteristic value is mapped through a pre-training heat prediction model (such as a fully-connected layer of a neural network or a downstream classifier of the model), so that a heat value of the media data is obtained through mapping, and the heat value can represent the heat of the media data.
The media data is, for example, a voice (e.g. music, lesson, broadcast), and the original data of the voice, i.e. the voice data, is first queried from the database based on the back source information of the voice. And then invoking a heat prediction model to perform feature extraction processing on the voice data to obtain audio features of the voice data, and performing preliminary processing on the audio features, namely performing feature mapping on the audio features to obtain heat features of the voice, wherein heat feature values are feature values of features such as listening amount and playing amount of the voice. For example, some popular music is identified by the popularity prediction model, the mapped popularity feature values (such as the feature values of the listening amount and the playing amount) are very high, and some unmanned questionnaires are identified by the popularity prediction model, and the mapped feature values are very low. And finally, predicting the characteristic value of the voice to obtain the heat value of the media data.
In some embodiments, the heat of the media data may also be reflected in consideration of how frequently the media data is back sourced. Thus, step 102 shown in FIG. 3A may also be implemented by: determining a plurality of source returning time points of the media data according to the source returning information of the media data in a set time period, and sequencing the source returning time points according to the time sequence to obtain a first source returning time sequence; performing de-duplication treatment on the first back-source time sequence to obtain a second back-source time sequence; and determining a back source time interval between any two adjacent back source time points in the second back source time sequence, and taking the average value of the back source time intervals as the heat of the media data in a set time period.
Specifically, in a certain preset time period, a plurality of time points when the media data is back-sourced, namely the back-sourced time points, are counted according to the back-sourced information of the media data. And then sequencing the multiple source returning time points in the preset time period according to the time sequence to obtain a first source returning time sequence. If the first back source time sequence has repeated back source time points, performing de-duplication processing on the repeated back source time points, and reserving only one back source time point to obtain a second back source time sequence. Next, for any two adjacent source time points in the second source time sequence, determining a time interval between the two adjacent source time points, namely a source time interval, and determining an average value of all source time intervals, wherein the average value represents the source frequency of the media data, namely the heat of the media data.
For example, in one hour (from 4 points to 5 points), all the time points of the back source obtained according to the statistics of the back source information are "4 points 10, 4 points 17, 4 points 58, 4 points 41 and 4 points 17", and then the time points of the back source are sequentially sequenced according to time to obtain a first back source time sequence, and the repeated time point "4 points 17" of the back source is removed from the first back source time sequence to obtain a second back source time sequence "4 points 10, 4 points 17, 4 points 41 and 4 points 58". Next, the time interval (unit: minutes) between any two adjacent back source time points in the second back source time sequence is calculated, and the time intervals are taken as back source time intervals, so that '7, 24 and 17' are obtained, and finally, the average value of all the back source time intervals '7, 24 and 17' is calculated to be 16, and the average value is taken as the heat of the media data.
In some embodiments, the timeliness of the media data is considered, because the timeliness of the media data decreases with the advancement of time, i.e. the media data has a high timeliness in a certain period, but the timeliness decreases rapidly in a next period, if the timeliness of the media data is determined only in a certain single preset period, there is a limitation.
Based on this, in order to ensure that the heat of the media data has timeliness, the embodiment of the present application may further determine the heat of the media data by counting the source-back information of a plurality of time periods, and the following specific description of step 102 shown in fig. 3A may be further implemented by the following manner: determining a plurality of set time periods before the current moment, and determining the number of times of source back of the media data in the set time periods based on the source back information of the media data for each set time period; determining the source return frequency of the media data in a set time period according to the source return times; determining the weight of the back source frequency, wherein the weight and the time interval between the set time period of the back source frequency and the current moment are in a negative correlation relationship; and weighting the source return frequency based on the weight to obtain weighted source return frequency of the set time period, adding the weighted source return frequency of each set time period, and taking the average value of the addition result as the heat degree of the media data.
Specifically, a plurality of (more than 1) time periods are preset before the current moment, and then the number of times of the back source of the media data is determined according to the back source information of the media data in each time period. For each time period, determining a back source frequency of the media data in each time period, where the back source frequency may be obtained by dividing the number of back sources in the time period by the time of the time period. And then setting weight for the back source frequency in each time period, wherein the weight and the time interval between the time period where the back source frequency is located and the current moment are in negative correlation. I.e. the criteria for setting weights follow the following rules: the closer the time period of the back source frequency is to the current time, the higher the weight of the back source frequency is, whereas the farther the time period of the back source frequency is to the current time, the lower the weight of the back source frequency is. The weight determination method can specifically refer to the following formula (1):
(1)/>
in the above formula (1), w (i) is a weight corresponding to the back source frequency from the earliest time zone to the ith time zone at the current time, and N is the total number of time zones.
After the weight of the source return frequency corresponding to each time period is determined, weighting the source return frequency by using the weight for each time period to obtain the weighted source return frequency of the time period, finally adding the weighted source return frequency of each time period, then averaging, and taking the average value obtained after adding and averaging as the heat of the media data.
For example, when the current time is 5, the number of source returning times of a certain media data in three time periods from 2 to 3, from 3 to 4 and from 4 to 5 is counted to be 240, 300 and 240 respectively. Next, the back source frequency (unit: times/min) of the media data in each period of time is calculated to be 4,5,4, respectively. Then according to the formula (1), determining weights of the source return frequencies in three time periods as、/>And e. Finally, weighting the source return frequency of each time period by using the weight to obtain weighted source return frequencies of 4 +.>、5/>4e, summing the three weighted back source frequencies, and calculating the average value of the summation result to obtain +.>As a result ofIs the warmth of the media data.
With continued reference to fig. 3A, in step 103, a cache request is sent to at least a portion of the edge servers in response to the heat characterizing the media data as hot data.
After the heat degree of the media data is determined, whether the media data is the heat data can be judged according to the heat degree. In this embodiment of the present application, if the heat of the media data is obtained according to the source-returning parameter of the media data or according to the heat prediction model, a heat threshold may be preset, and if the heat of the media data is less than the preset threshold, it is determined that the media data is not heat data. And if the heat of the media data is not less than the preset threshold value, determining that the media data is hot data.
In some embodiments, if the average value of the back source time intervals of the media data is taken as the heat of the media data, a time interval threshold may be preset according to the actual scene as the heat threshold. And when the average value of the source returning time intervals of the media data is smaller than a preset time interval threshold, namely the heat of the media data is smaller than the preset heat threshold, determining the media data as heat data. And when the average value of the source returning time intervals of the media data is not smaller than a preset time interval threshold, namely the heat of the media data is not smaller than the preset heat threshold, determining that the media data is heat data.
For example, in one hour (4 to 5 points), all the back source time points obtained according to the back source information statistics are "4 points 10, 4 points 17, 4 points 58, 4 points 41 and 4 points 17", the back source time intervals (unit: minutes) of the media data are finally calculated to be "7, 24 and 17", and the average value of all the back source time intervals "7, 24 and 17" is calculated to be 16, so as to be used as the heat degree of the media data. A time interval threshold 20 is then preset as a heat threshold, i.e. the average value of the media data back-to-source time interval does not exceed "20 minutes" before it can be determined as heat data. And at this time, the heat degree (average value of the back source time interval) of the media data is 16 minutes and less than the heat degree threshold 20, and the media data is determined to be heat data.
In some embodiments, if the weighted source frequency average value of the media data in a plurality of set time periods is taken as the heat of the media data, a frequency threshold may be preset according to the actual scene and taken as the heat threshold. And when the average value of the weighted back source frequencies of the media data is larger than a preset frequency threshold value, namely the heat of the media data is larger than the preset heat threshold value, determining the media data as heat data. And when the average value of the weighted back source frequencies of the media data is not greater than a preset frequency threshold, namely the heat of the media data is not greater than the preset heat threshold, determining that the media data is non-heat data.
For example, when the current time is 5, statistics are made that a weighted back source frequency (unit: times/min) of a certain media data in three time periods of 2 to 3, 3 to 4, and 4 to 5 is 4 respectively、5/>4e, summing the three weighted back source frequencies, and calculating the average value of the summation result to obtain +.>As the heat of the media data. A frequency threshold 20 is then preset as the heat threshold, i.e. the average of the weighted back source frequencies of the media data over three time periods needs to exceed "20 times/min" to be determined as heat data. And at this time the heat of the media data (average value of weighted back source frequency) is +. >If the media data is greater than the preset heat threshold 20, the media data is indicated as heat data.
In some embodiments, considering that the number of media data is numerous and the heat is different, in some scenarios, the heat threshold is difficult to set reasonably, so after the number of media data is plural (for example, 100) and the heat of each media data is determined according to step 102 shown in fig. 3A, the plural media data may be further sorted in descending order according to the heat of the media data, to obtain a media data sequence. That is, in the media data sequence, the media data with high heat is ordered before and the media data with low heat is ordered after. And then directly selecting K (for example, 10) media data with high heat from the media data sequence, and determining the K media data as hot data, namely determining the K media data which are ranked in the front in the media data sequence as hot data. The remaining media data in the sequence of media data is determined to be non-thermal data, such that thermal data may also be determined from the plurality of media data.
In an actual scenario, if the mapped value mapped by each back-source parameter exceeding the parameter threshold is set as the heat threshold, for example, the mapped value mapped by each back-source parameter exceeding the corresponding parameter threshold is set as 100, and the heat threshold is also set as 100. And the condition that the heat degree of the media data meets the heat degree threshold value can be determined as the heat data only by that one of the back source parameters exceeds the corresponding parameter threshold value in the at least one back source parameter of the media data.
For example, a certain media data is a video, and the back source parameters of the video are respectively a frequency parameter, a proportion value parameter and a multiple value parameter according to the back source information of the video, and finally mapped into three mapping values of 100, 100 and 100, and then the three mapping values are summed to obtain a total mapping value 300 which is used as the heat of the video. At this time, the preset heat threshold is 100, and the heat 300 of the video exceeds the heat threshold 100, which indicates that the video is heat data. This also means that, of the three back-source parameters of the video, only at least one back-source parameter is required to exceed the corresponding parameter threshold, and the final mapping value obtained is at least 100, and is not less than the preset heat threshold 100, and the video can be determined as heat data.
In another example, a certain media data is a voice, for example, may be music, and the feature extraction processing is performed through a pre-trained popularity prediction model to obtain popularity features of the music, for example, play amount, listening amount, praise number, comment number, forwarding share number, and the like, and then the prediction processing is performed on the popularity features of the music to obtain a popularity value of 200, where the popularity value may represent that the popularity of the music is 200. At this time, the preset heat threshold is 100, and the heat 200 of the music exceeds the heat threshold 100, which indicates that the music is heat data.
And after judging whether the media data is hot data, responding to the heat degree of the media data to represent the media data as hot data, and sending a cache request to at least part of edge servers. Because the heat of the media data is determined to represent the media data as the heat data, that is, after the media data is judged to be the heat data, the media data is high in access frequency among the end users and is more popular with the end users. In order to respond more quickly to end user demands, it is desirable that the edge servers both be able to cache media data. And then sending a cache request for the media data to at least part of the edge servers, wherein the cache request is used for indicating at least part of the edge servers to cache the media data in advance, so that when an end user initiates the request for the media data, the cache of the edge servers is more likely to be hit, the downloading and transmission speed of the media data are accelerated, the request of the end user is responded more quickly, and the user experience is improved.
In some embodiments, referring to fig. 3E, step 103 shown in fig. 3A may also be implemented by the following steps 1031 to 1032, which are specifically described below.
In step 1031, metadata for the media data is queried from the database.
In some embodiments, after determining that the media data is hot, the media data is retrieved from the database and returned to the corresponding edge server in response to a request for the media data to be retrieved by the edge service download. In the embodiment of the application, when responding to the request of downloading and acquiring the media data by the edge server, the metadata of the media data is returned to the edge server. Firstly, metadata of media data is queried from a database, wherein the metadata is not original data of the media data, but intermediate data of the media data (such as resource links, mirror images and the like of the media data), and because the memory occupied by the intermediate data is small, the load of a network can be reduced, and the storage space of an edge server can be effectively saved.
In step 1032, the metadata is sent to at least a portion of the edge servers.
With the above embodiment, after the metadata of the media data is queried from the database, a cache request may be sent to at least some edge servers. Specifically, metadata of the media data is sent to at least part of the edge services, wherein the metadata is used for indicating at least part of the edge servers to pre-cache original data of the media data. That is, after the edge server receives the metadata of the media data, the original data of the media data can be cached in advance from the source server according to the metadata of the media data. When the terminal user sends the request for the media data again later, the request can directly hit the cache of the edge server, and the edge server can quickly transmit the original data of the pre-cached media data and return the original data to the terminal user.
According to the embodiment of the application, when the media data are determined to be hot data, the metadata of the media data are sent to part of the edge servers, so that the edge servers cache the original data of the media data. Therefore, the request of the media data initiated by the terminal user can hit the cache of the edge server more easily, and the number of times of source returning of the edge server is reduced. In addition, the metadata of the media data is returned to the edge server and is not the original data of the media data, and the metadata has small memory, so that the network resources consumed in the sending process can be reduced, and the bandwidth pressure of network node data transmission in the content distribution network can be reduced. No additional back source is added even if the edge server has cached the original data of the media data. And the media data buffer interface of the edge server does not need secondary development, and the existing buffer interface is directly multiplexed.
In some embodiments, referring to fig. 3F, the following step 105 to step 106 implementation may also be performed before step 103 shown in fig. 3A, which is specifically described below.
In step 105, a plurality of edge servers in the content distribution network that establish communication with the origin server are determined based on the topology of the content distribution network in which the origin server is located.
In some embodiments, after determining that the media data is hot data, it may be directly triggered that all edge servers buffer metadata of the media data, that is, directly send a buffer request to all edge servers, to indicate that all edge servers buffer original data of the media data. But this can place a significant load on the network nodes in the content distribution network. And the method is limited by objective conditions such as regions, networks and the like, each edge server has different requirements for media data, and if the objective conditions are not considered, the method directly triggers all edge servers to buffer the original data of the media data, so that the waste of data resources can be caused.
Based on the above, the embodiment of the application can determine a plurality of edge servers which establish communication with the source server in the content distribution network according to the topology structure of the content distribution network where the source server is located. Specifically, before sending a cache request to at least part of edge servers, determining which edge servers need to send the cache request, so that the cache request is sent to only part of edge servers in a targeted manner, and the part of edge servers are indicated to cache media data. Since only the edge servers establishing communication can download the acquired media data from the source server through the network node, depending on the topology of the content distribution network in which the source server is located, a plurality of edge servers establishing communication with the source server are determined.
At least a portion of the edge servers are determined from the plurality of edge servers in step 106.
With the above embodiments, when a plurality of edge servers in a content distribution network for establishing communication with an origin server are determined, at least some of the edge servers are determined from the plurality of edge servers, and a method for determining at least some of the edge servers will be described in detail below.
In some embodiments, referring to fig. 3G, step 106 shown in fig. 3F may also be implemented by the following steps 1061A-1062A, described in detail below.
In step 1061A, the communication distance at which each edge server establishes communication with the origin server is determined.
Since the communication distance between the source server and the edge server affects the speed of media data download transmission, the closer the communication distance is, the shorter the speed of data download transmission is, and the slower the speed of responding to data requests for media data is. Conversely, the farther the communication distance, the longer the speed of data download transmission, and the slower the speed of data requests in response to media data. Based on this scenario, at least a portion of the edge servers may be determined by the communication distance of the origin server from the edge servers. First, in a content distribution network, a communication distance at which each edge server establishes communication with an origin server is determined, which is not a distance on a physical route but a network distance on a level of a network communication route. The network communication routes of the source server and the edge server may be plural, so the communication distance may be plural. In the embodiment of the application, the communication distance of the shortest communication route is determined as the communication distance between the edge server and the source server.
In step 1062A, an edge server is determined to be at least part of the edge server when the communication distance is not less than the distance threshold.
The above embodiment is accepted because the speed of download transmission is longer in consideration of the larger communication distance. In the embodiment of the application, the edge server with the communication distance with the source server larger than the preset distance threshold is selected to be at least a part of servers. Specifically, by setting a communication distance threshold, for each edge server in the plurality of edge servers, when it is determined that the communication distance between the edge server and the source server is not less than the communication distance threshold, the edge server is determined as at least a part of the edge servers. Thus, the edge servers with the far communication distance with the source server can be screened out from the edge servers, and then the cache request for the media data is sent to the edge servers with the far communication distance.
Through steps 1061A to 1062A of the embodiment of the application, at least some edge servers are determined from the plurality of edge servers by using the communication distance between the edge server and the source server. That is, only the part of the edge server far from the source server is selected to transmit the metadata of the media data, so that the part of the edge server far from the source server in communication with the source server buffers the original data of the media data in advance. And compared with the method that the cache request is directly sent to all edge servers without considering any condition, the method reduces the network load pressure in the whole content distribution network, can respond to the demands of the terminal users more quickly, and does not cause the waste of data resources.
In some embodiments, referring to fig. 3H, step 106 shown in fig. 3F may also be implemented by the following steps 1061B to 1064B, which are specifically described below.
In some embodiments, at least some edge servers may also be determined by predicting the likelihood of the edge server returning the media data. For each edge server, it is determined which edge servers need to send the cache request by predicting the probability of the edge server returning to the source for the media data, as described in detail below.
In step 1061B, a set of media data is acquired.
First, for each edge server, a media data set is acquired, wherein the media data set is a set of media data in which a source has been executed in the current edge server. Specifically, all media data including video, pictures, documents or voice packets, etc. that have been executed by the current edge server back to the source within a certain period of time are collected. The time period may be preset according to an actual scene, for example, one month. The collection method may query the response log or the data transmission log of the current edge server to obtain the response log, because the edge server stores corresponding media data transmission records each time the edge server responds to the media data request of the end user, and the data transmission records are stored in the response log or the data transmission log.
In step 1062B, feature extraction processing is performed on the media data set to obtain a back source feature of the media data set.
After the media data set of which the current edge server executes the back source is obtained, the feature extraction processing can be carried out on the media data set, so that the back source feature of the media data set is obtained. The back source feature generally characterizes some characteristic or property of the media data. For example, when the media data is video, the video category (tv drama, short video), the subject matter (history subject matter, education subject matter), the subject matter (news, documentary) and the like can be used as the video source-returning feature. If most of the videos in the media data set are short videos, the current edge server is indicated to be easier to perform source-back aiming at the media data of the short video type, and the predicted source-back probability is higher.
The method for extracting the characteristics of the media data set can also call the back source prediction model to realize that the media data set is input into the back source prediction model as a data set for extracting the characteristics. And different media data can call the back source prediction model to extract different types of features, for example, if the media data is a picture, a convolution layer of the back source prediction model is called to extract the convolution features of the picture, and if the media data is a document, a text encoder of the back source prediction model is called to carry out text encoding processing on the document.
In step 1063B, the back source feature is mapped based on the media data to obtain a back source probability of the edge server for the media data.
According to the embodiment, after the back source prediction model is called to extract the back source characteristics of the media data set, mapping processing is carried out on the back source characteristics based on the media data, and the back source probability of the edge server for the media data is obtained. Specifically, feature information of the current media data is extracted first, then a back source prediction model (such as a full connection layer) is called based on the feature information of the current media data to map back source features of the media data set, and back source probability of the current edge server for the current media data is obtained. The higher the correlation of the feature information of the current media data with the back source feature of the media data set, the larger the back source probability, and vice versa.
For example, a media data set collected by a current edge server is a video set, and after the source-returning characteristics of the video set are extracted, the highest duty ratio of the documentary short video belonging to the historical subject material is found. The current edge server is easy to return to the source of the short video of the record belonging to the history subject, at the moment, the characteristic information of the current video is determined, the current video is found to be the short video of the record of the history subject, the characteristic of the current video is similar to that of the source returning of the video set, and the correlation degree is high. And mapping the back source characteristics of the video set according to the current video to obtain the back source probability (for example, 99%) of the edge server for the current video. If the current video is found to be documentary drama of the educational material, which is quite different from the back source feature of the video collection, it may be that there are few or even few documentary drama of the educational material in the video collection. And mapping the back source characteristics of the video set according to the current video to obtain the back source probability (for example, 2%) of the edge server for the current video.
In step 1064B, when the back source probability is greater than the probability threshold, the edge servers are determined to be at least part of the edge servers.
With the above embodiment, when the back source feature is mapped based on the media data, the back source probability of the current edge server for the media data is obtained, and then the back source probability of the current edge server for the media data can be determined according to the back source probability, and the higher the back source probability, the greater the possibility that the current edge server executes back source for the media data. Thus, for each edge server, the source return probability of the current edge server for the media data can be predicted, and the possibility of the current edge server executing source return on the media data is determined. A probability threshold (e.g., 95%) is then preset, and when the back source probability is greater than the probability threshold, the edge server is determined to be at least a portion of the edge servers. When the back source probability is larger than the probability threshold, the edge server is more likely to back source the media data, and the more the end user requests the media data. Then a cache request may be sent to the corresponding edge servers with the back source probabilities greater than the probability threshold, i.e., metadata for the media data may be sent to the edge servers such that the edge servers pre-cache the original data for the media data.
Through steps 1061B to 1064B in the embodiment of the present application, for each edge server, by collecting the media data set that has been back-sourced by the current edge server, the back-sourced feature of the media data that has been back-sourced by the current edge server is determined, so as to determine whether the current edge server will back-source the current media data. When it is determined that the current edge server will source media data (the source-back probability is greater than the probability threshold), at least some edge servers are determined and cache requests are sent to those edge servers. And when the edge server is determined not to return to the source media data (the return source probability is smaller than the probability threshold value), the cache request is not sent. Compared with the method that all edge servers are directly triggered to send the cache request without considering any condition, the method reduces the load pressure of network nodes in the whole content distribution network, can respond to the demands of terminal users more quickly, and does not cause resource waste.
In some embodiments, after sending a buffering request to at least some edge servers, the corresponding edge servers are caused to pre-buffer the original data of the media data. However, in a certain period of time, the heat of the media data is reduced, the number of requests for the media data by the end user becomes low, the number of times of source back of the edge server for the media data is necessarily reduced, and the heat of the media data is also changed. Because the back source information of the media data is changed in real time, the heat of the changed media data can be determined according to the changed back source information in a certain time period.
When the number of times of the back source of the media data by the edge server is reduced in a certain period of time, the heat degree of the changed media data is also reduced according to the changed back source information in the period of time. The reduced heat of the media data may result in the changed heat being below a heat threshold, where the heat of the media data is indicative of the media data not being heat data. When the media data is not hot, at least part of the edge servers do not have to buffer the media data. In the embodiment of the application, a cache release request is sent to at least part of edge servers in response to the changed heat representing that the media data is not hot data. The buffer release request is used for indicating at least part of the edge servers to clear the buffer of the media data, namely indicating all the edge servers to unload the buffer of the media data, so that the storage resources of the edge servers are saved.
The embodiment of the application also provides a data processing method which is applied to the edge server. Referring to fig. 3I, fig. 3I is a flowchart of a data processing method according to an embodiment of the present application, and the edge server 500 shown in fig. 1 is taken as an execution body, and the steps shown in fig. 3I will be described.
In step 107, a cache request for media data sent by an origin server is received.
In some embodiments, when the end user sends a request for media data, the end user will query the database for whether the media data is cached after receiving the request, and when it is determined that the database is not queried for media data, a request for downloading the media data will be initiated to the source server. The origin server eventually sends a cache request to at least some of the edge servers instructing the edge servers to cache the media data. A cache request for the media data sent by the source server may be received at this time, where the cache request is generated by the source server based on the popularity of the media data after determining that the edge server does not cache the media data.
In step 108, the media data is pre-cached according to the caching request.
After receiving the cache request of the source server, the media data can be cached in advance according to the cache request. Specifically, after receiving a cache request for media data sent by an origin server, metadata of the media data sent by the origin server is also received, and the metadata of the media data is sent as a cache request after the origin server queries the source data of the media data from a database. The original data of the media data is downloaded and cached in advance from the source server according to the received metadata of the media data, and then returned to the end user in response to the request of the end user for the media data. When a re-request for media data by the end user is received, the cache may be hit directly. At this time, the metadata of the media data is cached, so that the original data of the cached media data is directly transmitted back to the terminal user, and the source is not required to be returned to the source server. Therefore, the cache hit rate can be improved, the request of the terminal user can be responded more quickly, and the user experience is improved.
In the following, an exemplary application of the embodiments of the present application in a practical application scenario will be described.
In some video scenarios, a user initiates a video request each time a video is clicked on in a video application of a terminal or a uniform resource locator (Uniform Resource Location, URL) address of the video is entered on the terminal. The server responds to the video request of the terminal every time, reads the video data from the local cache according to the URL address of the video, and if the corresponding video data is not queried and read from the cache, namely the hit cache fails, the video data is downloaded from the source server to return or directly feed back the failure information of the video request to the terminal user. However, considering the response speed of the server and the influence of external factors of the network, the downloading transmission speed of the video data is slow, so that the whole video request process is very slow, and the user experience is poor.
In the related art, a CDN network is used to download video, so as to accelerate a response process of a video request of a terminal. Referring specifically to fig. 4, fig. 4 is a schematic sequence diagram of a video transmission process in the CDN network provided in the embodiment of the present application, and as shown in fig. 4, a user clicks a video in a video application of a terminal or inputs a URL address of the video on the terminal, and sends a video download request. The user's request is first sent to a local domain name system (Domain Name System, DNS) server to perform domain name resolution, and the local DNS server responds to the terminal's request, performs domain name resolution according to the domain name in the URL address of the video, obtains the corresponding internet protocol address (Internet Protocol Address, IP) of the CDN node, and returns to the user terminal as an edge server IP address. The user sends the request of the user to the nearest edge server to request to acquire the video content by utilizing the CDN network based on the IP address of the edge server. The edge server may select an optimal edge server to respond to the user's request based on the geographic location of the user, network conditions, CDN network topology, and other factors. The edge server inquires the video content from the database, judges whether the local cache exists or not, and if the local cache of the edge server does not exist, the edge server sends a source return request to the source server, namely the source return server downloads the video to acquire the video content (the original data of the video). After receiving the source-returning request, the source server analyzes the IP address of the source server from the domain name in the video URL, downloads and acquires the video data, and returns the video content to the edge server. The edge server can record local cache after receiving video data sent by the source server, namely, the video content is cached locally, so that the cache can be hit when the next time of user request, the video content is directly returned to the user, the video downloading speed is increased, and the user experience is improved. After the edge server records the local cache, the video content is returned to the terminal of the user, namely the video content is transmitted to the user, and the user can watch the video on the browser of the terminal. According to fig. 4, whether the edge server has a local cache has a great influence on the video transmission process, in some scenes with frequent video requests, if the edge server does not have a corresponding video cache when responding to each video request, that is, the cache hit rate of the edge server is low, the edge server needs to send a source-returning request to the source server to acquire video data, which results in high source-returning times of videos, and for videos with high source-returning times, the downloading and transmission speeds are very slow, so that the user experience is affected.
Based on the above scenario, the embodiment of the application provides a video transmission method, which collects the source-returning condition of each video reported by the source server by setting a thermal video judgment module, that is, a thermal video determination (determination Hot) module, between the source server and the edge server, and then judges whether the video of the current source-returning of the source server belongs to a thermal video. When the hot video is determined, other edge servers are actively triggered to cache the hot video. When different subsequent users continuously initiate video requests to the CDN network through the terminal to download videos, the video requests can be directly cached in the edge server, and video data is returned to the terminal, so that the cache hit rate of the edge server is improved, and the downloading speed, the transmission speed and the user experience of the videos are improved.
Specifically, referring to fig. 5, fig. 5 is an interactive flowchart of a video transmission method provided in an embodiment of the present application, and the video transmission method provided in the embodiment of the present application will be described in detail with reference to the steps shown in fig. 5.
In step 501, the URL address of the play video or the input video is clicked.
When a user browses a video at a terminal, the user generally clicks the video in a terminal application program or inputs a URL address of the video at the terminal, and sends a video download request to acquire video data.
In step 502, a video request is initiated to a local DNS server.
If the terminal locally caches the video data, if the user has downloaded the video data in the application program and stores the video data in the memory of the terminal, the video data is directly obtained from the terminal and the video is played. If the terminal does not have cached video data locally, the video download request sent by the user will generally be sent to the local DNS server.
In step 503, domain name resolution is performed on the URL address of the video.
And the local DNS server receives the user request sent by the terminal and performs domain name resolution on the URL address of the video, so that a corresponding CDN node IP address, namely the IP address of the edge server, is obtained.
In step 504, the IP address of the edge server is returned.
After the local DNS server obtains the IP address of the edge server, the IP address of the edge server is returned to the terminal.
In step 505, video content is requested from an edge server.
The terminal receives the IP address of the local DNF server, and can request the edge server to acquire the video content, and the edge server can select the optimal edge server to respond to the video content request sent by the terminal according to the geographical position of the terminal user, the network condition, the CDN network topology result and other factor conditions.
In step 506, it is determined whether there is a local video buffer, if yes, the process proceeds to step 507, and if no, the process proceeds to step 508.
When the edge server responds to the request of acquiring the video content sent by the terminal, the edge server queries the local cache to judge whether the video data requested by the user is cached locally, if the fact that the video data requested by the user is cached locally is determined, step 507 is executed, and if the fact that the video data requested by the user is not cached locally is determined, step 508 is executed.
In step 507, the video content is returned directly to the terminal.
After the edge server queries from the local cache, if it is determined that the video data requested by the user is already cached locally, it is indicated that the edge server hits the cache, and corresponding video content is directly returned to the user of the terminal. Therefore, the downloading and transmitting speed of the video is increased, and the user experience is improved.
In step 508, the source server is requested to download the video content.
Accordingly, when the edge server queries from the local cache, it is determined that the video data requested by the user is not cached locally, and this indicates that the edge server does not hit the cache. In order to respond to the video content request initiated by the user at the terminal, the edge server requests to download and acquire the corresponding video content from the source server.
In step 509, the source server IP address is resolved from the URL address of the video and the video content is downloaded.
An origin server is a source of video data, i.e., a repository of videos, in which metadata corresponding to the videos must be stored. When the edge server requests to download the corresponding video content from the source server, domain name resolution is performed according to the URL address of the video to obtain the IP address of the source server, and then the video content is downloaded or acquired to return to the edge server.
In step 510, the source-back condition of the current video is reported.
After the source server acquires the corresponding video data, the source server asynchronously uploads the source returning condition of the current video to the thermal data judging module, namely the request times and the frequency aiming at the video.
In step 511, the video content is returned to the edge server.
When the source server asynchronously reports the source returning condition of the current video to the thermal data judging module, the corresponding video content is returned to the edge server at the moment so as to respond to the video content request initiated by the user at the terminal.
In step 512, it is determined whether the current video is a hot video, if yes, the process proceeds to step 514, and if no, the process proceeds to step 513.
And after the thermal video judging module receives the source returning condition of the current video uploaded by the source server, judging whether the current video is the thermal video or not in real time according to the source returning condition. Whether the current video is a hot video or not is judged according to the number of times of video source back, wherein the number of times of video source back refers to the number of times of request of a source server for obtaining the video. If the number of times of the back source of one video is high, the video is requested by more users, and is likely to be a popular video. Conversely, if the number of times a video is back sourced is low, this indicates that the video is requested by fewer users, and is likely not a popular video.
In the embodiment of the application, whether the current video is a hot video is judged according to the number of times of source returning of the video, and specific judging conditions include the following five points:
(1) The number of times of the video source back in a certain time exceeds a certain threshold, for example, the number of times of the video source back in a certain video exceeds 100 times in the past 1 hour, and the video can be judged to be a popular video.
(2) The increasing speed of the number of times of the video source returning exceeds a certain threshold value in a certain time, for example, the increasing speed of the number of times of the video source returning exceeds 10% in the past 1 hour, and the video can be judged to be a hot video.
(3) The proportion of the number of times of the video source returning to the total number of times of the requests exceeds a certain threshold value in a certain time, for example, the proportion of the number of times of the video source returning to the total number of times of the video requests exceeds 50% in the past 1 hour, and the video can be judged to be a popular video.
(4) The number of times of video source back exceeds a certain multiple of the average value in a certain time, for example, in the past 1 hour, the number of times of video source back exceeds 2 times of the average value of the number of times of video source back in the past 24 hours, and the video can be judged to be a popular video.
(5) The number of times of the video source back exceeds a certain percentage of edge nodes in a certain time, for example, in the past 1 hour, the number of times of the video source back exceeds 80% of the number of times of the edge nodes in the CDN network, and the video can be judged to be a hot video.
If the current video satisfies at least one of the above five judgment conditions, i.e., it is determined that the current video is a hot video, then execution proceeds to step 514, and if none of the above five judgment conditions is satisfied, it is determined that the current video is not a hot video, then execution proceeds to step 513.
In step 513, no cache requests need to be initiated to other edge servers.
When judging that the current video is not the hot video, the method indicates that the current video has a small number of times of source returning, and the edge server possibly has cached the data of the current video, so that a cache request for the current video does not need to be initiated to other edge servers.
In step 514, other edge servers are triggered to actively cache the hot video.
When the current video is judged to be the hot video, the number of times of source returning of the current video is high, and the data of the current video is not cached in the edge server. The high number of times of source back of the current video also indicates that the frequency of video requests of the terminal user is frequent, and the data of the current video needs to be cached in the edge server. At this time, the thermal video judging module triggers other edge servers to actively cache the video data of the thermal video, that is, triggers other edge servers to actively cache the video data of the thermal video.
In step 515, the metadata of the hot video is actively cached.
When other edge servers (namely all edge servers) receive the request of actively caching the hot video triggered by the hot video judging module, all the edge servers actively cache metadata of the hot video, wherein the metadata are used for indicating the edge servers to cache video contents in advance and are stored in local storage of the edge servers.
Since the requests initiated by the user at the terminal to download the video are consistent, only the metadata of the video is different (i.e., the requested video is different). If the edge server does not have the corresponding video data, the source returning (namely downloading and acquiring from the source server) is carried out, and the caching is carried out after the source returning, so that the purpose of caching the video is achieved. If the edge server has the cache of the corresponding video data, the metadata of the corresponding video is directly returned to the user of the terminal without the need of source returning. In addition, the edge server does not need to perform secondary development (namely developing a separate cache video interface), and the existing video downloading interface can be directly multiplexed. Considering that the bandwidth consumed by the metadata is negligible, only requesting the metadata instead of requesting the complete video content can save the bandwidth consumed by the video download and transmission, and increase the speed of downloading and transmitting the video data by the edge server.
In step 516, the video content is returned to the terminal.
After all the edge servers actively cache the metadata of the corresponding hot video, when receiving the re-request for the hot video initiated by the users of different terminals, directly selecting the optimal edge server and hitting the cache, inquiring the metadata of the hot video from the local cache of the optimal edge server, downloading and transmitting the metadata, and returning the video content of the hot video to the corresponding terminal.
According to the embodiment of the application, the hot video judging module is newly added in the architecture of the CDN, when the edge server responds to the video request of the terminal user, and the edge server does not locally cache the data of the current video and returns the data to the corresponding source server, whether the current video is the hot video is judged in real time through the number of times of returning the current video data reported by the source server, when the current video is determined to be the hot video, all the edge servers are triggered to actively cache the metadata of the current video, and in response to the repeated request of different terminal users for the current video, the cache hit rate of the hot video can be greatly improved, the video downloading speed is improved, and therefore user experience is improved.
Continuing with the description below of an exemplary structure of the data processing apparatus 453 provided in the embodiments of the present application implemented as a software module, in some embodiments, as shown in fig. 2A, the software module in the data processing apparatus 453 stored in the memory 450 may include: an obtaining module 4531, configured to obtain source-returning information of the media data in response to at least one edge server not caching the media data; a determining module 4532, configured to determine the heat of the media data based on the source-returning information of the media data; and a sending module 4533 configured to send a cache request to at least some edge servers in response to the heat of the media data indicating that the media data is hot data, where the cache request is used to instruct at least some edge servers to pre-cache the media data.
In some embodiments, the determining module 4532 is further configured to determine at least one back-source parameter of the media data according to the back-source information of the media data; determining a heat of the media data based on the at least one back source parameter;
wherein the back source parameter is one of the following: the number of times of source returning of the media data in a preset time period; the increase speed value of the source returning times in a preset time period; the source returning times in the preset time period are proportional to the total source returning requests of the corresponding edge server; the source returning times exceeds the multiple value of the average value of the source returning times in a preset time period; and the source-returning times in the preset time period are the percentage of the total source-returning times of the corresponding edge server.
In some embodiments, the determining module 4532 is further configured to map the back-source parameter to obtain the heat of the media data when the back-source parameter of the media data is one; when the number of the back source parameters of the media data is multiple, mapping processing is carried out on each back source parameter to obtain multiple mapping values, and the sum of the multiple mapping values is used as the heat of the media data.
In some embodiments, the determining module 4532 is further configured to obtain original data of the media data based on the source-back information of the media data; performing feature extraction processing on the original data of the media data to obtain the heat feature of the original data; and predicting the heat characteristic to obtain the heat of the media data.
In some embodiments, the sending module 4533 is further configured to query the database for metadata of the media data;
metadata is sent to at least some of the edge servers, wherein the metadata is used to instruct at least some of the edge servers to pre-cache raw data of the media data.
In some embodiments, the sending module 4533 is further configured to determine a plurality of edge servers in the content distribution network that establish communication with the source server according to a topology structure of the content distribution network in which the source server is located; at least a portion of the edge servers are determined from the plurality of edge servers.
In some embodiments, the sending module 4533 is further configured to determine a communication distance for each edge server to establish communication with the source server; when the communication distance is not less than the distance threshold, the edge server is determined to be at least part of the edge server.
In some embodiments, the sending module 4533 is further configured to, for each edge server, perform the following: acquiring a media data set, wherein the media data set is a set of media data of which the source is executed in an edge server; performing feature extraction processing on the media data set to obtain the back source feature of the media data set; mapping the back source characteristics based on the media data to obtain the back source probability of the edge server for the media data; when the back source probability is greater than the probability threshold, the edge server is determined to be at least part of the edge server.
In some embodiments, the sending module 4533 is further configured to, when the back source information of the media data changes, determine the heat after the change of the media data according to the changed back source information; and sending a cache release request to at least part of the edge servers in response to the changed heat indicating that the media data is not hot data, wherein the cache release request is used for indicating at least part of the edge servers to clear the cache of the media data.
In some embodiments, as shown in FIG. 2B, the software modules stored in the data processing device 553 in the memory 550 may include: the receiving module 5531 receives a cache request for media data sent by the source server, wherein the cache request is generated based on the heat of the media data after the source server determines that the edge server does not cache the media data; the buffering module 5532 is configured to pre-buffer media data according to a buffering request.
Embodiments of the present application provide a computer program product comprising a computer program or computer-executable instructions stored in a computer-readable storage medium. The processor of the electronic device reads the computer-executable instructions from the computer-readable storage medium, and the processor executes the computer-executable instructions, so that the electronic device executes the data processing method according to the embodiment of the present application.
The present embodiments provide a computer-readable storage medium storing computer-executable instructions or a computer program stored therein, which when executed by a processor, cause the processor to perform a data processing method provided by the embodiments of the present application, for example, a data processing method as shown in fig. 3A to 3H or a data processing method as shown in fig. 3I.
In some embodiments, the computer readable storage medium may be RAM, ROM, flash memory, magnetic surface memory, optical disk, or CD-ROM; but may be a variety of devices including one or any combination of the above memories.
In some embodiments, computer-executable instructions may be written in any form of programming language, including compiled or interpreted languages, or declarative or procedural languages, in the form of programs, software modules, scripts, or code, and they may be deployed in any form, including as stand-alone programs or as modules, components, subroutines, or other units suitable for use in a computing environment.
As an example, computer-executable instructions may, but need not, correspond to files in a file system, may be stored as part of a file that holds other programs or data, such as in one or more scripts in a hypertext markup language (Hyper Text Markup Language, HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
As an example, computer-executable instructions may be deployed to be executed on one electronic device or on multiple electronic devices located at one site or, alternatively, on multiple electronic devices distributed across multiple sites and interconnected by a communication network.
In summary, according to the embodiment of the present application, when the edge server does not cache the media data, the heat of the media data is determined according to the source-back information of the media data, and when the heat of the media data is high, a request for caching the media data is initiated to a part of the edge servers, so that a part of the edge servers cache the media data. When the media data are numerous and the data requests are frequent, the edge server is triggered to cache the media data through the heat of the media data, so that the requests for the media data are easier to hit the cache of the edge server, and the cache hit rate of the edge server is improved. Thus, when responding to the request for the media data, the downloading and transmission speed of the media data is increased, and the request is responded more quickly. In addition, when it is determined that the media data is hot data, metadata of the media data is transmitted to a part of the edge servers on the other hand, so that the edge servers cache original data of the media data. Therefore, the request of the media data initiated by the terminal user can hit the cache of the edge server more easily, and the number of times of source returning of the edge server is reduced. The other side returns to the edge server that the metadata of the media data is not the original data of the media data, and the metadata has small memory, so that the network resources consumed in the sending process can be reduced, and the bandwidth pressure of the network node data transmission in the content distribution network is reduced.
The foregoing is merely exemplary embodiments of the present application and is not intended to limit the scope of the present application. Any modifications, equivalent substitutions, improvements, etc. that are within the spirit and scope of the present application are intended to be included within the scope of the present application.

Claims (14)

1. A data processing method, applied to an origin server, the method comprising:
responding to the fact that at least one edge server does not cache the media data, and acquiring back source information of the media data;
determining a plurality of set time periods before the current moment, and executing the following processing for each set time period:
determining the number of times of source back of the media data in the set time period based on the source back information of the media data;
determining the source-returning frequency of the media data in the set time period according to the source-returning times;
determining the weight of the back source frequency, wherein the weight and the time interval between the set time period of the back source frequency and the current moment are in a negative correlation relationship;
weighting the source return frequency based on the weight to obtain a weighted source return frequency of the set time period;
summing the weighted back source frequencies of each set time period, and taking the average value of the summation result as the heat of the media data;
And sending a cache request to at least part of the edge servers in response to the fact that the heat of the media data represents that the media data is hot data, wherein the cache request is used for indicating at least part of the edge servers to cache the media data in advance.
2. The method of claim 1, wherein the responding to the heat of the media data characterizes the media data as hot data, the method further comprises, prior to sending a cache request to at least a portion of the edge servers:
determining at least one back source parameter of the media data according to the back source information of the media data;
determining a heat of the media data based on the at least one back source parameter;
wherein the source return parameter is one of the following: the number of times of source returning of the media data in a preset time period; the increase speed value of the source returning times in a preset time period; the source returning times in a preset time period are proportional to the total number of source returning requests of the corresponding edge server; the source returning times exceeds the multiple value of the average value of the source returning times in a preset time period; and the source-returning times account for the percentage of the total source-returning times of the corresponding edge server in the preset time period.
3. The method of claim 2, wherein determining the heat of the media data based on the at least one back source parameter comprises:
when the back source parameters of the media data are one, mapping the back source parameters to obtain the heat of the media data;
when the number of the source returning parameters of the media data is multiple, mapping processing is carried out on each source returning parameter respectively to obtain multiple mapping values, and the sum of the multiple mapping values is used as the heat of the media data.
4. The method of claim 1, wherein the responding to the heat of the media data characterizes the media data as hot data, the method further comprises, prior to sending a cache request to at least a portion of the edge servers:
acquiring original data of the media data based on the back source information of the media data;
performing feature extraction processing on the original data of the media data to obtain the heat feature of the original data;
and predicting the heat characteristic to obtain the heat of the media data.
5. The method of claim 1, wherein said sending a cache request to at least some of said edge servers comprises:
Querying metadata of the media data from a database;
and sending the metadata to at least part of the edge servers, wherein the metadata is used for indicating at least part of the edge servers to pre-cache the original data of the media data.
6. The method of claim 1, wherein prior to said sending a cache request to at least some of said edge servers, said method further comprises:
determining a plurality of edge servers which are in communication with the source server in the content distribution network according to the topological structure of the content distribution network where the source server is located;
at least a portion of the edge servers are determined from the plurality of edge servers.
7. The method of claim 6, wherein said determining at least a portion of said edge servers from said plurality of edge servers comprises:
determining a communication distance between each edge server and the source server;
and determining the edge server as at least part of the edge server when the communication distance is not less than a distance threshold.
8. The method of claim 6, wherein said determining at least a portion of said edge servers from said plurality of edge servers comprises:
The following processing is performed for each of the edge servers:
acquiring a media data set, wherein the media data set is a set of media data of which the source is executed in the edge server;
performing feature extraction processing on the media data set to obtain a source returning feature of the media data set;
mapping the back source characteristics based on the media data to obtain the back source probability of the edge server for the media data;
and when the source returning probability is larger than a probability threshold, determining the edge server as at least part of the edge server.
9. The method of claim 1, wherein after said sending a cache request to at least some of said edge servers, said method further comprises:
when the back source information of the media data changes, determining the heat of the changed media data according to the changed back source information;
and sending a cache release request to at least part of the edge servers in response to the changed heat representing that the media data is not hot data, wherein the cache release request is used for indicating at least part of the edge servers to clear the cache of the media data.
10. A data processing method, applied to an edge server, the method comprising:
receiving a cache request for media data sent by an origin server, wherein the cache request is generated based on the heat of the media data by the origin server determining that the edge server does not cache the media data;
pre-caching the media data according to the caching request, wherein the heat of the media data is obtained by the following steps: determining a plurality of set time periods before the current moment, and executing the following processing for each set time period:
determining the number of times of source back of the media data in the set time period based on the source back information of the media data;
determining the source-returning frequency of the media data in the set time period according to the source-returning times;
determining the weight of the back source frequency, wherein the weight and the time interval between the set time period of the back source frequency and the current moment are in a negative correlation relationship;
weighting the source return frequency based on the weight to obtain a weighted source return frequency of the set time period;
and adding the weighted back source frequencies of each set time period, and taking the average value of the addition result as the heat of the media data.
11. A data processing apparatus, the apparatus comprising:
the acquisition module is used for responding to the fact that at least one edge server does not cache the media data and acquiring the source returning information of the media data;
a determining module, configured to determine a plurality of set time periods before a current time, and execute the following processing for each set time period: determining the number of times of source back of the media data in the set time period based on the source back information of the media data; determining the source-returning frequency of the media data in the set time period according to the source-returning times; determining the weight of the back source frequency, wherein the weight and the time interval between the set time period of the back source frequency and the current moment are in a negative correlation relationship; weighting the source return frequency based on the weight to obtain a weighted source return frequency of the set time period; summing the weighted back source frequencies of each set time period, and taking the average value of the summation result as the heat of the media data;
and the sending module is used for responding to the heat degree of the media data to represent the media data as hot data and sending a cache request to at least part of the edge servers, wherein the cache request is used for indicating at least part of the edge servers to cache the media data in advance.
12. A data processing apparatus, the apparatus comprising:
the device comprises a receiving module, a processing module and a processing module, wherein the receiving module is used for receiving a cache request for media data sent by an origin server, wherein the cache request is generated based on the heat of the media data, and the origin server determines that an edge server does not cache the media data;
the caching module is used for caching the media data in advance according to the caching request, wherein the heat of the media data is obtained by the following steps: determining a plurality of set time periods before the current moment, and executing the following processing for each set time period: determining the number of times of source back of the media data in the set time period based on the source back information of the media data; determining the source-returning frequency of the media data in the set time period according to the source-returning times; determining the weight of the back source frequency, wherein the weight and the time interval between the set time period of the back source frequency and the current moment are in a negative correlation relationship; weighting the source return frequency based on the weight to obtain a weighted source return frequency of the set time period; and adding the weighted back source frequencies of each set time period, and taking the average value of the addition result as the heat of the media data.
13. An electronic device, the electronic device comprising:
a memory for storing computer executable instructions or computer programs;
a processor for implementing the data processing method of any one of claims 1 to 9 or the data processing method of claim 10 when executing computer executable instructions or computer programs stored in the memory.
14. A computer-readable storage medium storing computer-executable instructions or a computer program, which when executed by a processor implement the data processing method of any one of claims 1 to 9 or the data processing method of claim 10.
CN202311392617.8A 2023-10-25 2023-10-25 Data processing method, device, electronic equipment and computer readable storage medium Active CN117119052B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311392617.8A CN117119052B (en) 2023-10-25 2023-10-25 Data processing method, device, electronic equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311392617.8A CN117119052B (en) 2023-10-25 2023-10-25 Data processing method, device, electronic equipment and computer readable storage medium

Publications (2)

Publication Number Publication Date
CN117119052A CN117119052A (en) 2023-11-24
CN117119052B true CN117119052B (en) 2024-01-19

Family

ID=88809693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311392617.8A Active CN117119052B (en) 2023-10-25 2023-10-25 Data processing method, device, electronic equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN117119052B (en)

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103974138A (en) * 2014-04-15 2014-08-06 上海聚力传媒技术有限公司 Method and device for preloading videos in CDN
CN104683485A (en) * 2015-03-25 2015-06-03 重庆邮电大学 C-RAN based internet content caching and preloading method and system
WO2019057209A1 (en) * 2017-09-25 2019-03-28 中兴通讯股份有限公司 Local content caching method and apparatus, storage medium and electronic apparatus
CN112866724A (en) * 2020-12-31 2021-05-28 山东远桥信息科技有限公司 Video service processing method and system based on software defined network and edge computing technology
CN112929676A (en) * 2019-12-06 2021-06-08 北京金山云网络技术有限公司 Live data stream acquisition method, device, node and system
CN112995251A (en) * 2019-12-13 2021-06-18 北京金山云网络技术有限公司 Source returning method and device, electronic equipment and storage medium
CN113766650A (en) * 2021-08-26 2021-12-07 武汉天地同宽科技有限公司 Internet resource acquisition method and system based on dynamic balance
CN114697683A (en) * 2022-03-25 2022-07-01 腾讯音乐娱乐科技(深圳)有限公司 Intelligent scheduling method, equipment and computer program product for streaming media file
WO2022228390A1 (en) * 2021-04-26 2022-11-03 北京字跳网络技术有限公司 Media content processing method, apparatus and device, and storage medium
WO2023082764A1 (en) * 2021-11-12 2023-05-19 中兴通讯股份有限公司 Content recording method, content playing method, cdn system, and storage medium
CN116321303A (en) * 2023-03-20 2023-06-23 北京航空航天大学 Data caching method, device, equipment and readable storage medium

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103974138A (en) * 2014-04-15 2014-08-06 上海聚力传媒技术有限公司 Method and device for preloading videos in CDN
CN104683485A (en) * 2015-03-25 2015-06-03 重庆邮电大学 C-RAN based internet content caching and preloading method and system
WO2019057209A1 (en) * 2017-09-25 2019-03-28 中兴通讯股份有限公司 Local content caching method and apparatus, storage medium and electronic apparatus
CN112929676A (en) * 2019-12-06 2021-06-08 北京金山云网络技术有限公司 Live data stream acquisition method, device, node and system
CN112995251A (en) * 2019-12-13 2021-06-18 北京金山云网络技术有限公司 Source returning method and device, electronic equipment and storage medium
CN112866724A (en) * 2020-12-31 2021-05-28 山东远桥信息科技有限公司 Video service processing method and system based on software defined network and edge computing technology
WO2022228390A1 (en) * 2021-04-26 2022-11-03 北京字跳网络技术有限公司 Media content processing method, apparatus and device, and storage medium
CN113766650A (en) * 2021-08-26 2021-12-07 武汉天地同宽科技有限公司 Internet resource acquisition method and system based on dynamic balance
WO2023082764A1 (en) * 2021-11-12 2023-05-19 中兴通讯股份有限公司 Content recording method, content playing method, cdn system, and storage medium
CN114697683A (en) * 2022-03-25 2022-07-01 腾讯音乐娱乐科技(深圳)有限公司 Intelligent scheduling method, equipment and computer program product for streaming media file
CN116321303A (en) * 2023-03-20 2023-06-23 北京航空航天大学 Data caching method, device, equipment and readable storage medium

Also Published As

Publication number Publication date
CN117119052A (en) 2023-11-24

Similar Documents

Publication Publication Date Title
Goian et al. Popularity-based video caching techniques for cache-enabled networks: A survey
US9749400B2 (en) Cooperative loading of webpages based on shared meta information
US8654684B1 (en) Multi-platform video delivery configuration
CN107251525B (en) Distributed server architecture for supporting predictive content pre-fetching services for mobile device users
US8126986B2 (en) Advanced content and data distribution techniques
CN102282804B (en) Adaptive network content distribution system
US9609366B2 (en) Digital television terminal, video file playing method and video file playing system
CN102833293A (en) Method for downloading resources in peer to server and peer (P2SP) network, and client
US10592578B1 (en) Predictive content push-enabled content delivery network
CN102771080A (en) System and methods for efficient media delivery using cache
CN102740159A (en) Media file storage format and adaptive delivery system
CN102469149A (en) Method and device for carrying out self-adaptive adjustment on images by agent
US20120054295A1 (en) Method and apparatus for providing or acquiring the contents of a network resource for a mobile device
Guan et al. Prefcache: Edge cache admission with user preference learning for video content distribution
Haouari et al. QoE-aware resource allocation for crowdsourced live streaming: A machine learning approach
CN103905516B (en) The method and respective server and terminal of sharing data
CN101635831B (en) Method, device and agent system for sharing node data of P2P live video
CN117119052B (en) Data processing method, device, electronic equipment and computer readable storage medium
CN115022660B (en) Parameter configuration method and system for content distribution network
Shen et al. Toward efficient short-video sharing in the YouTube social network
CN115883657A (en) Cloud disk service accelerated scheduling method and system
Gao et al. Measurement study on P2P streaming systems
KR102235622B1 (en) Method and Apparatus for Cooperative Edge Caching in IoT Environment
CN111698539A (en) Method and system for optimizing mobile terminal APP access
Baydeti et al. Scalable Models for Redundant Data Flow Analysis in Online Social Networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant