WO2018049563A1

WO2018049563A1 - Systems and methods for caching

Info

Publication number: WO2018049563A1
Application number: PCT/CN2016/098871
Authority: WO
Inventors: William August HOILES; S M Shahrear TANZIL; Yan DUAN; Vikram Krishnamurthy; Ngoc Dung DAO
Original assignee: Huawei Technologies Co., Ltd.
Priority date: 2016-09-13
Filing date: 2016-09-13
Publication date: 2018-03-22

Abstract

There is provided a network cache management architecture, in accordance with embodiments of the present invention. The network cache management architecture comprises a global network node that is configurable to connect to content service providers, at least one regional network node that is connected to the global network node, and for each regional network node, at least one local network node that is connected to that regional network node. The at least one local network node comprises a local cache that is configurable to store content. Each of said at least one regional network node comprises a regional content database that is configurable to store metadata pertaining to content stored in each local cache of each local network node that is connected to that regional network node, a regional content popularity prediction unit that is configurable to predict a local popularity metric for each content stored at any local cache of each local network node that is connected to the regional network node, and a content placement unit configurable to determine, for each content in all local cache of all local network nodes that are connected to the regional node, in which local cache that content is stored. The global network node comprises a global content database configurable to store metadata pertaining to all content that has been requested by user equipment, and a global content popularity prediction unit that is configurable to predict a regional popularity metric, for each regional network node, for new content that is not stored at any local cache connected to that regional network node.

Description

SYSTEMS AND METHODS FOR CACHING

FIELD OF THE INVENTION

The present invention pertains to the field of network communications, and in particular to systems and methods for caching in communications networks.

BACKGROUND

Network caching generally refers to the storage of commonly accessed data content such as web pages, audio/video content, and images within a communications network. When a user requests a specific piece of data content, for example, it may be delivered from an originating server to the user via the communications network. In some situations, the piece of data content may also be stored within a cache memory of the communications network (i.e., “cached” ) where it may be later retrieved, instead of from the originating server, in the event of a subsequent request for the data content. Accordingly, ‘caching’ certain pieces of data content may provide faster delivery and reduce data traffic within the communications network. However, cache memories have limited storage space in order to provide cost-effectiveness, making efficient management and use of the cache memories a challenging task.

This background information is provided to reveal information believed by the applicant to be of possible relevance to the present invention. No admission is necessarily intended, nor should be construed, that any of the preceding information constitutes prior art against the present invention.

SUMMARY

An object of embodiments of the present invention is to provide an improved method and apparatus for caching in a communications network.

In accordance with embodiments of the present invention, there is provided a network cache management architecture. The network cache management architecture comprises a global network node that is configurable to connect to content service providers, at least one regional network node that is connected to the global network node, and for each regional network node, at least one local network node that is connected to that regional network node. The at least one local network node comprises a local cache that is configurable to store content. Each of said at least one regional network node comprises a regional content database that is configurable to store metadata pertaining to content stored in each local cache of each local network node that is connected to that regional network node, a regional content popularity prediction unit that is configurable to predict a local popularity metric for each content stored at any local cache of each local network node that is connected to the regional network node, and a content placement unit configurable to determine, for each content in all local cache of all local network nodes that are connected to the regional node, in which local cache that content is stored. The global network node comprises a global content database configurable to store metadata pertaining to all content that has been requested by user equipment, and a global content popularity prediction unit that is configurable to predict a regional popularity metric, for each regional network node, for new content that is not stored at any local cache connected to that regional network node.

In accordance with embodiments of the present invention, there is provided a method of storing a new content in cache memory in a network. The network comprises a global network node connected to at least one regional network node and, for each regional network node, at least one local network node connected to that regional network node. Each local network node has cache capabilities for storing content for delivery to requesting user equipment. The method comprises determining, at the global network node, a regional popularity metric of the new content for a regional network node, sending from the global network node to the regional network node the regional popularity metric of the new content, and storing the new content at a local cache physically closest to a user equipment requesting the new content if said popularity metric meets a threshold. A regional popularity metric is based on the new content metadata, historical content metadata for content accessed at any local node connected to the regional node； and past content requests for said historical content at any local network node connected to the regional network node.

In accordance with embodiments of the present invention, there is provided a method of storing a content in cache memory in a network. The network comprises a global network node connected to at least one regional network node, and for each regional network node, at least one local network node connected to that regional network node. Each local network node has cache capabilities for storing content for delivery to requesting user equipment. A first local cache of a first local network node of the at least one local network nodes that is connected to a first regional network node has a stored content. The method comprises a regional cache controller at the first regional network node determining for each local network node connected to the first regional network node a local popularity metric for the stored content, determining for each local network node connected to the first regional network node a local cache infrastructure cost for storing the stored content at a local cache of that local network node, storing the stored content at a second cache located at a second local network node connected to said first regional network node, and sending to the global network node a regional cache report of found past content requests and missed past content requests from user equipment. Each local popularity metric is based on past content requests for the stored content at that local network node. The second local network node is associated with an optimal local cache infrastructure cost for the stored content. The found past content was found to be in any local cache of any local network node connected to that regional network node. The missed past content was not found to be in any local cache of any local network node connected to that regional network node.

BRIEF DESCRIPTION OF THE FIGURES

Further features and advantages of the present invention will become apparent from the following detailed description, taken in combination with the appended drawings, in which:

FIG. 1 is a network diagram illustrating an example of a network cache management architecture, in accordance with an embodiment of the present invention；

FIG. 2 is a network diagram illustrating an example of two caching architectures, in accordance with the network cache management architecture；

FIG. 3 is a flow chart illustrating a method of storing content in cache memory in a network, in accordance with the network cache management architecture；

FIG. 4 is a flow chart illustrating a method of storing content in cache memory in a network, in accordance with the network cache management architecture；

FIG. 5 is a logical architecture diagram illustrating an example of network interfaces and signaling to facilitate operation of a centralized caching system, in accordance with the network cache management architecture；

FIG. 6 shows a schematic of a content centric network；

FIG. 7 shows an example of a segmented last recent used (SLRU) content replacement scheme；

FIG. 8 is a functional schematic of a communications network；

FIG. 9 is a flowchart showing a method for managing a cache memory network；

FIG. 10A is a flowchart showing an ELM method for predicting popularity of a piece of data content；

FIG. 10B illustrates an example of a feature selection algorithm in the form of an exemplary sequential wrapper feature selection algorithm；

FIG. 10C illustrates how the data content can be classified into popular and unpopular categories；

FIGS. 11A-B illustrate respective hit ratio and latency simulations in response to various user requests；

FIG. 12 illustrates an example of pseudo code that may be used to estimate the performance of the ELM method for predicting popularity of a piece of data content；

FIG. 13 illustrates an example of an ELM comprising a hidden-layer feed-forward neural network；

FIG. 14 illustrates an algorithm for handling unbalanced training data and to reduce the effect of outliers；

FIG. 15A is a schematic illustrating an example of an ELM that can perform both prediction of new and published videos；

FIG. 15B shows predicted view counts from the ELM of FIG. 15A on day 1；

FIG. 15C shows predicted view counts from the ELM of FIG. 15A on day 4；

FIG. 16 is a flow chart illustrating an adaptive distributive caching method；

FIG. 17 is a flow chart illustrating a game theoretic learning regret-based algorithm； and

FIG. 18 is a schematic diagram of a hardware component.

It will be noted that throughout the appended drawings, like features are identified by like reference numerals.

DETAILED DESCRIPTION

Communications networks typically include a plurality of servers and nodes that are communicatively interconnected to serve the requests of various users connected to the network via user equipment (UE) . One or more cache memories may be deployed at various locations or nodes of the communications network in order to temporarily and locally store frequently accessed data content, which may then be re-used in a subsequent request without requiring retransmission from the original source location (i.e., “caching” ) . A “cache hit” may be defined as a user request for a specific piece of data content that is found within a cache memory of the communications network, whereas a “cache miss” may be defined as the absence of a requested piece of data content within the communications network, which must then be retrieved from an originating content provider in order to fulfil the request.

Some benefits of efficient caching include the reduction of bandwidth consumption over the communications network, reduction of request processing over the network, and reduction of latency times required for fetching desired user content (i.e., content retrieval) . Cache memories are typically size-limited due to their cost, requiring efficient management of data content within the cache to be effective. Efficient caching becomes more useful as the size of the communications network, and the number of UEs connected to the network increases.

In communications networks having a plurality of cache memories, another problem involves the effective organization of data content amongst the cache memories. For example, a set of data content may be divided into different groups (e.g., videos, news, music, etc. ) , with each specific group of data content being initially cached at a specific geographical location which may reduce distance and transmission latency to an expected user. However, since the popularity of individual or groups of data content may be location and time dependent, the ideal storage location of the data content may also change over time, resulting in an increasing number of cache misses.

Another shortcoming of some communications networks is the failure to utilize and leverage various caching statistics in order to cache, manage, and organize data content within various locations of the communications network. This results in static or semi-static caching operation that fails to adapt to potentially changing and evolving storage conditions of the communications network.

Yet another shortcoming of some communications networks is the failure to communicate or coordinate caching information between various cache memories. This may result in redundant storage of data content, increased latency delivery times, and an increasing number of cache misses.

Embodiments of the present invention are directed towards systems and methods for adaptive and predictive caching in communications networks that at least partially addresses or alleviates one or more of the aforementioned problems.

FIG. 1 shows in a network diagram a network cache management architecture 100, in accordance with an embodiment of the present invention. The centralized network cache management architecture 100 comprises a global network node 110, at least one regional node 120 connected to the global network node 110, and at least one local network node 130 connected to each regional network node 120. The global network node 110 may be located at or near a main network gateway for communication with content service providers via a backbone network. The global network node 110 comprises a global content database 111 and a global content popularity prediction unit 112. The global content database 111 is configurable to store metadata pertaining to all content that has been requested by user equipment (UE) , via any local network in the overall network, from the content service providers. The global content popularity prediction unit 112 is configurable to predict a new content popularity metric for each regional network node 120 for each new content that is not stored at any cache of any local network node 130 connected to that regional network node 120.

The regional network nodes 120may be located at or near local network gateways. For example, a regional network node 120 may be located at or near the local network gateway of a local wireline network. Another regional network node 120 may be located at or near the local network gateway of a local wireless network. A regional network nodes may be located at or near a local network within the overall network. For example, one regional network node may be located at or near a local network gateway of a local wireline network, and another regional network node 120 may be located at or near a local network gateway of a local wireless network. More generally, a plurality of local networks may be connected to a global network node. Each local network may have a regional network node located at or near the gateway of that local network that connects with the global network node.

Regional network nodes comprise a regional content database 121, a regional content popularity prediction unit 122 and a content placement unit 123. The regional content database 121 is configurable to store metadata pertaining to content stored in each local cache of each local network node connected to the regional network node. The regional content popularity prediction unit 122 is configurable to predict a local popularity metric for any content stored at any local cache of any local network nodes connected to the regional network node. The content placement unit 123 is configurable to determine, for each content in all local cache of all local network nodes 130 connected to the regional network node 120, in which cache that content is stored. The local popularity metrics may also be time-dependent as well as local cache-dependent.

The local network nodes 130 may be located at or near local cache locations of the local networks. The local network nodes 130 may be located at or near edge nodes of the local network. For wireless networks, an example of an edge node is a wireless base station. Local network nodes comprise a local cache 131that is configurable to store content. The local network nodes 130 may further comprise caching functionality for the operation of the local caches 131, such as the generation and sending of local cache hit/miss reports to the regional network node 120.

Referring to FIG. 2, a network diagram 200 for two caching architectures is illustrated, in accordance with the network cache management architecture 100. In the diagram, content service providers 211 provide content through a backbone network 205. Backbone routers 207 direct requests to a specified content provider, and direct requested content to the requesting network. In FIG. 2 there are two exemplary networks, network 1 201 and network 2 202. Network 1 201 illustrates a distributed caching architecture, whereas Network 2 202 depicts an example of the network cache management architecture 100 in more detail.

Referring to network 1 201, a first network gateway router 218 facilitates the exchange of content requests from the network 1 201 and requested content in reply to the content requests from the backbone network 205. Network routers 217 direct information to its intended location within the network 1 201. The network 1 201 is depicted as having both a wireline network 1 212 and a mobile wireless network 1 214. Caching servers 210 are distributed at nodes throughout the wireline network 1 212 and the mobile wireless network 1 214. The caching servers 210 being servers with associated cache resources. The caching servers 210 each executing a local content popularity prediction and distributed content placement algorithm. Caching of content at each caching server 210 being primarily determined based upon a local popularity metric as determined by that caching server 210. Communication links between network entities are illustrated by a solid line. The dotted line in network 1 201 indicating a logical link for sharing control information, including content popularity and utility information using the distributed content placement algorithm.

Referring to network 2 202, a second network gateway router 228 facilitates the exchange of content requests from the network 2 202 and requested content in reply to the content requests from the backbone network 205. Local network routers 227 direct information to its intended location within the network 2 202. The local network routers 227 may comprise virtual network routers maintained by a controlling network entity. The network 2 202 is depicted as having both a wireline local network 2 222 and a mobile wireless local network 2 223.

Network 2 202 includes a centralized data analytics server which provides "big data analytics" based upon a global (i.e., local network-wide) popularity metric to provide content popularity prediction at the level of the local network gateway routers 227. The centralized data analytics server is an example of the global network node 110. The centralized data analytics server 110 may comprise a virtualization executing on computing resource located in a network operating center, for instance. Accordingly, the centralized data analytics server 110 is located to identify trending content popularity to predict popularity at the local level in advance of actual realized popularity, based upon local popularity measurements. Thus, the centralized data analytics server 110 may implement a regional content popularity prediction unit 112, and a global content database 111 that stores metadata of content stored within all local networks within the network 2 202. The metadata of the content may come directly from the content service providers 150. Alternatively, the metadata may be extracted by a proxy server that received both the content and metadata description together from the content service providers 150. The proxy server would then send the metadata to the global network node 110 and the content to the requesting local network node 130.

The network 2 202 further includes regional caching servers which provide coordination for caching on the

local networks

222, 223. The regional caching servers are examples of the regional network nodes 120. The regional caching servers 120 execute regional content popularity prediction (i.e., an example of a local content popularity unit 122 using information in a regional repository or regional content database 121) and content placement algorithms (i.e., an example of a content placement unit 123) to support individual network nodes. As with the centralized data analytics server 110, the regional caching servers 120 may comprise virtualizations executing on computing resources located in regional network operating centers, for instance.

The network 2 202 further includes cache resources distributed throughout the network 2 202. The cache resources are examples of the local network nodes 130. The cache resources 130 are managed in part by the regional caching servers 120. The cache resources 130 may further comprise local cache controllers for managing operation of each local cache 131 in the cache resource 130. Alternatively, the cache resources 130 may rely upon the regional caching servers 120 to provide regional cache controllers for management of the operations of each cache resource 130. Alternatively, a combination of local cache control and regional cache control may be employed depending upon network requirements.

In operation, the regional caching servers 120 exchange content-related information with the centralized content placement server 110 and the cache resources 130. Based upon the exchanged content information, the regional caching servers 120 coordinate the caching of content at the level of the cache resources 130 (i.e., the storing of content at a local cache 131 at the local network node 130) . The regional caching servers 120 may be operative to fractionally distribute blocks of content between caching resources 130. The regional network node 120 may further include a regional cache resource 225 which may be used to store content that is not popular enough to be stored at a local cache 131, but popular enough in an aggregation of local network nodes 130 to be stored at the regional network node 120.

Referring to FIG. 3, a flow chart is shown illustrating a method of storing new content in cache memory 131 in a network (300) , in accordance with the network cache management architecture 100. The network may comprise a global network node 110 connected to at least one regional network node 120. Each regional network node 120 is connected to at least one local network node 130. Each local network node 130 has cache capabilities 131 for storing content for delivery to requesting user equipment (UE) . The method (300) comprises determining a regional popularity metric of the new content for a regional network node 120 (310) , sending the regional popularity metric of the new content to the regional network node 120 (320) , and storing the new content at a local cache 131 physically closest to a UE requesting the new content (330) if said popularity metric meets a threshold.

A global network controller at the global network node 110 may implement a regional content popularity prediction unit 112 to determine the regional popularity metric of the new content based on metadata of the new content metadata, historical content metadata in the global content database 111, and past content requests for the historical content at any local network node 130 connected to the regional network node 120. The historical content metadata may pertain to content accessed at any local network node 130 connected to the regional network node 120. Methods for predicting content popularity are described in further detail below.

The method (300) may further comprise the global network node receiving the new content metadata from a content service provider via a backbone network connection. Additionally, and/or alternatively, the method (300) may further comprise deleting other content from said local cache 131 to make room for the new content. The deleted other content may have a lower popularity metric than the new content. The method (300) may also further comprise a controller at the regional network node 120 receiving a local cache report of found past content requests where said content was found in the local cache 131 (i.e., cache hits) and missed past content requests where said content was not found in the local cache 131 (i.e., cache misses) . A regional cache report is updated with the local cache report and the updated regional cache report is sent to the global network node 110 to update the global content database 111. The local cache report of cache hits and cache misses may also be time-dependent where the local cache report is of found past content requests for a specified time period where the content was in the respective local cache and missed past content requests for a specified time period where the content was not in the respective local cache. Past content may have similar metadata attributes to new content. Thus, past content metadata may be used to estimate the popularity of new data for each regional network node. If there is insufficient data for one regional network node, regional content popularity prediction unit 112 may use metadata from other regional nodes where this metadata has similar attributes as the new content metadata.

Referring to FIG. 4, a flow chart is shown illustrating a method of storing content in cache memory in a network (400) , in accordance with the network cache management architecture. The network comprises a global network node 110 connected to at least one regional network node 120. Each regional network node 120 is connected to at least one local network node 130. Each local network node 130 has cache capabilities for storing content for delivery to requesting user equipment (UE) . In this network node hierarchy, before the start of the method (400) , a first local cache 131 of a first network node 130 of the at least one local network nodes 130 connected to a first regional network node 120 has a stored content. The method (400) is performed by a regional cache controller at the first regional network node 120. The method (400) comprises determining a local popularity metric (410) for the stored content for each local network node 130, determining a local cache infrastructure cost (420) for storing the stored content for each cache at each local network node 130, storing the stored content at a second cache (430) that is associated with an optimal local cache infrastructure cost for the stored content, and sending to the global network node 110 a regional cache report of found past content requests and missed past content requests from UE. The local popularity metric is based on past content requests for said stored content at that local network node 130. The found past content in the regional cache reports were found to be in any local cache 131 of any local network node 130 connected to the regional network node 120. The missed past content requests were not found in any local cache 131 of any local network node 130 connected to the regional network node 120.

A regional cache controller at the regional network node 120 may store at the regional content database 121 local cache reports of past content requests for each of any local network node 130 connected to the regional network node 120. Each of said local cache reports may comprise found past content requests where the content was found in that local cache 131 (i.e., cache hits) and missed past content requests where the content was not found in that local cache 131 (i.e., cache misses) . Each of the local cache reports may further comprise found past content requests for a specified time period where said past content was found in that local cache 131 during the specified time (i.e., time-dependent cache hits) and misses past content requests for a specified time period where said past content was not found in that local cache 131 during the specified time (i.e., time-dependent cache misses) .

The method (400) may further comprise deleting other content in said second local cache to make room for the stored content. The deleted other content may have a lower popularity metric than the stored content. The method (400) may further comprise deleting the stored content from the first cache if the local popularity metric for the first cache is not associated with an optimal local cache infrastructure cost. A local cache infrastructure cost may be a ratio of the number of requests for a content at a local network node 130 to the cost for storing said content at the local cache 131 of said local network node 130. A local cache infrastructure cost may be deemed as optimal where the local cache infrastructure cost is below a predetermined level of cost. Alternatively, a local cache infrastructure cost may be deemed as optimal where it is the lowest of a plurality of local cache infrastructure costs.

The local cache infrastructure cost may be determined for each local cache 131 at each local network node 130 for a content. The local cache infrastructure cost may be a function of at least one of a latency cost associate with a local network node 130, an initial file transfer cost associated with that local network node 130, a file storage cost associated with that local network node 130 and a subsequent file transfer cost associated with that local network node 130. The local cache infrastructure cost may be the sum of these costs. The latency cost represents a file transfer latency value between each of the local network nodes 130 connected to the regional network node 120 and that local network node 130. The initial file transfer cost represents an initial file transfer latency value from the global network node 110 to that local network node 130. The file storage cost represents an energy consumption value for storing said stored content at that local network node 130. The subsequent file transfer cost represents an energy consumption value for transferring said file between another local network node 130 and said local network node 130.

Each of the latency cost, initial file transfer cost, file storage cost and subsequent file storage cost may be modified by a weight to establish a priority for that local cache infrastructure cost. The cache infrastructure costs may further be a function of the size of each block of content to be cached. The local cache infrastructure costs are described in further detail below.

The regional network node 120 may further comprise a regional cache 225 for storing content. The regional cache controller may determine a regional popularity metric for the stored content based on an aggregation of local popularity metrics for the local network nodes 130 connected to the regional network node 120 where the local cache 131 at the local network node 130 is not storing the stored content. The regional cache controller may also determine a regional cache infrastructure cost for storing content at the regional cache 225. The stored content may be stored in the regional cache 225 when the regional popularity metric for said stored content is associated with an optimal regional infrastructure cost for the stored content, and when the regional cache infrastructure cost for storing said stored content is associated with an optimal regional cache infrastructure cost. The regional popularity metric may further be based on found past content requests for a specified time period where said stored content was found in the regional cache 225 during the specified time period and missed past content requests where said stored content was not found in the regional cache 225 during the specified time period. I.e., the local popularity metrics may be based on found past content requests for a specified time period where said found past content was found in that local cache 131 during the specified time period (i.e., time-dependent cache hits) and missed past content requests where said missed past content was not found in that local cache during the specified time period (i.e., time-dependent cache misses) .

The optimal regional cache infrastructure cost may equal the optimal local cache infrastructure costs. Alternatively, the optimal regional cache infrastructure cost may be higher (i.e., have a higher threshold) than the optimal local network cache infrastructure costs since the regional cache 225 may be larger than any local cache 131. The regional network node 120 may comprise one of the local network nodes 130. The methods (300, 400) may be combined.

Referring to FIG. 5, a logical architecture diagram 500 illustrating an example of network interfaces and signaling to facilitate operation of a centralized adapting caching system is shown, in accordance with the network cache management architecture 100. In general, the architecture may be considered as divided into three broad domains: content provider domain 515, data caching network domain 510, and the network domain 505.

The content provider domain 515 includes two logical links. The first link involves content 531 being provided to the network domain, and in particular as input to the distributed and hierarchical cache resources 130 available on the local network. The second link is that, for each piece of content 531, a content description 532 (typically provided in metadata from and relating to a content description 532 when the content 531 is first made available) , is provided to the big data analytics analysis (i.e., an example of the global network node 110) maintained within the global content processing 512 portion of the data caching network domain 510. As noted above, the content description 532 (i.e., new content metadata) may be sent to the global network node 110 by the content service providers 150. Alternatively, a proxy server may be given the content 531 together with the content description from the content service providers 150. The proxy server then extracts the content description 532 and sends it to the global network node 110, while sending the content 531 to the local network node 130. The content description 532 is used as input to assist in determining a predicted popularity metric for that content 531. In general, the content description 532 is stored in a big data (or centralized) data content database (i.e., an example of the global content database 111) that matches the content description 532 to the content 531, along with associated content information. The content description 532 and associated content information may be used by a content popularity prediction algorithm (i.e., an example of the regional content popularity prediction unit 112) , such as the “extreme learning machine” algorithm described below, to determine the predicted popularity metric for that content 531. A separate regional popularity metric for that content 531 may be predicted for each regional network node 120 (i.e., for each local network in the overall network) .

The content popularity prediction 112 may take as additional inputs content-associated information, such as regional cache hit/miss reports 537. The regional cache hit/miss reports 537 for similar content (i.e., content sharing one or more metadata attributes) are used to estimate the popularity of the new content 531. For example, if a new content 531 is a movie and its metadata includes its genre and the actors in the new movie, then the popularity of other movies (based on if they have been requested by users as reported in the regional cache hit/miss reports 537) having the same genre and actors may be indicative of the popularity of the new movie. Some or all of the additional inputs may be included in the big data content database 111. In the example of FIG. 5, regional cache hit/miss reports 537 are added to the big data content database 111. The content popularity prediction 112 algorithm outputs to the regional cache controller regional and time-dependent content popularity messages which provide popularity measurements which may be used to determine trends in popularity changes. The regional content popularity unit 112 may output to the regional network node 120 a regional popularity metric for the new content 531 at a specified time. Alternatively, the output may be a set of regional popularity metrics for the new content 531 for several specified times. The output may also be a set of popularity metrics for each local network node 130 for the new content 531 for several specified times.

The regional content processing portion 514 of the data caching network domain 510 provides the coordination for caching on the network. The regional content processing portion 514 includes a regional cache controller which is an example of the regional network node 120. Alternatively, the regional cache controller may be implemented at the regional network node 120. With reference to FIG. 5, the regional cache controller is an example of the regional network node 120. The regional cache controller 120 includes a per-cache and time-dependent content popularity prediction unit (i.e., an example of the local content popularity prediction unit 122) , a content placement optimization (e.g., MILP described below) unit (i.e., an example of the content placement unit 123, the regional content database 121, and a cache selector for content access unit 524.

The per-cache and time-dependent content popularity prediction unit 122 determines a local popularity metric for each content in a local–network node 130 having caching functionality (i.e., local cache 131) . The local popularity metric is determined using information pertaining to local cache hit/miss reports 536 stored in the regional content database 121. The content placement optimization unit 123 determines if there are an optimal local caches 131 in which to store the content 531. Sometimes, the content placement optimization unit 123 may determine that the content 531 is not popular enough to be stored in any local cache 131 in view of the cost. Other times, the content placement unit 123 may determine that the content 531is popular enough to be stored in one or more local caches 131 in view of the cost.

For each local cache 131, the content placement optimization unit 123 obtains a local popularity metric for that content and information pertaining to network topology 542 relating to the local cache 131. The content placement optimization algorithm 123 also takes as input network-based associated content information from a user-plane path selector 557 from a control plane of the network as well as the network management (e.g., Operational Support Systems /Business Support Systems -OSS/BSS) 541 information to obtain information related to network topology, cost of transmission links, energy consumption at the caches and energy consumption for transmission, and other related information 542. For example, for each local cache 131, the user-plane path selector provides to the content placement optimization unit 123 the communication link path to take from one local network node 130 to that local cache 131. This path information in combination with the information obtained from the network management provides the cost information for that local cache 131. The cache selector 524 operates on the user-plane path selector 557 of the control plane of the network.

With the popularity metrics from the prediction unit 122 and the cost information from the mobile network domain 505, the content placement optimization unit 123 determines the local cache infrastructure cost for each local cache 131 for the content 531. The local cache infrastructure costs are compared. If any of the local caches 131 have a cost past a certain level, then those one or more local caches 131 are selected as the cache to store the content. The content placement optimization 123 algorithm outputs the content placement and replacement instructions 534 to the cache resources 131.

The regional content processing portion 514 of the data caching network domain 510 also acts as an aggregator of content-related information for input to the global content processing portion 512. Referring to FIG. 5, the cache resources 130 cache content at the local level (i.e., at local cache 131 of the local network nodes 130) , and report local cache hit/miss reports 536 to the regional cache controller 120. Information pertaining to these reports are stored in the regional content database 121 and aggregated into regional cache hit/miss reports 537 to send to the big data content database 111.

The cache resources 130 also take as input content placement/replacement instructions 534 from the content placement optimization unit 123 which manage the storage and retention of content within each cache resource 130. As indicated above, some portions of cache control may be effected at the local level. Cache control may be managed at the regional level. The cache resources 130 operates on the data-plane of the network to effect the changes for content management and handling.

As is described above, the content placement optimization unit (i.e., local content popularity unit 122) may be a MILP. A mixed integer linear program (MILP) can be formulated for caching popular content in a network, including a content centric network (CCN) , a content delivery network (CDN) , an information centric network (ICN) and other types of networks. One way to cache content based on its associated popularity is to formulate the content placement problem as a MILP where the goal is to reduce content transfer costs in the network or to minimize the latency of delivering requested content to users. The constraints of the MILP are associated with link capacity limits, maximum latency limits for delivering content, content storage size at each server, and cost of transferring files from one server to another. MILPs are non-deterministic polynomial-time (NP) hard to solve. Linear relaxation may be used to solve MILP in polynomial time.

Popularity based caching mechanisms use a prediction of the popularity of content either via historical request statistics or from the meta-level features of the content. Methods to estimate the popularity of content are presented above. A MILP caching mechanism is a centralized caching mechanism based on predicted popularity and formulates the content placement problem as a MILP. The formulation is described in terms of CCN. However, the formulation is also applicable for any other network (e.g., CDN, ICN, etc. ) . Advantageously, for maximizing energy savings, virtualization of content delivery networks (vCDNs) network functions virtualization (NFV) is useful in the case where the video streaming workloads exhibit a difference between prime-time and non-prime-time usage of the infrastructure. The MILP formulation may also be utilized in such setting by solving the MILP at specific time intervals or when prompted by a significant change in the popularity of content.

A latency cost is the latency to fetch contents from other routers. A controller determines the latency cost associated with each local network node 130. The initial file transfer cost is the latency to fetch contents from the gateway server to store contents. The controller determines the initial file transfer cost associated with each local network node 130. For each local network node 130, the initial file transfer cost represents an initial file transfer latency value from the gateway node (i.e., global network node 110) to that local network node 130. The latency cost is zero when a requested content is served by the router itself. If the latency cost to fetch a content from other routers within the network is higher than fetching from the gateway router, then the gateway router will serve the content request. Network latency may be generally expressed as follows:

Latency costs and initial file transfer costs are described further below. Both depend upon network topology and routing strategy.

The file storage cost and the subsequent file transfer cost reflect energy consumption for storage and subsequent file transfer, respectively, of content. The energy consumption metrics translate to the energy consumption for the initial storage of the content and the transportation energy for subsequent file transfers of the content. The controller determines the latency cost associated with each network node. For each network node, the file storage cost represents the energy consumption value for storing a content at that network node. Also for each network node, the subsequent file transfer cost represents the energy consumption value for transferring said file between each of the plurality of network nodes and that node. The energy consumption cost is zero when a requested content is served by the router itself. Energy consumption may be generally expressed as follows:

File storage costs and subsequent file transfer costs are described further below. Both depend upon network topology and routing strategy.

The MILP formulation will be described in further detail in terms of CCN. Let a CCN comprising of one gateway/core router and I edge routers, which are indexed by i ∈ I ＝ {1, 2, …, I } . The set that contains all the edge routers and the gateway router is indexed by

Typically, the gateway router has higher storage than edge routers and has access to the full library of contents via high speed communication link. Each edge router i has a weaker communication link and a limited capacity of N_i (byte) to store contents at the beginning, and is able to make decisions as to which blocks of contents to cache. These blocks may contain popular files of different categories such as music, entertainment, people and blogs, etc., and are indexed by V ＝ {1, 2, …, J } . Let s_j denote the size of the j-th block of contents. Then, considering the storage capacity constraint, the set of feasible caching strategies available to router i can be expressed by the set

where 2^V denotes the power set of set V. The initial file transferring cost via gateway router to router i is denoted by d_gi and the latency between router i and router k is denoted by d_ik (second per byte) . d_gi and d_ik, both depend on network topology and routing strategy. Similarly, energy consumption due to file transportation from router i to router k is denoted by P_transport (ik) which also depends on network topology and routing strategy. P_storage (i) refers to file storage cost in terms of energy consumption. μ^j _k represents the estimated popularity of j block of contents at router k which is described above.

Consider two types of decision variables. Let a^j _i denote the decision variable where a^j _i ＝ 1 when j-th block of contents is cached by router i and a^j _i ＝ 0 otherwise. b^j _ik represents the fraction of content block j is served by router i to router k. b^j _kk ＝ 1 means that router k caches the content block j and serves the request by itself.

A main objective of MILP formulation is to minimize the latency and energy consumption cost, taking into account initial file transfer cost and storage cost in the network while maintaining the storage capacity constraint of each router. The objective function may be formulated as follows.

subject to:

Here, w₁, w₂ are the weight of real-time latency cost and initial file transfer cost in the objective function, respectively. w₃, w₄ are the weight of the storage cost and energy consumption due to file transportation in the objective function, respectively. Constraint (5) ensures that total fraction of j-th block of contents adds to 1. Constraint (6) represents the fact that router I can serve other routers’ requests only when it catches the requested block of contents. Constraint (7) ensures that each router can cache up to its storage capacity. Constraint (9) is introduced so that each router either caches the whole block or the contents or none.

The parameters w₁, w₂, w₃, w₄ in (4) could be mathematically interpreted as weight factors that determine the relative importance of the different factors considered in the formulation. The values of these parameters affect the choice of caching contents. The particular choice of these values will depend on the network specifications. For instance, w₁ is higher than w₂ when the target of caching formulation is to reduce real-time latency. Similarly, w₃ < w₄ when P_storage < P_{transportation}. Since different terms, i.e., energy and latency, are considered together in the MILP formulation, the values of w₁, w₂, w₃ and w₄ are preferably selected in a manner that both terms receive their importance.

Hierarchical caching architectures comprise of routers that are designated into several layers, and storage capacities are designated accordingly. In one example of a hierarchical caching architecture, routers are designated into lower and higher layers where higher layer routers have higher storage capabilities than lower layer routers. Different caching strategies can exist for different architectures. According to an “inclusive cache hierarchy” strategy, the same files can be cached in both upper and lower layers. Alternatively, in an “exclusive hierarchy” strategy, higher layer cache files that are unavailable at lower layers. Neither of these extreme strategies are always optimal as, ideally, hierarchy caching strategy should also factor network topology and file popularities. For example, if category-1 files are popular in both router-1 and router-2, and both routers are directly connected to the same higher layer router with high communication links, then caching category-1 files in the higher layer router may provide more benefits for the network than caching the same files in both router-1 and router-2.

FIG. 6 shows a schematic of a content centric network 600. Routers 1 to 24 are in layer-1 of the cache. These routers directly receive content requests from users and in general have smaller storage and weaker communication links. Routers 25 to 29 are considered to be in layer-2 since each router covers a set of routers of layer-1. Also, these layer-2 routers receive content requests directly from user such as layer-1 routers. Layer-2 routers typically have higher storage and stronger communication links than layer-1 routers. Moreover, layer-2 routers can support a large number of requests and content popularity can be estimated by an extreme learning machine (ELM) algorithm (further described below) . Finally, router-30 is referred to as the gateway router. It is assumed that router-30 has higher storage capacity and can serve all the requests from the main server via high speed Internet. Layer-3 routers receive content requests via layer-2 routers. The connection arrows denote the communication link capabilities between the servers of the routers.

The MILP formulation can be translated into RAN caching architecture. For instance, routers can be replaced by radio networks (RNs) and gateway routers can be considered as gateway/core networks. In particular, routers in layer-1 can be considered as small RNs/small cells with small number of requests. These small cell RNs have lower storage capacity, e.g., cache 2％-5％of the available contents and typically have weaker back-haul communication links. Routers in layer-2 can be interpreted as macro RNs/macro cells. These macro RNs cover a set of small RNs and have higher storage capacity, e.g., cache 5％-10％of the available contents. Typically, macro RNs connect to the core network via strong back-haul links. Users connect to RNs according to communication protocols. Finally, layer-3 routers, i.e., gateway routers, can be interpreted as core networks and have higher storage capacity than macro RNs, e.g., caches 10％-20％of the available contents. Typically, core networks are connected with the main server via high speed optical fiber links.

To deal with newly arrived contents, a dynamic cache replacement scheme with pre-caching may be implemented. Routers initialize their storage according to the pre-caching mechanism, i.e., MILP. However, routers each replace contents within their corresponding cache according to the dynamic cache replacement scheme. To do this, routers cache the popular newly arrived contents along with the popular pre-cached contents. The routers then each promote or demote the contents within their own corresponding caches based upon requests for content received by that router. Contents with minimal requests may be evicted from the cache to make room for newly requested content. For example, FIG. 7 shows an example of a segmented last recent used (SLRU) content replacement scheme 700 having three priority segments within a cache. Level-3, level-2 and level-1 are the highest, middle and lowest priority segments, respectively, within a particular cache. H and T are referred to as head and tail, respectively. When a received request for content results in a cache miss, the router fetches the content to service the request and caches the fetched at the head of the level-1 segment of its cache. When a received request for content results in a cache hit, the requested content is promoted to the head of the next upper priority segment of the cache. In relation to contents that were recently added to the cache, for instance, the contents added to the head of the level-1 segment of the cache would be promoted to the head of the level-2 segment of the cache. For a given cache size, once the cache is full some of the existing content retained within the cache is evicted before new content may be added to the cache. Contents may shift down within a segment as new content is added to the head of that segment. When new content is promoted to the head of that segment, contents that reach the tail of a segment may be evicted from that segment and moved to the head of the next lower priority segment. The content evicted from the level-1 segment is totally evicted from the router.

A centralized caching mechanism may be used for several scenarios. For example, when users are mainly connected to the lower level of cache and lower level of caches are deployed within end user proximity. In a hierarchical caching system, lower level of cache may be referred to as layer-1 cache. In terms of cellular networks, layer-1 cache can be considered as small cell RNs which can support a few number of users and closer to the end users. The MILP formulation may be used when layer-1 cache has limited storage capacity and connects to the upper level of cache via low back-haul communication link.

Another scenario involves the upper level caching system covering a set of lower level caches and a number of directly connected users. In cellular networks, an upper level caching system may be interpreted as a macro RN and core network. In a hierarchical caching system, upper level of cache may be referred to as layer-2 and layer-3 cache. The MILP formulation may be used when upper level caching system has higher storage and back-haul capacity.

Another scenario involves when the popularity of contents at an upper level of the caching system can be estimated by using an ELM algorithm, exploiting the fat that ELM works well for large number of users.

Another scenario involves when the popularity of contents at a lower level of the caching system can be synthetically generated using a constrained linear program which ensures that the estimated content popularities is consistent with upper level popularities. That is, linear programming can be utilized to construct micro level requests that are consistent with the observed macro level requests. In this scenario, the sum of all requests at the micro level nodes is equal to the macro level requests. This allows for the generation of micro level requests to test algorithms for predicting the popularity of content at the micro level (i.e., user level) .

Another scenario involves a combination of the centralized solution and MILP as in MILP caching mechanisms and methods. The centralized solution/MILP determines whether a content should be cached (pre-fetched) and where it should be cached in the micro level. At the macro or micro level, the MILP can be utilized for caching and an auto-regressive model for content popularity prediction. Thus, one content could be cached in multiple places, depending on the available memory of the caches and the contents popularity.

In certain situations, it may be desirable to perform cache management of communications system, for example, using the predicted popularity of various data content. For example, data content that is determined to be relatively more popular, can be stored within proximate cache locations of the communications system, to reduce latency when delivering the data content to a requesting user. Data content determined to be relatively less popular, may be purged from the communications system, or cached at a relatively more distant cache location from potential users. In this way, data content may be efficiently organized within communications system to reduce backhaul load within the network, and to also reduce latency when specific data content is to be delivered to a requesting UE.

As is described above, the global network node 110 estimates an initial popularity metric for new content for a regional network node 120 (i.e., content that is not currently being stored in any local cache 131 of any local network node 130 connected to the regional network node 120) . The regional network node 120 also estimates per-cache and time-dependent content popularity metrics using MILP. Both of these steps involve the use of a popularity prediction algorithm.

Referring to FIG. 8, there is shown a functional schematic of a communications network 800. The communications network 800 includes a caching, computing and forwarding manager (CCFM) 140 communicatively coupled to a cache memory network 160, for managing the first cache 162, second cache 164, and third cache 166 therein. CCFM 140 is further coupled to content providers 211 for retrieving various data content, for example, in response to a user request, or when data content is absent from the cache memory network 160. A CCFM 140 may correspond to network managers 123, 133, the

caches

162, 164, 166 may correspond to the local caches 131, for example.

As shown in FIG. 8, CCFM 140 includes various interface modules 141 including a content provider interface 141a, a virtual network topology manager interface 141b, a CCFM interface 141c, and a cache interface 141d. CCFM 140 further comprises a fresh content (FC) register 142, an FC popularity estimator 143, a most popular content (MPC) register 144, an MPC popularity estimator 145, a least popular content (LPC) register 146, a LPC popularity estimator 147, a content catalogue 148, and a cache performance monitor 149. It is noted that the CCFM may also include other functionalities not illustrated.

The CCFM 140 is communicatively coupled via the cache interface 141d to the cache memory network 160 in order to manage storage of data content within each of the first cache 162, second cache 164, and third cache 166. The CCFM 140 also includes various registers which serve as catalogs or indexes for looking up the location of specific data content cached throughout the communications network 800. Each register may belong to a certain category that indexes data content having a certain criteria or characteristic. The registers may each comprise one or more entries (or pointers, references) each identifying the location of a specific piece of data content within

individual caches

162, 164, 166. The entries within registers may also be sorted, arranged, or organized according to certain criteria, in order to find data content corresponding to desired criteria within each register. In this way, various pieces of data content can be individually associated with different registers through various indexes therein.

Referring again to FIG. 8, the content catalogue 148 is a database of all data content stored in

individual caches

162, 164, 166 (and potentially other cache networks not shown) . Entries in content catalogue 148 may be labelled for example, by content name, content description, content cache location, content popularity, hit count, miss count, and timer. Hit count is a counter indicating the number of times a particular item of data content has been accessed from a certain cache. Miss count is a counter indicating the number of times an item of data content has been requested but not found in the cache. Hit and miss counts can be kept for items of data content and/or for particular content service providers. Timer may indicate the remaining time in which the data content remains valid. Content popularity is a variable indicating the relative popularity of the data content within a geographical area or region of the network. Content cache location identifies where particular data content is stored within

individual caches

162, 164, 166.

The cache performance monitor 149 observes and reports various parameters of

individual caches

162, 164, 166. For example, cache performance monitor 149 may monitor the number of hits or misses for a particular content provider, content category (e.g., movie, music, images) and cache location, where a hit may be defined as a specific request for a particular piece of data content located in a certain geographical location (such as cache network 160, or a specific

individual cache

162, 164, 166) , and a miss defined as a request for a piece of data content item not found in a certain location (such as within the cache network 160) . The cache performance monitor 149 can also monitor storage capacity of various caches or content providers, frequency of content replacement within individual caches, and outgoing traffic volume from individual caches.

FC register 142 is an index for newly arrived data content to the cache network 160. For example, new data content may be sent to CCFM 140 in response to a user request for the new data content. MPC register 144 is an index for data content that is relatively more popular or accessed at a greater rate. LPC register 146 is an index for data content that is relatively less popular or accessed at a lower rate. As will be discussed and illustrated in further detail below, the use of multiple registers for categorizing and indexing data content in

individual caches

162, 164, 166 may improve management and speed in delivering various data content to users.

FC popularity estimator 143, MPC popularity estimator 145, and LPC popularity estimator 147 are functional modules that estimate the popularities of data content referenced by entries in the FC register 142, MPC register 144, and LPC register 146, respectively. For example, popularity may be defined by the number of times a particular item of data content has been accessed, or the frequency at which the data content entry has been used or accessed. Alternatively, the popularity of a data content item may be defined based on the amount of time elapsed since that data content item was last accessed. The FC popularity estimator 143, MPC popularity estimator 145, and LPC popularity estimator 147 may comprise different algorithms or processing functions to provide different treatment of statistics for determining popularity for its respective register. Furthermore,

popularity estimators

143, 145, 147 can be configured to perform spatiotemporal popularity estimation. For example,

popularity estimators

143, 145, 147 can estimate popularity in different geographic locations, network locations, different times of day or week, or the like, or a combination thereof.

Popularity estimators

143, 145, 147 may be implemented in a centralized or distributed manner. For example, a FC popularity estimator 143 can operate at a content service provider server (not shown) . In this case, when new data content is introduced to the content service provider, the FC popularity estimator 143 estimates popularity of the new data content and attaches meta-information to the data content indicative of estimated popularity.

Interface modules 141 are used by CCFM 140 to communicate with other functional components outside of communications system 800 (not shown) . Content Provider interface 141a is communicatively coupled to content provider 211 in order to obtain data content and/or content meta information associated with certain data content (e.g., content type, “time to live” , encoding formats, content type, etc. ) . Virtual network topology manager (VNTM) interface 141b is communicatively coupled to a virtual network topology manager. For example, the virtual network topology manager is configured to deploy network resources to instantiate the various caches and cache controllers at desired network locations. The caches and cache controllers can be deployed using network function virtualization (NFV) , software defined topology (SDT) , and/or software defined protocols (SDP) . For example, to receive information such as assigned resource usage (of physical cache locations, memory sizes, and computing resources) and user-cache associations (for example, radio node-cache connection matrix information) . CCFM interface 141c is communicatively coupled to other CCFM modules (not shown) to exchange various information.

Cache interface 141d is communicatively coupled to

individual caches

162, 164, 166 in order for CCFM 140 to manage store, and update data content within the caches. For example, CCFM 140 may send commands to delete unpopular data content, or to copy certain data content to other individual cache (s) , and to receive memory usage information (i.e., remaining storage capacity) and requests to move content to another cache (for example, if the individual cache is full or reaches a predetermined level) .

Individual caches

162, 164, 166 are cache memories which include cache, computing, and cache forwarding functions.

Individual caches

162, 164, 166 are operatively configured to store, delete or copy data content objects in accordance with commands received from the CCFM 140.

Individual caches

162, 164, 166 can also perform content processing functions (e.g., coding and transcoding) and report maintenance information to the CCFM 140, such as the available capacity left in each cache.

Referring to FIG. 9, there is shown a flowchart of a method 900 for managing a cache memory network, such as of a communications network 800 of FIG. 8. For example, method 900 may be implemented by CCFM 140 of FIG. 8 to manage cache network 160 based on predicted popularity. At step 910, the popularity of a piece of data content is estimated using an extreme learning machine (ELM, described in further detail below) . This may be functionally executed, for example, by any one of

popularity estimators

143, 145, 147 of FIG. 8. The piece of data content may be stored within a particular cache location (such as within cache network 160 of FIG. 8) , or referenced from another location (such as from content provider 211 of FIG. 8) . At step 920, cache management is performed according to the predicted popularity of the data content item. For example, the relative popularity of a piece of data content may determine whether the piece of data content is physically moved (or deleted) between physical cache locations (such as

individual caches

162, 164, 166, for example) , or whether references to pieces of data content should be re-arranged (between

registers

142, 144, 146, for example) . This will be described in further detail below.

Regarding step 910, ELM is a branch of machine learning techniques that use a number of past observations to predict a future characteristic. For example, if given enough data, the ELM can determine a certain correlation to make a prediction from that data. Use of ELM may satisfy universal approximation ability, may be implemented in parallel, and can be trained sequentially. The ELM may comprise a single hidden-layer feed-forward neural network. Also, the ELM may be trained in two steps as follows. First, hidden layer weights are randomly initialized using any continuous probability distribution. For example, a normal distribution may be selected to initialize the weights. Second, the hidden-layer output weights may be computed using a suitable algorithm, such as the Moore-Penrose generalized inverse.

Regarding step 920, cache management may comprise arranging references in

registers

142, 144, 146 which point to various pieces of data content in

cache memories

162, 164, 166. For example, if a piece of data content referenced in FC register 142 is determined to be relatively unpopular, the reference in FC register 142 may be moved to LPC register 146 which indexes relatively unpopular data content. As another example, if a piece of data content referenced in LPC register 146 is determined to be relatively popular, its reference in LPC register 146 may be moved to MPC register 144 which indexes relatively popular data content. In some instances, references within

registers

142, 144, 146 may also be deleted according to the relative popularity of their associated data content. In this way, references within each of FC register 142, MPC register 144, and LPC register 146 may be re-arranged or shuffled according to the relative popularity of their associated data content, so that each

register

142, 144, 146 may be accordingly updated and maintained.

Still regarding step 920, cache management may comprise moving pieces of data content between physical cache locations. As an illustrative example referring to FIG. 8, suppose cache 162 is physically farthest from users, and reserved for storage of relatively unpopular data content, cache 166 is physically closest to users, and reserved for storage of relatively popular data content, and cache 164 is logically between

caches

162, 166. If, for example, a piece of data content item within a cache 164 is determined to be relatively unpopular, it may be moved to cache 162 for storage of relatively unpopular data content. If, alternatively, the same piece of data content is determined to be relatively popular, it may be moved to cache 166 to reduce latency time if subsequently requested by a user. If a piece of data content in cache 162 is determined to be the relatively most unpopular piece of data content within cache network 160, for example, it may be simply deleted from the cache network 160. In this way, data content within

cache memories

162, 164, 166 can be organized in a way that would improve performance and reduce latency upon user requests for various data content.

Referring to FIG. 10A, there is shown an ELM method 1000 for predicting popularity of a piece of data content. Step 910 of FIG. 9 predicting the popularity of data content with an extreme learning machine, for example, may comprise the ELM method 1000. At step 1010, metadata is collected for a piece of data content. Metadata is information which relates to certain features of the data content, for example, titles, thumbnails (for videos or images) , keywords, and so forth. When the data content is a video for example, metadata may also include video quality (for example, standard definition or high definition, video resolution, frame rate) , directors, actors, actresses, production year, awards, producer, or other relevant information associated with the video. When the data content is news, metadata may include time and date, geographical locations, names of reporters, names of photographers, related events, or other relevant information associated with the news. Metadata may be collected for a predetermined period sufficient to compile a dataset relating to the data content. At step 1020, the metadata is computed, for example to provide a numerical parameter. As an example, for title or keyword information, the number of hits for that title/keyword may be computed； for a thumbnail image, a visual characteristic such as contrast or hue may be computed； and for an image, certain features may be converted into a numerical parameter. This step may utilize feature engineering based on domain knowledge of a particular situation. At step 1030, one or more features are selected from the computed metadata. For example, the metadata may relate to features such as title, keyword, or thumbnails. Those features which may be important for popularity prediction may be selected in order to provide more accurate popularity results. View counts and past statistics may also be included. Step 1030 may be performed via a feature selection algorithm which can be used to select meta-level features that may be valuable for prediction. At step 1040, the extreme learning machine is trained using the selected features. Finally, at step 1050, popularity of the piece of data content is predicted using the trained ELM. For example, a binary classifier based on the thresholding the popularity prediction from the extreme learning machine may be used. As an example, consider the application of the feature selection algorithm (pseudo code 1030a in FIG. 10B) and the extreme learning machine for predicting a future view count of a cache memory infrastructure containing approximately 12, 000 videos. FIG. 10C illustrates how the data content can be classified into popular and unpopular categories (i.e., a binary classifier) .

As highlighted above, step 1030 may be carried out via a feature selection algorithm. A feature selection algorithm allows for identification of the relevant metadata features to be used in training the ELM. Reducing the set of potential features to a subset of the most relevant features reduces the computational time and cost for training the ELM, and in many cases results in a more accurate classification by the ELM.

FIG. 10B illustrates an example of a feature selection algorithm in the form of an exemplary sequential wrapper feature selection algorithm 1030a. In step 1056, the “ELM (39) ” refers to:

with β ＝ [β₁, β₂ …, β_L] and h (x) ＝ [h₁ (x) , h₂ (x) , …, h_L (x) ] , wherein θ represents the model parameters of the ELM. In step 1056, “ (40) ” refers to:

with H the hidden-layer output matrix with entries H_kj ＝ h_k (x_j； θ_k) for k ∈ [1, 2, …, L} and j ∈ {1, 2, …, N} , and Y the target output with entries Y ＝ [y₁, y₂, …, y_N] .

Referring to FIGS. 11A-B, there are shown respective hit ratio and latency simulations in response to various user requests, using the cache memory infrastructure management method 900 of FIG. 9, simulated on a communications network. These simulations are discussed further below.

FIG. 12 illustrates an example of pseudo code that may be used to estimate the performance of the ELM method 1000 for predicting popularity of a piece of data content, when applied within the cache memory infrastructure management method 900 of FIG. 9, for example. The cache initialization portion of the pseudo code controls the user defined portion of the cache dedicated to popular content, and that controlled by request statistics methods such as least recently used (LRU) or segmented least recently used (SLRU) algorithms. The parameter d_ {ij} in the content distribution simulator is the associated latency between servers i and j (e.g., nodes or cache memories) . The latency is computed using a named data networking simulator (ndnSIM) simulator package using the following method: First define the link capacities on all the servers. Then a BestRoute (in ndnSIM) forwarding strategy is defined on all the servers to handle the incoming and outgoing packets (e.g., pieces of data content) . To estimate the latency between servers i and j, content requests are generated, for example based on a logged YouTube^TM dataset. An ndnSIM AppDelayTracer is then used to compute the associated delay between sending content from server i to server j. This process is repeated for all servers in the content distribution network. The hit ratio and latency results are computed via the following equations:

As shown in FIGs 11A-B, use of the ELM improves the hit ratio probability, while it also reduces latency time when a user requests a specific piece of data content. In the hit ratio graph shown in FIG 11A, six possible caching strategies are applied (for example, to cache memories 131) , four of which use the predicted popularity values from the extreme learning machine method 1000 above. Note the cache is assumed to contain 3 portions each of which can store a collection of files. First is the (empty LRU) in which the cache begins with no cached content (i.e., an empty cache) . As requests are received the requested data content are cached via the least recently used method. Second is the (empty S3LRU) in which the cache begins with no cached content. As requests are received the requested data content are cached via the segmented least recently used method with 3 segments. Third is the (Top 2 LRU) in which the top two portions of the cache are populated with data content that is predicted to be popular according to the extreme learning machine method 1000 above. As requests are received, the cache content is updated via the least recently used method. Fourth is the (Top2 S3LRU) in which the top two portions of the cache are populated with data content predicted to be popular from the extreme learning machine method 1000 above. As requests are received the cache is updated using the segmented least recently used method with 3 segments. Fifth is the (static LRU) in which the top two portions of the cache are populated with data content predicted to be popular from the extreme learning machine method 1000 above. As requests are received, only the bottom portion of the cache is updated via the least recently used method. Sixth is the (static S2LRU) in which the first portion of the cache is populated with the data content predicted to be popular from the extreme learning machine method 1000 above. As requests are received, the other two portions of the cache are updated via the segmented least recently used method with two segments. As seen, the optimal hit ratio is obtained using (Top2 LRU) . The same setup is used for FIG. 11B, where we see that the optimal setup is also obtained from the (Top2 LRU) .

Referring to FIG. 13, there is shown an example of an ELM comprising a hidden-layer feed-forward neural network, with x∈R_m denoting the feature inputs, h_k (x； θ_k) the transfer function for hidden-layer node k, B_k the output weights, and y∈R the output. The parameters of the hidden-layer may be randomly generated by a distribution, and the subsequent output weights are computed by minimizing the error between the computed output y and the measured output from the dataset D (for example, a dataset D ＝ { (x_i, y_i) } ^N _i＝1 of features x∈R^m and total views y for content i ∈ {1, 2, …, N} ) to construct a model that relates the features x to the total views y based on the dataset D, in predicting popularity of a piece of data content with the use of request statistics (e.g., for user requests for data content) . Each hidden-layer neuron can have a unique transfer function, such as sigmoid, hyperbolic tangent, Gaussian, or any non-linear piecewise function. Accordingly, the ELM in FIG. 13 approximates the functional relationship:

with β ＝ [β_1, β₂ …, β_L]and h (x) ＝ [h₁ (x) , h₂ (x) , …, h_L (x) ] , wherein θ represents the model parameters of the ELM.

The ELM may include tuning the model parameters θ in equation (11) to improve accuracy/efficiency of the ELM. In other words, ELM method 1000 in FIG. 10A may include an optional step 1045 (not shown) of tuning the extreme learning machine model parameters with an adaptive algorithm to reduce error probabilities. The adaptive algorithm may be used to select θ to reduce the probability (denoted as

) of Type-I and Type II errors in predicted popularity of a piece of data content using the ELM. A Type-I error relates to the erroneous prediction of an unpopular piece of data content as popular (e.g., ‘false positive’ ) , while a Type-II error relates to the erroneous prediction of a popular piece of data content as unpopular (e.g., ‘false negative’ ) . When applied to CCFM 140 of FIG. 8, for example, this may result in the shifting of references from the LPC register 146 to the MCP register 144 for a Type-I error, and the shifting of references from the MCP register 144 to the LPC register 146 for a Type-II error.

The accuracy of the ELM as a function of the number of neurons L is stochastic as a result of how each ELM is initialized. The adaptive algorithm, may comprise a simulation based stochastic optimization problem to estimate:

where Ε denotes the expectation with respect to the random variable θ defined in (11) , and

denotes the probability. Since the probability of Type-I and Type-II errors is not known explicitly, (12) is a simulation based stochastic optimization problem. To determine a local minimum value of J (L) , the following simultaneous perturbation stochastic gradient (SPSA) algorithm may be used:

STEP 1: Choose initial ELM parameters L₀ by generating each from the distribution N (0, 1) , and

STEP 2: for iterations k ＝ 1, 2, 3, …, estimate cost J (L_k) in (12) , denoted as J_k (θ _k) , by plugging:

into equation (11) . Then, the following gradient estimate may be computed:

[Corrected under Rule 26, 19.09.2016]

with gradient step size ω > 0. Then the probe vector L_k may be updated with step size μ > 0:

[Corrected under Rule 26, 19.09.2016]

As understood by those skilled in the art, SPSA is a generalization where an explicit formula for the gradient is not available, and needs to be estimated by stochastic simulation. For decreasing step size μ ＝ 1/k, the SPSA algorithm converges with probability one to a local stationary point. For constant step size, it converges weakly (in probability) to a local stationary point.

Feature selection algorithms are geared towards selecting the minimum number of features to achieve a sufficiently accurate prediction. If the feature set is too large, then the generalization error of the predictor will be large. One feature selection algorithm that can be used to improve the performance of an ELM involves handling unbalanced training data. The ELM method 1000 of FIG. 10A may further comprise an optional step for handling unbalanced training data and mitigating outliers in the data set (not shown) . For example, the computed meta data in step 1020 forms a data set formed from past observations, which may be considered ‘unbalanced’ in that it may only include a few requests for popular data content. Additionally, the data set may include outliers which are inconsistent with the remainder of the data set.

FIG. 14 illustrates an algorithm for handling unbalanced training data and to reduce the effect of outliers, which may be applied as the optional step to FIG. 10A for handling unbalanced training data and mitigating outliers in the data set (not shown) . In Step 1, the weights associated with the minority and majority classes are selected. In Step 2, the regularization constant C is selected. The regularization constant C in determines the tradeoff_between the minimization of training errors and the output weights β_ (i.e., maximization of the marginal distance) . Additionally, notice that β is the solution of the minimization of the objective function for constant C. It has been noted that including the regularization constant C increases the stability of the ELM and also enhances the generalization ability. Finally, Steps 3 and 4 estimate which training features are outliers and remove them from the training set.

Another feature selection algorithm, the sequential wrapper feature selection algorithm, relies on computing the optimal features based on the model sensitivity to variations in the features and an estimate of the generalization error of the model. Features are removed sequentially while ensuring the generalization error is sufficiently low. This sequential wrapper feature selection algorithm 1030a is shown in FIG 10B. The algorithm 1030a sequentially removes the least useful features while ensuring that the performance of the ELM is sufficiently high. This is performed by computing the output of the ELM with all features, then computing the output with one of the features held constant at its means (i.e., the null ELM model) . If the output from the ELM and null ELM are similar under some metric, then the feature held constant does not contribute significantly to the predictive performance of the ELM and may be removed. This process may be repeated sequentially until a performance threshold is reached.

FIG. 15A is a schematic illustrating an example of an ELM 1500 that can perform both prediction of new and published videos. The ELM 1500 comprises a feature selection module 1510, a machine learning module (view count) 1520 and a delay module 1530. Meta-data for a video is presented to the feature selection module 1510 which constructs video features

which is composed of subscribers, contrast, overexposure, and previous day viewcount. x_i (t) evolves per day t after the video is posted as new request statistics become available. The predicted viewcount on day t from the ELM is given by v_i (t) .

With no request statistics available, the predicted viewcount from the ELM is very noisy as illustrated in FIG. 15B for v_i (1) . For typical caching algorithms, only the top 10％of content needs to be considered to construct a binary popularity estimator using the ELM by thresholding. As requested statistics arrive, the ELM can be used to make a prediction of the viewcount dynamics as illustrated in FIG. 15C for the viewcount on day 4 (i.e., v_i (4) ) . Therefore, a course estimate of the popularity of videos can be made using the ELM initially. Then, as request statistics arrive, the ELM can be used to provide a high accuracy of the next day population of videos.

FIG. 16 is a flow chart illustrating an adaptive distributive caching method 1600, which may be applied to communications network. At step 1620, cost metrics are determined based on caching statistics of the communications network. For example, the cost metric may comprise the distance or latency for a requested piece of data content to arrive to the requesting user. The cost metric may be the monetary cost to send a piece of data content, or a certain amount of data, over a certain distance of a link. The cost rate of the link (e.g., dollars per kilobyte per kilometer) may be variable depending on the time of day. The caching statistics may comprise cache hits and/or cache misses arising from user requests. For example, a relatively low cost metric for a particular cache hit, may indicate that the associated piece of data content (e.g., the ‘requested’ data content) is in a suitable location, whereas a relatively high cost metric for a particular cache hit, or a cache miss, may indicate that the associated piece of data content is in a non-ideal location that requires additional time/resources for delivery to the requesting user.

Step 1620 may involve inter-cache communication regarding data content stored in individual cache locations, such as directly between individual caches, or through a centralized content catalog to determine the cost metric associated with a particular cache hit or user request. For example, determination of the latency for a particular cache hit, may involve communication between various cache memories to determine where within the communications system the associated data content was stored.

At step 1630, caching is performed for one or more individual caches based on the determined cost metrics in step 1620. For example, if a large number of cache misses are noted for a specific piece of data content, the piece of data content should be fetched and cached within the communications network relatively closer to where the user requests originate. As another example, if the cost metric (such as distance or latency) relating to cache hits for a specific piece of data content is relatively high, the piece of data content can then be moved to a location closer to where one or more user requests originate (e.g., at or closer to the access nodes associated with one or more of the user requests) such that the cost metric would be reduced. The caching strategy may be collaborative (based on statistics/actions for all cache memories of the network) , or individually by each cache memory (considering only the requests, data content associated with an individual cache memory) . The caching strategy may also involve inter-cache communication regarding data content stored in individual cache locations, such as directly between individual caches, or through a centralized content catalog.

Step 1630 may be performed over a pre-determined period according to the type of data content that is generally cached. As an example, for dynamic data content (such as news) the pre-determined period may be shorter (e.g., every 30-60 min) than that compared for static content (movies and videos) having a longer (e.g., daily/weekly) lifespan and thus probability of continued requests. After the pre-determined period for step 1630 has elapsed, the method 1600 may optionally repeat at step 1620, where cost metrics are updated from caching statistics over the previous period. Step 1630 can then again be performed using the updated cost metrics. In this way, caching decisions for the communications system can be made in a dynamic matter reflective of prior caching statistics over the communication system, in a way that reduces or minimizes cost metrics. Previous caching statistics (in step 1620) are used to determine/update cost metric calculations, which are then dynamically leveraged to influence subsequent caching decisions (step 1630) in a way that can reduce cost metrics associated with subsequent user requests.

During network initialization or transitional periods, caching statistics may not yet be available to determine cost metrics (step 1620) . Accordingly, step 1610 of obtaining caching statistics may optionally be first performed. This step may include optionally ‘loading’ the cache memories of the communications network with various data content, and initially caching data content within the communications network based on user requests, according to a pre-determined initial caching method over an initial period. By way of illustration, if 10,000 videos need to be initially distributed (cached) in a communications network having 10 cache memories (not shown) each cache having a capacity of 1000 videos, 10 groups of 1000 videos may be randomly ‘loaded’ into each cache memory. The predetermined initial caching method may then be performed until a sufficient amount of caching statistics generated to advance to the next step.

The initial caching method that may be used in step 1610 may be independently performed by individual cache memories. For example, cache memories may use a least recently used (LRU) caching method to generate caching statistics. Alternatively, cache memories may use a pre-defined caching rule, such as caching any user requested data content at or nearby the cache locations of the access node associated with the user request. As another example, cache memories may pre-fetch certain data content which are likely to be requested by users at a later time. The initial caching method may involve inter-cache communication (for example, of cache catalogs or libraries) , such as determining the existence of a requested piece of data content, and whether caching it would cause redundant storage within the network. The communications network may include a centralized catalog server of the contents of all cache memories, to keep track of cached data content in performance of the initial caching method.

FIG. 17 is a flow chart illustrating a game theoretic learning regret-based algorithm (GTLRBA) 1700, which for example, may be applied as the caching strategy in step 1620 and/or step 1630 of the adaptive distributive caching method 1600 of FIG. 16. GTLRBA 1700 is a class of algorithms that can be applied to individual agents (e.g., cache memories) for optimizing a local utility (e.g., cost metric associated with caching/storage, or delivering data content) in a global system (e.g., communications work) as it converges to sophisticated behaviour towards set of correlated equilibria. For example, each node in a communications network may run GTLRBA 1700 in a recursive loop, to perform independent caching decisions. Although each node may operate GTLRBAs independently, its collective use may ensure that every cache memory picks an action (e.g., whether to cache a particular piece of data content) from a set of correlated equilibria that improves overall network caching performance.

Referring back to FIG. 17, at step 1710, an action is picked, for example by an individual cache memory (or node) . For example, the action may be whether the cache memory should store (e.g., cache) or not store a particular piece of data content in response to a user request. The initial action may be random or predetermined, and then after looping (back to step 1710 from step 1730, as will be explained below) , subsequent actions may be based on the ‘regret’a t a previous time. At step 1720, the utility from the action is measured. Utility for example, may comprise a cost, cost metric, latency, storage, distance, etc., and further depends on the actions taken by other cache memories. For example, if another cache memory performs an action (e.g., it decides to cache a piece of data content) , the utility for the subject cache memory considers this. In this way, actions from all cache memories/nodes affect the utility of other nodes. Finally, at step 1730, the regret is calculated from the action. Regret considers how much worse off the cache memory would be (in terms of utility) had it picked a different action. In other words, it may consider the utility had the cache memory picked a different action. The regret computed in step 1730 may comprise a regret matrix used to determine a running regret average over time. The currently computed regret may then be used to influence the action (step 1710) of the cache memory at a subsequent time.

Where each of the cache memories of a communications network independently apply GTLRBA 1700, because every cache memory picks an action (step 1710) from a strategy with correlated equilibria (e.g., in a way to minimize regret) , overall network operation may converge over time to a behaviour that may improve or optimize the given utility (e.g., cost metrics associated with caching data content) . In this way, application of the GTLRBA 1700 leverages actions made in the past (e.g., caching statistics) to potentially minimize delivery cost metrics for requests made at later times.

Referring to FIG. 18, there is shown a schematic diagram of a hardware component 1300 upon which various functional modules, such as global network node 110 functionalities and regional network node 120 functionalities may be deployed. As shown, the hardware component 1300 includes a processor 1300a, memory 1300b, non-transitory mass storage 1300c, I/O interface 1300d, network interface 1300e, and a transceiver 1300f, all of which are communicatively coupled via bi-directional bus. According to certain embodiments, any or all of the depicted elements may be utilized, or only a subset of the elements. Further, hardware component 1300 may contain multiple instances of certain elements, such as multiple processors, memories, or transceivers. Also, elements of hardware component 1300 may be directly coupled to other elements without the bi-directional bus.

The I/O interface 1300d, and/or transceiver 1300f may be implemented to receive requests from recipient nodes, receive indications and/or data from transmitting nodes, and transmit data to recipient nodes, according to different RAN configurations having wired or wireless links between nodes. The network interface 1300e may be used to communicate with other devices or networks (not shown) in determining forwarding, protocol, and other data delivery decisions to facilitate data transmission between nodes.

The memory 1300b may include any type of non-transitory memory such as static random access memory (SRAM) , dynamic random access memory (DRAM) , synchronous DRAM (SDRAM) , read-only memory (ROM) , any combination of such, or the like. The mass storage element 1300c may include any type of non-transitory storage device, such as a solid state drive, hard disk drive, a magnetic disk drive, an optical disk drive, USB drive, or any computer program product configured to store data and machine executable program code. The memory 1300b or mass storage 1300c may have recorded thereon statements and instructions executable by the processor 1300a for performing the aforementioned functions and steps of the hardware component 1300.

Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

Through the descriptions of the preceding embodiments, the present invention may be implemented by using hardware only or by using software and a necessary universal hardware platform. Based on such understandings, the technical solution of the present invention may be embodied in the form of a software product. The software product may be stored in a non-volatile or non-transitory storage medium, which can be a compact disk read-only memory (CD-ROM) , USB flash disk, or a removable hard disk. The software product includes a number of instructions that enable a computer device (personal computer, server, or network device) to execute the methods provided in the embodiments of the present invention. For example, such an execution may correspond to a simulation of the logical operations as described herein. The software product may additionally or alternatively include number of instructions that enable a computer device to execute operations for configuring or programming a digital logic apparatus in accordance with embodiments of the present invention

lthough the present invention has been described with reference to specific features and embodiments thereof, it is evident that various modifications and combinations can be made thereto without departing from the invention. The specification and drawings are, accordingly, to be regarded simply as an illustration of the invention as defined by the appended claims, and are contemplated to cover any and all modifications, variations, combinations or equivalents that fall within the scope of the present invention.

Claims

A network cache management architecture comprising:

a global network node configurable to connect to content service providers；

at least one regional network node connected to the global network node； and

at least one local network node connected to each regional network node；

wherein said at least one local network node comprises a local cache configurable to store content；

and wherein each of said at least one regional network node comprises:

a regional content database configurable to store metadata pertaining to content stored in each local cache of each local network node connected to that regional network node；

a regional content popularity prediction unit configurable to predict a local popularity metric for each content stored at any local cache of each local network node connected to the regional network node； and

a content placement unit configurable to determine, for each content in all local cache of all local network nodes connected to the regional node, in which local cache that content is stored；

and wherein said global network node comprises:

a global content database configurable to store metadata pertaining to all content that has been requested by UE； and

a global content popularity prediction unit configurable to predict a regional popularity metric, for each regional network node, for new content that is not stored at any local cache connected to that regional network node.
The network cache management architecture as claimed in claim 1, wherein said regional network nodes further comprises a regional cache configurable to store content；

and wherein said global content popularity prediction unit is further configurable to predict a new content popularity metric for new content that is not stored at any regional cache and that is not stored in any local cache.
The network cache management architecture as claimed in claim 1, wherein said local cache nodes are further configured to each prepare local cache reports of past UE content requests, where said requested content was found to be in the local cache of that local cache node and where said requested content was found to not be in the local cache of that local cache node；

and wherein said regional network nodes are each further configured to receive said local cache reports, update said regional cache database based on said local cache reports, and prepare regional cache reports of past UE content requests, where said content was found to be in any local cache of any local network node connected to that regional network node；

and wherein said global network node is further configured to receive said regional cache reports and update said global content database.
A method of storing a new content in cache memory in a network, the network comprising a global network node connected to at least one regional network node, each regional network node connected to at least one local network node, each local network node having cache capabilities for storing content for delivery to requesting user equipment (UE) , the method comprising:

determining, at the global network node, a regional popularity metric of the new content for a regional network node, said regional popularity metric based on:

the new content metadata；

historical content metadata for content accessed at any local node connected to the regional node； and

past content requests for said historical content at any local network node connected to the regional network node；

sending, from the global network node to the regional network node, the regional popularity metric of the new content； and

storing the new content at a local cache physically closest to a UE requesting the new content if said popularity metric meets a threshold.
The method as claimed in claim 4, further comprising the global network node receiving the new content metadata from one of:

a content service provider via a backbone network connection； and

a proxy server that received the new content together with the new content metadata from the content service provider and extracted the new content metadata to send to the global network node.
The method as claimed in claim 4, further comprising deleting other content from said local cache, said other content having a lower popularity metric than said new content.
The method as claimed in claim 4, further comprising the regional network node:

receiving a local cache report of found past content requests where said found past content was found in the local cache and missed past content requests where said missed past content was not found in the local cache；

updating a regional cache report with the local cache report； and

sending the updated regional cache report to the global network node.
The method as claimed in claim 4, wherein a first local cache of a first local network node of the at least one local network nodes connected to a first regional network node has a stored content, the method comprising a regional cache controller at the first regional network node:

determining, for each local network node connected to the first regional network node, a local popularity metric for the stored content, each local popularity metric based on past content requests for said stored content at that local network node；

determining, for each local network node connected to the first regional network node, a local cache infrastructure cost for storing the stored content at a local cache of that local network node； and

storing the stored content at a second cache located at a second local network node connected to said first regional network node, said second local network node associated with an optimal local cache infrastructure cost for the stored content.
A method of storing a content in cache memory in a network, the network comprising a global network node connected to at least one regional network node, each regional network node connected to at least one local network node, each local network node having cache capabilities for storing content for delivery to requesting user equipment (UE) , wherein a first local cache of a first local network node of the at least one local network nodes connected to a first regional network node has a stored content, the method comprising a regional cache controller at the first regional network node:

determining, for each local network node connected to the first regional network node, a local popularity metric for the stored content, each local popularity metric based on past content requests for said stored content at that local network node；

determining, for each local network node connected to the first regional network node, a local cache infrastructure cost for storing the stored content at a local cache of that local network node；

storing the stored content at a second cache located at a second local network node connected to said first regional network node, said second local network node associated with an optimal local cache infrastructure cost for the stored content； and

sending to the global network node a regional cache report of found past content requests and missed past content requests from UE, wherein said found past content was found to be in any local cache of any local network node connected to that regional network node, and wherein said missed past content was not found to be in any local cache of any local network node connected to that regional network node.
The method as claimed in claim 9, wherein determining said local popularity metric comprises the regional cache controller storing, at a repository at the first regional network node, local cache reports of past content requests at each local cache of all local network nodes connected to the first regional network node, wherein each of said local cache reports comprises found past content requests where said found content was found in that local cache and missed past content reports where said missed content was not found in that local cache.
The method as claimed in claim 10, wherein each of said local cache reports further comprise found past content requests for a specified time period where said found content was found in that local cache during the specified time period and missed past content requests for the specified time period where said missed past content was not found in that local cache during the specified time period.
The method as claimed in claim 11, further comprising deleting other content in said second local cache, said other content having a lower popularity metric than said stored content.
The method as claimed in claim 11, further comprising deleting said stored content from the first local cache if the local popularity metric for the first local cache is not associated with the optimal local cache infrastructure cost.
The method as claimed in claim 9, wherein each local cache infrastructure cost of each local cache at each local network node connected to the first regional network node comprises, for each local network node connected to the first regional network node, a function of at least one of:

a latency cost associated with that local network node, the latency cost representing a file transfer latency value between each of the local network nodes connected to the first regional network node and that local network node；

an initial file transfer cost associated with that local network node, the initial file transfer cost representing an initial file transfer latency value from the global network node to that local network node；

a file storage cost associated with that local network node, the file storage cost representing an energy consumption value for storing said stored content at that local network node； and

a subsequent file transfer cost associated with that local network node, the subsequent file transfer cost representing an energy consumption value for transferring a file between another local network node and said local network node.
The method as claimed in claim 14, wherein each of the latency cost, initial file transfer cost, file storage cost and subsequent file storage cost are modified by a weight to establish a priority for that local cache infrastructure cost.
The method as claimed in claim 14, wherein the local cache infrastructure costs are further a function of a size of each block of stored content.
The method as claimed in claim 9, wherein the regional network node further comprises a regional cache for storing said stored content, and further comprising the regional controller:

determining a regional popularity metric for the stored content based on an aggregation of local popularity metrics for the local network nodes connected to the regional network node where the local cache at the local network node is not storing the stored content； and

determining a regional cache infrastructure cost for storing said stored content at the regional cache.
The method as claimed in claim 17, further comprising storing said stored content in the regional cache when the regional cache infrastructure cost for storing said stored content is associated with an optimal regional cache infrastructure cost.
The method as claimed in claim 18, wherein the regional popularity metric is based on found past content requests for a specified time period where said stored content was found in the regional cache during the specified time period and missed past content request where said stored content was not found in the regional cache during the specified time period；

and wherein and the local popularity metrics are based on found past content requests for a specified time period where said stored content was found in that local cache during the specified time period and missed past content request where said missed past content was not found in that local cache during the specified time period.
The method as claimed in claim 9, further comprising:

determining, at the global network node for each regional network node, an initial regional popularity metric of new content that is not stored in any local cache of any local network node connected to that regional network node, said regional popularity metric based on:

the new content metadata；

historical content metadata for content accessed at any local network node connected to that regional network node； and

past content requests for said historical content at any local network node connected to that regional network node；

sending, from the global network node to that regional network node, the regional popularity metric of the new content； and

storing the new content at a local cache physically closest to a UE requesting the new content if said popularity metric meets a threshold.