CN111090653A

CN111090653A - Data caching method and device and related products

Info

Publication number: CN111090653A
Application number: CN201911330901.6A
Authority: CN
Inventors: 马忠义; 崔朝辉; 赵立军; 张霞
Original assignee: Neusoft Corp
Current assignee: Neusoft Corp
Priority date: 2019-12-20
Filing date: 2019-12-20
Publication date: 2020-05-01
Anticipated expiration: 2039-12-20
Also published as: CN111090653B

Abstract

The application discloses a data caching method, a data caching device and a related product. When the first application program runs, the time for traversing data from the graph database is saved; if the map data cache region just contains the required data, the data can be directly obtained from the map data cache region without searching the data in the map database, and the time for preheating the map database is saved. The graph data nodes filled into the graph data cache region are selected according to the node structure characteristics of the graph database, if certain graph data nodes are known to be nodes causing low data exchange performance of the memory medium and the external memory medium according to the node structure characteristics of the graph database, the nodes can be directly obtained from the graph data cache region after the graph data nodes are filled into the graph data cache region, traversal time is saved, and the data obtaining speed is improved.

Description

Data caching method and device and related products

Technical Field

The present application relates to the field of data storage, and in particular, to a data caching method and apparatus, and a related product.

Background

With the rapid development of big data industries such as finance, e-commerce and the internet of things, the relation between data needing to be processed in the big data industries is increased in a geometric progression along with the data volume. The traditional relational database is difficult to meet the actual requirements in the aspects of expansion capability, read-write performance and the like, and the non-relational databases (NoSQL, Not Only SQL) such as a graph database and the like are produced at the same time.

The graph database does not refer to a database storing pictures, but stores and queries data in a data structure of a graph. Currently, a graph database usually exists in an external storage medium such as a hard disk, and when an application program in a device runs, data needs to be traversed from the graph database in the external storage medium in real time. However, the graph database product has a warm-up mechanism, which results in that the graph database needs to consume much time in the startup phase, thereby affecting the speed of acquiring data when the application program runs.

Disclosure of Invention

Based on the above problems, the present application provides a data caching method, device and related product to improve the speed of acquiring data when an application program runs.

The embodiment of the application discloses the following technical scheme:

in a first aspect, the present application provides a data caching method, including:

before a first application program runs, determining a graph data cache region corresponding to the first application program in a memory medium of equipment;

and filling graph data nodes into the graph data cache region according to the node structure characteristics of the graph database.

Optionally, filling graph data nodes into the graph data cache region according to the node structure characteristics of the graph database specifically includes:

scoring each graph data node in a graph database according to a preset scoring mode to obtain a score of each graph data node; the preset scoring mode is related to the node structure characteristics of the graph database;

sequencing the graph data nodes according to the scores of the graph data nodes to obtain a node list;

and filling graph data nodes into the graph data cache region according to the node list.

Optionally, the expression of the preset scoring mode is:

score(v)＝p*R(v)+q*Q(v)；

wherein v represents any one graph data node in a set of nodes of the graph database, the set of nodes including all the graph data nodes of the graph database; the score (v) represents the score of v; r (v) represents a maximum value among respective closest distances to v for respective graph data nodes in the node set; said q (v) represents the degree of said v; p is a first weight representing the weight of R (v) in the preset scoring mode; the q is a second weight and represents the weight of the Q (v) in the preset scoring mode; the expression of R (v) is:

R(v)＝max_S{P(v,t)}；

wherein S represents the set of nodes, and t represents any graph database node other than v in S; the P (v, t) represents the closest distance of the v to the t.

Optionally, the method further comprises: setting the first weight and the second weight according to a history of data queries to the graph database.

Optionally, setting the first weight and the second weight according to a history of data query for the graph database specifically includes:

obtaining the number of deep searches and the number of wide searches according to the data query history of the graph database;

if the depth search times are greater than the breadth search times, setting the first weight to be greater than the second weight; and if the depth search times are less than the breadth search times, setting the first weight to be less than the second weight.

Optionally, the graph data cache region includes a fixed cache and a hit cache; the filling of the graph data nodes into the graph data cache region specifically includes: populating the fixed cache with graph data nodes; when the first application is running on the device, the method further comprises:

receiving a data query request, wherein the data query request comprises a data query condition;

judging whether the graph data cache region comprises data meeting the data query condition; if yes, returning the data meeting the data query condition to the initiating end of the data query request; if not, returning the data meeting the data query condition in the graph database to the initiating end of the data query request, and filling the data meeting the data query condition into the hit cache.

Optionally, the filling graph data nodes into the fixed cache specifically includes:

obtaining cache format data corresponding to the graph data node;

judging whether the cache space occupied by the cache format data exceeds the residual cache space of the fixed cache, and if not, filling the cache format data into the fixed cache; and if so, stopping filling the fixed cache with the cache format data corresponding to the arbitrary graph data node.

In a second aspect, the present application provides a data caching apparatus, including:

the device comprises a cache region determining module, a cache region determining module and a cache region determining module, wherein the cache region determining module is used for determining a graph data cache region corresponding to a first application program in a memory medium of the device before the first application program runs;

and the data cache module is used for filling graph data nodes into the graph data cache region according to the node structure characteristics of the graph database.

Optionally, the data caching module specifically includes:

the score acquisition unit is used for scoring each graph data node in the graph database according to a preset scoring mode to acquire the score of each graph data node; the preset scoring mode is related to the node structure characteristics of the graph database;

a node list obtaining unit, configured to sort the graph data nodes according to the scores of the graph data nodes, and obtain a node list;

and the data first cache unit is used for filling graph data nodes into the graph data cache region M according to the node list.

Optionally, the above apparatus may further include:

a weight setting module for setting the first weight and the second weight according to a data query history for the graph database.

Optionally, the weight setting module specifically includes:

a search frequency acquisition unit for acquiring a depth search frequency and an extent search frequency according to a data query history for the graph database;

a setting unit, configured to set the first weight to be greater than the second weight when the number of deep searches is greater than the number of breadth searches; and the first weight is set to be smaller than the second weight when the depth search times are smaller than the breadth search times.

Optionally, the graph data cache region includes a fixed cache and a hit cache;

a data cache module, specifically configured to fill graph data nodes into the fixed cache;

when the first application is running on the device, the apparatus may further comprise:

the request receiving module is used for receiving a data query request, wherein the data query request comprises a data query condition;

the judging module is used for judging whether the data meeting the data query condition is included in the graph data cache region;

the data returning module is used for returning the data meeting the data query condition to the initiating end of the data query request when the judgment result of the judging module is yes; the data query module is also used for returning the data which accords with the data query condition in the graph database to the initiating end of the data query request when the judging result of the judging module is negative;

and the data cache module is further used for filling the data meeting the data query condition into the hit cache when the judgment result of the judgment module is negative.

Optionally, the data caching module is specifically configured to include:

the data acquisition unit is used for acquiring cache format data corresponding to the graph data nodes;

the judging unit is used for judging whether the cache space occupied by the cache format data exceeds the residual cache space of the fixed cache or not;

the second data cache unit is used for filling the cache format data into the fixed cache when the judgment result of the judgment unit is negative; and when the judgment result of the judgment unit is yes, stopping filling the fixed cache with the cache format data corresponding to the arbitrary graph data node.

In a third aspect, the present application provides a computer-readable storage medium, in which a computer program is stored, and when the program is executed by a processor, the data caching method provided in the first aspect is implemented.

In a fourth aspect, the present application provides a processor for executing a computer program, where the program executes the data caching method provided in the first aspect.

Compared with the prior art, the method has the following beneficial effects:

in the technical scheme provided by the application, a graph data cache region corresponding to a first application program in a memory medium of equipment is determined before the first application program runs, and graph data nodes are filled into the graph data cache region according to the node structure characteristics of a graph database. By filling graph data nodes into the graph data cache region, when a first application program runs, the time for traversing data from a graph database can be saved; and if the map data cache region contains the required data right, the data can be directly obtained from the map data cache region, even the data in the map database does not need to be searched, and the time for preheating the map database is saved. In addition, in the technical scheme of the application, the graph data nodes filled into the graph data cache region are selected according to the node structure characteristics of the graph database, so that if some graph data nodes are known to be nodes causing low data exchange performance of the memory medium and the external memory medium according to the node structure characteristics of the graph database, after the nodes are filled into the graph data cache region, the graph data nodes can be directly obtained from the graph data cache region when the first application program runs, data exchange of the memory medium and the external memory medium is not needed, the time for traversing the graph database is further saved, and the data obtaining speed is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.

Fig. 1 is a flowchart of a data caching method according to an embodiment of the present application;

FIG. 2 is a diagram illustrating an embodiment of allocating a graph data cache for an application;

FIG. 3 is a schematic diagram of a node structure of a graph database according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a node structure of another graph database according to an embodiment of the present application;

fig. 5 is a flowchart of another data caching method according to an embodiment of the present application;

FIG. 6 is a schematic diagram of a node structure of another graph database according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of a graph data cache area according to an embodiment of the present disclosure;

fig. 8 is a flowchart of another data caching method according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of a data caching apparatus according to an embodiment of the present application;

fig. 10 is a hardware structure diagram of a data caching device according to an embodiment of the present application.

Detailed Description

As mentioned above, when data is currently acquired using a graph database, the speed of acquiring data during the operation of an application is not good for various reasons. For example, the data warming mechanism of the graph database causes the starting phase of the graph database to consume a long time; data acquisition speed may also be reduced when traversing graph data if the graph database data is of a relatively large scale. How to effectively utilize a graph database and improve the data acquisition speed at the same time is a problem which needs to be solved urgently.

In view of the above problems, the inventors provide a data caching method, apparatus and related products. In the application, before an application program is started, graph data nodes are filled into a graph data cache region allocated to the application program according to the node structure characteristics of a graph database, and the graph data cache region is located in a memory medium of a device for running the application program. Because the graph data nodes are cached in advance, when the application program actually runs, the burden of data exchange between the memory medium and the external memory medium can be reduced, the time for preheating the data is saved, and the data acquisition speed is improved.

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Method embodiment

Referring to fig. 1, this figure is a flowchart of a data caching method according to an embodiment of the present application.

The data caching method illustrated in fig. 1 may be implemented by a device that can run an application. The device may be a fixed terminal device, such as a desktop computer; but also mobile terminal devices such as notebook computers, tablet computers, etc.

The device may be used to run a plurality of applications, such as a first application, a second application, a third application, and the like. The functions performed by these different applications may be similar or different. The data required by different applications at run-time may be provided by the same graph database, for example: when the first application program runs, providing train ticket and air ticket booking services for a user; and when the second application program runs, providing hot spot recommendation service for the user. Furthermore, the data required by different applications at run-time may also be provided by different graph databases, for example: when the first application program runs, providing train ticket and air ticket booking services for a user; and when the third application program runs, providing book recommendation service for the user.

The graph database resides in an external storage medium of the device. In this embodiment, a specific implementation of the method is described by taking a first application as an example, and data cached for the first application is mainly provided by a graph database corresponding to the first application.

As shown in fig. 1, the data caching method provided in this embodiment includes:

step 101: before a first application program runs, determining a graph data cache region M corresponding to the first application program in a memory medium of equipment.

In this embodiment, corresponding graph data cache regions are respectively allocated to multiple application programs that may run on the device in advance. The map data cache regions are located in a memory medium of the device and can be used for caching data provided by the map database. Referring to fig. 2, a diagram of allocating a graph data cache for an application according to an embodiment of the present disclosure is shown. As shown in fig. 2, in this embodiment, a map data buffer M is allocated for a first application, a map data buffer X is allocated for a second application, and a map data buffer W is allocated for a third application.

In order to increase the speed of acquiring data when the first application program runs, in this embodiment, before the first application program runs, the graph data cache region M corresponding to the first application program is first determined, for example, the size and the position of the cache space of the graph data cache region M are determined, so that data is subsequently cached in the graph data cache region M.

Step 102: and filling graph data nodes into the graph data cache region M according to the node structure characteristics of the graph database.

A graph database contains a plurality of graph data nodes, in which data is represented and stored using graph data nodes, edges, attributes, and the like. Fig. 3 and 4 illustratively provide node structures of two different graph databases, respectively, in which connecting lines between graph data nodes represent relationships between the nodes in the node structures of the graph databases shown in fig. 3 and 4. The starting node of the connecting line arrow between the graph data nodes is a father node, and the pointing node of the connecting line arrow is a child node. The node that is only a parent node and not a child node is a root node, and the node that is only a child node and not a parent node is a leaf node.

The greater the distance between the root node and the leaf node, the deeper the depth of the graph database; the greater the number of child nodes connected to the root node, the wider the graph database. FIG. 3 is a graph of a graph database in which node structure is dominated by depth; the node structure in the graph database shown in FIG. 4 is breadth-dominated. Of course, there are some graph databases with both depth and breadth features in practical applications, and the node structures of the various graph databases are not described in a drawing manner.

In the concrete implementation of the step, graph data nodes are selected according to the node structure characteristics of the graph database, and then the graph data nodes selected are filled in the graph data cache region M. For example, if the node structure of the graph database is mainly depth, the graph data cache region M may be preferentially filled with graph data nodes with deeper depth; if the node structure characteristic of the graph database is mainly breadth, graph data nodes with wider breadth can be filled into the graph data cache region M preferentially.

In practical applications, the history may also be queried according to data from the graph database when the data is populated. In a possible implementation manner, some data with high frequency query and deeper depth can be determined according to the data query history, the data query times are more, and each query affects the data exchange performance between the external storage medium and the internal storage medium due to the deeper depth, so that the data can be preferentially cached in the graph data cache region M. In another possible implementation manner, some data with high frequency query and wide range can be determined according to the data query history, the data query times are more, and each query influences the data exchange performance between the external storage medium and the internal storage medium due to the wide range, so that the data can be preferentially cached in the graph data cache region.

In the process of caching data, the cached data is not limited to the graph data node itself, and may also include a relationship between the graph data node and an adjacent node, an attribute of the graph data node, and the like.

The above is a data caching method provided in the embodiment of the present application. In the method, before a first application program runs, a graph data cache region corresponding to the first application program in a memory medium of equipment is determined, and graph data nodes are filled into the graph data cache region according to the node structure characteristics of a graph database. By filling graph data nodes into the graph data cache region, when a first application program runs, the time for traversing data from a graph database can be saved; and if the map data cache region contains the required data right, the data can be directly obtained from the map data cache region, even the data in the map database does not need to be searched, and the time for preheating the map database is saved. In addition, in the technical scheme of the application, the graph data nodes filled into the graph data cache region are selected according to the node structure characteristics of the graph database, so that if some graph data nodes are known to be nodes causing low data exchange performance of the memory medium and the external memory medium according to the node structure characteristics of the graph database, after the nodes are filled into the graph data cache region, the graph data nodes can be directly obtained from the graph data cache region when the first application program runs, data exchange of the memory medium and the external memory medium is not needed for the first application program, the time for traversing the graph database is further saved, or the time consumed for traversing the database is avoided, and the data obtaining speed is improved.

In the foregoing embodiment, the step 102 has multiple possible implementation manners, and in some possible implementation manners, a priority sequence for caching data to the graph data cache region M is determined in a quantifiable manner, and data caching is implemented according to the sequence, so that the ordering of the cached data is improved. For the sake of understanding, this implementation is described in detail below with reference to the accompanying drawings.

Referring to fig. 5, this figure is a flowchart of another data caching method according to an embodiment of the present application.

As shown in fig. 5, the data caching method provided in this embodiment includes:

step 501: before a first application program runs, determining a graph data cache region M corresponding to the first application program in a memory medium of equipment.

In this embodiment, the implementation manner of step 501 is substantially the same as that of step 101 in the foregoing embodiment, so that reference may be made to the foregoing embodiment for related description of step 501, and details are not repeated here.

Step 502: and scoring each graph data node in the graph database according to a preset scoring mode to obtain the score of each graph data node.

In this embodiment, a method for ranking graph data nodes in a graph database in a manner of scoring each graph data node in the graph database is provided, so as to select and cache the graph data nodes in order. The preset scoring mode applied in this embodiment is related to the node structure characteristics of the graph database. An example scoring method provided by the embodiments of the present application is described below with reference to the drawings and formulas.

Referring to FIG. 6, a schematic diagram of a node structure of another graph database according to an embodiment of the present application is shown. In fig. 6, this includes: graph data nodes A-H, wherein node A is a root node, and node B, node C and node G are three child nodes of node A; node B and node D are child nodes of node C; the node F and the node E are child nodes of the node D, and the node F is also a child node of the node E; node H is a child of node G.

In this embodiment, the node set S of the graph database includes all the graph data nodes of the graph database. In conjunction with fig. 6, the set of nodes S ═ a, B, C, D, E, F, G, H. Defining a function R (v), wherein R (v) represents the maximum value in the shortest distance between each graph data node in the node set S and any one node v in the node set S, and the expression of the function is as follows:

R(v)＝max_S{ P (v, t) } formula (1)

In formula (1), S represents the node set, and t represents any graph database node except for the node v in the node set S; the P (v, t) represents the closest distance of node v to node t.

Taking fig. 6 as an example, node a and node B have two distances: the distance between the node A and the node B is 1 if the A directly points to the B; from A through C to B, node A is 2 away from node B. Therefore, the closest distance between node a and node B is 1. Similarly, the closest distance between node D and node F is 1. In fig. 6, P (a, B) is 1, P (a, F) is 3, P (a, H) is 2, and the respective closest distances between the other nodes in the node set S and the node a do not exceed 3, so that according to the formula (1), r (a) takes the maximum value thereof, that is, r (a) is 3. It will be appreciated that for node a, the greater r (a), the deeper the depth of the node structure of the graph database.

In addition, in this embodiment, a function q (v) is further defined, where the function q (v) represents a degree of any graph data node v in the node set S, and the value is equal to the number of edges associated with the node v. Taking fig. 6 as an example, q (a) is 3, q (b) is 2, and q (d) is 3. It is understood that, for any one node v, the larger q (v), the more the relationship between the node v and other nodes in the node set S; for node A, the larger Q (A), the wider the node structure of the graph database.

In this embodiment, a first weight p and a second weight q are further defined, in the present application, a preset scoring manner combines and uses a function r (v) and a function q (v) to score any node v, and the first weight p and the second weight q are respectively configured for the function r (v) and the function q (v). The expression of the preset scoring mode is as follows:

score (v) ═ p r (v) + q (v) formula (2)

In formula (2), score (v) represents the score of any graph data node v in the node set S. For the node set S, the first weight p and the second weight q are uniformly set and do not change due to the change of the nodes of the graph data.

It should be noted that, in the present embodiment, the first weight p and the second weight q may be set according to a history of data query to the graph database. In specific implementation, the number of deep searches and the number of wide searches can be obtained according to the data query history of the graph database; the number of deep searches and the number of wide searches are then numerically compared.

As a possible implementation manner, if the distance between nodes involved in each search is obtained from the data query history of the graph database, the number of searches with the distance greater than or equal to a first preset distance (for example, 4) is T1, and T1 is used as the number of deep searches; the number of searches for a distance less than or equal to a second preset distance (e.g., 2) is T2, and T2 is taken as the number of searches for the breadth. The above is only an example way of determining the number of deep searches and the number of wide searches, and in practical applications, other ways may also be adopted to determine the number of deep searches and the number of wide searches. The specific implementation is not limited herein. Taking FIG. 6 as an example, the search by A → C → D → F is determined to be a depth search; the search by A → G, A → C or A → B is determined to be an extent search.

If the number of deep searches is greater than the number of extensive searches, the graph database is used for providing data for the deep search service of the first application program with high frequency, so that the first weight p can be set to be greater than the second weight q, and the R (v) function value of the node v has a relatively high ratio in the calculation of the score; if the number of deep searches is less than the number of breadth searches, indicating that the graph database is used with high frequency to provide data for breadth search services of the first application, the first weight p may be set less than the second weight q, such that the q (v) function value of node v has a relatively high share in the score calculation.

Of course, in practical applications, the first weight p and the second weight q may also be set equal. Taking fig. 6 as an example, if p ═ q ═ 1 is set, the scores of the respective graph data nodes can be obtained as follows using equations (1) and (2):

score(A)＝6，score(B)＝3，score(C)＝5，score(D)＝5，score(E)＝5，score(F)＝5，score(G)＝3，score(H)＝3。

step 503: and sequencing the graph data nodes according to the scores of the graph data nodes to obtain a node list.

As one possible implementation manner, the scores of the respective graph data nodes in the node set S may be arranged in a descending order, so as to form a node list l (S) in which the scores of the graph data nodes are arranged from large to small, still using the example in fig. 6, when p ═ q ═ 1, since score (a) > score (c) ═ score (d) ═ score (e) ═ score (f) > score (b) ═ score (g) ═ score (h), the node list l (S) thus obtained is as follows:

L(S)＝[A,C,D,E,F,B,G,H]

the node list l (S) includes all the graph data nodes in the node set S. The ordering of the nodes in l(s) can be understood as: prioritizing of caching data to the graph data cache region M. The larger the score of a graph data node, the more advanced the ranking in the node list l(s), and the more preferentially the graph data node can be cached in the graph data cache region M than the graph data node with the smaller score.

Step 504: and filling the graph data cache region M with graph data nodes according to the node list.

In a specific implementation of this step, the graph data nodes may be filled into the graph data cache region M according to the order of each graph data node in the node list l(s). In addition, a cache format of the graph data node may be set in advance, and the data may be cached in the graph data cache region M according to the cache format. The following exemplarily provides a cache format:

for any graph data node v in the node list l(s), the node v itself is used as a key, and the set of adjacent nodes of the node v is used as a value. Taking node a in fig. 6 as an example, the data in the cache format is < a, [ B, C, G ] >, where [ B, C, G ] is the set of adjacent nodes of node a.

In this embodiment, any one graph data node v pair may be usedThe required buffer format data is Item_vThus, therefore, it is<A,[B,C,G]>Can be written as Item_A. In the specific implementation of this step, the cache format data corresponding to the node may be filled into the graph data cache region M according to the node sequence of the node list l(s).

In the above embodiment, the score of each graph data node is obtained by scoring each graph data node in the node set of the graph database, and then a node list used as a data cache selection basis is formed according to the score, so as to finally realize the caching of the data. In this embodiment, a scoring method is used to quantify the priority of data caching, so that the caching process is performed more orderly.

In the foregoing embodiment, the data of the graph database is cached in the graph data cache region corresponding to the first application program before the first application program runs, so that the speed of acquiring the data when the first application program runs is increased. In practical applications, there is a possible scenario: when the first application program runs, the data query request does not hit the data cached before, and therefore the data is required to be obtained from the graph database and returned to the initiating end of the data query request. However, after this request, the initiator of the data query request may also repeatedly initiate the request, and if it is apparent that repeatedly obtaining data associated with the request from the graph database will also affect the speed of obtaining the data. In response to this problem, the inventor further provides a data caching method, in which the graph data cache region M corresponding to the first application is divided into a fixed cache and a hit cache. The following description is made with reference to the embodiments and the accompanying drawings.

Referring to fig. 7, the figure is a schematic structural diagram of a graph data buffer M according to an embodiment of the present disclosure. As shown in fig. 7, the map data buffer M specifically includes: fixed cache M1 and hit cache M2. The fixed cache M1 is specifically used for caching data before the first application program runs as described in the foregoing embodiments; while hit cache M2 is specifically used to cache data while the first application is running.

In the case where the size of the cache space of the map data cache M is fixed, the sum of the size of the cache space of the fixed cache M1 and the size of the cache space of the hit cache M2 is equal to the size of the cache space of the map data cache M. In one possible implementation, the size of the buffer space of M1 and the size of the buffer space of M2 are fixed and invariant, respectively. In another possible implementation, the buffer space size of M1 and the buffer space size of M2 are both variable, but the sum of the additions is still equal to the buffer space size of the map data buffer M. For example, a part of the cache space of M1 remains, and this remaining part of the cache space can be allocated for M2.

Referring to fig. 8, this figure is a flowchart of another data caching method provided in this embodiment of the present application.

As shown in fig. 8, the data caching method includes:

step 801: before the first application program runs, determining a graph data cache region M corresponding to the first application program in a memory medium of the device.

As shown in fig. 7, the map data cache M includes a fixed cache M1 and a hit cache M2. This step may specifically include determining the size and location of the cache space of the fixed cache M1 in the graph data cache M, and determining the size and location of the cache space of the hit cache M2.

Step 802: the fixed cache M1 is populated with graph data nodes according to the node structure characteristics of the graph database.

Before the graph data node is filled, the cache format data of the graph data node may be obtained, and then it is determined whether the cache space occupied by the cache format data exceeds the remaining cache space of the fixed cache M1.

It will be appreciated that in the process of filling graph data nodes into the fixed cache M1, the remaining cache space of M1 is continuously decreasing. If the buffer space occupied by the next (or next group of) buffer format data waiting for buffering exceeds the residual buffer space of the fixed buffer M1, the buffer format data cannot be filled into the fixed buffer M1; if the cache space required to be occupied by the next (or next group of) cache-format data waiting for caching is less than or equal to the remaining cache space of the fixed cache M1, the fixed cache M1 may continue to be filled with the cache-format data. For the former case, a remaining cache space is left in the fixed cache M1, which is denoted as M1, and M1 can be allocated to the hit cache M2, so that the waste of the remaining cache space M1 is avoided.

The above judgment process can be expressed by formulas (3) and (4), where formula (3) represents the cache format data Item corresponding to the ith graph data node in the node list l(s)_iThe condition for data caching is carried out, and formula (4) represents the (i + 1) th graph data node Item in the node list L (S)_i+1No longer cached condition.

In formulas (3) and (4), Size (Item)_i) Representing the cache format data Item corresponding to the ith graph data node in the node list L (S)_iThe Size of the required occupied cache space, Size (M1) indicates the Size of the cache space of the fixed cache M1,

the total of the cache space occupied by the cache format data corresponding to the first i graph data nodes in the node list l(s) is shown. As expressed in formula (3), when the sum of the cache spaces occupied by the cache format data corresponding to the first i graph data nodes in the node list l(s) does not exceed the size of the cache space of the fixed cache M1, that is, the cache space occupied by the cache format data corresponding to the ith node does not exceed the size of the remaining cache space of the fixed cache M1, the cache format data Item corresponding to the ith node can be stored in the fixed cache M1_iFills in the fixed buffer M1.

In formula (4), Size (Item)_i+1) Represents the cache format data Item corresponding to the (i + 1) th graph data node in the node list L (S)_i+1The amount of cache space that needs to be occupied,

the total of the cache space occupied by the cache format data corresponding to the first i +1 graph data nodes in the node list l(s) is shown. As expressed in formula (4), when the sum of the cache spaces occupied by the cache format data corresponding to the first i +1 graph data nodes in the node list l(s) exceeds the size of the cache space of the fixed cache M1, that is, the cache space occupied by the cache format data corresponding to the i +1 th node exceeds the size of the remaining cache space after the fixed cache M1 has filled the cache format data corresponding to the first i th node, it is necessary to stop the cache format data Item corresponding to the i +1 th node_i+1Filling operation into the fixed cache M1.

The data caching operation performed when the first application program runs is described below in conjunction with step 803-.

Step 803: a data query request is received when a first application is running on a device.

In this embodiment, the data query request is initiated by a user or a tester directly operating the device, or may be initiated by another device in communication connection with the device. When a data query request is initiated by directly controlling the equipment by a user or a tester, the user or the tester is used as an initiating end of the request; when a data query request is initiated by another device, the other device acts as the initiator of the request.

The data query request may include a data query condition f. The data query condition f may be a keyword for searching data, a picture provided for searching data, audio provided for searching data, or the like. The specific form of the data query condition f is not limited herein.

It should be noted that the data query request described in this step refers to any data query request received when the first application runs on the device. If the device has received other data query requests during the previous operation of the first application before receiving the data query request, then it is likely that the hit cache M2 is not empty; if the data query request is the first data query request received by the device during the operation of the first application, the hit cache M2 may be empty.

Step 804: judging whether the image data cache region M comprises data meeting the data query condition f, if so, executing step 805; if not, step 806 is performed.

In practical applications, if the data query received in step 803 is the first data query received when the first application program runs, there are two possibilities,

one possibility is: the fixed cache M1 includes data meeting the data query condition f, that is, the data meeting the data query condition f in the fixed cache M1 can be directly used without acquiring data from the graph database of the external storage medium. The operation described in step 805 is performed directly.

Another possibility is: the fixed cache M1 does not include data meeting the data query condition f, so it is necessary to obtain relevant data from the graph database of the external storage medium. For the former case, if a data query request including the same data query condition f is subsequently received, the data in the fixed cache M1 can still be used to provide the data to the request initiator; in the latter case, if data query requests including the same data query condition f are subsequently received, the speed of acquiring data is repeatedly affected if data is repeatedly acquired from the external storage medium. For this purpose, step 806 may be executed to cache the first acquired related data in the hit cache M2 for subsequent use, so as to increase the data acquisition speed.

If the data query request received at step 803 is not the first data query request received at the runtime of the first application, then there are three possibilities:

the first two of them may be similar to the above two and are not described again;

a third possibility is: the hit cache M2 includes data meeting the data query condition f, that is, the data meeting the data query condition f in the hit cache M2 is directly used without acquiring data from the graph database of the external storage medium. The operation described in step 805 is performed directly.

Step 805: and returning the data meeting the data query condition f to the initiating end of the data query request.

Step 806: and returning the data meeting the data query condition f in the graph database to the initiating end of the data query request, and filling the data meeting the data query condition f into the hit cache M2.

In practical applications, there may be multiple (or multiple) sets of data meeting the data query condition f, and the device may return all of the data meeting the data query condition f to the initiating terminal through the first application program, or may return a part of the data meeting the data query condition f to the initiating terminal.

It should be noted that, before returning the data meeting the data query condition f to the originating end of the data query request in step 805 and step 806, the first application program may also process the acquired data, and then return the processed data to the originating end of the request.

For example: the cache format data < a, [ B, C ] > is data meeting the data query condition f, but the first application program judges that the graph data node a therein is not data required by the initiator of the data query request, and therefore, the first application program can return only [ B, C ] to the initiator after processing < a, [ B, C ].

It will be appreciated that the above is merely one example implementation of processing data for a first application. In practical applications, depending on the difference of the data query request and the difference of the first application program, other manners may be adopted to process the data, so that the specific manner of data processing is not limited herein.

In the above embodiment, the step 803-806 can be repeatedly executed, i.e. continuously receiving the data query request, and implementing the update of the hit cache M2 in the process. The hit cache M2 is filled with more and more data during the caching process, and when the hit cache M2 is full of data, the update of the data in the hit cache M2 may be stopped.

In the above embodiment, the hit rate of the data query is increased by continuously updating the data in the hit cache M2. In a big data scene, the data cached in the map data cache region M corresponding to the first application program is far less than the data contained in the map database, and the data read-write speed of the memory medium where M is located is far higher than the data read-write speed of the external memory medium where the map database is located. The above description of the embodiments shows that some high-probability query requests can directly perform data query and acquisition in the memory medium, and compared with the prior art, the data read-write performance is greatly improved.

In the embodiment of the application, the design goal of the fixed cache M1 is to pre-cache nodes with low performance caused in depth and breadth-first traversal, and avoid a spontaneous data preheating mechanism of a graph database; the design goal of the hit cache M2 is to make use of the statistics of the user query behavior, and for nodes that are not stored in the fixed cache M1 but are likely to appear in the query, the nodes can be put into the cache to improve efficiency, which is a complement and supplement to the fixed cache mechanism.

Based on the data caching method provided by the foregoing embodiment, correspondingly, the present application further provides a data caching device. A specific implementation of the apparatus is described below with reference to the embodiments and the drawings.

Device embodiment

Referring to fig. 9, this figure is a schematic structural diagram of a data caching apparatus according to an embodiment of the present application.

As shown in fig. 9, the apparatus 900 includes:

a buffer determining module 901, configured to determine, before a first application runs, a graph data buffer M corresponding to the first application in a memory medium of a device;

a data caching module 902, configured to fill graph data nodes into the graph data caching area M according to node structural features of the graph database.

By filling graph data nodes into the graph data cache region, when a first application program runs, the time for traversing data from a graph database can be saved; and if the map data cache region contains the required data right, the data can be directly obtained from the map data cache region, even the data in the map database does not need to be searched, and the time for preheating the map database is saved. In addition, in the technical scheme of the application, the graph data nodes filled into the graph data cache region are selected according to the node structure characteristics of the graph database, so that if some graph data nodes are known to be nodes causing low data exchange performance of the memory medium and the external memory medium according to the node structure characteristics of the graph database, after the nodes are filled into the graph data cache region, the graph data nodes can be directly obtained from the graph data cache region when the first application program runs, data exchange of the memory medium and the external memory medium is not needed, the time for traversing the graph database is further saved, the complex graph traversal process is optimized to be a memory calculation process, and the data obtaining speed is improved.

In some possible implementation manners, a priority sequence for caching data to the graph data cache region M is determined in a quantifiable manner, and data caching is implemented according to the sequence, so that the ordering of the cached data is improved. The data caching module 902 specifically includes:

Optionally, the expression of the preset scoring mode is:

score(v)＝p*R(v)+q*Q(v)；

R(v)＝max_S{P(v,t)}；

The method comprises the steps of obtaining scores of all graph data nodes by scoring all graph data nodes in a node set of a graph database, forming a node list used as a data cache selection basis according to the scores, and finally caching data. In this embodiment, a scoring method is used to quantify the priority of data caching, so that the caching process is performed more orderly. Optionally, the above apparatus may further include:

Optionally, the weight setting module specifically includes:

By caching the data of the graph database into the graph data cache region corresponding to the first application program before the first application program runs, the speed of acquiring the data when the first application program runs is improved. In practical applications, there is a possible scenario: when the first application program runs, the data query request does not hit the data cached before, and therefore the data is required to be obtained from the graph database and returned to the initiating end of the data query request. After this request, the originator of the data query request may also repeatedly initiate the request, and if it is obvious that repeatedly obtaining data associated with the request from the graph database will also affect the speed of obtaining the data. For this problem, the map data cache region M corresponding to the first application may be divided into a fixed cache and a hit cache. Optionally, the graph data cache M comprises a fixed cache M1 and a hit cache M2;

a data cache module 902, specifically configured to fill the graph data node into the fixed cache M1;

the request receiving module is used for receiving a data query request, wherein the data query request comprises a data query condition f;

the judging module is used for judging whether the image data cache region M comprises data meeting the data query condition f;

the data returning module is used for returning the data meeting the data query condition f to the initiating end of the data query request when the judgment result of the judging module is yes; the data query module is also used for returning the data meeting the data query condition f in the graph database to the initiating end of the data query request when the judging result of the judging module is negative;

the data cache module 902 is further configured to, when the determination result of the determining module is negative, fill the data meeting the data query condition f into the hit cache M2.

By continuously updating the data in the hit cache M2, the hit rate of the data query is increased. In a big data scene, the data cached in the map data cache region M corresponding to the first application program is far less than the data contained in the map database, and therefore, the performance of the memory medium where M is located is far higher than the performance of the external memory medium where the map database is located. The above description of the embodiments shows that some high-probability query requests can directly perform data query and acquisition in the memory medium, and compared with the prior art, the performance is greatly improved.

Optionally, the data caching module 902 is specifically configured to include:

the judging unit is used for judging whether the cache space occupied by the cache format data exceeds the residual cache space of the fixed cache M1;

the second data cache unit is used for filling the cache format data into the fixed cache M1 when the judgment result of the judgment unit is negative; when the judgment result of the judgment unit is yes, the fixed cache M1 is stopped from being filled with the cache format data corresponding to any graph data node.

Based on the data caching method and device provided by the foregoing embodiments, the embodiments of the present application further provide a computer-readable storage medium.

The storage medium stores a program, and the program implements some or all of the steps of the data caching method protected by the foregoing method embodiments of the present application when executed by the processor.

The storage medium may be a Memory medium, such as a Read-Only Memory (ROM), a Random Access Memory (RAM), or other various media that can store program codes.

Based on the data caching method, device and storage medium provided by the foregoing embodiments, the embodiments of the present application provide a processor. The processor is used for running a program, wherein when the program runs, part or all of the steps of the data caching method protected by the method embodiment are executed.

Based on the storage medium and the processor provided by the foregoing embodiments, the present application also provides a data caching device.

Referring to fig. 10, this figure is a hardware structure diagram of the data caching device provided in this embodiment.

As shown in fig. 10, the data caching apparatus includes: memory 1001, processor 1002, communication bus 1003, and communication interface 1004.

The memory 1001 stores a program that can be executed on the processor, and when the program is executed, part or all of the steps in the data caching method provided in the foregoing method embodiments of the present application are implemented. The memory 1001 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device.

In this device, the processor 1002 and the memory 1001 transmit signaling, logic instructions, and the like through a communication bus. The device is capable of communicative interaction with other devices via the communication interface 1004.

In the method, before the first application program runs, a graph data cache region corresponding to the first application program in a memory medium of the device is determined, and graph data nodes are filled into the graph data cache region according to the node structure characteristics of the graph database. By filling graph data nodes into the graph data cache region, when a first application program runs, the time for traversing data from a graph database can be saved; and if the map data cache region contains the required data right, the data can be directly obtained from the map data cache region, even the data in the map database does not need to be searched, and the time for preheating the map database is saved. In addition, in the technical scheme of the application, the graph data nodes filled into the graph data cache region are selected according to the node structure characteristics of the graph database, so that if some graph data nodes are known to be nodes causing low data exchange performance of the memory medium and the external memory medium according to the node structure characteristics of the graph database, after the nodes are filled into the graph data cache region, the graph data nodes can be directly obtained from the graph data cache region when the first application program runs, data exchange of the memory medium and the external memory medium is not needed, and the data obtaining speed is further improved.

It should be noted that, in the present specification, all the embodiments are described in a progressive manner, and the same and similar parts among the embodiments may be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, for the apparatus and system embodiments, since they are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described embodiments of the apparatus and system are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts suggested as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

The above description is only one specific embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for caching data, comprising:

2. The method according to claim 1, wherein the populating the graph data cache area with graph data nodes according to node structure characteristics of a graph database specifically comprises:

3. The method of claim 2, wherein the pre-set scoring method is expressed by:

score(v)＝p*R(v)+q*Q(v)；

R(v)＝max_S{P(v,t)}；

4. The method of claim 3, further comprising: setting the first weight and the second weight according to a history of data queries to the graph database.

5. The method according to claim 4, wherein said setting the first weight and the second weight according to a history of data queries to the graph database comprises:

6. The method of claim 1, wherein the graph data cache comprises a fixed cache and a hit cache; the filling of the graph data nodes into the graph data cache region specifically includes: populating the fixed cache with graph data nodes; when the first application is running on the device, the method further comprises:

7. The method according to claim 6, wherein the populating the graph data node to the fixed cache specifically includes:

obtaining cache format data corresponding to the graph data node;

8. A data caching apparatus, comprising:

9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the data caching method as claimed in any one of claims 1 to 7.

10. A processor arranged to run a computer program which when run performs a data caching method as claimed in any one of claims 1 to 7.