CN113448971A - Data updating method based on distributed system, computing node and storage medium - Google Patents

Data updating method based on distributed system, computing node and storage medium Download PDF

Info

Publication number
CN113448971A
CN113448971A CN202010213176.0A CN202010213176A CN113448971A CN 113448971 A CN113448971 A CN 113448971A CN 202010213176 A CN202010213176 A CN 202010213176A CN 113448971 A CN113448971 A CN 113448971A
Authority
CN
China
Prior art keywords
cache object
computing node
updating
characteristic information
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010213176.0A
Other languages
Chinese (zh)
Inventor
李海波
何兰州
侯爽
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing ByteDance Network Technology Co Ltd
Original Assignee
Beijing ByteDance Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing ByteDance Network Technology Co Ltd filed Critical Beijing ByteDance Network Technology Co Ltd
Priority to CN202010213176.0A priority Critical patent/CN113448971A/en
Publication of CN113448971A publication Critical patent/CN113448971A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2365Ensuring data consistency and integrity
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/23Updating
    • G06F16/2308Concurrency control
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Abstract

The application relates to a data updating method based on a distributed system, a computing node and a storage medium. The method comprises the following steps: a first computing node in a distributed system acquires first updating characteristic information of a target cache object, wherein the first updating characteristic information is used for indicating a client to update the target cache object locally cached by a second computing node in the distributed system; the first computing node detects whether a first cache object matched with the data identifier of the target cache object exists in the local cache objects or not according to the first updating characteristic information; and if the first cache object matched with the data identifier of the target cache object is locally cached in the first computing node, performing data updating on the first cache object. Therefore, the problem of heavy network burden caused by the update of the existing full-cache data is solved, the data update cost is reduced, and the update efficiency is improved.

Description

Data updating method based on distributed system, computing node and storage medium
Technical Field
The present application relates to data processing technologies, and in particular, to a data updating method, a computing node, and a storage medium based on a distributed system.
Background
In a distributed computing cluster, for hot spot data (i.e., frequently accessed objects), a cache object is cached in a local memory of each computing node, so that a large amount of network resources are prevented from being consumed due to repeated reading from an external storage device during use, and delay is prevented from being increased. However, when a certain computing node updates a local cache object, an efficient scheme is needed to synchronously update the associated cache objects of other computing nodes to achieve data consistency, and the existing method for achieving data consistency is as follows:
the computing node periodically polls the original data corresponding to the cache object from the external storage device and updates the local cache, that is, the method periodically updates all the cache objects in full, and even if some cache objects are not updated in the external storage device, the cache objects are also updated periodically.
Disclosure of Invention
In order to solve the problems, the invention provides a data updating method, a computing node and a storage medium based on a distributed system, which only needs to update cache objects with updating operation without periodically updating all cache objects in full, thereby solving the problem of heavy network burden caused by the updating of the existing full cache data, reducing the data updating cost and improving the data updating efficiency.
In a first aspect, an embodiment of the present application provides a data updating method based on a distributed system, including:
a first computing node in a distributed system acquires first updating characteristic information of a target cache object, wherein the first updating characteristic information is used for indicating a client to update the target cache object locally cached by a second computing node in the distributed system, and the first updating characteristic information at least indicates that a data identifier of the target cache object exists;
the first computing node detects whether a first cache object matched with the data identifier of the target cache object exists in the local cache objects or not according to the first updating characteristic information;
and if the first cache object matched with the data identifier of the target cache object is locally cached in the first computing node, performing data updating on the first cache object to ensure that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
In this way, since the first computing node can obtain the first update characteristic information of the target cache object having the update operation in the second computing node, the first computing node only needs to update the cache object having the update operation in synchronization, that is, updates the first cache object, and does not need to update all cache objects in the first computing node in a full amount, thereby solving the problem of heavy network load caused by the existing full-amount cache data update, reducing the data update cost, and improving the data update efficiency.
In addition, the cache object in the first computing node in the embodiment of the application does not need to set an expiration time, that is, the first computing node does not delete the cache object that has failed regularly, so that the problem of query failure caused by deletion of the cache object after failure is avoided, a foundation is laid for improving user experience, and further, the problems of increased burden of the external storage device and increased query delay caused by accessing the external storage device (storing original data of the cache object) due to query failure are also avoided.
In a specific embodiment, the obtaining, by a first computing node in the distributed system, first update characteristic information of a target cache object includes:
the first computing node receives the first updating characteristic information sent by a local message queue or a message queue of external equipment, wherein the first updating characteristic information is sent to the message queue after a client performs updating operation on a target cache object cached locally by the second computing node; alternatively, the first and second electrodes may be,
the first computing node queries the first updating feature information from a local shared updating component or a shared updating component of an external device, wherein the first updating feature information is sent to the shared updating component after a client performs an updating operation on a target cache object locally cached by the second computing node.
Here, the present embodiment provides two ways of obtaining the first update characteristic information, the first way is a scheme of receiving the first update characteristic information of the target cache object by using a message queue, and the scheme does not need the first computing node to actively obtain the first update characteristic information of the target cache object where the update operation occurs, but only needs to passively receive the first update characteristic information, so that on the basis of efficiently achieving data consistency, the communication cost of the first computing node is reduced, and especially when the update operation is less, compared with the existing way, the scheme can greatly reduce the communication cost. In the second mode, the scheme that the shared updating component is used for acquiring the first updating characteristic information of the target cache object is adopted, and the first computing node actively acquires the first updating characteristic information of the target cache object in the scheme, so that a foundation is laid for efficiently realizing data consistency, the scheme is strong in universality, especially when the number of local cache objects in the first computing node is large, the updating cost can be greatly reduced, and the updating efficiency is improved.
In a specific embodiment, the detecting, by the first computing node, whether a first cache object matching the data identifier of the target cache object exists in the local cache objects according to the first update characteristic information includes:
the first computing node traverses the local cache object and compares the data identifier of the local cache object of the first computing node with the data identifier of the target cache object indicated by the first updating characteristic information;
and after traversing is completed, the first computing node determines whether a first cache object matched with the data identifier of the target cache object is locally cached in the first computing node or not based on the comparison result.
Here, in this embodiment, it is determined whether the first cache object matched with the data identifier of the target cache object is cached locally in the first computing node in a manner of traversing all cache objects, that is, it is determined whether the cache object to be updated is locally present in the first computing node in a manner of traversing all local cache objects, so that a basis is laid for efficiently implementing data consistency, and the scheme is strong in universality, and especially when there are many local cache objects in the first computing node, the update cost can be greatly reduced, and the update efficiency is improved.
In an embodiment, the updating the data of the first cache object includes:
after the first computing node determines that a first cache object matched with the data identifier of the target cache object is locally cached in the first computing node based on the comparison result, comparing the time characteristic value corresponding to the first cache object with an update time value contained in first update characteristic information;
and after the first computing node compares that the time characteristic value of the first cache object is smaller than the update time value of the target cache object, performing data update on the first cache object.
In an embodiment, the updating the data of the first cache object includes:
the first computing node acquires the updated original data of the target cache object to the external storage device according to the first updating characteristic information;
and the first computing node updates the local first cache object by using the original data, so that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
Here, since the first computing node can obtain the first update feature information of the target cache object having the update operation, the first computing node only needs to update the cache object having the update operation synchronously without performing full update on all cache objects in the first computing node, thereby solving the problem of heavy burden on the external storage device due to the existing full cache data update, and laying a foundation for reducing network communication of the external storage device.
In a specific embodiment, the method further comprises:
and the first computing node detects that the client performs an updating operation on a local second cache object, generates and sends second updating characteristic information, wherein the second updating characteristic information is used for indicating that the client performs an updating operation on the local second cache object of the first computing node.
Here, in order to ensure that the second computing node in the distributed system can know the update operation of the first computing node, the first computing node may further generate second update characteristic information for indicating that the client performs an update operation on a second cache object local to the first computing node, so that a foundation is laid for efficiently implementing data consistency between the first computing node and the second computing node.
In a second aspect, an embodiment of the present application provides a first computing node, including:
the system comprises an obtaining unit, a processing unit and a processing unit, wherein the obtaining unit is used for obtaining first updating characteristic information of a target cache object, the first updating characteristic information is used for indicating a client to update the target cache object locally cached by a second computing node in the distributed system, and the first updating characteristic information at least indicates that a data identifier of the target cache object exists;
the detection unit is used for detecting whether a first cache object matched with the data identifier of the target cache object exists in the local cache objects or not according to the first updating characteristic information;
and the data updating unit is used for determining that a first cache object matched with the data identifier of the target cache object is locally cached, and then performing data updating on the first cache object to enable the updated first cache object in the first computing node to be matched with the updated target cache object in the second computing node.
In a specific embodiment, the obtaining unit is further configured to:
receiving the first updating characteristic information sent by a local message queue or a message queue of external equipment, wherein the first updating characteristic information is sent to the message queue after a client performs updating operation on a target cache object locally cached by the second computing node; alternatively, the first and second electrodes may be,
and querying the first updating characteristic information from a local shared updating component or a shared updating component of an external device, wherein the first updating characteristic information is sent to the shared updating component after a client performs an updating operation on a target cache object locally cached by the second computing node.
In a specific embodiment, the detection unit is further configured to:
traversing the local cache object, and comparing the data identifier of the local cache object with the data identifier of the target cache object indicated by the first updating characteristic information;
and after traversing is finished, determining whether the first computing node locally caches a first cache object matched with the data identifier of the target cache object or not based on a comparison result.
In an embodiment, the data updating unit is further configured to:
after determining that a first cache object matched with the data identifier of the target cache object is locally cached in the first computing node based on the comparison result, comparing the time characteristic value corresponding to the first cache object with an update time value contained in first update characteristic information;
and after the time characteristic value of the first cache object is smaller than the update time value of the target cache object, performing data update on the first cache object.
In an embodiment, the data updating unit is configured to:
acquiring the updated original data of the target cache object from the external storage device according to the first updating characteristic information;
and updating the local first cache object by using the original data, so that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
In a specific embodiment, the detection unit is further configured to:
and detecting that the client performs an updating operation on a local second cache object, and generating and sending second updating characteristic information, wherein the second updating characteristic information is used for indicating that the client performs an updating operation on the local second cache object of the first computing node.
In a third aspect, an embodiment of the present application provides a first computing node, including:
one or more processors;
a memory communicatively coupled to the one or more processors;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the methods described above.
In a fourth aspect, the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the method described above.
Therefore, the first computing node can obtain the first updating characteristic information of the target cache object with the updating operation in the second computing node, so that the first computing node only needs to synchronously update the cache object with the updating operation, namely, the first cache object is updated, and all cache objects in the first computing node do not need to be updated in a whole amount, the problem of heavy network burden caused by the existing whole-amount cache data updating is solved, the data updating cost is reduced, and the data updating efficiency is improved.
Drawings
FIG. 1 is a schematic diagram of a flow chart of implementing a data updating method based on a distributed system according to an embodiment of the present invention;
fig. 2 is a first schematic diagram of a specific application scenario of the data updating method based on the distributed system according to the embodiment of the present application;
fig. 3 is a schematic diagram of a specific application scenario of the data updating method based on the distributed system according to the embodiment of the present application;
FIG. 4 is a schematic diagram of a logical unit structure of a first compute node according to an embodiment of the present invention;
fig. 5 is a schematic diagram of a hardware structure of a first computing node according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In some of the flows described in the specification and claims of the present application and in the above-described figures, a number of operations are included that occur in a particular order, but it should be clearly understood that the flows may include more or less operations, and that the operations may be performed sequentially or in parallel.
The embodiment of the application provides a data updating method, a computing node and a storage medium based on a distributed system; specifically, fig. 1 is a schematic flow chart illustrating an implementation process of a data updating method based on a distributed system according to an embodiment of the present invention, where the method is applied to a first computing node of the distributed system, and specifically, as shown in fig. 1, the method includes:
step 101: a first computing node in a distributed system acquires first updating characteristic information of a target cache object, wherein the first updating characteristic information is used for indicating a client to update the target cache object locally cached by a second computing node in the distributed system, and the first updating characteristic information at least indicates that a data identifier of the target cache object exists.
Here, it should be noted that the distributed system includes two or more computing nodes, and the first computing node and the second computing node described in the present application are both computing nodes in the distributed system, where "first" and "second" are only used to distinguish different nodes, and are not limited to different types.
In this embodiment, to ensure data consistency of each computing node in the distributed system, a first computing node further needs to synchronize information of an update operation performed on the first computing node to another second computing node, and specifically, the first computing node detects that a client performs an update operation on a local second cache object, generates and sends second update characteristic information, where the second update characteristic information is used to indicate that the client performs an update operation on the local second cache object of the first computing node. That is to say, the first computing node can also generate second update characteristic information for updating the second cache object of the local cache by the client, so that other second computing nodes can update the corresponding cache object synchronously, and data consistency in the whole distributed system is ensured.
Step 102: and the first computing node detects whether a first cache object matched with the data identifier of the target cache object exists in the local cache objects or not according to the first updating characteristic information.
In practical application, the first computing node stores the data identifier of the local cache object, so that the first computing node matches the data identifier of the local cache object with the data identifier of the target cache object to determine whether the first cache object which needs to be updated synchronously exists. For example, an identifier identical to the data identifier of the target cache object exists in the first computing node, and at this time, the matching is considered to be successful, and the cache object corresponding to the identifier identical to the data identifier of the target cache object is the first cache object to be subjected to data updating; otherwise, the cache object needing to be updated is considered to be not existed.
Step 103: and if the first cache object matched with the data identifier of the target cache object is locally cached in the first computing node, performing data updating on the first cache object to ensure that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
In a specific embodiment, the data update may be performed in a manner that, specifically, the first computing node obtains, according to the first update characteristic information (for example, according to a data identifier of a target cache object indicated by the first update characteristic information), original data after the target cache object is updated from an external storage device; and the first computing node updates the local first cache object by using the original data, so that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node, and thus, the data consistency is realized.
Here, in order to avoid the problem of access failure in the data updating process, for example, in the data updating process of the first cache object, an access request for the first cache data exists, and at this time, access failure may be caused because the first cache object cannot be accessed.
In the scheme of the application, the cache data updating process can be specifically realized by adopting two modes, specifically, the mode one: a passive mode;
step 1-1: and the first computing node receives the first updating characteristic information sent by a local message queue or a message queue of external equipment, wherein the first updating characteristic information is sent to the message queue after a client performs updating operation on a target cache object cached locally by the second computing node.
Step 1-2: and the first computing node detects whether a first cache object matched with the data identifier of the target cache object exists in the local cache objects or not according to the first updating characteristic information.
Step 1-3: and if the first cache object matched with the data identifier of the target cache object is locally cached in the first computing node, performing data updating on the first cache object to ensure that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
In this manner, the types of message queues include, but are not limited to, Kafka, NSQ, rockmq, and the like; moreover, a message queue may be provided in the first computing node; of course, the method can also be arranged in other equipment before the first computing node, so as to reduce the maintenance cost of the computing nodes in the distributed system; here, if the message queue is disposed in the first computing node, a part of the computing resources need to be partitioned from the first computing node to support the message queue function, and at this time, other second computing nodes do not need to support the message queue function, and the data consistency of the entire distributed system can also be achieved.
Specifically, the method is further described in detail with reference to fig. 2, as shown in fig. 2, notifying other computing nodes in the distributed system based on the message queue so that the other computing nodes can know in near real time whether the original data corresponding to the local cache object in the external storage device is updated, and if the original data is updated, only reloading the updated cache object and updating the local cache to achieve the data consistency of the distributed system, which includes the following specific steps:
step A: the client sends an update request which represents that data update is carried out on a cache object with a data Identification (ID) of X in the computing node A.
And B: and the computing node A updates the cache object with the ID of X in the local cache. At this time, the original data corresponding to the cache object whose ID is X is synchronously updated in the external storage device.
And C: and the computing node A issues the updated characteristic information of the cache object with the ID of X to the message queue so as to receive the updated characteristic information from the message queue by other computing nodes.
Step D: after receiving the updated characteristic information (including at least the ID) of the cache object with the ID of X, other computing nodes in the distributed system, such as the computing node B, determine whether the updated characteristic information is in the local cache.
Step E: and if the computing node B determines that the data is in the local cache, loading the updated original data of the cache object with the ID of X from the external storage equipment, and updating the local cache.
In practical application, each computing node can be added with a message publishing module, and when updating operation exists, the updating characteristic information is published through the message publishing module of the computing node; meanwhile, each computing node can be additionally provided with a message receiving module which is used for receiving the updating characteristic information sent by the message queue and comparing whether the ID in the updating characteristic information is in the local cache or not; and finally, adding an updating module in the local cache of each computing node, actively inquiring corresponding updated original data from the external storage equipment for the cache object needing synchronous updating in the local cache, locking the local cache object to be updated, then acquiring the original data of the cache object needing updating, and updating the local cache needing updating after all the original data are acquired, so that miss of the updated cache object is not inquired. Therefore, the ID (namely the updating characteristic information) of the cache object with the updating operation is sent to all the computing nodes by using the message queue, the communication cost is very low, the data updating cost is reduced, and the data updating efficiency is improved.
It should be noted that, in practical applications, the message queue may be set in the computing node a, and at this time, the scheme of the present application may be implemented only by setting the message queue in the computing node a in the whole distributed system. Of course, the method may also be disposed in other devices besides the computing node a to reduce the maintenance cost of the distributed system, which is not limited in this application.
The second method comprises the following steps: an active mode;
step 2-1: the first computing node queries (for example, periodically queries) the first update characteristic information from a local shared update component or a shared update component of an external device, where the first update characteristic information is sent to the shared update component after a client performs an update operation on a target cache object cached locally by the second computing node.
Step 2-2: the first computing node traverses the local cache object and compares the data identifier of the local cache object of the first computing node with the data identifier of the target cache object indicated by the first updating characteristic information;
step 2-3: and step 203 is circulated until the traversal is completed, and the first computing node determines whether the first computing node locally caches a first cache object matched with the data identifier of the target cache object based on the comparison result.
Step 2-4: and if the first cache object matched with the data identifier of the target cache object is locally cached in the first computing node, performing data updating on the first cache object to ensure that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
Here, in a specific example, after the first computing node determines, based on the comparison result, that the first computing node locally caches a first cache object that matches the data identifier of the target cache object, the time characteristic value corresponding to the first cache object is compared with an update time value included in first update characteristic information; and after the first computing node compares that the time characteristic value of the first cache object is smaller than the update time value of the target cache object, performing data update on the first cache object.
In this manner, the types of shared update components include, but are not limited to, Redis, memcached, databases, etc., and the shared update component may be disposed in the first compute node; of course, the method can also be arranged in other equipment before the first computing node, so as to reduce the maintenance cost of the computing nodes in the distributed system; here, if the shared update component is installed in the first computing node, a part of the computing resources in the first computing node needs to be partitioned to support the function of the shared update component, and at this time, the other second computing nodes can also achieve data consistency in the entire distributed system without supporting the function of the shared update component.
Specifically, describing this manner in further detail with reference to fig. 3, as shown in fig. 3, the update characteristic information is stored based on a cluster shared cache such as redis, where an attribute, i.e., timestamp (i.e., temporal characteristic value) updated _ at, is attached when all cache objects are written into the compute node memory cache.
Further, the steps include:
step A: the client sends an update request which represents that data update is carried out on a cache object with a data Identification (ID) of X in the computing node A.
And B: and the computing node A updates the cache object with the ID of X in the local cache. At this time, the original data corresponding to the cache object whose ID is X is synchronously updated in the external storage device.
And C: the computing node a writes the update characteristic information of the cache object with ID X into the cluster shared cache (that is, the shared update component in the present application), where the format of the update characteristic information may specifically be { key ═ ID, value ═ update _ timestamp }, where update _ timestamp is an update timestamp.
Step D: other computing nodes in the distributed system, such as the computing node B, periodically go to the cluster shared cache to inquire whether the IDs of all cache objects local to the computing node B are in the shared cache and the corresponding update _ timestamp.
Step E, if the first cache object (the cache object whose ID is X) queried locally is in the cluster shared cache, and the updated _ at of the first cache object is smaller than the updated _ timestamp queried by the first cache object in the cluster shared cache, the computing node B considers that the first cache object has an update operation, and further loads the updated original data of the first cache object from the external storage device, and updates the local cache, and at the same time, updates the updated _ at of the first cache object to be the updated _ timestamp.
Here, in practical applications, a communication module sharing the cache with the cluster is added to each computing node, and after the update operation, the update characteristic information, such as the ID and the update time update _ timestamp, is written into the cluster shared cache through the communication module. Here, the time stamp of the update may be set to 8 bytes, and the occupied space is very small compared to a real cache object. Meanwhile, a polling module is added in a local cache of each computing node, whether the updated characteristic information aiming at the local cache object exists or not is inquired in a cluster sharing cache regularly, and after the fact that the updated characteristic information aiming at the local cache object exists is detected, a comparison function is called, whether the updated _ at of the local cache object is smaller than the updated _ timestamp in the inquired updated characteristic information or not is calculated, and whether updating is needed or not is judged. Finally, the local cache of each computing node also comprises an updating module, for the cache object needing to be updated in the local cache, the updated original data is actively inquired from the external storage device, after the cache object needing to be updated is locked, the original data of the cache object needing to be updated is acquired, and the local cache needing to be updated is updated after all the original data are acquired, so that miss of inquiry of the cache item is avoided. Obviously, the method is more universal and more suitable for scenes with more updating operations or large cache objects.
Here, after using any of the above manners of the present application scheme, each compute node can accurately know whether a local cache object is updated in an external storage device, for example, 10 ten thousand cache objects are in a local cache, and only 100 cache objects are actually updated, and at this time, only the updated 100 cache objects need to be updated, so that the update cost can be reduced by 1/1000 compared with that of a full update.
In this way, since the first computing node can obtain the first update characteristic information of the target cache object having the update operation in the second computing node, the first computing node only needs to update the cache object having the update operation in synchronization, that is, updates the first cache object, and does not need to update all cache objects in the first computing node in a full amount, thereby solving the problem of heavy network load caused by the existing full-amount cache data update, reducing the data update cost, and improving the data update efficiency.
In addition, the cache object in the first computing node in the embodiment of the application does not need to set an expiration time, that is, the first computing node does not delete the cache object that has failed regularly, so that the problem of query failure caused by deletion of the cache object after failure is avoided, a foundation is laid for improving user experience, and further, the problems of increased burden of the external storage device and increased query delay caused by accessing the external storage device (storing original data of the cache object) due to query failure are also avoided.
An embodiment of the present application further provides a first computing node, as shown in fig. 4, including:
an obtaining unit 41, configured to obtain first update feature information of a target cache object, where the first update feature information is used to indicate that a client performs an update operation on the target cache object locally cached by a second computing node in the distributed system, and the first update feature information at least indicates that a data identifier of the target cache object exists;
a detecting unit 42, configured to detect whether a first cache object matching the data identifier of the target cache object exists in the local cache object according to the first update feature information;
a data updating unit 43, configured to determine that a first cache object matching the data identifier of the target cache object is stored in the local cache, perform data updating on the first cache object, so that the updated first cache object in the first computing node matches the updated target cache object in the second computing node.
In a specific embodiment, the obtaining unit 41 is further configured to:
receiving the first updating characteristic information sent by a local message queue or a message queue of external equipment, wherein the first updating characteristic information is sent to the message queue after a client performs updating operation on a target cache object locally cached by the second computing node; alternatively, the first and second electrodes may be,
and querying the first updating characteristic information from a local shared updating component or a shared updating component of an external device, wherein the first updating characteristic information is sent to the shared updating component after a client performs an updating operation on a target cache object locally cached by the second computing node.
In an embodiment, the detecting unit 42 is further configured to:
traversing the local cache object, and comparing the data identifier of the local cache object with the data identifier of the target cache object indicated by the first updating characteristic information;
and after traversing is finished, determining whether the first computing node locally caches a first cache object matched with the data identifier of the target cache object or not based on a comparison result.
In an embodiment, the data updating unit 43 is further configured to:
after determining that a first cache object matched with the data identifier of the target cache object is locally cached in the first computing node based on the comparison result, comparing the time characteristic value corresponding to the first cache object with an update time value contained in first update characteristic information;
and after the time characteristic value of the first cache object is smaller than the update time value of the target cache object, performing data update on the first cache object.
In an embodiment, the data updating unit 43 is configured to:
acquiring the updated original data of the target cache object from the external storage device according to the first updating characteristic information;
and updating the local first cache object by using the original data, so that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
In an embodiment, the detecting unit 42 is further configured to:
and detecting that the client performs an updating operation on a local second cache object, and generating and sending second updating characteristic information, wherein the second updating characteristic information is used for indicating that the client performs an updating operation on the local second cache object of the first computing node.
Here, it should be noted that: the description of the above embodiment of the computing node is similar to the description of the above method, and has the same beneficial effects as the method embodiment, and therefore, the description thereof is omitted. For technical details that are not disclosed in the embodiment of the compute node of the present invention, those skilled in the art should refer to the description of the embodiment of the method of the present invention to understand that, for the sake of brevity, detailed description is omitted here.
An embodiment of the present application further provides a first computing node, including: one or more processors; a memory communicatively coupled to the one or more processors; one or more application programs; wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method described above.
In a specific example, the first computing node according to the embodiment of the present application may be embodied as a structure as shown in fig. 5, and the first computing node at least includes a processor 51, a storage medium 52, and at least one external communication interface 53; the processor 51, the storage medium 52 and the external communication interface 53 are all connected by a bus 54. The processor 51 may be a microprocessor, a central processing unit, a digital signal processor, a programmable logic array, or other electronic components with processing functions. The storage medium has stored therein computer executable code capable of performing the method of any of the above embodiments. In practical applications, the acquiring unit 41, the detecting unit 42 and the data updating unit 43 can be implemented by the processor 51.
Here, it should be noted that: the description of the above embodiment of the computing node is similar to the description of the above method, and has the same beneficial effects as the method embodiment, and therefore, the description thereof is omitted. For technical details that are not disclosed in the embodiment of the compute node of the present invention, those skilled in the art should refer to the description of the embodiment of the method of the present invention to understand that, for the sake of brevity, detailed description is omitted here.
Embodiments of the present application also provide a computer-readable storage medium, which stores a computer program, and when the program is executed by a processor, the computer program implements the method described above.
A computer-readable storage medium can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable read-only memory (CDROM). Additionally, the computer-readable storage medium may even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that all or part of the steps carried by the method for implementing the above embodiments can be implemented by hardware related to instructions of a program, which can be stored in a computer readable storage medium, and the program includes one or a combination of the steps of the method embodiments when the program is executed.
In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium. The storage medium may be a read-only memory, a magnetic or optical disk, or the like.
The embodiments described above are only a part of the embodiments of the present invention, and not all of them. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Claims (10)

1. A data updating method based on a distributed system is characterized by comprising the following steps:
a first computing node in a distributed system acquires first updating characteristic information of a target cache object, wherein the first updating characteristic information is used for indicating a client to update the target cache object locally cached by a second computing node in the distributed system, and the first updating characteristic information at least indicates that a data identifier of the target cache object exists;
the first computing node detects whether a first cache object matched with the data identifier of the target cache object exists in the local cache objects or not according to the first updating characteristic information;
and if the first cache object matched with the data identifier of the target cache object is locally cached in the first computing node, performing data updating on the first cache object to ensure that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
2. The method of claim 1, wherein obtaining the first updated characteristic information of the target cache object by the first computing node in the distributed system comprises:
the first computing node receives the first updating characteristic information sent by a local message queue or a message queue of external equipment, wherein the first updating characteristic information is sent to the message queue after a client performs updating operation on a target cache object cached locally by the second computing node; alternatively, the first and second electrodes may be,
the first computing node queries the first updating feature information from a local shared updating component or a shared updating component of an external device, wherein the first updating feature information is sent to the shared updating component after a client performs an updating operation on a target cache object locally cached by the second computing node.
3. The method of claim 1, wherein the detecting, by the first computing node, whether a first cache object matching the data identifier of the target cache object exists in the local cache objects according to the first update characteristic information comprises:
the first computing node traverses the local cache object and compares the data identifier of the local cache object of the first computing node with the data identifier of the target cache object indicated by the first updating characteristic information;
and after traversing is completed, the first computing node determines whether a first cache object matched with the data identifier of the target cache object is locally cached in the first computing node or not based on the comparison result.
4. The method of claim 3, wherein the updating the data of the first cache object comprises:
after the first computing node determines that a first cache object matched with the data identifier of the target cache object is locally cached in the first computing node based on the comparison result, comparing the time characteristic value corresponding to the first cache object with an update time value contained in first update characteristic information;
and after the first computing node compares that the time characteristic value of the first cache object is smaller than the update time value of the target cache object, performing data update on the first cache object.
5. The method of claim 1, wherein the updating the data of the first cache object comprises:
the first computing node acquires the updated original data of the target cache object to the external storage device according to the first updating characteristic information;
and the first computing node updates the local first cache object by using the original data, so that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
6. The method of claim 1, further comprising:
and the first computing node detects that the client performs an updating operation on a local second cache object, generates and sends second updating characteristic information, wherein the second updating characteristic information is used for indicating that the client performs an updating operation on the local second cache object of the first computing node.
7. A first computing node, comprising:
the system comprises an obtaining unit, a processing unit and a processing unit, wherein the obtaining unit is used for obtaining first updating characteristic information of a target cache object, the first updating characteristic information is used for indicating a client to update the target cache object locally cached by a second computing node in the distributed system, and the first updating characteristic information at least indicates that a data identifier of the target cache object exists;
the detection unit is used for detecting whether a first cache object matched with the data identifier of the target cache object exists in the local cache objects or not according to the first updating characteristic information;
and the data updating unit is used for determining that a first cache object matched with the data identifier of the target cache object is locally cached, and then performing data updating on the first cache object to enable the updated first cache object in the first computing node to be matched with the updated target cache object in the second computing node.
8. The computing node of claim 7, wherein the data update unit is configured to:
acquiring the updated original data of the target cache object from the external storage device according to the first updating characteristic information;
and updating the local first cache object by using the original data, so that the updated first cache object in the first computing node is matched with the updated target cache object in the second computing node.
9. A first computing node, comprising:
one or more processors;
a memory communicatively coupled to the one or more processors;
one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the method of any of claims 1-6.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the method according to any one of claims 1 to 6.
CN202010213176.0A 2020-03-24 2020-03-24 Data updating method based on distributed system, computing node and storage medium Pending CN113448971A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010213176.0A CN113448971A (en) 2020-03-24 2020-03-24 Data updating method based on distributed system, computing node and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010213176.0A CN113448971A (en) 2020-03-24 2020-03-24 Data updating method based on distributed system, computing node and storage medium

Publications (1)

Publication Number Publication Date
CN113448971A true CN113448971A (en) 2021-09-28

Family

ID=77806436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010213176.0A Pending CN113448971A (en) 2020-03-24 2020-03-24 Data updating method based on distributed system, computing node and storage medium

Country Status (1)

Country Link
CN (1) CN113448971A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878639A (en) * 2022-09-07 2023-03-31 贝壳找房(北京)科技有限公司 Consistency processing method of secondary cache and distributed service system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105339906A (en) * 2013-06-12 2016-02-17 日本电气株式会社 Data writing control method for persistent storage device
CN107357857A (en) * 2017-06-29 2017-11-17 深圳市金立通信设备有限公司 A kind of method and service node device for updating cache information
CN109586948A (en) * 2018-10-16 2019-04-05 深圳壹账通智能科技有限公司 Update method, apparatus, computer equipment and the storage medium of system configuration data
CN110572450A (en) * 2019-09-05 2019-12-13 腾讯科技(深圳)有限公司 Data synchronization method and device, computer readable storage medium and computer equipment

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105339906A (en) * 2013-06-12 2016-02-17 日本电气株式会社 Data writing control method for persistent storage device
CN107357857A (en) * 2017-06-29 2017-11-17 深圳市金立通信设备有限公司 A kind of method and service node device for updating cache information
CN109586948A (en) * 2018-10-16 2019-04-05 深圳壹账通智能科技有限公司 Update method, apparatus, computer equipment and the storage medium of system configuration data
CN110572450A (en) * 2019-09-05 2019-12-13 腾讯科技(深圳)有限公司 Data synchronization method and device, computer readable storage medium and computer equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115878639A (en) * 2022-09-07 2023-03-31 贝壳找房(北京)科技有限公司 Consistency processing method of secondary cache and distributed service system
CN115878639B (en) * 2022-09-07 2023-10-24 贝壳找房(北京)科技有限公司 Consistency processing method of secondary cache and distributed service system

Similar Documents

Publication Publication Date Title
CN102291416B (en) A kind of method and system of client and server bi-directional synchronization
CN108418900B (en) Caching method, write-in point client and read client in server cluster system
US10275347B2 (en) System, method and computer program product for managing caches
CN107562385B (en) Method, device and equipment for reading data by distributed storage client
CN105468718B (en) Data consistency processing method, device and system
CN111049928B (en) Data synchronization method, system, electronic device and computer readable storage medium
CN113094430B (en) Data processing method, device, equipment and storage medium
CN111475519B (en) Data caching method and device
CN103561095A (en) Data synchronous method and node and storage service cluster
CN112667601B (en) Block chain identification management method, terminal equipment and computer readable storage medium
CN110908965A (en) Object storage management method, device, equipment and storage medium
CN112307119A (en) Data synchronization method, device, equipment and storage medium
CN113010549A (en) Data processing method based on remote multi-active system, related equipment and storage medium
CN116088892A (en) Distributed service system configuration changing method, device, computer equipment and medium
CN113448971A (en) Data updating method based on distributed system, computing node and storage medium
US11455117B2 (en) Data reading method, apparatus, and system, avoiding version rollback issues in distributed system
CN112000850B (en) Method, device, system and equipment for processing data
CN110795495A (en) Data processing method and device, electronic equipment and computer readable medium
CN111367921A (en) Data object refreshing method and device
CN113779052A (en) Data updating method, device, equipment and storage medium
CN111209304B (en) Data processing method, device and system
CN114117280A (en) Page static resource using method and device, terminal equipment and storage medium
CN111104376B (en) Resource file query method and device
CN114238518A (en) Data processing method, device, equipment and storage medium
CN110764697B (en) Data management method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Douyin Vision Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: Tiktok vision (Beijing) Co.,Ltd.

Address after: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant after: Tiktok vision (Beijing) Co.,Ltd.

Address before: 100041 B-0035, 2 floor, 3 building, 30 Shixing street, Shijingshan District, Beijing.

Applicant before: BEIJING BYTEDANCE NETWORK TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information