CN113783921A - Method and device for creating cache component - Google Patents

Method and device for creating cache component Download PDF

Info

Publication number
CN113783921A
CN113783921A CN202110110654.XA CN202110110654A CN113783921A CN 113783921 A CN113783921 A CN 113783921A CN 202110110654 A CN202110110654 A CN 202110110654A CN 113783921 A CN113783921 A CN 113783921A
Authority
CN
China
Prior art keywords
node
cache
data
nodes
written
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110110654.XA
Other languages
Chinese (zh)
Inventor
郭庆海
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Zhenshi Information Technology Co Ltd filed Critical Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority to CN202110110654.XA priority Critical patent/CN113783921A/en
Publication of CN113783921A publication Critical patent/CN113783921A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1001Protocols in which an application is distributed across nodes in the network for accessing one among a plurality of replicated servers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44505Configuring for program initiating, e.g. using registry, configuration files
    • G06F9/4451User profiles; Roaming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/568Storing data temporarily at an intermediate stage, e.g. caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention discloses a method and a device for creating a cache component, and relates to the technical field of computers. One embodiment of the method comprises: determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes; selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node; creating a cache component based on the master node and the slave node. The implementation mode reduces network requests and improves the read-write performance; under the condition of restarting the computing task, each cache node retains data and can quickly recover the data directly based on a local disk.

Description

Method and device for creating cache component
Technical Field
The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for creating a cache component.
Background
During the real-time computing task execution process, the cache component is required to cache data related to the real-time computing task. At present, a common method is that cache middleware such as Redis is used as a cache component. Redis is used as a cache component, a Redis server is required to be remotely requested each time data is queried, and if the problems of response timeout abnormity and the like of the Redis server are met, the real-time computing task is abnormally stopped. When the real-time computing task is restarted, each computing node needs to read data in the Redis server in series to recover the data, the data recovery speed is low, and a large amount of time is consumed.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for creating a cache component, where a distributed cache in a computing task is constructed based on the cache component, and does not depend on an external cache, so that network requests can be reduced, and read-write performance can be improved; under the condition of restarting a computing task, each cache node retains data and can quickly recover the data directly based on a local disk; and event subscription is supported, and data processing of different service scenes is met.
To achieve the above object, according to an aspect of the embodiments of the present invention, there is provided a method for creating a cache component, including:
determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes;
selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node;
creating a cache component based on the master node and the slave node.
Optionally, selecting one of the cache nodes from the plurality of cache nodes as the master node includes:
and selecting one cache node from the plurality of cache nodes as a main node according to a Raft algorithm.
Optionally, the method further comprises: acquiring an IP address of the computing cluster;
synchronizing the data to be written to the local disk of the slave node comprises:
and based on the IP address, the master node synchronizes the data to be written into a local disk of the slave node.
Optionally, the method further comprises: configuring a communication port;
synchronizing the data to be written to the local disk of the slave node comprises: based on the communication port, the master node synchronizes the data to be written into a local disk of the slave node.
Optionally, the method further comprises: counting the query times of the data cached in the cache component; determining hotspot data according to the query times; and storing the hot spot data into a Java virtual machine.
Optionally, the method further comprises: and setting a cache elimination strategy, and updating the hotspot data according to the cache elimination strategy.
Optionally, the method further comprises: setting the invalidation time of the cache file; and determining the data to be deleted in the cache assembly according to the cache file failure time, the time for writing the data to be written into the main node and the current time, and deleting the data to be deleted.
Optionally, the method further comprises: and creating a data change event, wherein the data change event is used for triggering event broadcasting after the data to be written is written into a local disk of the main node or synchronized into a local disk of the slave node, so that the computing nodes of the computing cluster inquire the data to be written.
Optionally, the method further comprises: and constructing a bloom filter based on the data change event.
Optionally, the method further comprises: and creating a hotspot data event, wherein the hotspot data event is used for counting the related information of the hotspot data.
To achieve the above object, according to another aspect of the embodiments of the present invention, there is provided an apparatus for creating a cache component, including:
the system comprises a determining module, a calculating module and a processing module, wherein the determining module is used for determining a calculating cluster for executing a real-time calculating task, the calculating cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes;
the selecting module is used for selecting one cache node from the plurality of cache nodes as a main node, and the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node;
a creating module for creating a cache component based on the master node and the slave node.
Optionally, the election module is further configured to: and selecting one cache node from the plurality of cache nodes as a main node according to a Raft algorithm.
Optionally, the apparatus further includes an obtaining module, configured to obtain an IP address of the computing cluster.
Optionally, the apparatus further comprises a configuration module configured to configure the communication port.
Optionally, the apparatus further includes a hotspot module, configured to count query times of the data cached in the cache component; determining hotspot data according to the query times; and storing the hot spot data into a Java virtual machine.
Optionally, the apparatus further includes an elimination module, configured to set a cache elimination policy, and update the hotspot data according to the cache elimination policy.
Optionally, the apparatus further includes a data cleaning module, configured to set a cache file expiration time; and determining the data to be deleted in the cache assembly according to the cache file failure time, the time for writing the data to be written into the main node and the current time, and deleting the data to be deleted.
Optionally, the apparatus further includes an event model creating module, configured to create a data change event, where the data change event is used to trigger an event broadcast after the data to be written is written into the local disk of the master node or synchronized into the local disk of the slave node, so that the computing nodes of the computing cluster query the data to be written.
Optionally, the apparatus further comprises a filter construction module configured to construct a bloom filter based on the data change event.
Optionally, the event model creation module is further configured to: and creating a hotspot data event, wherein the hotspot data event is used for counting the related information of the hotspot data.
To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided an electronic apparatus including: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by the one or more processors, the one or more processors realize the method for creating the cache component.
To achieve the above object, according to still another aspect of the embodiments of the present invention, there is provided a computer readable medium on which a computer program is stored, the program implementing the method of creating a cache component of an embodiment of the present invention when executed by a processor.
One embodiment of the above invention has the following advantages or benefits: determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes; selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node; a cache component is established based on the main node and the slave node, so that a distributed cache in a computing task is established based on the cache component, and the cache does not depend on an external cache, so that network requests can be reduced, and the read-write performance is improved; under the condition of restarting a computing task, each cache node retains data and can quickly recover the data directly based on a local disk; and event subscription is supported, and data processing of different service scenes is met.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of a main flow of a method of creating a cache component according to an embodiment of the invention;
FIG. 2 is a schematic diagram of the main modules of an apparatus for creating a cache component according to an embodiment of the present invention;
FIG. 3 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 4 is a schematic block diagram of a computer system suitable for use in implementing a terminal device or server of an embodiment of the invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic diagram of a main flow of a method for creating a cache component according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step S101: determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes;
step S102: selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node;
step S103: creating a cache component based on the master node and the slave node.
For step S101, the real-time computing task may be a distributed computing task, the computing cluster for executing the real-time computing task is composed of a plurality of server nodes (or containers), each server node of the computing cluster is used as a cache node to obtain a plurality of cache nodes, and the plurality of cache nodes form a cache component. Thus, each server node has two roles: a compute node of a compute cluster and a cache node of a cache component.
As to step S102, as a specific example, one of the cache nodes may be selected from the plurality of cache nodes as a home node according to a Raft algorithm. In the Raft algorithm, the server node has three states, as follows:
(1) leader (Leader): processing a write request, managing log replication, and maintaining heartbeat service with a follower;
(2) follower (Follower): receiving and processing messages from a leader, and recommending the leader node to be used as the leader to elect when the leader node fails;
(3) candidate (Candidate): and sending a request voting RPC message to other follower nodes to inform voting, and if the votes of most nodes are obtained, successfully voting as a leader. Among them, RPC (Remote Procedure Call) refers to that a client calls an object existing on a Remote computer without knowing the details of the Call, just like calling an object in a local application.
In order to make the process of electing the master node based on the Raft algorithm clearer, the following description takes a server node a, a server node B, and a server node C as an example:
(1) in an initial state, all server nodes in the cluster are in a follower state;
(2) the random timeout characteristic is realized by the Raft algorithm, so that the situation that all server nodes initiate elections at the same time is avoided, and the probability of election failure is reduced. That is, the timeout interval for each server node to wait for the leader node heartbeat information is random; the random timeout time of the server node A is 150ms, the random timeout time of the server node B is 200ms, the random timeout time of the server node C is 300ms, and the random period is 0;
(3) the server node A increases the due period number of the server node A, elects the server node A as a candidate, casts a vote to the server node A, and then sends a request voting RPC message to other nodes to ask the server node A to elect the server node A as a leader;
(4) if other server nodes receive the request voting RPC message of the candidate A and have not voted in the deadline with the number of 1, the other server nodes cast votes to the server node A and increase the deadline number of the other server nodes;
(5) if the candidate wins most of the votes within the election timeout period, it becomes the new leader within the current tenure. After server node a elects the leader, it will periodically send heartbeat messages to inform other server nodes a of the leader.
The Raft algorithm is a distributed consensus algorithm and can ensure the data consistency of the multi-server nodes; the method is decentralized, and a master node (master) is automatically selected every time a task is started or a cluster is abnormal; the algorithm is based on a ProtoBuf protocol, and high-efficiency node data transmission is realized; the algorithm supports various external storage modes, can check the cache data stored in the external storage device in real time, and cannot influence the normal execution of the calculation task. The ProtoBuf protocol is a platform-independent, language-independent, extensible, lightweight and efficient protocol for serialized data structures, and can be used for network communication and data storage.
After the master node is elected, other cache nodes except the master node are all used as slave nodes. In this embodiment, the master node is responsible for processing the write request to write the data to be written into the local disk of the master node, and after writing the data to be written into the local disk of the master node, the data to be written is synchronized into the local disk of the slave node, thereby ensuring data consistency.
For step S103, after the master node and the slave nodes are elected and determined, the cache component is composed of the master node and the slave nodes.
In the method for creating a cache component according to the embodiment of the present invention, a computing cluster for executing a real-time computing task is determined, where the computing cluster includes a plurality of server nodes, and each server node is used as a cache node; selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for processing a data write request, writing data to be written into a local disk of the master node, and after the data to be written is written into the local disk of the master node, synchronizing the data to be written into the local disk of the slave node by the master node; creating a cache component based on the master node and the slave node. A distributed cache in a computing task is constructed based on the cache component, and the external cache is not relied on, so that network requests can be reduced, and the read-write performance is improved; under the condition of restarting a computing task, each cache node retains data and can quickly recover the data directly based on a local disk; and event subscription is supported, and data processing of different service scenes is met.
The cache component of the embodiment of the invention can be used in different computing engines (flink, storm, etc.), and has wide applicability.
In an optional embodiment, the method for creating a cache component further includes: an IP address of a compute cluster executing a real-time compute task is obtained. And according to the IP address of the computing cluster, synchronizing the data to be written into the slave node by the master node of the cache component.
In an alternative embodiment, a configuration file may be preset in which the IP addresses of the computing clusters performing the real-time computing task are recorded. The configuration file is then read, thereby obtaining the IP address of the computing cluster.
In other alternative embodiments, the IP address of the computing cluster may also be obtained through UDP protocol broadcast. The process of acquiring the IP address of the computing cluster through UDP protocol broadcast comprises the following steps:
(1) initializing a server and a client, wherein a cache node with a local IP address equal to a pre-configured UDP server IP address is the server, and other cache nodes are the clients;
(2) after the client is started, sending a registration message to the server every 200ms, and stopping the registration message until a response message of the server is received;
(3) after receiving the message of each client, the server locally generates a client IP set according to each client IP, and sends the message containing the IP set to all the clients when the size of the IP set is equal to the number of nodes of the computing task cluster;
(4) after receiving the message of the server, the client stores the IP set in a local disk, sends a confirmation message to the server and closes the link;
(5) and after receiving the confirmation message of each client, the server stops sending the response message and closes the link.
In an optional embodiment, the method for creating a cache component further includes: a communication port is configured. And the master node of the cache component synchronizes the data to be written into the local disk of the slave node based on the communication port. Wherein the communication port comprises a UDP port and an RPC communication port. UDP (User Datagram Protocol) is a connectionless transport layer Protocol in an OSI (Open System Interconnection) reference model, and provides a transaction-oriented simple unreliable information transfer service.
In an optional embodiment, the method for creating a cache component further includes: and configuring the IP address and the RPC timeout time of the UDP server.
In an optional embodiment, the method for creating a cache component further includes: the Netty communication component is registered and started. The Netty communication component is an open source network IO communication component developed based on Java. In this embodiment, the Netty communication component is used to support communication between a master node and a slave node.
In an optional embodiment, the method further comprises: counting the query times of the data cached in the cache component; determining hotspot data according to the query times; storing the hot spot data in a Java virtual machine (jvm). In this embodiment, the hot data is data which is queried in a large amount within a period of time, and the hot data is stored in the Java virtual machine, so that the data querying speed can be increased, and further the computing efficiency of the computing task is increased. In this embodiment, data may be stored in the form of key-value in the cache component.
In an alternative embodiment, the method further comprises: and setting a cache elimination strategy, and updating the hotspot data according to the cache elimination strategy. In alternative embodiments, the cache eviction policy may be an LRU policy or an LFU policy. The LRU (Least Recently Used) is to find the Least Used data from the cache component and then replace the data, so as to store new data, and its main measure is the time of use and the additional measure is the number of times of use. The core idea of LFU (Least Frequently Used) is to eliminate according to the frequency of recent accesses of keys, with less accessed priorities eliminated and more accessed ones left. In this embodiment, updating the hot spot data according to the cache elimination policy includes updating the hot spot data in the local disk of the cache component and the hot spot data in the Java virtual machine.
As a specific example, the cache component in this embodiment may introduce a third-party open source component guava cache and cache, to implement a cache and cache elimination policy of the Java virtual machine, for example, the guava cache adopts an LRU policy, and the cache is based on an LFU policy. Wherein, the open source components GuavaCache and Caffeine are high-performance cache components developed based on Java.
In an optional embodiment, the method further comprises: setting the invalidation time of the cache file; and determining the data to be deleted in the cache assembly according to the cache file failure time, the time for writing the data to be written into the main node and the current time, and deleting the data to be deleted.
With the execution of the computing task, the data cached by the cache component is larger and larger, and the storage space of the cache component is smaller and smaller. In order to ensure sufficient storage space, determining the data to be deleted in the cache assembly according to preset cache file failure time, the time for writing the data to be written into the main node and the current time, and deleting the data to be deleted. Namely, the data deletion in which the time for writing the data to be written into the main node is equal to or more than the cache file failure time is subtracted from the current time. In practical application, the failed data in the cache component can be cleaned up through the timed scheduling task.
As a specific example, the data to be written may be named according to the time when the data to be written is written to the master node. For example, the cache file is named: index, where yyyy denotes year, MM denotes month, HH denotes hours, and index denotes file suffix. Examples are: 2020-01-28_01. index. After the data is written into the cache assembly, whether the data reaches the cache file failure time or not is determined according to file names, and the data is automatically marked as cold data after the cache file failure time is reached. Data marked as cold data may be cleaned up. In practical applications, a background thread may be started at the master node and each slave node. The method comprises the steps of generating a timing task based on the schedule execution service of JDK, and clearing data which reach the failure of a cache file in a cache component by using the timing task.
In an optional embodiment, the method for creating a cache component further includes: and creating a data change event, wherein the data change event is used for triggering event broadcasting after the data to be written is written into a local disk of the main node or synchronized into a local disk of the slave node, so that the computing nodes of the computing cluster inquire the data to be written.
In this embodiment, the method creates an event model: a data change event. After writing the data to be written to the master node or slave node, an event broadcast is triggered. If the server node of the computing cluster realizes the event monitoring interface, the data to be written in the master node or the slave node can be acquired.
In an alternative embodiment, the method may also construct a bloom filter based on the data change event. That is, by subscribing to a data change event, data written to the master node or slave node is written to the bloom filter. Among them, the bloom filter is mainly used to filter data.
As a specific example, a bloom filter may be created based on the full amount of data over a period of time, such as requiring only data over the last 10 days. In the present invention, the filtering condition of the bloom filter can be flexibly set according to the application scenario, and the present invention is not limited herein.
In the prior art, there are generally two ways to construct a bloom filter: (1) each server node constructs a local bloom filter based on the Redis key space event notification; (2) a global bloom filter is constructed based on Redis, and all child server nodes access this bloom filter. The Redis component has already implemented a bloom filter component, calling directly. Wherein a bloom filter may be used to retrieve whether an element is in a set. In real-time calculation, the method is mainly used for filtering data, generating a hash value as a key according to a service unique identifier or a plurality of field combinations, and constructing a bloom filter. Redis is a high-performance key-value database. However, constructing a bloom filter based on the above-described manner has the following drawbacks: (1) based on Redis key space event notification, when a local bloom filter is constructed, when a task is started, the key value of Redis needs to be scanned in a full amount, the scanning operation is time-consuming operation for Redis, each server node needs to be executed once, and a large amount of time is consumed; (2) the method includes the steps that a global bloom filter is built based on Redis, each task needs to request a Redis cluster when processing each piece of data, execution of the tasks is affected by a large number of I/O operations, if a large number of operations exist in the Redis cluster, timeout exception of Reids requests is caused, and finally exception of real-time computing tasks is caused.
The bloom filter is constructed on the basis of the cache component of the embodiment of the invention, so that network requests can be reduced, and the read-write performance can be improved; under the condition of restarting the computing task, each cache node retains data and can quickly recover the data directly based on a local disk.
In an optional embodiment, the method further comprises:
and creating a hotspot data event, wherein the hotspot data event is used for counting the related information of the hotspot data. The related information of the hot data may include a name, read times, write time, update time, and the like of the hot data.
In an alternative embodiment, the above configuration information may be recorded in the configuration file shown in table 1 below, and the configuration file may be directly read when creating the cache component to obtain the relevant configuration information. The configuration file is easy to maintain, and labor can be saved.
Table 1:
Figure BDA0002918833230000121
Figure BDA0002918833230000131
in table 1, connectorType and connectorConfigJson are used to set external storage such as Mysql. The data in the cache component can be written into the external storage through the parameter for persistent storage.
Fig. 2 is a schematic diagram of main modules of an apparatus 200 for creating a cache component according to an embodiment of the present invention, as shown in fig. 2, the apparatus 200 includes:
a determining module 201, configured to determine a computing cluster for executing a real-time computing task, where the computing cluster includes a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes;
an election module 201, configured to elect one of the cache nodes as a master node from the multiple cache nodes, and the cache node except the master node is a slave node; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node;
a creating module 202 configured to create a cache component based on the master node and the slave node.
Optionally, the election module 201 is further configured to: and selecting one cache node from the plurality of cache nodes as a main node according to a Raft algorithm.
Optionally, the apparatus further includes an obtaining module, configured to obtain an IP address of the computing cluster.
Optionally, the apparatus further comprises a configuration module configured to configure the communication port.
Optionally, the apparatus further includes a hotspot module, configured to count query times of the data cached in the cache component; determining hotspot data according to the query times; and storing the hot spot data into a Java virtual machine.
Optionally, the apparatus further includes an elimination module, configured to set a cache elimination policy, and update the hotspot data according to the cache elimination policy.
Optionally, the apparatus further includes a data cleaning module, configured to set a cache file expiration time; and determining the data to be deleted in the cache assembly according to the cache file failure time, the time for writing the data to be written into the main node and the current time, and deleting the data to be deleted.
Optionally, the apparatus further includes an event model creating module, configured to create a data change event, where the data change event is used to trigger an event broadcast after the data to be written is written into the local disk of the master node or synchronized into the local disk of the slave node, so that the computing nodes of the computing cluster query the data to be written.
Optionally, the apparatus further comprises a filter construction module configured to construct a bloom filter based on the data change event.
Optionally, the event model creation module is further configured to: and creating a hotspot data event, wherein the hotspot data event is used for counting the related information of the hotspot data.
The device for creating the cache component in the embodiment of the invention obtains a plurality of cache nodes by determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is taken as a cache node; selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node; and creating a cache component based on the main node and the slave node. A distributed cache in a computing task is constructed based on the cache component, and the external cache is not relied on, so that network requests can be reduced, and the read-write performance is improved; under the condition of restarting a computing task, each cache node retains data and can quickly recover the data directly based on a local disk; and event subscription is supported, and data processing of different service scenes is met.
The device can execute the method provided by the embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method. For technical details that are not described in detail in this embodiment, reference may be made to the method provided by the embodiment of the present invention.
Fig. 3 illustrates an exemplary system architecture 300 to which the method of creating a cache component or the apparatus for creating a cache component of embodiments of the invention may be applied.
As shown in fig. 3, the system architecture 300 may include terminal devices 301, 302, 303, a network 304, and a server 305. The network 304 serves as a medium for providing communication links between the terminal devices 301, 302, 303 and the server 305. Network 304 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal device 301, 302, 303 to interact with the server 305 via the network 304 to receive or send messages or the like. The terminal devices 301, 302, 303 may have various communication client applications installed thereon, such as shopping applications, web browser applications, search applications, instant messaging tools, mailbox clients, social platform software, and the like.
The terminal devices 301, 302, 303 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 305 may be a server providing various services, such as a background management server providing support for shopping websites browsed by the user using the terminal devices 301, 302, 303. The background management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (e.g., target push information and product information) to the terminal device.
It should be noted that the method for creating a cache component provided by the embodiment of the present invention is generally executed by the server 305, and accordingly, the apparatus for creating a cache component is generally disposed in the server 305.
It should be understood that the number of terminal devices, networks, and servers in fig. 3 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 4, a block diagram of a computer system 400 suitable for use with a terminal device implementing an embodiment of the invention is shown. The terminal device shown in fig. 4 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 4, the computer system 400 includes a Central Processing Unit (CPU)401 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)402 or a program loaded from a storage section 408 into a Random Access Memory (RAM) 403. In the RAM 403, various programs and data necessary for the operation of the system 400 are also stored. The CPU 401, ROM 402, and RAM 403 are connected to each other via a bus 404. An input/output (I/O) interface 405 is also connected to bus 404.
The following components are connected to the I/O interface 405: an input section 406 including a keyboard, a mouse, and the like; an output section 407 including a display device such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage section 408 including a hard disk and the like; and a communication section 409 including a network interface card such as a LAN card, a modem, or the like. The communication section 409 performs communication processing via a network such as the internet. A driver 410 is also connected to the I/O interface 405 as needed. A removable medium 411 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 410 as necessary, so that a computer program read out therefrom is mounted into the storage section 408 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication section 409, and/or installed from the removable medium 411. The computer program performs the above-described functions defined in the system of the present invention when executed by a Central Processing Unit (CPU) 401.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes a sending module, an obtaining module, a determining module, and a first processing module. The names of these modules do not in some cases constitute a limitation on the unit itself, and for example, the sending module may also be described as a "module that sends a picture acquisition request to a connected server".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise:
determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes;
selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node;
creating a cache component based on the master node and the slave node.
Determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes; selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node; creating a cache component based on the master node and the slave node.
A distributed cache in a computing task is constructed based on the cache component of the embodiment, and does not depend on an external cache, so that network requests can be reduced, and the read-write performance is improved; under the condition of restarting a computing task, each cache node retains data and can quickly recover the data directly based on a local disk; and event subscription is supported, and data processing of different service scenes is met.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (13)

1. A method of creating a cache component, comprising:
determining a computing cluster for executing a real-time computing task, wherein the computing cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes;
selecting one cache node from the plurality of cache nodes as a main node, wherein the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node;
creating a cache component based on the master node and the slave node.
2. The method of claim 1, wherein selecting one of the plurality of cache nodes as the home node comprises:
and selecting one cache node from the plurality of cache nodes as a main node according to a Raft algorithm.
3. The method of claim 1, further comprising:
acquiring an IP address of the computing cluster;
the synchronizing, by the master node, the data to be written to the local disk of the slave node includes:
and based on the IP address, the master node synchronizes the data to be written into a local disk of the slave node.
4. The method of claim 1, further comprising: configuring a communication port;
synchronizing the data to be written to the local disk of the slave node comprises:
based on the communication port, the master node synchronizes the data to be written into a local disk of the slave node.
5. The method of claim 1, further comprising:
counting the query times of the data cached in the cache component;
determining hotspot data according to the query times;
and storing the hot spot data into a Java virtual machine.
6. The method of claim 5, further comprising:
and setting a cache elimination strategy, and updating the hotspot data according to the cache elimination strategy.
7. The method of claim 1, further comprising:
setting the invalidation time of the cache file;
and determining the data to be deleted in the cache assembly according to the cache file failure time, the time for writing the data to be written into the main node and the current time, and deleting the data to be deleted.
8. The method of claim 1, further comprising:
and creating a data change event, wherein the data change event is used for triggering event broadcasting after the data to be written is written into a local disk of the main node or synchronized into a local disk of the slave node, so that the computing nodes of the computing cluster inquire the data to be written.
9. The method of claim 8, further comprising:
and constructing a bloom filter based on the data change event.
10. The method of claim 5, further comprising:
and creating a hotspot data event, wherein the hotspot data event is used for counting the related information of the hotspot data.
11. An apparatus for creating a cache component, comprising:
the system comprises a determining module, a calculating module and a processing module, wherein the determining module is used for determining a calculating cluster for executing a real-time calculating task, the calculating cluster comprises a plurality of server nodes, and each server node is used as a cache node to obtain a plurality of cache nodes;
the selecting module is used for selecting one cache node from the plurality of cache nodes as a main node, and the cache nodes except the main node are slave nodes; the master node is used for writing data to be written into a local disk of the master node and synchronizing the data to be written into the local disk of the slave node;
a creating module for creating a cache component based on the master node and the slave node.
12. An electronic device, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-10.
13. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-10.
CN202110110654.XA 2021-01-27 2021-01-27 Method and device for creating cache component Pending CN113783921A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110110654.XA CN113783921A (en) 2021-01-27 2021-01-27 Method and device for creating cache component

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110110654.XA CN113783921A (en) 2021-01-27 2021-01-27 Method and device for creating cache component

Publications (1)

Publication Number Publication Date
CN113783921A true CN113783921A (en) 2021-12-10

Family

ID=78835476

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110110654.XA Pending CN113783921A (en) 2021-01-27 2021-01-27 Method and device for creating cache component

Country Status (1)

Country Link
CN (1) CN113783921A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115002116A (en) * 2022-05-30 2022-09-02 紫光建筑云科技(重庆)有限公司 Distributed redis cluster on cloud platform and reliability detection method

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060072231A1 (en) * 2004-10-06 2006-04-06 Fischer Jonathan H Current mirrors having fast turn-on time
US20070288526A1 (en) * 2006-06-08 2007-12-13 Emc Corporation Method and apparatus for processing a database replica
CN103377100A (en) * 2012-04-26 2013-10-30 华为技术有限公司 Data backup method, network nodes and system
EP2876563A1 (en) * 2013-11-22 2015-05-27 Sap Se Transaction commit operations with thread decoupling and grouping of I/O requests
CN109639773A (en) * 2018-11-26 2019-04-16 中国船舶重工集团公司第七六研究所 A kind of the distributed data cluster control system and its method of dynamic construction
CN111258822A (en) * 2020-01-15 2020-06-09 广州虎牙科技有限公司 Data processing method, server and computer readable storage medium
CN111586147A (en) * 2020-04-30 2020-08-25 平安科技(深圳)有限公司 Node synchronization method, device, equipment and storage medium of block chain

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060072231A1 (en) * 2004-10-06 2006-04-06 Fischer Jonathan H Current mirrors having fast turn-on time
US20070288526A1 (en) * 2006-06-08 2007-12-13 Emc Corporation Method and apparatus for processing a database replica
CN103377100A (en) * 2012-04-26 2013-10-30 华为技术有限公司 Data backup method, network nodes and system
EP2876563A1 (en) * 2013-11-22 2015-05-27 Sap Se Transaction commit operations with thread decoupling and grouping of I/O requests
CN109639773A (en) * 2018-11-26 2019-04-16 中国船舶重工集团公司第七六研究所 A kind of the distributed data cluster control system and its method of dynamic construction
CN111258822A (en) * 2020-01-15 2020-06-09 广州虎牙科技有限公司 Data processing method, server and computer readable storage medium
CN111586147A (en) * 2020-04-30 2020-08-25 平安科技(深圳)有限公司 Node synchronization method, device, equipment and storage medium of block chain

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
DONGGANG CAO;PEIDONG LIU;WEI CUI;YEHONG ZHONG;BO AN;: "Cluster as a Service: A Resource Sharing Approach for Private Cloud", TSINGHUA SCIENCE AND TECHNOLOGY, no. 06 *
任永坚;沈之强;张纪林;万健;殷昱煜;蒋从锋;: "云计算系统中的块级别网络磁盘缓存技术研究", 小型微型计算机系统, no. 03 *
查德威克: "《ASP.NET MVC 4 Web编程》", 31 December 2013, pages: 211 - 213 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115002116A (en) * 2022-05-30 2022-09-02 紫光建筑云科技(重庆)有限公司 Distributed redis cluster on cloud platform and reliability detection method

Similar Documents

Publication Publication Date Title
US11379428B2 (en) Synchronization of client machines with a content management system repository
US11687555B2 (en) Conditional master election in distributed databases
US9514208B2 (en) Method and system of stateless data replication in a distributed database system
US9262324B2 (en) Efficient distributed cache consistency
US10097659B1 (en) High performance geographically distributed data storage, retrieval and update
US11741075B2 (en) Methods and system of tracking transactions for distributed ledger
EP2767912A2 (en) In-memory real-time synchronized database system and method
CN111818117A (en) Data updating method and device, storage medium and electronic equipment
CN112751847A (en) Interface call request processing method and device, electronic equipment and storage medium
CN103607424A (en) Server connection method and server system
JP2012234333A (en) Cluster system, synchronization control method, server device and synchronization control program
US10721335B2 (en) Remote procedure call using quorum state store
US11397632B2 (en) Safely recovering workloads within a finite timeframe from unhealthy cluster nodes
CN115185705A (en) Message notification method, device, medium and equipment
CN110737510B (en) Block device management system
US20060282524A1 (en) Apparatus, system, and method for facilitating communication between an enterprise information system and a client
CN113783921A (en) Method and device for creating cache component
CN112948498A (en) Method and device for generating global identification of distributed system
US11120007B2 (en) Module expiration management
US20190065327A1 (en) Efficient versioned object management
CN108701035B (en) Management of application properties
WO2024066001A1 (en) Data update method and apparatus, storage medium, and electronic apparatus
US20240061754A1 (en) Management of logs and cache for a graph database
CN114610740B (en) Data version management method and device of medical data platform
US20230289347A1 (en) Cache updates through distributed message queues

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination