CN117112189A - Data processing method, device, electronic equipment and storage medium - Google Patents

Data processing method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117112189A
CN117112189A CN202210540576.1A CN202210540576A CN117112189A CN 117112189 A CN117112189 A CN 117112189A CN 202210540576 A CN202210540576 A CN 202210540576A CN 117112189 A CN117112189 A CN 117112189A
Authority
CN
China
Prior art keywords
local table
data
attribute information
node
table node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210540576.1A
Other languages
Chinese (zh)
Inventor
迟成龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Zhenshi Information Technology Co Ltd
Original Assignee
Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Zhenshi Information Technology Co Ltd filed Critical Beijing Jingdong Zhenshi Information Technology Co Ltd
Priority to CN202210540576.1A priority Critical patent/CN117112189A/en
Publication of CN117112189A publication Critical patent/CN117112189A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a data processing method, a device, electronic equipment and a storage medium, wherein the method comprises the following steps: when a data writing request for writing the read data to be processed into the local table is received, determining a destination sample table node according to the current use attribute information of each local table node; and writing the data to be processed into a target local table corresponding to the target local table nodes, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target local table node based on the updated current use attribute information of each local table node when a data writing request is received. The technical scheme of the embodiment of the invention not only realizes the balanced writing of the data to be processed into the corresponding local table, but also avoids the problem of inconsistent data in the writing process, and improves the technical effects of data writing accuracy and high efficiency.

Description

Data processing method, device, electronic equipment and storage medium
Technical Field
The embodiment of the invention relates to the technical field of computers, in particular to a data processing method, a data processing device, electronic equipment and a storage medium.
Background
In the context of the progressive development of big data, computation of real-time data is typically performed. On the premise that a large amount of real-time data exists, the real-time data is generally written into the distributed table, and then the real-time data is written into a local table corresponding to the distributed table based on a data writing component provided by the distributed table, so that data query and processing are performed based on the data stored in the local table.
The inventors found that when implementing the present technical solution based on the above-described mode, the following problems exist:
at present, when the distributed table receives real-time data, the data needs to be divided into a plurality of parts and forwarded to corresponding servers for storage, and the problem that the network flow between the servers is increased, so that the data writing rate is low exists. Meanwhile, because of asynchronous transmission storage, if a server is down in the period, the problem of data loss exists. Further, there is a problem of data write imbalance.
Based on the above, the conventional method for writing real-time data into the distribution table has the problems of low data writing efficiency, data loss and unbalanced data writing.
Disclosure of Invention
The invention provides a data processing method, a data processing device, electronic equipment and a storage medium, which are used for directly writing data to be processed into a local table under the condition of load balancing, so that the technical effects of data processing efficiency and data consistency are improved.
In a first aspect, an embodiment of the present invention provides a data processing method, including:
when a data writing request for writing the read data to be processed into the local table is received, determining a destination sample table node according to the current use attribute information of each local table node;
and writing the data to be processed into a target local table corresponding to the target local table node, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target specimen table node based on the updated node attribute information of each local table node when a data writing request is received.
In a second aspect, an embodiment of the present invention further provides a data processing apparatus, including:
the target specimen surface node determining module is used for determining target specimen surface nodes according to the current use attribute information of each local surface node when receiving a data writing request for writing the read data to be processed into the local surface;
and the attribute information updating module is used for writing the data to be processed into a target local table corresponding to the target local table node, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target specimen table node based on the updated node attribute information of each local table node when a data writing request is received.
In a third aspect, an embodiment of the present invention further provides an electronic device, including:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the data processing method according to any of the embodiments of the present invention.
In a fourth aspect, embodiments of the present invention also provide a storage medium containing computer-executable instructions which, when executed by a computer processor, are used to perform a data processing method according to any of the embodiments of the present invention.
According to the technical scheme, when the data writing request for writing the read data to be processed into the local table is received, the current use attribute information of each local table node is determined, the target table node is determined according to the current use attribute information, so that corresponding data to be processed is written into the target local table node, meanwhile, the current use attribute information of each local table node can be updated according to the current use attribute information of each local table node and corresponding original configuration information, when the data writing request is received, the target table node is determined based on the current use attribute information of each updated local table node, the problems that in the process of writing real-time data into the local table in the prior art, the real-time data needs to be divided into a plurality of parts through a clickhouus distributed table, then each part of data is forwarded to a corresponding server, network flow between the servers is increased, the data writing rate is low, when the data is written into the distributed table based on the clickhou distributed table, if the server is in case, the situation that the server is in the data is written into the local table, the corresponding data is not written into the local table node, the corresponding data is not written into the local table, the corresponding data is not balanced, the problem that the data is not written into the local table is not processed in the local table is solved, and the corresponding process is not balanced, and the problem of the data is not being written into the local table is not being processed, and the local table is not well is solved, and the problem is not being well is well-balanced.
Drawings
In order to more clearly illustrate the technical solution of the exemplary embodiments of the present invention, a brief description is given below of the drawings required for describing the embodiments. It is obvious that the drawings presented are only drawings of some of the embodiments of the invention to be described, and not all the drawings, and that other drawings can be made according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIG. 3 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIG. 4 is a schematic flow chart of a data processing method according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a data processing apparatus according to an embodiment of the present invention;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the drawings and examples. It is to be understood that the specific embodiments described herein are merely illustrative of the invention and are not limiting thereof. It should be further noted that, for convenience of description, only some, but not all of the structures related to the present invention are shown in the drawings.
Before the technical scheme is introduced, an application scene and a used noun of the technical scheme can be illustrated. Under the scene that big data analysis is needed, real-time data can be written into the local table based on the technical scheme, so that information corresponding to each index is obtained through data analysis on the data written in the local table, and more accurate service is provided. The local table refers to a Clickhouse local table. The local tables refer to tables in which clickhouses actually store data, and each local table has a distributed node corresponding to the local table. The distributed nodes correspond to clickhouse distributed tables. A clickhouse Distributed table refers to a Distributed table engine of clickhouse, which can be understood as a view that does not itself store any real-time data, but can be associated to all local tables for query and the like.
Fig. 1 is a schematic flow chart of a data processing method according to an embodiment of the present invention, where the method may be applied to a situation that, under a load balancing condition, a target local table node for writing data is determined, and then data to be processed (real-time data) stored in a data source is directly written into a target local table corresponding to the target local table node. The hardware may be an electronic device, which may be a mobile terminal, a PC-side or a server, etc.
As shown in fig. 1, the method includes:
and S110, when a data writing request for writing the read data to be processed into the local table is received, determining the destination table node according to the current use attribute information of each local table node.
The data to be processed may be data to be written into a local table and analyzed. For example, when a user logs in to a certain platform, the user can browse each webpage in the platform, and the operation behavior data of the user on each webpage can be determined based on the embedded point information. Through the analysis and processing of the operation behavior data of all users, the statistical results corresponding to the indexes can be determined. In general, data is stored in a data source, and if analysis processing is to be performed on the data, the data stored in the data source needs to be imported into a local table, and then the numerical values corresponding to the various indexes of the data are determined based on the operation behavior data stored in the local table. And taking the data currently imported from the data source into the local table as data to be processed. When the data in the data source is required to be written into the local table, a data writing request can be generated, so that when the data writing request is received, the current use attribute information of each local table node is obtained. The local table is a table of stored data corresponding to clickhaus, and the table has the effect of fast reading data. Each local table may be represented by a corresponding local table node, i.e., the local table node corresponds to the local table. And taking the finally written local table as a target local table, and correspondingly, taking a node corresponding to the target local table as a target local table node. Each local table node has corresponding usage attribute information, the usage attribute information is dynamically changed, and the node attribute information corresponding to each local table node can be used as current usage attribute information when the input writing request is received currently. The current usage attribute information may include a usage weight of each local table node or usage configuration information.
Specifically, when a data writing request for writing the read data to be processed into the local table is received, current use attribute information corresponding to each local table node at the current moment can be determined, so that the destination specimen table node is determined according to the current use attribute information.
In this embodiment, the data write request may be generated first before the data write request to write the read data to be processed to the local table is received.
Optionally, the data to be processed is read from each data source based on a preset timing task, and the data writing request is generated based on the data to be processed.
Wherein the timing tasks may be implemented based on pre-written program code. Alternatively, the timing task may be how long to read the data to be processed from the upstream data source, e.g., read the data to be processed from the data source every five minutes or ten minutes. The data source may be, but is not limited to, kafka, hdfs, etc. A data write request may be generated based on pending data obtained from a data source.
Specifically, the preset timing task starts to read the data to be processed from the data source every ten minutes, and at this time, a data writing request may be generated based on the read data to be processed.
In this embodiment, the current usage attribute information includes a current usage weight value, and correspondingly, the determining, according to node attribute information of each local surface node, a target surface node includes: and obtaining the current use weight value of each local table node from the attribute cache container, and determining the target local table node with the maximum current use weight value.
The attribute cache container mainly stores node attribute information corresponding to each local table node, and when the data to be processed is written into the target local table, the node attribute information of the corresponding local table node needs to be updated, so that the node attribute information corresponding to the received data writing request can be used as the current use attribute information. It may be further understood that, after the data to be processed is written into the target local table and the attribute information of each local table node is redetermined, the data is updated in the attribute cache container, that is, only one usage attribute information of each local table node is stored in the attribute cache container, that is, the current usage information is dynamically updated, and the update frequency of the current usage information may be dynamically adapted to the above-mentioned timing task. The current usage weight is used to characterize the selected specific gravity value of the corresponding local surface node, and alternatively, the larger the current usage weight value is, the more the node should be regarded as the target surface node. The local table node with the largest current use weight value can be used as the target local table node.
Specifically, the current use weight value corresponding to each local table node is read from the attribute cache container, and the local table node with the largest weight value can be used as the target local table node.
For example, referring to FIG. 2, a flinkSql or other data writing component may be developed, writing a cllickhouse calling procedure. The data to be processed is read from the data source kafka or hdfs based on a timing task set in advance, at which time a data write request may be generated. And receiving the data writing request based on the load balancing agent component handdler, and sending an instruction for acquiring the current use attribute information of each local table node to the load balancing module based on the data writing request. After the load balancing module receives the instruction, the local table node with the maximum current use weight value can be determined from the attribute cache container, and the node identification of the local table node is fed back to the load balancing agent component. After the load balancing agent component receives the node identification, the destination specimen surface node can be determined, and meanwhile, the data to be processed can be written into a local table corresponding to the destination local surface node, namely, the data to be processed can be written into the cllickhouse local table.
And S120, writing the data to be processed into a target local table corresponding to the target local table nodes, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target local table node based on the updated current use attribute information of each local table node when a data writing request is received.
Each local table node has a local table corresponding to the local table node, and the local table corresponding to the target local table node can be used as the target local table. The original configuration information may be an original configuration weight set in advance for each local table node. The local table is typically deployed on the electronic device, and may set corresponding original configuration information for each electronic device according to the device performance of the electronic device, where, optionally, if the device performance is high, the original configuration information may be relatively large, and correspondingly, if the device performance is relatively poor, the original configuration information may be relatively small. The benefit of configuring different original configuration information for different local tables is that: the method has the advantages that the use attribute information of each local table node can be determined, the data can be uniformly written into the corresponding local table according to the use attribute information, the problem that the processing effect of the equipment on the data is poor due to unbalanced data writing caused by the fact that more data are written into the equipment with poor performance and less data are written into the equipment with good performance is avoided.
Specifically, after determining the destination table according to the current usage attribute information of each local table node, the data to be processed can be written into the target local table, the node attribute information of each local table node can be updated while the data to be processed is written into the target local table, namely, the updated node attribute information is used as the current usage attribute information, so that after receiving a data writing request, the destination table surface node can be determined according to the updated current usage attribute information of each local table node.
Based on the above, while writing data into the target local table, the node attribute information of each local table node can be updated according to the current usage attribute information and the original configuration information of each local table node, and used as the current usage attribute information. And when a data writing request is received, determining the destination specimen surface node based on the updated current use attribute information of the local surface node. That is, the current usage attribute information of the local table node corresponding to each local table is dynamically changed, and the timing of the dynamic change thereof corresponds to the timing of writing the data to be processed to the target local table.
According to the technical scheme, when the data writing request for writing the read data to be processed into the local table is received, the current use attribute information of each local table node is determined, the target table node is determined according to the current use attribute information, so that corresponding data to be processed is written into the target local table node, meanwhile, the current use attribute information of each local table node can be updated according to the current use attribute information of each local table node and corresponding original configuration information, when the data writing request is received, the target table node is determined based on the current use attribute information of each updated local table node, the problems that in the process of writing real-time data into the local table in the prior art, the real-time data needs to be divided into a plurality of parts through a clickhouus distributed table, then each part of data is forwarded to a corresponding server, network flow between the servers is increased, the data writing rate is low, when the data is written into the distributed table based on the clickhou distributed table, if the server is in case, the situation that the server is in the data is written into the local table, the corresponding data is not written into the local table node, the corresponding data is not written into the local table, the corresponding data is not balanced, the problem that the data is not written into the local table is not processed in the local table is solved, and the corresponding process is not balanced, and the problem of the data is not being written into the local table is not being processed, and the local table is not well is solved, and the problem is not being well is well-balanced.
Fig. 3 is a schematic flow chart of a data processing method provided by the embodiment of the present invention, on the basis of the foregoing embodiment, original configuration information of each local table node may be determined first, so as to adjust current usage attribute information of each local table node according to corresponding original configuration information, and meanwhile, the original configuration information of each local table node may also be monitored, so that when the original configuration information changes, the current usage attribute information of the corresponding local table node is adjusted based on the changed original configuration information, thereby determining a target local table node in an equilibrium manner, and then writing data to be processed into the corresponding local table. Wherein, the technical terms identical to or corresponding to the above embodiments are not repeated herein.
As shown in fig. 3, the method includes:
s210, original configuration information of each local table node is determined.
Wherein the original configuration information is determined according to the device capabilities corresponding to each local table. Optionally, the device performance is higher, and under the condition of higher calculation power, the weight value of the original configuration information is higher, otherwise, the weight value is lower.
It should also be noted that the sum of the weight values of the original configuration information of all local table nodes may not be 1.
For example, referring to fig. 4, the configuration of the local table node information is performed for the clickhous (configuration ck node), and the weight value is configured, for example, there are three local tables, and correspondingly, there are three local table nodes, where the device performance corresponding to the local table node 1 is general, the configuration information may be 30, the device performance corresponding to the local table node 2 is better, the configuration information may be 40, the device performance corresponding to the local table node 3 is poor, and the configuration information may be 10. The sum of the node configuration information of all the local table nodes is not necessarily equal to 100, and may be any value.
S220, storing node identifiers of the local table nodes and corresponding original configuration information into a load balancing cache container according to preset data types, so that when the data to be processed is detected to be written into the target local table, updating current use attribute information of the corresponding local table nodes according to the original configuration information stored in the load balancing cache container.
The load balancing cache container mainly stores original configuration information of each local table node. It is understood that the load balancing cache container is simply a storage space that stores the original configuration of each local table node. The preset data type may be a map data type. The storage is performed in the form of key value in Map data type, alternatively Map < String, loadBalance >. Key can be the node identification of the local table node, and value can be the original configuration information corresponding to the local table node.
Specifically, after the original configuration information of each local table node is determined, the node identifier of the local table node and the corresponding original configuration information may be stored in a map cache container according to a preset manner.
Illustratively, with continued reference to FIG. 4, the policy schema defines a load balancing policy, i.e., the original configuration information for each local table node. All load balancing strategies are preloaded into strategy containers, namely load balancing cache containers, the load balancing strategies are stored into map cache containers, namely load balancing weights (original configuration information of all local table nodes) are stored into the map cache containers according to preset data types.
S230, monitoring original configuration information stored in the load balancing cache container based on a monitoring module, and updating each piece of current use attribute information in the attribute cache container based on the changed original configuration information when the original configuration information is detected to change.
The monitoring module may be based on zookeeper asynchronous monitoring. Monitoring whether original configuration information in the load balancing map container changes or not through a zookeeper, and initializing in real time to realize decoupling of the original configuration information and the system. The monitoring module can be active monitoring or passive monitoring. Active monitoring can be understood as acquiring whether the original configuration information stored in the load balancing cache container changes in real time; passive monitoring may be understood as notifying the monitoring module when the original configuration information stored in the load balancing cache container changes, so that the monitoring module sends updated original configuration information to the attribute cache container, and further updates the attribute information to be used currently stored in the attribute cache container.
Illustratively, with continued reference to FIG. 4, the original configuration information stored in the load balancing container is asynchronously snooped based on a zookeeper. It can be understood that the original configuration information in the load balancing map cache container is monitored by utilizing the zookeeper, and initialized in real time, so that when the original configuration information is monitored to be changed, the current use attribute information of each local table node stored in the attribute cache container is updated based on the updated original configuration information.
And S240, when a data writing request for writing the read data to be processed into the local table is received, determining the destination table node according to the current use attribute information of each local table node.
For example, with continued reference to fig. 4, when a data writing request is received, the getMax () method is called based on the load balancing agent module to determine a local table node with the largest weight value in the attribute information to be used currently, and the local table node is used as a target local table node, so that the data to be processed can be written into the target local table corresponding to the target local table node.
S250, writing the data to be processed into a target local table corresponding to the target local table node, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target local table node based on the updated current use attribute information of each local table node when a data writing request is received.
In this embodiment, the updating the current usage attribute information of each local table node according to the current usage attribute information and the original configuration information of each local table node includes: updating the current use attribute information of the target local table node according to the current use attribute information, the original configuration information and the total use attribute information of each local table node of the target local table node; and for each local table node except the target local table node, updating the current use attribute information of the current local table node according to the current use attribute information and the original configuration information of the current local table node.
When writing data into the local table, in order to realize balanced determination of the target specimen table nodes, the current usage attribute information corresponding to each local table node may be dynamically determined, and then the target local table node may be determined according to the updated current usage attribute information.
In this embodiment, the node attribute information of each local table node may be updated while the data to be processed is written into the target local table, but in order to further realize the determination of the balanced target local table node, there is a certain difference in the specific manner of updating the current usage attribute information of the target local table node and other local table nodes with respect to the data to be written in a balanced manner.
Specifically, for the target specimen surface node, current use attribute information when the data to be processed is written into the target local surface node and original configuration information corresponding to the target local surface node are acquired, and a configuration intermediate value is determined. And updating the current use attribute information of the target local table node according to the total use attribute information of the configuration intermediate value and the original configuration information of each local table node, namely the total configuration value. For other non-target local table nodes, the current use attribute information of the current local table node can be updated according to the current use attribute information and the original configuration information.
Illustratively, the local table node includes A, B, C three nodes, and the original configuration information of the three nodes, that is, the original configuration weights may be 10, 20, and 30, respectively. When the data writing request is received for the first time, the current use attribute information of each local table node is the original configuration information of each local table node, and the local table node C with the largest current use attribute information can be determined to be the target local table node based on the data writing request. And writing the data to be processed into a target local table corresponding to the target local table node, and updating the current use attribute information of each local table node based on the mode. If the local table node C is a target local table node, the current usage attribute information is 30, the original configuration information is 30, the total usage attribute information of each local table node is 60, and the updated current usage attribute information of the target local table node is 30+30-60=0; the updated current usage attribute information of the local table node B is 20+20=40; the updated current usage attribute information of the local table node a is 10+10=20. When the data writing request for writing the data to be processed into the local table is received again, the current usage attribute information of the local table node A, B, C is 20, 40 or 0, at this time, it can be determined that the target local table node is B, and when the data to be processed is written into the local table node corresponding to the target local table node B, the above steps can be repeatedly executed to update the current usage attribute information of each local table node.
In this embodiment, the updated current usage attribute information is stored in an attribute cache container, so that when a data writing request is received, the current usage attribute information of each local table node is read from the attribute cache container.
It can be understood that the current usage attribute information of each local table node determined again can be updated into the cache container, so that when a data writing request is received, the current usage attribute information of each local table node is determined from the attribute cache container, and then the target specimen table node is determined according to each current usage attribute information.
As an optional implementation manner of the foregoing embodiment, the overall flow implemented by the foregoing technical solution may be: and (3) developing a normal service code development flow, and developing flnkSql sink according to service requirements, and reading data from an upstream data source, including but not limited to kafka, hdfs and the like. The previous flow is to set parallelism, directly access clickhouse and write, now add agent before, call load balancing agent handdler, return the local node with the maximum weight, and then write the data to be processed into the clickhouse local table corresponding to the local node. The determining of the current usage attribute information of each local table node may be implemented by a load balancing agent, handdler. The first step configures ck local table nodes, that is, configures original configuration information (original configuration weight) for each local table node, for example, original configuration information=30 of node 1, original configuration information=40 of node 2, original configuration information=10 of node 3, and the sum of node weight values is not necessarily equal to 100. Load balancing strategy loading, wherein a strategy mode defines a load balancing strategy, all strategies are preloaded into a strategy container, the load balancing strategy is stored into a map container, and a key value form is stored: map < String, loadBalance >, load ck node weight configuration into initial weight buffer container Map according to load balancing policy, and one corresponding buffer container is current weight buffer container Map, which is used for recording current node weight value, and refresh in real time after each call. Meanwhile, the zookeeper asynchronous monitoring is configured, the value change in the map container is balanced by utilizing the zookeeper monitoring load and initialized in real time, the weight configuration and the system can be decoupled, and one-place operation and any-position configuration can be realized. And receiving the request, calling a getMax () method to return to the node with the maximum current weight value, and then refreshing the current weight cache pool, namely the attribute cache container. And refreshing according to the weight proportion, and carrying out weight recalculation on all the nodes to obtain a new current weight ratio.
According to the technical scheme, when the data writing request for writing the read data to be processed into the local table is received, the current use attribute information of each local table node is determined, the target table node is determined according to the current use attribute information, so that corresponding data to be processed is written into the target local table node, meanwhile, the current use attribute information of each local table node can be updated according to the current use attribute information of each local table node and corresponding original configuration information, when the data writing request is received, the target table node is determined based on the current use attribute information of each updated local table node, the problems that in the process of writing real-time data into the local table in the prior art, the real-time data needs to be divided into a plurality of parts through a clickhouus distributed table, then each part of data is forwarded to a corresponding server, network flow between the servers is increased, the data writing rate is low, when the data is written into the distributed table based on the clickhou distributed table, if the server is in case, the situation that the server is in the data is written into the local table, the corresponding data is not written into the local table node, the corresponding data is not written into the local table, the corresponding data is not balanced, the problem that the data is not written into the local table is not processed in the local table is solved, and the corresponding process is not balanced, and the problem of the data is not being written into the local table is not being processed, and the local table is not well is solved, and the problem is not being well is well-balanced.
Fig. 5 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present invention, as shown in fig. 5, where the apparatus includes: the destination sample surface node determination module 310 and the attribute information update module 320.
The target local table node determining module 310 is configured to determine, when receiving a data writing request for writing the read data to be processed into the local table, a target sample table node according to current usage attribute information of each local table node; and the attribute information updating module 320 is configured to write the data to be processed into a target local table corresponding to the target local table node, and update the current usage attribute information of each local table node according to the current usage attribute information and the original configuration information of each local table node, so as to determine the target local table node based on the updated node attribute information of each local table node when the data writing request is received.
On the basis of the technical scheme, the device further comprises: and the data writing request generation module is used for reading the data to be processed from each data source based on a preset timing task and generating the data writing request based on the data to be processed.
On the basis of the technical scheme, the target local table node determining module is further used for:
and obtaining the current use weight value of each local table node from the attribute cache container, and determining the target local table node with the maximum current use weight value.
On the basis of the technical schemes, the device further comprises:
the configuration information determining module is used for determining original configuration information of each local table node;
the configuration information writing module is used for storing node identifiers of all local table nodes and corresponding original configuration information into the load balancing cache container according to preset data types, so that when the fact that the data to be processed are written into the target local table is detected, the current use attribute information of the corresponding local table nodes is updated according to all the original configuration information stored in the load balancing cache container.
On the basis of the technical schemes, the device further comprises:
and the monitoring module is used for monitoring the original configuration information stored in the load balancing cache container based on the monitoring module, and updating each piece of currently used attribute information in the attribute cache container based on the changed original configuration information when the original configuration information is detected to change.
On the basis of the above technical solutions, the attribute information updating module includes:
the target local surface node updating unit is used for updating the current use attribute information of the target local surface node according to the current use attribute information, the original configuration information and the total use attribute information of each local surface node;
and the other local table nodes updating unit is used for updating the current use attribute information of the current local table node according to the current use attribute information and the original configuration information of the current local table node aiming at all local table nodes except the target local table node.
Based on the above technical solutions, the attribute information caching unit is configured to store the updated current usage attribute information into an attribute cache container, so as to read the current usage attribute information of each local table node from the attribute cache container when a data writing request is received.
According to the technical scheme, when the data writing request for writing the read data to be processed into the local table is received, the current use attribute information of each local table node is determined, the target table node is determined according to the current use attribute information, so that corresponding data to be processed is written into the target local table node, meanwhile, the current use attribute information of each local table node can be updated according to the current use attribute information of each local table node and corresponding original configuration information, when the data writing request is received, the target table node is determined based on the current use attribute information of each updated local table node, the problems that in the process of writing real-time data into the local table in the prior art, the real-time data needs to be divided into a plurality of parts through a clickhouus distributed table, then each part of data is forwarded to a corresponding server, network flow between the servers is increased, the data writing rate is low, when the data is written into the distributed table based on the clickhou distributed table, if the server is in case, the situation that the server is in the data is written into the local table, the corresponding data is not written into the local table node, the corresponding data is not written into the local table, the corresponding data is not balanced, the problem that the data is not written into the local table is not processed in the local table is solved, and the corresponding process is not balanced, and the problem of the data is not being written into the local table is not being processed, and the local table is not well is solved, and the problem is not being well is well-balanced.
The data processing device provided by the embodiment of the invention can execute the data processing method provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
It should be noted that each unit and module included in the above apparatus are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for distinguishing from each other, and are not used to limit the protection scope of the embodiments of the present invention.
Fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present invention. Fig. 6 shows a block diagram of an exemplary electronic device 40 suitable for use in implementing the embodiments of the present invention. The electronic device 40 shown in fig. 6 is merely an example and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 6, the electronic device 40 is in the form of a general purpose computing device. Components of electronic device 40 may include, but are not limited to: one or more processors or processing units 401, a system memory 402, a bus 403 that connects the various system components (including the system memory 402 and the processing units 401).
Bus 403 represents one or more of several types of bus structures, including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or a local bus using any of a variety of bus architectures. By way of example, and not limitation, such architectures include Industry Standard Architecture (ISA) bus, micro channel architecture (MAC) bus, enhanced ISA bus, video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus.
Electronic device 40 typically includes a variety of computer system readable media. Such media can be any available media that is accessible by electronic device 40 and includes both volatile and non-volatile media, removable and non-removable media.
The system memory 402 may include computer system readable media in the form of volatile memory, such as Random Access Memory (RAM) 404 and/or cache memory 405. Electronic device 40 may further include other removable/non-removable, volatile/nonvolatile computer system storage media. By way of example only, storage system 406 may be used to read from and write to non-removable, nonvolatile magnetic media (not shown in FIG. 6, commonly referred to as a "hard drive"). Although not shown in fig. 6, a magnetic disk drive for reading from and writing to a removable non-volatile magnetic disk (e.g., a "floppy disk"), and an optical disk drive for reading from or writing to a removable non-volatile optical disk (e.g., a CD-ROM, DVD-ROM, or other optical media) may be provided. In such cases, each drive may be coupled to bus 403 through one or more data medium interfaces. Memory 402 may include at least one program product having a set (e.g., at least one) of program modules configured to carry out the functions of embodiments of the invention.
A program/utility 408 having a set (at least one) of program modules 407 may be stored in, for example, memory 402, such program modules 407 including, but not limited to, an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment. Program modules 407 generally perform the functions and/or methods of the described embodiments of the invention.
The electronic device 40 may also communicate with one or more external devices 409 (e.g., keyboard, pointing device, display 410, etc.), one or more devices that enable a user to interact with the electronic device 40, and/or any devices (e.g., network card, modem, etc.) that enable the electronic device 40 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 411. Also, electronic device 40 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 412. As shown, network adapter 412 communicates with other modules of electronic device 40 over bus 403. It should be appreciated that although not shown in fig. 6, other hardware and/or software modules may be used in connection with electronic device 40, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
The processing unit 401 executes various functional applications and data processing by running a program stored in the system memory 402, for example, implements the data processing method provided by the embodiment of the present invention.
Embodiments of the present invention also provide a storage medium containing computer-executable instructions for performing a data processing method when executed by a computer processor.
The method comprises the following steps:
when a data writing request for writing the read data to be processed into the local table is received, determining a destination sample table node according to the current use attribute information of each local table node;
and writing the data to be processed into a target local table corresponding to the target local table node, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target local table node based on the updated current use attribute information of each local table node when a data writing request is received.
The computer storage media of embodiments of the invention may take the form of any combination of one or more computer-readable media. The computer readable medium may be a computer readable signal medium or a computer readable storage medium. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Computer program code for carrying out operations for embodiments of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
Note that the above is only a preferred embodiment of the present invention and the technical principle applied. It will be understood by those skilled in the art that the present invention is not limited to the particular embodiments described herein, but is capable of various obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the invention. Therefore, while the invention has been described in connection with the above embodiments, the invention is not limited to the embodiments, but may be embodied in many other equivalent forms without departing from the spirit or scope of the invention, which is set forth in the following claims.

Claims (10)

1. A method of data processing, comprising:
when a data writing request for writing the read data to be processed into the local table is received, determining a destination sample table node according to the current use attribute information of each local table node;
and writing the data to be processed into a target local table corresponding to the target local table node, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target local table node based on the updated current use attribute information of each local table node when a data writing request is received.
2. The method as recited in claim 1, further comprising:
and reading data to be processed from each data source based on a preset timing task, and generating the data writing request based on the data to be processed.
3. The method according to claim 1, wherein the current usage attribute information includes a current usage weight value, and the determining the destination table node according to the current usage attribute information of each local table node includes:
and acquiring the current use weight value of each local table node from the attribute cache container, and taking the local table node with the largest current use weight value as the target local table node.
4. The method as recited in claim 1, further comprising:
determining original configuration information of each local table node;
and storing node identifiers of the local table nodes and corresponding original configuration information into a load balancing cache container according to preset data types, so that when the data to be processed is detected to be written into a target local table, the current use attribute information of the corresponding local table nodes is updated according to the original configuration information stored in the load balancing cache container.
5. The method as recited in claim 4, further comprising:
and monitoring original configuration information stored in the load balancing cache container based on a monitoring module, and updating each piece of current use attribute information in the attribute cache container based on the changed original configuration information when the original configuration information is detected to change.
6. The method of claim 4, wherein updating the current usage attribute information of each local table node based on the current usage attribute information of each local table node and the original configuration information, comprises:
updating the current use attribute information of the target local table node according to the current use attribute information, the original configuration information and the total use attribute information of each local table node of the target local table node;
and for each local table node except the target local table node, updating the current use attribute information of the current local table node according to the current use attribute information and the original configuration information of the current local table node.
7. The method as recited in claim 1, further comprising:
and storing the updated current use attribute information into an attribute cache container so as to read the current use attribute information of each local table node from the attribute cache container when a data writing request is received.
8. A data processing apparatus, comprising:
the target specimen surface node determining module is used for determining target specimen surface nodes according to the current use attribute information of each local surface node when receiving a data writing request for writing the read data to be processed into the local surface;
and the attribute information updating module is used for writing the data to be processed into a target local table corresponding to the target local table node, and updating the current use attribute information of each local table node according to the current use attribute information and the original configuration information of each local table node so as to determine the target specimen table node based on the updated node attribute information of each local table node when a data writing request is received.
9. An electronic device, the electronic device comprising:
one or more processors;
storage means for storing one or more programs,
when executed by the one or more processors, causes the one or more processors to implement the data processing method of any of claims 1-7.
10. A storage medium containing computer executable instructions for performing the data processing method of any of claims 1-7 when executed by a computer processor.
CN202210540576.1A 2022-05-17 2022-05-17 Data processing method, device, electronic equipment and storage medium Pending CN117112189A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210540576.1A CN117112189A (en) 2022-05-17 2022-05-17 Data processing method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210540576.1A CN117112189A (en) 2022-05-17 2022-05-17 Data processing method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117112189A true CN117112189A (en) 2023-11-24

Family

ID=88811581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210540576.1A Pending CN117112189A (en) 2022-05-17 2022-05-17 Data processing method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117112189A (en)

Similar Documents

Publication Publication Date Title
US9996565B2 (en) Managing an index of a table of a database
US9363195B2 (en) Configuring cloud resources
US10331669B2 (en) Fast query processing in columnar databases with GPUs
US20150310052A1 (en) Managing a table of a database
CN111917587B (en) Method for network service management by using service system and service system
WO2023056946A1 (en) Data caching method and apparatus, and electronic device
CN110781505A (en) System construction method and device, retrieval method and device, medium and equipment
AU2019425532B2 (en) System and methods for loading objects from hash chains
WO2020185316A1 (en) In-memory normalization of cached objects to reduce cache memory footprint
US20200175163A1 (en) Feedback-directed static analysis
CN116069725A (en) File migration method, device, apparatus, medium and program product
CN117112189A (en) Data processing method, device, electronic equipment and storage medium
JP6697486B2 (en) Garbage collection without special instructions
US10628416B2 (en) Enhanced database query processing
US10620946B1 (en) Dynamic modeling for opaque code during static analysis
CN112948141A (en) Data processing method, device and system based on message middleware
CN112148450A (en) Data processing method, device, equipment and storage medium
US12072882B2 (en) Database query processing
CN113590483B (en) Use case operation method and device
CN112486421B (en) Data storage method and device, electronic equipment and storage medium
US20160283521A1 (en) Matching untagged data sources to untagged data analysis applications
CN113687881A (en) Metadata calling method and device, electronic equipment and storage medium
CN114385142A (en) Data storage method and device, electronic equipment and storage medium
CN116701152A (en) Database performance determining method, device, equipment and medium
CN116795867A (en) Data processing method, device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination