CN113392146A

CN113392146A - Efficient data merging method

Info

Publication number: CN113392146A
Application number: CN202110471732.9A
Authority: CN
Inventors: 金剑; 张锐; 唐龙涛
Original assignee: Shanghai Wandehonghui Information Technology Co ltd
Current assignee: Shanghai Wandehonghui Information Technology Co ltd
Priority date: 2021-04-29
Filing date: 2021-04-29
Publication date: 2021-09-14
Anticipated expiration: 2041-04-29
Also published as: CN113392146B

Abstract

The invention provides an efficient data merging method, which is characterized by comprising the following steps: the data publisher publishes data information through N data sources, N is larger than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps. The data merging unit of the invention eliminates the repeated data in the N paths of data by using the data merging rule formulated by the data publisher, thereby merging the N paths of data into one path of data for the client to use. The data publisher can flexibly define the data merging rules according to needs, and the client as the data receiving device does not need to change. The data merging unit can perform merging calculation in real time by using the data merging rule, and avoids data delay and data loss caused by retransmission and source switching.

Description

Efficient data merging method

Technical Field

The invention relates to a data merging method.

Background

For data transmitted in real time, a data provider provides two or even multiple data sources for downstream users to access and use in order to ensure the stability of the data and reduce the influence after hardware faults such as network interruption, machine downtime and the like occur. The device for accessing data generally adopts the following two ways to ensure the access of the dual data sources: the first way-client primary and standby; the second approach-client dual-source access.

As shown in fig. 1, the primary and standby clients access two data sources for a client program, one is primary data, and the other is standby data. If the main path data has a problem, the client program is actively switched to the standby data. The master-standby mode of the client can cause a large amount of data to be missing in a short time or delay.

As shown in fig. 2, the client dual-source access means that one client program accesses two data sources simultaneously and stores the data sources, and can be switched quickly when a problem occurs. Although the client-side dual-source access mode solves the problem of missing a large amount of data in a short time or delaying a large amount of data, a client-side program needs to store two pieces of data, and therefore resources are wasted.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: the existing data switching mode has untimely switching and data waste.

In order to solve the above technical problem, a technical solution of the present invention is to provide an efficient data merging method, which is characterized by comprising the following steps:

the data publisher publishes data information through N data sources, N is more than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps:

step 1, a data merging rule is formulated by a data publisher, the data merging rule consists of M sub-rules, M is more than or equal to 1, each sub-rule defines a data type field, a calculation sequence field and a calculation direction field respectively, wherein: defining the data type of the current sub-rule through a data type field; when rule judgment is carried out through the definition of the calculation sequence field, the calculation sequence of the current sub-rule in all the M sub-rules is judged from the 1 st sub-rule to the M sub-rule in sequence according to the calculation sequence field when the rule judgment is carried out; defining the change trend of data type data of the current sub-rule twice by calculating a direction field;

the data format of the data message issued by the N-path data source comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the head of the message header field is not empty, otherwise, the head of the message header field is empty, the data of the head of the message header field consists of data of different data types, and the adopted data type is determined according to the data type field of the M sub-rules; the message content field data is used for storing data which are actually required to be published by a data publisher;

step 2, the data merging unit receives and stores the data merging rules given by the data publisher;

the data merging unit initializes M comparison values corresponding to the M sub-rules one by one according to the values of the data type fields of the M sub-rules and the values of the calculation sequence fields, the data type of the mth comparison value is equivalent to the data type determined by the values of the data type fields of the mth sub-rule, and M is 1, …, M;

step 3, after receiving a data message, the data merging unit judges whether a header field head of the data message is empty, if so, the step 10 is executed, and if not, the step 4 is executed;

step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 10;

step 5, setting m to be 1;

step 6, acquiring the mth sub-rule, and analyzing the message header field head of the current data message according to the data type field of the mth sub-rule to acquire an mth judgment value;

step 7, comparing the mth judgment value with the mth comparison value, judging whether the change of the mth comparison value compared with the mth judgment value meets the change trend specified by the calculation direction field of the mth sub-rule, if so, entering step 8, and if not, discarding the current data message and returning to step 3 to continue judging the next data message;

step 8, judging whether M is equal to M, if so, updating M comparison values by using the obtained M judgment values and then entering step 10, otherwise, further judging whether M is smaller than M, if so, entering step 9, and if M is larger than M, updating M comparison values by using the obtained M judgment values and then entering step 10;

step 9, updating m to m +1 and returning to step 6;

and step 10, after data are extracted from the message content field data of the current data message, the extracted data are used as data after N paths of data are combined for a client to use.

Preferably, the trend of change includes increase, decrease, constant, change.

Preferably, M different data types are defined by the data type field of the M sub-rules.

The data merging unit in the high-efficiency data merging method provided by the invention eliminates the repeated data in the N paths of data by using the data merging rule formulated by the data publisher, so that the N paths of data are merged into one path of data for the client to use. The data publisher can flexibly define the data merging rules according to needs, and the client as the data receiving device does not need to change. The data merging unit can perform merging calculation in real time by using the data merging rule, and avoids data delay and data loss caused by retransmission and source switching.

Compared with a main/standby mode of a client, the method and the system have the advantages that multiple data sources are simultaneously accessed, and how to rapidly switch one or more data sources after the data sources are asked, so that the problem of untimely switching is solved. Compared with a client-side double-source access mode, the method and the device provided by the invention have the advantages that repeated data in the multi-path data are removed and then provided for the client side to use, and the problem of data waste is solved.

Drawings

FIG. 1 is a schematic diagram of a master/slave mode of a client;

FIG. 2 is a schematic diagram of a client dual-source access method;

FIG. 3 is a schematic diagram of the system of the present invention;

FIG. 4 is a flow chart of data consolidation;

FIG. 5 shows sub-rules in an embodiment.

Detailed Description

The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.

The method is based on the problems of untimely switching and data waste existing in the existing client master-standby mode and the client dual-source access mode. The invention adopts a mode of multi-source simultaneous access of the client, combines and calculates in real time and stores a copy of data, thereby reducing extra data delay caused by abnormal environments such as network jitter and the like while ensuring the integrity of the data.

In this embodiment, the method provided by the present invention is further described by using the client dual-source simultaneous access shown in fig. 3, and specifically includes the following steps in combination with fig. 4:

the data publisher publishes the data message through the data source 1 and the data source 2. The data merging unit is simultaneously accessed to the data source 1 and the data source 2, and merges two paths of data obtained from the two paths of data sources into one path of data to be output. The client uses the path data output by the data merging unit.

In actual production activities, two paths of data provided by the data source 1 and the data source 2 have different characteristics, so that it cannot be guaranteed that the two paths of data are completely consistent during each transmission, and therefore, the data merging unit needs to solve the problem of how to merge data. However, how to merge data is more clear for a data publisher than for a data receiver, so that the data publisher is required to define a merging rule in the invention, and a data merging unit merges and stores two paths of data according to the merging rule, which specifically comprises the following steps:

step 1, a data merging rule is formulated by a data publisher, the data merging rule consists of M sub-rules, M is more than or equal to 1, each sub-rule defines a data type field, a calculation sequence field and a calculation direction field respectively, wherein: defining the data type of the current sub-rule through a data type field; when rule judgment is carried out through the definition of the calculation sequence field, the calculation sequence of the current sub-rule in all the M sub-rules is judged from the 1 st sub-rule to the M sub-rule in sequence according to the calculation sequence field when the rule judgment is carried out; and defining the change trend of the data type data of the current sub-rule twice by calculating the direction field, wherein the change trend comprises increase, decrease, invariance and change.

In this embodiment, the data merge rule includes three sub-rules shown in fig. 5, which are sub-rule R1, sub-rule R2, and sub-rule R3. The data type field of the sub-rule R1 is a 32-bit integer, the calculation order field is 1, and the calculation direction field is increasing; the data type field of the sub-rule R2 is a character string type, the calculation sequence field is 2, and the calculation direction field is unchanged; the data type field of the sub-rule R3 is a 32-bit floating point number, the calculation order field is 3, and the calculation direction field is made smaller. When the data merging rule is judged, according to the calculation sequence field, the judgment sequence is as follows: sub-rule R1, sub-rule R2, and sub-rule R3.

The data formats of the data messages issued by the data source 1 and the data source 2 are shown in the following table:

head

data

the data format of the data message comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the header field head is not null, otherwise, the header field head is null. The data of the header field head is composed of data of different data types, and the data type adopted by the data is determined according to the data type fields of the M sub-rules. The message content field data is used for storing data which needs to be published actually by a data publisher.

in this embodiment, the data merge unit initializes three comparison values in sequence according to the values of the data type fields of the sub-rule R1, the sub-rule R2, and the sub-rule R3, where the first comparison value is a 32-bit integer, the second comparison value is a string type, and the third comparison value is a 32-bit floating point number.

Step 3, after the data merging unit receives a data message, judging whether a message header field head of the data message is empty, if so, entering step 9, and if not, entering step 4;

step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 9;

step 5, analyzing a message header field head of the current data message according to the data type field of the sub-rule R1 to obtain a 1 st judgment value; comparing the 1 st judgment value with the 1 st comparison value, judging whether the change of the 1 st comparison value compared with the 1 st judgment value is in accordance with the increase specified by the calculation direction field of the sub-rule R1, if so, entering the step 6, and if not, discarding the current data message and returning to the step 3 to continue judging the next data message;

step 6, analyzing a message header field head of the current data message according to the data type field of the sub-rule R2 to obtain a 2 nd judgment value; comparing the 2 nd judgment value with the 2 nd comparison value, judging whether the change of the 2 nd comparison value compared with the 2 nd judgment value is consistent with the invariance specified by the calculation direction field of the sub-rule R2, if so, entering the step 7, and if not, discarding the current data message and returning to the step 3 to continuously judge the next data message;

step 7, analyzing a message header field head of the current data message according to the data type field of the sub-rule R3 to obtain a 3 rd judgment value; comparing the 3 rd judgment value with the 3 rd comparison value, judging whether the change of the 3 rd comparison value compared with the 3 rd judgment value is smaller than the change specified by the calculation direction field of the sub-rule R3, if so, entering the step 8, and if not, discarding the current data message and returning to the step 3 to continue judging the next data message;

step 8, updating the three comparison values by using the obtained three judgment values, and then entering step 9;

and 9, after data is extracted from the message content field data of the current data message, the extracted data is used as data obtained by combining two paths of data for the client to use, and the client can use the data immediately or store the data for subsequent use.

Claims

1. An efficient data merging method, comprising the steps of:

step 5, setting m to be 1;

step 9, updating m to m +1 and returning to step 6;

2. An efficient data merging method as claimed in claim 1, wherein the trend includes increase, decrease, constant and change.

3. An efficient data merging method as in claim 1 wherein M different data types are defined by the data type field of the M sub-rules.