CN113392146A - Efficient data merging method - Google Patents
Efficient data merging method Download PDFInfo
- Publication number
- CN113392146A CN113392146A CN202110471732.9A CN202110471732A CN113392146A CN 113392146 A CN113392146 A CN 113392146A CN 202110471732 A CN202110471732 A CN 202110471732A CN 113392146 A CN113392146 A CN 113392146A
- Authority
- CN
- China
- Prior art keywords
- data
- rule
- sub
- field
- message
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 14
- 238000010586 diagram Methods 0.000 description 3
- 239000002699 waste material Substances 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000007596 consolidation process Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/252—Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/256—Integrating or interfacing systems involving database management systems in federated or virtual databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/258—Data format conversion from or to a database
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides an efficient data merging method, which is characterized by comprising the following steps: the data publisher publishes data information through N data sources, N is larger than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps. The data merging unit of the invention eliminates the repeated data in the N paths of data by using the data merging rule formulated by the data publisher, thereby merging the N paths of data into one path of data for the client to use. The data publisher can flexibly define the data merging rules according to needs, and the client as the data receiving device does not need to change. The data merging unit can perform merging calculation in real time by using the data merging rule, and avoids data delay and data loss caused by retransmission and source switching.
Description
Technical Field
The invention relates to a data merging method.
Background
For data transmitted in real time, a data provider provides two or even multiple data sources for downstream users to access and use in order to ensure the stability of the data and reduce the influence after hardware faults such as network interruption, machine downtime and the like occur. The device for accessing data generally adopts the following two ways to ensure the access of the dual data sources: the first way-client primary and standby; the second approach-client dual-source access.
As shown in fig. 1, the primary and standby clients access two data sources for a client program, one is primary data, and the other is standby data. If the main path data has a problem, the client program is actively switched to the standby data. The master-standby mode of the client can cause a large amount of data to be missing in a short time or delay.
As shown in fig. 2, the client dual-source access means that one client program accesses two data sources simultaneously and stores the data sources, and can be switched quickly when a problem occurs. Although the client-side dual-source access mode solves the problem of missing a large amount of data in a short time or delaying a large amount of data, a client-side program needs to store two pieces of data, and therefore resources are wasted.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the existing data switching mode has untimely switching and data waste.
In order to solve the above technical problem, a technical solution of the present invention is to provide an efficient data merging method, which is characterized by comprising the following steps:
the data publisher publishes data information through N data sources, N is more than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps:
the data format of the data message issued by the N-path data source comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the head of the message header field is not empty, otherwise, the head of the message header field is empty, the data of the head of the message header field consists of data of different data types, and the adopted data type is determined according to the data type field of the M sub-rules; the message content field data is used for storing data which are actually required to be published by a data publisher;
the data merging unit initializes M comparison values corresponding to the M sub-rules one by one according to the values of the data type fields of the M sub-rules and the values of the calculation sequence fields, the data type of the mth comparison value is equivalent to the data type determined by the values of the data type fields of the mth sub-rule, and M is 1, …, M;
step 3, after receiving a data message, the data merging unit judges whether a header field head of the data message is empty, if so, the step 10 is executed, and if not, the step 4 is executed;
step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 10;
step 5, setting m to be 1;
step 6, acquiring the mth sub-rule, and analyzing the message header field head of the current data message according to the data type field of the mth sub-rule to acquire an mth judgment value;
step 7, comparing the mth judgment value with the mth comparison value, judging whether the change of the mth comparison value compared with the mth judgment value meets the change trend specified by the calculation direction field of the mth sub-rule, if so, entering step 8, and if not, discarding the current data message and returning to step 3 to continue judging the next data message;
step 8, judging whether M is equal to M, if so, updating M comparison values by using the obtained M judgment values and then entering step 10, otherwise, further judging whether M is smaller than M, if so, entering step 9, and if M is larger than M, updating M comparison values by using the obtained M judgment values and then entering step 10;
step 9, updating m to m +1 and returning to step 6;
and step 10, after data are extracted from the message content field data of the current data message, the extracted data are used as data after N paths of data are combined for a client to use.
Preferably, the trend of change includes increase, decrease, constant, change.
Preferably, M different data types are defined by the data type field of the M sub-rules.
The data merging unit in the high-efficiency data merging method provided by the invention eliminates the repeated data in the N paths of data by using the data merging rule formulated by the data publisher, so that the N paths of data are merged into one path of data for the client to use. The data publisher can flexibly define the data merging rules according to needs, and the client as the data receiving device does not need to change. The data merging unit can perform merging calculation in real time by using the data merging rule, and avoids data delay and data loss caused by retransmission and source switching.
Compared with a main/standby mode of a client, the method and the system have the advantages that multiple data sources are simultaneously accessed, and how to rapidly switch one or more data sources after the data sources are asked, so that the problem of untimely switching is solved. Compared with a client-side double-source access mode, the method and the device provided by the invention have the advantages that repeated data in the multi-path data are removed and then provided for the client side to use, and the problem of data waste is solved.
Drawings
FIG. 1 is a schematic diagram of a master/slave mode of a client;
FIG. 2 is a schematic diagram of a client dual-source access method;
FIG. 3 is a schematic diagram of the system of the present invention;
FIG. 4 is a flow chart of data consolidation;
FIG. 5 shows sub-rules in an embodiment.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The method is based on the problems of untimely switching and data waste existing in the existing client master-standby mode and the client dual-source access mode. The invention adopts a mode of multi-source simultaneous access of the client, combines and calculates in real time and stores a copy of data, thereby reducing extra data delay caused by abnormal environments such as network jitter and the like while ensuring the integrity of the data.
In this embodiment, the method provided by the present invention is further described by using the client dual-source simultaneous access shown in fig. 3, and specifically includes the following steps in combination with fig. 4:
the data publisher publishes the data message through the data source 1 and the data source 2. The data merging unit is simultaneously accessed to the data source 1 and the data source 2, and merges two paths of data obtained from the two paths of data sources into one path of data to be output. The client uses the path data output by the data merging unit.
In actual production activities, two paths of data provided by the data source 1 and the data source 2 have different characteristics, so that it cannot be guaranteed that the two paths of data are completely consistent during each transmission, and therefore, the data merging unit needs to solve the problem of how to merge data. However, how to merge data is more clear for a data publisher than for a data receiver, so that the data publisher is required to define a merging rule in the invention, and a data merging unit merges and stores two paths of data according to the merging rule, which specifically comprises the following steps:
In this embodiment, the data merge rule includes three sub-rules shown in fig. 5, which are sub-rule R1, sub-rule R2, and sub-rule R3. The data type field of the sub-rule R1 is a 32-bit integer, the calculation order field is 1, and the calculation direction field is increasing; the data type field of the sub-rule R2 is a character string type, the calculation sequence field is 2, and the calculation direction field is unchanged; the data type field of the sub-rule R3 is a 32-bit floating point number, the calculation order field is 3, and the calculation direction field is made smaller. When the data merging rule is judged, according to the calculation sequence field, the judgment sequence is as follows: sub-rule R1, sub-rule R2, and sub-rule R3.
The data formats of the data messages issued by the data source 1 and the data source 2 are shown in the following table:
head | data |
the data format of the data message comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the header field head is not null, otherwise, the header field head is null. The data of the header field head is composed of data of different data types, and the data type adopted by the data is determined according to the data type fields of the M sub-rules. The message content field data is used for storing data which needs to be published actually by a data publisher.
in this embodiment, the data merge unit initializes three comparison values in sequence according to the values of the data type fields of the sub-rule R1, the sub-rule R2, and the sub-rule R3, where the first comparison value is a 32-bit integer, the second comparison value is a string type, and the third comparison value is a 32-bit floating point number.
Step 3, after the data merging unit receives a data message, judging whether a message header field head of the data message is empty, if so, entering step 9, and if not, entering step 4;
step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 9;
step 5, analyzing a message header field head of the current data message according to the data type field of the sub-rule R1 to obtain a 1 st judgment value; comparing the 1 st judgment value with the 1 st comparison value, judging whether the change of the 1 st comparison value compared with the 1 st judgment value is in accordance with the increase specified by the calculation direction field of the sub-rule R1, if so, entering the step 6, and if not, discarding the current data message and returning to the step 3 to continue judging the next data message;
step 6, analyzing a message header field head of the current data message according to the data type field of the sub-rule R2 to obtain a 2 nd judgment value; comparing the 2 nd judgment value with the 2 nd comparison value, judging whether the change of the 2 nd comparison value compared with the 2 nd judgment value is consistent with the invariance specified by the calculation direction field of the sub-rule R2, if so, entering the step 7, and if not, discarding the current data message and returning to the step 3 to continuously judge the next data message;
step 7, analyzing a message header field head of the current data message according to the data type field of the sub-rule R3 to obtain a 3 rd judgment value; comparing the 3 rd judgment value with the 3 rd comparison value, judging whether the change of the 3 rd comparison value compared with the 3 rd judgment value is smaller than the change specified by the calculation direction field of the sub-rule R3, if so, entering the step 8, and if not, discarding the current data message and returning to the step 3 to continue judging the next data message;
step 8, updating the three comparison values by using the obtained three judgment values, and then entering step 9;
and 9, after data is extracted from the message content field data of the current data message, the extracted data is used as data obtained by combining two paths of data for the client to use, and the client can use the data immediately or store the data for subsequent use.
Claims (3)
1. An efficient data merging method, comprising the steps of:
the data publisher publishes data information through N data sources, N is more than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps:
step 1, a data merging rule is formulated by a data publisher, the data merging rule consists of M sub-rules, M is more than or equal to 1, each sub-rule defines a data type field, a calculation sequence field and a calculation direction field respectively, wherein: defining the data type of the current sub-rule through a data type field; when rule judgment is carried out through the definition of the calculation sequence field, the calculation sequence of the current sub-rule in all the M sub-rules is judged from the 1 st sub-rule to the M sub-rule in sequence according to the calculation sequence field when the rule judgment is carried out; defining the change trend of data type data of the current sub-rule twice by calculating a direction field;
the data format of the data message issued by the N-path data source comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the head of the message header field is not empty, otherwise, the head of the message header field is empty, the data of the head of the message header field consists of data of different data types, and the adopted data type is determined according to the data type field of the M sub-rules; the message content field data is used for storing data which are actually required to be published by a data publisher;
step 2, the data merging unit receives and stores the data merging rules given by the data publisher;
the data merging unit initializes M comparison values corresponding to the M sub-rules one by one according to the values of the data type fields of the M sub-rules and the values of the calculation sequence fields, the data type of the mth comparison value is equivalent to the data type determined by the values of the data type fields of the mth sub-rule, and M is 1, …, M;
step 3, after receiving a data message, the data merging unit judges whether a header field head of the data message is empty, if so, the step 10 is executed, and if not, the step 4 is executed;
step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 10;
step 5, setting m to be 1;
step 6, acquiring the mth sub-rule, and analyzing the message header field head of the current data message according to the data type field of the mth sub-rule to acquire an mth judgment value;
step 7, comparing the mth judgment value with the mth comparison value, judging whether the change of the mth comparison value compared with the mth judgment value meets the change trend specified by the calculation direction field of the mth sub-rule, if so, entering step 8, and if not, discarding the current data message and returning to step 3 to continue judging the next data message;
step 8, judging whether M is equal to M, if so, updating M comparison values by using the obtained M judgment values and then entering step 10, otherwise, further judging whether M is smaller than M, if so, entering step 9, and if M is larger than M, updating M comparison values by using the obtained M judgment values and then entering step 10;
step 9, updating m to m +1 and returning to step 6;
and step 10, after data are extracted from the message content field data of the current data message, the extracted data are used as data after N paths of data are combined for a client to use.
2. An efficient data merging method as claimed in claim 1, wherein the trend includes increase, decrease, constant and change.
3. An efficient data merging method as in claim 1 wherein M different data types are defined by the data type field of the M sub-rules.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110471732.9A CN113392146B (en) | 2021-04-29 | 2021-04-29 | Efficient data merging method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110471732.9A CN113392146B (en) | 2021-04-29 | 2021-04-29 | Efficient data merging method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113392146A true CN113392146A (en) | 2021-09-14 |
CN113392146B CN113392146B (en) | 2024-02-23 |
Family
ID=77617789
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110471732.9A Active CN113392146B (en) | 2021-04-29 | 2021-04-29 | Efficient data merging method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113392146B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003024036A1 (en) * | 2001-09-12 | 2003-03-20 | Skystream Networks Inc. | Method and system for scheduled streaming of best effort data |
CN103685207A (en) * | 2012-09-21 | 2014-03-26 | 百度在线网络技术(北京)有限公司 | System, apparatus, and method for integrating data spanning data sources |
US20170244799A1 (en) * | 2016-02-24 | 2017-08-24 | Verisign, Inc. | Feeding networks of message brokers with compound data elaborated by dynamic sources |
CN107689999A (en) * | 2017-09-14 | 2018-02-13 | 北纬通信科技南京有限责任公司 | A kind of full-automatic computational methods of cloud platform and device |
CN108769141A (en) * | 2018-05-09 | 2018-11-06 | 深圳市深弈科技有限公司 | A kind of method of multi-source real-time deal market data receiver and merger processing |
-
2021
- 2021-04-29 CN CN202110471732.9A patent/CN113392146B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2003024036A1 (en) * | 2001-09-12 | 2003-03-20 | Skystream Networks Inc. | Method and system for scheduled streaming of best effort data |
CN103685207A (en) * | 2012-09-21 | 2014-03-26 | 百度在线网络技术(北京)有限公司 | System, apparatus, and method for integrating data spanning data sources |
US20170244799A1 (en) * | 2016-02-24 | 2017-08-24 | Verisign, Inc. | Feeding networks of message brokers with compound data elaborated by dynamic sources |
CN107689999A (en) * | 2017-09-14 | 2018-02-13 | 北纬通信科技南京有限责任公司 | A kind of full-automatic computational methods of cloud platform and device |
CN108769141A (en) * | 2018-05-09 | 2018-11-06 | 深圳市深弈科技有限公司 | A kind of method of multi-source real-time deal market data receiver and merger processing |
Non-Patent Citations (2)
Title |
---|
张萍;钱沛然;祁立学;杨树勋;: "多路由备份数据热储备系统的优化设计方法", 测控技术, no. 02, 18 February 2010 (2010-02-18) * |
董明瑞;申利民;赵广建;: "面向用户的数据集成模型研究", 微计算机信息, no. 21, 25 July 2010 (2010-07-25) * |
Also Published As
Publication number | Publication date |
---|---|
CN113392146B (en) | 2024-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11995402B2 (en) | Calculating structural differences from binary differences in publish subscribe system | |
US7826451B2 (en) | Method of stateless group communication and repair of data packets transmission to nodes in a distribution tree | |
US10795744B2 (en) | Identifying failed customer experience in distributed computer systems | |
CN111897878B (en) | Master-slave data synchronization method and system | |
CN110928851A (en) | Method, device and equipment for processing log information and storage medium | |
CN111526188B (en) | System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka | |
US20160285969A1 (en) | Ordered execution of tasks | |
CN114090288A (en) | Data pushing method and device | |
US20070130219A1 (en) | Traversing runtime spanning trees | |
CN113392146B (en) | Efficient data merging method | |
CN111625467B (en) | Automatic testing method and device, computer equipment and storage medium | |
CN112543145A (en) | Method and device for selecting communication path of equipment node for sending data | |
CN111931105A (en) | Kafka consumption appointed push time data processing method | |
CN110336706B (en) | Network message transmission processing method and device | |
US10812355B2 (en) | Record compression for a message system | |
CN112600753B (en) | Equipment node communication path selection method and device according to equipment access amount | |
CN115210694A (en) | Data transmission method and device | |
CN111970340A (en) | Information transmission method, readable storage medium and electronic device | |
CN117596126B (en) | Monitoring method for high-speed network abnormality in high-performance cluster | |
Wu et al. | SUNVE: Distributed Message Middleware towards Heterogeneous Database Synchronization | |
CN109347678B (en) | Method and device for determining routing loop | |
US20050071497A1 (en) | Method of establishing transmission headers for stateless group communication | |
KR20230169743A (en) | Method and apparatus for data communication in federated learning | |
CN116402616A (en) | Time slice based multi-source multi-shot snapshot estrus optimization method, medium and device | |
CN116723142A (en) | Real-time rerouting method, device, equipment, storage medium and product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |