CN113392146A - Efficient data merging method - Google Patents

Efficient data merging method Download PDF

Info

Publication number
CN113392146A
CN113392146A CN202110471732.9A CN202110471732A CN113392146A CN 113392146 A CN113392146 A CN 113392146A CN 202110471732 A CN202110471732 A CN 202110471732A CN 113392146 A CN113392146 A CN 113392146A
Authority
CN
China
Prior art keywords
data
rule
sub
field
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110471732.9A
Other languages
Chinese (zh)
Other versions
CN113392146B (en
Inventor
金剑
张锐
唐龙涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Wandehonghui Information Technology Co ltd
Original Assignee
Shanghai Wandehonghui Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Wandehonghui Information Technology Co ltd filed Critical Shanghai Wandehonghui Information Technology Co ltd
Priority to CN202110471732.9A priority Critical patent/CN113392146B/en
Publication of CN113392146A publication Critical patent/CN113392146A/en
Application granted granted Critical
Publication of CN113392146B publication Critical patent/CN113392146B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/256Integrating or interfacing systems involving database management systems in federated or virtual databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/258Data format conversion from or to a database

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides an efficient data merging method, which is characterized by comprising the following steps: the data publisher publishes data information through N data sources, N is larger than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps. The data merging unit of the invention eliminates the repeated data in the N paths of data by using the data merging rule formulated by the data publisher, thereby merging the N paths of data into one path of data for the client to use. The data publisher can flexibly define the data merging rules according to needs, and the client as the data receiving device does not need to change. The data merging unit can perform merging calculation in real time by using the data merging rule, and avoids data delay and data loss caused by retransmission and source switching.

Description

Efficient data merging method
Technical Field
The invention relates to a data merging method.
Background
For data transmitted in real time, a data provider provides two or even multiple data sources for downstream users to access and use in order to ensure the stability of the data and reduce the influence after hardware faults such as network interruption, machine downtime and the like occur. The device for accessing data generally adopts the following two ways to ensure the access of the dual data sources: the first way-client primary and standby; the second approach-client dual-source access.
As shown in fig. 1, the primary and standby clients access two data sources for a client program, one is primary data, and the other is standby data. If the main path data has a problem, the client program is actively switched to the standby data. The master-standby mode of the client can cause a large amount of data to be missing in a short time or delay.
As shown in fig. 2, the client dual-source access means that one client program accesses two data sources simultaneously and stores the data sources, and can be switched quickly when a problem occurs. Although the client-side dual-source access mode solves the problem of missing a large amount of data in a short time or delaying a large amount of data, a client-side program needs to store two pieces of data, and therefore resources are wasted.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the existing data switching mode has untimely switching and data waste.
In order to solve the above technical problem, a technical solution of the present invention is to provide an efficient data merging method, which is characterized by comprising the following steps:
the data publisher publishes data information through N data sources, N is more than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps:
step 1, a data merging rule is formulated by a data publisher, the data merging rule consists of M sub-rules, M is more than or equal to 1, each sub-rule defines a data type field, a calculation sequence field and a calculation direction field respectively, wherein: defining the data type of the current sub-rule through a data type field; when rule judgment is carried out through the definition of the calculation sequence field, the calculation sequence of the current sub-rule in all the M sub-rules is judged from the 1 st sub-rule to the M sub-rule in sequence according to the calculation sequence field when the rule judgment is carried out; defining the change trend of data type data of the current sub-rule twice by calculating a direction field;
the data format of the data message issued by the N-path data source comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the head of the message header field is not empty, otherwise, the head of the message header field is empty, the data of the head of the message header field consists of data of different data types, and the adopted data type is determined according to the data type field of the M sub-rules; the message content field data is used for storing data which are actually required to be published by a data publisher;
step 2, the data merging unit receives and stores the data merging rules given by the data publisher;
the data merging unit initializes M comparison values corresponding to the M sub-rules one by one according to the values of the data type fields of the M sub-rules and the values of the calculation sequence fields, the data type of the mth comparison value is equivalent to the data type determined by the values of the data type fields of the mth sub-rule, and M is 1, …, M;
step 3, after receiving a data message, the data merging unit judges whether a header field head of the data message is empty, if so, the step 10 is executed, and if not, the step 4 is executed;
step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 10;
step 5, setting m to be 1;
step 6, acquiring the mth sub-rule, and analyzing the message header field head of the current data message according to the data type field of the mth sub-rule to acquire an mth judgment value;
step 7, comparing the mth judgment value with the mth comparison value, judging whether the change of the mth comparison value compared with the mth judgment value meets the change trend specified by the calculation direction field of the mth sub-rule, if so, entering step 8, and if not, discarding the current data message and returning to step 3 to continue judging the next data message;
step 8, judging whether M is equal to M, if so, updating M comparison values by using the obtained M judgment values and then entering step 10, otherwise, further judging whether M is smaller than M, if so, entering step 9, and if M is larger than M, updating M comparison values by using the obtained M judgment values and then entering step 10;
step 9, updating m to m +1 and returning to step 6;
and step 10, after data are extracted from the message content field data of the current data message, the extracted data are used as data after N paths of data are combined for a client to use.
Preferably, the trend of change includes increase, decrease, constant, change.
Preferably, M different data types are defined by the data type field of the M sub-rules.
The data merging unit in the high-efficiency data merging method provided by the invention eliminates the repeated data in the N paths of data by using the data merging rule formulated by the data publisher, so that the N paths of data are merged into one path of data for the client to use. The data publisher can flexibly define the data merging rules according to needs, and the client as the data receiving device does not need to change. The data merging unit can perform merging calculation in real time by using the data merging rule, and avoids data delay and data loss caused by retransmission and source switching.
Compared with a main/standby mode of a client, the method and the system have the advantages that multiple data sources are simultaneously accessed, and how to rapidly switch one or more data sources after the data sources are asked, so that the problem of untimely switching is solved. Compared with a client-side double-source access mode, the method and the device provided by the invention have the advantages that repeated data in the multi-path data are removed and then provided for the client side to use, and the problem of data waste is solved.
Drawings
FIG. 1 is a schematic diagram of a master/slave mode of a client;
FIG. 2 is a schematic diagram of a client dual-source access method;
FIG. 3 is a schematic diagram of the system of the present invention;
FIG. 4 is a flow chart of data consolidation;
FIG. 5 shows sub-rules in an embodiment.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The method is based on the problems of untimely switching and data waste existing in the existing client master-standby mode and the client dual-source access mode. The invention adopts a mode of multi-source simultaneous access of the client, combines and calculates in real time and stores a copy of data, thereby reducing extra data delay caused by abnormal environments such as network jitter and the like while ensuring the integrity of the data.
In this embodiment, the method provided by the present invention is further described by using the client dual-source simultaneous access shown in fig. 3, and specifically includes the following steps in combination with fig. 4:
the data publisher publishes the data message through the data source 1 and the data source 2. The data merging unit is simultaneously accessed to the data source 1 and the data source 2, and merges two paths of data obtained from the two paths of data sources into one path of data to be output. The client uses the path data output by the data merging unit.
In actual production activities, two paths of data provided by the data source 1 and the data source 2 have different characteristics, so that it cannot be guaranteed that the two paths of data are completely consistent during each transmission, and therefore, the data merging unit needs to solve the problem of how to merge data. However, how to merge data is more clear for a data publisher than for a data receiver, so that the data publisher is required to define a merging rule in the invention, and a data merging unit merges and stores two paths of data according to the merging rule, which specifically comprises the following steps:
step 1, a data merging rule is formulated by a data publisher, the data merging rule consists of M sub-rules, M is more than or equal to 1, each sub-rule defines a data type field, a calculation sequence field and a calculation direction field respectively, wherein: defining the data type of the current sub-rule through a data type field; when rule judgment is carried out through the definition of the calculation sequence field, the calculation sequence of the current sub-rule in all the M sub-rules is judged from the 1 st sub-rule to the M sub-rule in sequence according to the calculation sequence field when the rule judgment is carried out; and defining the change trend of the data type data of the current sub-rule twice by calculating the direction field, wherein the change trend comprises increase, decrease, invariance and change.
In this embodiment, the data merge rule includes three sub-rules shown in fig. 5, which are sub-rule R1, sub-rule R2, and sub-rule R3. The data type field of the sub-rule R1 is a 32-bit integer, the calculation order field is 1, and the calculation direction field is increasing; the data type field of the sub-rule R2 is a character string type, the calculation sequence field is 2, and the calculation direction field is unchanged; the data type field of the sub-rule R3 is a 32-bit floating point number, the calculation order field is 3, and the calculation direction field is made smaller. When the data merging rule is judged, according to the calculation sequence field, the judgment sequence is as follows: sub-rule R1, sub-rule R2, and sub-rule R3.
The data formats of the data messages issued by the data source 1 and the data source 2 are shown in the following table:
head data
the data format of the data message comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the header field head is not null, otherwise, the header field head is null. The data of the header field head is composed of data of different data types, and the data type adopted by the data is determined according to the data type fields of the M sub-rules. The message content field data is used for storing data which needs to be published actually by a data publisher.
Step 2, the data merging unit receives and stores the data merging rules given by the data publisher;
in this embodiment, the data merge unit initializes three comparison values in sequence according to the values of the data type fields of the sub-rule R1, the sub-rule R2, and the sub-rule R3, where the first comparison value is a 32-bit integer, the second comparison value is a string type, and the third comparison value is a 32-bit floating point number.
Step 3, after the data merging unit receives a data message, judging whether a message header field head of the data message is empty, if so, entering step 9, and if not, entering step 4;
step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 9;
step 5, analyzing a message header field head of the current data message according to the data type field of the sub-rule R1 to obtain a 1 st judgment value; comparing the 1 st judgment value with the 1 st comparison value, judging whether the change of the 1 st comparison value compared with the 1 st judgment value is in accordance with the increase specified by the calculation direction field of the sub-rule R1, if so, entering the step 6, and if not, discarding the current data message and returning to the step 3 to continue judging the next data message;
step 6, analyzing a message header field head of the current data message according to the data type field of the sub-rule R2 to obtain a 2 nd judgment value; comparing the 2 nd judgment value with the 2 nd comparison value, judging whether the change of the 2 nd comparison value compared with the 2 nd judgment value is consistent with the invariance specified by the calculation direction field of the sub-rule R2, if so, entering the step 7, and if not, discarding the current data message and returning to the step 3 to continuously judge the next data message;
step 7, analyzing a message header field head of the current data message according to the data type field of the sub-rule R3 to obtain a 3 rd judgment value; comparing the 3 rd judgment value with the 3 rd comparison value, judging whether the change of the 3 rd comparison value compared with the 3 rd judgment value is smaller than the change specified by the calculation direction field of the sub-rule R3, if so, entering the step 8, and if not, discarding the current data message and returning to the step 3 to continue judging the next data message;
step 8, updating the three comparison values by using the obtained three judgment values, and then entering step 9;
and 9, after data is extracted from the message content field data of the current data message, the extracted data is used as data obtained by combining two paths of data for the client to use, and the client can use the data immediately or store the data for subsequent use.

Claims (3)

1. An efficient data merging method, comprising the steps of:
the data publisher publishes data information through N data sources, N is more than or equal to 2, the data merging unit is simultaneously accessed into the N data sources to merge N data obtained from the N data sources into one data to be output, and the client uses the data output by the data merging unit, wherein the data merging unit merges the N data into one data by adopting the following steps:
step 1, a data merging rule is formulated by a data publisher, the data merging rule consists of M sub-rules, M is more than or equal to 1, each sub-rule defines a data type field, a calculation sequence field and a calculation direction field respectively, wherein: defining the data type of the current sub-rule through a data type field; when rule judgment is carried out through the definition of the calculation sequence field, the calculation sequence of the current sub-rule in all the M sub-rules is judged from the 1 st sub-rule to the M sub-rule in sequence according to the calculation sequence field when the rule judgment is carried out; defining the change trend of data type data of the current sub-rule twice by calculating a direction field;
the data format of the data message issued by the N-path data source comprises a message header field head and a message content field data, wherein: if the current data message needs to be merged, the head of the message header field is not empty, otherwise, the head of the message header field is empty, the data of the head of the message header field consists of data of different data types, and the adopted data type is determined according to the data type field of the M sub-rules; the message content field data is used for storing data which are actually required to be published by a data publisher;
step 2, the data merging unit receives and stores the data merging rules given by the data publisher;
the data merging unit initializes M comparison values corresponding to the M sub-rules one by one according to the values of the data type fields of the M sub-rules and the values of the calculation sequence fields, the data type of the mth comparison value is equivalent to the data type determined by the values of the data type fields of the mth sub-rule, and M is 1, …, M;
step 3, after receiving a data message, the data merging unit judges whether a header field head of the data message is empty, if so, the step 10 is executed, and if not, the step 4 is executed;
step 4, judging whether a data merging rule is stored, if so, entering step 5, otherwise, entering step 10;
step 5, setting m to be 1;
step 6, acquiring the mth sub-rule, and analyzing the message header field head of the current data message according to the data type field of the mth sub-rule to acquire an mth judgment value;
step 7, comparing the mth judgment value with the mth comparison value, judging whether the change of the mth comparison value compared with the mth judgment value meets the change trend specified by the calculation direction field of the mth sub-rule, if so, entering step 8, and if not, discarding the current data message and returning to step 3 to continue judging the next data message;
step 8, judging whether M is equal to M, if so, updating M comparison values by using the obtained M judgment values and then entering step 10, otherwise, further judging whether M is smaller than M, if so, entering step 9, and if M is larger than M, updating M comparison values by using the obtained M judgment values and then entering step 10;
step 9, updating m to m +1 and returning to step 6;
and step 10, after data are extracted from the message content field data of the current data message, the extracted data are used as data after N paths of data are combined for a client to use.
2. An efficient data merging method as claimed in claim 1, wherein the trend includes increase, decrease, constant and change.
3. An efficient data merging method as in claim 1 wherein M different data types are defined by the data type field of the M sub-rules.
CN202110471732.9A 2021-04-29 2021-04-29 Efficient data merging method Active CN113392146B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110471732.9A CN113392146B (en) 2021-04-29 2021-04-29 Efficient data merging method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110471732.9A CN113392146B (en) 2021-04-29 2021-04-29 Efficient data merging method

Publications (2)

Publication Number Publication Date
CN113392146A true CN113392146A (en) 2021-09-14
CN113392146B CN113392146B (en) 2024-02-23

Family

ID=77617789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110471732.9A Active CN113392146B (en) 2021-04-29 2021-04-29 Efficient data merging method

Country Status (1)

Country Link
CN (1) CN113392146B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003024036A1 (en) * 2001-09-12 2003-03-20 Skystream Networks Inc. Method and system for scheduled streaming of best effort data
CN103685207A (en) * 2012-09-21 2014-03-26 百度在线网络技术(北京)有限公司 System, apparatus, and method for integrating data spanning data sources
US20170244799A1 (en) * 2016-02-24 2017-08-24 Verisign, Inc. Feeding networks of message brokers with compound data elaborated by dynamic sources
CN107689999A (en) * 2017-09-14 2018-02-13 北纬通信科技南京有限责任公司 A kind of full-automatic computational methods of cloud platform and device
CN108769141A (en) * 2018-05-09 2018-11-06 深圳市深弈科技有限公司 A kind of method of multi-source real-time deal market data receiver and merger processing

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2003024036A1 (en) * 2001-09-12 2003-03-20 Skystream Networks Inc. Method and system for scheduled streaming of best effort data
CN103685207A (en) * 2012-09-21 2014-03-26 百度在线网络技术(北京)有限公司 System, apparatus, and method for integrating data spanning data sources
US20170244799A1 (en) * 2016-02-24 2017-08-24 Verisign, Inc. Feeding networks of message brokers with compound data elaborated by dynamic sources
CN107689999A (en) * 2017-09-14 2018-02-13 北纬通信科技南京有限责任公司 A kind of full-automatic computational methods of cloud platform and device
CN108769141A (en) * 2018-05-09 2018-11-06 深圳市深弈科技有限公司 A kind of method of multi-source real-time deal market data receiver and merger processing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张萍;钱沛然;祁立学;杨树勋;: "多路由备份数据热储备系统的优化设计方法", 测控技术, no. 02, 18 February 2010 (2010-02-18) *
董明瑞;申利民;赵广建;: "面向用户的数据集成模型研究", 微计算机信息, no. 21, 25 July 2010 (2010-07-25) *

Also Published As

Publication number Publication date
CN113392146B (en) 2024-02-23

Similar Documents

Publication Publication Date Title
US11995402B2 (en) Calculating structural differences from binary differences in publish subscribe system
US7826451B2 (en) Method of stateless group communication and repair of data packets transmission to nodes in a distribution tree
US10795744B2 (en) Identifying failed customer experience in distributed computer systems
CN111897878B (en) Master-slave data synchronization method and system
CN110928851A (en) Method, device and equipment for processing log information and storage medium
CN111526188B (en) System and method for ensuring zero data loss based on Spark Streaming in combination with Kafka
US20160285969A1 (en) Ordered execution of tasks
CN114090288A (en) Data pushing method and device
US20070130219A1 (en) Traversing runtime spanning trees
CN113392146B (en) Efficient data merging method
CN111625467B (en) Automatic testing method and device, computer equipment and storage medium
CN112543145A (en) Method and device for selecting communication path of equipment node for sending data
CN111931105A (en) Kafka consumption appointed push time data processing method
CN110336706B (en) Network message transmission processing method and device
US10812355B2 (en) Record compression for a message system
CN112600753B (en) Equipment node communication path selection method and device according to equipment access amount
CN115210694A (en) Data transmission method and device
CN111970340A (en) Information transmission method, readable storage medium and electronic device
CN117596126B (en) Monitoring method for high-speed network abnormality in high-performance cluster
Wu et al. SUNVE: Distributed Message Middleware towards Heterogeneous Database Synchronization
CN109347678B (en) Method and device for determining routing loop
US20050071497A1 (en) Method of establishing transmission headers for stateless group communication
KR20230169743A (en) Method and apparatus for data communication in federated learning
CN116402616A (en) Time slice based multi-source multi-shot snapshot estrus optimization method, medium and device
CN116723142A (en) Real-time rerouting method, device, equipment, storage medium and product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant