CN116776275A - A multi-thread-based multi-link data reception and fusion method - Google Patents

A multi-thread-based multi-link data reception and fusion method Download PDF

Info

Publication number
CN116776275A
CN116776275A CN202310467072.6A CN202310467072A CN116776275A CN 116776275 A CN116776275 A CN 116776275A CN 202310467072 A CN202310467072 A CN 202310467072A CN 116776275 A CN116776275 A CN 116776275A
Authority
CN
China
Prior art keywords
data
node
protocol
nodes
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310467072.6A
Other languages
Chinese (zh)
Other versions
CN116776275B (en
Inventor
袁铭
孙渊博
李大伟
冯帆
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Computer Technology and Applications
Original Assignee
Beijing Institute of Computer Technology and Applications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Computer Technology and Applications filed Critical Beijing Institute of Computer Technology and Applications
Priority to CN202310467072.6A priority Critical patent/CN116776275B/en
Publication of CN116776275A publication Critical patent/CN116776275A/en
Application granted granted Critical
Publication of CN116776275B publication Critical patent/CN116776275B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S19/00Satellite radio beacon positioning systems; Determining position, velocity or attitude using signals transmitted by such systems
    • G01S19/38Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system
    • G01S19/39Determining a navigation solution using signals transmitted by a satellite radio beacon positioning system the satellite radio beacon positioning system transmitting time-stamped messages, e.g. GPS [Global Positioning System], GLONASS [Global Orbiting Navigation Satellite System] or GALILEO
    • G01S19/42Determining position
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Remote Sensing (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明涉及一种基于多线程的多链路数据接收融合方法,属于高可用数据融合领域。本发明应用于具有高可靠性数据远程发送和存储,节点服务集群搭建,具有高可靠性,可以应用于GPS或北斗定位物体和物体跟踪展示服务等应用环境,使数据提供方无需担心服务器停机断电,或节点应用进程停止运行等问题造成数据传输中止问题。当多个机房的服务集群其中一个机房发生停电现象时,用户不用担心数据中断传输,从而提高了数据发送的稳定性,减少了监管人员的运维成本,让用户只关心数据展示业务。同时应用具有数据源连接可配置化,数据协议解析可配置化,融合过滤算法可配置化,用户可以自定义数据源连接,自定义数据协议和数据过滤方法,用于解析、过滤和转发数据。

The invention relates to a multi-thread-based multi-link data reception and fusion method, belonging to the field of high-availability data fusion. The invention is applied to high-reliability data remote transmission and storage, node service cluster construction, high reliability, and can be applied to application environments such as GPS or Beidou positioning objects and object tracking display services, so that data providers do not need to worry about server downtime. Data transmission is suspended due to problems such as power outage or node application process stopping. When a power outage occurs in one of the service clusters of multiple computer rooms, users do not have to worry about data transmission being interrupted, thereby improving the stability of data transmission, reducing the operation and maintenance costs of supervisors, and allowing users to only care about the data display business. At the same time, the application has configurable data source connections, configurable data protocol parsing, and configurable fusion filtering algorithms. Users can customize data source connections, customize data protocols, and data filtering methods for parsing, filtering, and forwarding data.

Description

一种基于多线程的多链路数据接收融合方法A multi-thread-based multi-link data reception and fusion method

技术领域Technical field

本发明属于高可用数据融合领域,具体涉及一种基于多线程的多链路数据接收融合方法。The invention belongs to the field of high-availability data fusion, and specifically relates to a multi-thread-based multi-link data receiving and fusion method.

背景技术Background technique

随着北斗定位技术演变的越来越成熟,使用越来越广泛,越来越多的手机端、微机端、网页端都提供给用户方便快捷的用户定位体验服务,北斗技术已经作为服务于政治、军事和民用等多个领域的常规技术。北斗数据协议的内容组成更加多样,既有经纬度和时间日期的数字形式,也有日常文本等字符串形式。然而传统的定位技术不能解决数据的定位精度、数据时间发送周期长和缺少数据可持续性发送的解决方案,因此已不再适合越来越频繁使用且需要高可靠性的定位服务。As Beidou positioning technology evolves more and more maturely and is used more and more widely, more and more mobile phones, computer terminals, and web pages provide users with convenient and fast user positioning experience services. Beidou technology has become a political service Conventional technology in many fields such as , military and civilian applications. The content of the Beidou data protocol is more diverse, including digital forms of longitude, latitude, time and date, and string forms such as daily text. However, traditional positioning technology cannot solve the problem of data positioning accuracy, long data transmission cycle and lack of sustainable data transmission solutions. Therefore, it is no longer suitable for positioning services that are used more and more frequently and require high reliability.

为了解决北斗定位技术在应用过程中存在服务器停机、服务器应用进程消失,数据丢失,数据时序混乱,北斗定位数据偏移等诸多问题,为了使北斗定位技术与应用相结合过程中数据传输更加稳定,高效,本发明旨在将多线程技术和融合过滤算法技术应用于北斗定位的数据源连接、数据协议解析和错误数据过滤之中,以解决数据在接收与转发的过程中出现的各种问题。In order to solve many problems such as server shutdown, server application process disappearance, data loss, data timing confusion, and Beidou positioning data offset during the application process of Beidou positioning technology, in order to make data transmission more stable during the combination of Beidou positioning technology and application, Highly efficient, the present invention aims to apply multi-threading technology and fusion filtering algorithm technology to Beidou positioning data source connection, data protocol analysis and error data filtering to solve various problems that arise in the process of receiving and forwarding data.

本发明是应用于一种北斗/GPS定位数据的数据传输服务,主要解决多链路数据过滤融合、多种协议解析通用化、多连接通用化的基础数据可持续传输。具体解决的问题包括:The invention is a data transmission service applied to Beidou/GPS positioning data, and mainly solves the sustainable transmission of basic data such as multi-link data filtering and fusion, multi-protocol parsing generalization, and multi-connection generalization. Specific issues addressed include:

解决当发生停机,应用进程消失等数据不丢失的问题,需要通过节点集群搭建方式,从而增加了稳定性,减少运维成本。To solve the problem of data not being lost when downtime occurs, application processes disappear, etc., it is necessary to build a node cluster, thereby increasing stability and reducing operation and maintenance costs.

解决处理协议种类过多,不确定格式处理新协议等问题,需要设计一套通用配置文件用于解析多种数据协议解析,通过配置文件配置说明书,用户可自定义数据协议格式,且减少用户操作步骤和减少研发成本。To solve the problems of processing too many types of protocols and processing new protocols in uncertain formats, it is necessary to design a set of general configuration files for parsing multiple data protocols. Through the configuration file configuration instructions, users can customize the data protocol format and reduce user operations. steps and reduce R&D costs.

解决了数据冗余问题和错误数据的发送问题,根据用户自定义数据的特点,用户可自定义数据过滤算法,过滤掉错误数据。It solves the problem of data redundancy and the problem of sending erroneous data. According to the characteristics of user-defined data, users can customize the data filtering algorithm to filter out erroneous data.

解决了数据源复杂多样的问题。Solve the problem of complex and diverse data sources.

解决节点集群搭建的状态同步问题。Solve the problem of status synchronization in node cluster construction.

节点恢复启动的状态回复问题。Node recovery startup status recovery problem.

解决所有节点状态同步过程中发生单个或其中多个服务器停机,进程丢失等请求发送失败的问题。Solve the problem of single or multiple server shutdowns, process loss, and request failure during the status synchronization process of all nodes.

发明内容Contents of the invention

(一)要解决的技术问题(1) Technical problems to be solved

本发明要解决的技术问题是如何提供一种基于多线程的多链路数据接收融合方法,以解决北斗定位技术在应用过程中存在服务器停机、服务器应用进程消失,数据丢失,数据时序混乱,北斗定位数据偏移等诸多问题。The technical problem to be solved by this invention is how to provide a multi-thread-based multi-link data receiving and fusion method to solve the problems of server shutdown, server application process disappearance, data loss, data timing confusion, and Beidou positioning technology problems during the application process. Positioning data offset and many other issues.

(二)技术方案(2) Technical solutions

为了解决上述技术问题,本发明提出一种基于多线程的多链路数据接收融合方法,该方法基于融合系统,该融合系统由多个节点组成,构成一个集群,每个节点连接三个目标,第一个目标是数据源,数据源用于接收数据,第二个目标为数据库,数据库是关系型数据库,用于记录集群状态,第三个目标为缓存中间件Redis,用于数据融合脚本协议的执行;该方法包括:In order to solve the above technical problems, the present invention proposes a multi-thread-based multi-link data reception and fusion method. The method is based on a fusion system. The fusion system is composed of multiple nodes to form a cluster, and each node is connected to three targets. The first target is the data source, which is used to receive data. The second target is the database. The database is a relational database and is used to record the cluster status. The third target is the cache middleware Redis, which is used for the data fusion script protocol. Execution; the method includes:

在每个节点中通过自定义配置文件定义接收器形式,数据协议形式和数据过滤算法,并通过线程池技术建立多个线程,每个线程建立一个接收器,同时运行数据接收、数据处理服务,且每个接收器的数据处理服务相互之间不会产生影响,解析过后,再汇总多数据源数据,做数据去重和数据过滤操作;In each node, the receiver form, data protocol form and data filtering algorithm are defined through a custom configuration file, and multiple threads are established through thread pool technology. Each thread establishes a receiver and runs data reception and data processing services at the same time. And the data processing services of each receiver will not affect each other. After parsing, the data from multiple data sources will be summarized to perform data deduplication and data filtering operations;

每个节点是一个独立运行的节点,且每个节点通过多线程技术,连接多个数据源,并行运行,每个节点去重数据源接收的数据,然后对最近两点的数据距离计算和接收时间差计算,判断是否为错误数据,如果是则过滤,最后将处理过后的数据发送到缓存中间件Redis,依靠缓存中间件Redis的脚本技术和SET集合执行数据去重和数据发送,并且将缓存的过时数据可定时销毁;Each node is an independently running node, and each node uses multi-threading technology to connect multiple data sources and run in parallel. Each node deduplicates the data received by the data source, and then calculates and receives the data distance between the two nearest points. Calculate the time difference to determine whether it is wrong data. If so, filter it. Finally, send the processed data to the cache middleware Redis. Rely on the script technology and SET collection of the cache middleware Redis to perform data deduplication and data sending, and cache the data. Outdated data can be destroyed regularly;

缓存中间件Redis设置在应用集群之外,集群每个节点都通过TCP协议与缓存中间件Redis连接,发送数据大到缓存中间件Redis的SET集合,当集合SET不存在则发送数据到阻塞队列,并且缓存中间件Redis支持多种语言连接,可以消费阻塞队列数据;The cache middleware Redis is set outside the application cluster. Each node in the cluster is connected to the cache middleware Redis through the TCP protocol. Data is sent to the SET set of the cache middleware Redis. When the set SET does not exist, the data is sent to the blocking queue. And the cache middleware Redis supports multiple language connections and can consume blocking queue data;

每个节点部署在不同的服务器,每个节点在服务器中是一个进程,集群搭建在多个服务器之上,多个服务器可以分布在多个机房,如果有一台服务器发生宕机,或者服务器重启,进程消失,集群就会丢失一个节点,然而其它节点还在运行,数据不会丢失;因为所有节点的数据接收相同的数据,因此发送的数据也具有相同性,即使一个节点的服务器发生故障,也不影响其它节点接收发送数据,又由于缓存中间件Redis脚本技术和SET集合不重复的特性,将所有节点的数据发送到这个SET集合中,判断是否存在,如果存在,则说明已经接收,如果不存在,则说明数据还没接收,因此当一个节点挂掉,数据也能在1秒钟内恢复发送,且数据具有唯一性;Each node is deployed on a different server. Each node is a process in the server. The cluster is built on multiple servers. Multiple servers can be distributed in multiple computer rooms. If one server goes down or the server restarts, If the process disappears, the cluster will lose a node. However, other nodes are still running and the data will not be lost; because the data of all nodes receive the same data, the data sent is also the same. Even if the server of one node fails, the data will not be lost. It does not affect the reception and sending of data by other nodes, and due to the caching middleware Redis script technology and the non-duplicate characteristics of SET collections, the data of all nodes are sent to this SET collection to determine whether it exists. If it exists, it means it has been received. If not, it means that it has been received. exists, it means that the data has not been received, so when a node hangs up, the data can be resumed within 1 second, and the data is unique;

每个节点接收的数据都有数据时间,首先每个节点都会过滤掉时间乱序的数据,然而每个节点发送到缓存中间件Redis的时间不同,因此缓存中间件Redis的脚本协议的定制有两种解决方案,第一种方案是,因为每个节点的数据具有有序性,只需要在缓存中间件Redis中设置SET集合,将所有节点的数据发送到SET集合,利用SET集合的不可重复性,判断数据是否接收,如果SET集合中没有数据,则缓存数据到SET集合,并将数据发送到阻塞队列中,因为队列具有先进先出的特性,因此最终消费数据的就具有的有序性;第二种方案是,在缓存中间件Redis中设置权限Key,且具有5秒的时效,这个5秒时效可自定义为任何正整形数值,Key的值为节点的IP地址,这个权限的分配规则为先发送数据的节点具有这个权限,也就是说这Key的值为这个先发送数据的节点的IP地址,那么这个节点就会有5秒钟时间发送数据到SET集合的权限,再判断集合是否接收相同数据,如果没有,则缓存数据到SET集合,并将数据发送到阻塞队列中;5秒过后就重现创建一个空Key,还是任何一个节点先发送数据,那么这个节点的IP地址会赋值给这个Key,由于每个节点的数据具有时序性,因此最终汇集到队列的数据具有时序性;The data received by each node has a data time. First, each node will filter out time-ordered data. However, the time each node sends to the cache middleware Redis is different. Therefore, there are two ways to customize the script protocol of the cache middleware Redis. There are two solutions. The first solution is that because the data of each node is ordered, you only need to set the SET set in the cache middleware Redis, send the data of all nodes to the SET set, and take advantage of the non-repeatability of the SET set. , determine whether the data is received. If there is no data in the SET set, cache the data to the SET set and send the data to the blocking queue. Because the queue has a first-in, first-out characteristic, the final consumption of data will be in order; The second option is to set the permission Key in the cache middleware Redis, and it has a validity of 5 seconds. This 5-second validity can be customized to any positive integer value. The value of the Key is the IP address of the node. The allocation rules of this permission The node that sends data first has this permission. That is to say, the value of this Key is the IP address of the node that sends data first. Then this node will have 5 seconds to send data to the SET collection permission, and then determine whether the collection is Receive the same data. If not, cache the data to the SET collection and send the data to the blocking queue; after 5 seconds, an empty Key will be created again. If any node sends data first, then the IP address of this node will be assigned a value. Given this Key, since the data of each node is sequential, the data finally collected into the queue is sequential;

用户发送请求到节点1,节点1通过配置文件获取集群配置,异步转发请求到其他节点,同步状态,使集群所有节点同时具有相同的接收和转发状态,当同步成功时,在数据库保存当前状态信息;当一个节点发生宕机,或请求不通,可移除这个节点,不影响其它节点的请求操作;如果不是宕机或请求不通,当有一个节点服务请求失败时,为了使所有节点状态一致,使用数据库已经存储的节点上一步状态,读取数据库上一步状态,用于节点恢复上一步状态;使用数据库存储节点状态,连接和转发信息,当节点挂掉的时候,再启动,可以恢复连接状态和其它状态。The user sends a request to Node 1, and Node 1 obtains the cluster configuration through the configuration file, asynchronously forwards the request to other nodes, and synchronizes the status so that all nodes in the cluster have the same receiving and forwarding status at the same time. When the synchronization is successful, the current status information is saved in the database. ; When a node is down, or requests are blocked, the node can be removed without affecting the request operations of other nodes; if it is not down or requests are blocked, when a node service request fails, in order to make all node states consistent, Use the previous status of the node that has been stored in the database to read the previous status of the database for the node to restore the previous status; use the database to store the node status, connection and forwarding information. When the node hangs up, restart it to restore the connection status. and other states.

(三)有益效果(3) Beneficial effects

本发明提出一种基于多线程的多链路数据接收融合方法,本发明公开了一种多线程、多链路数据的数据融合过滤方法,主要优势体现在以下方面:The present invention proposes a multi-thread-based multi-link data reception and fusion method. The present invention discloses a multi-thread, multi-link data data fusion filtering method. The main advantages are reflected in the following aspects:

1.设计了一种多链路数据融合解决方案,即使发生服务器停机、应用进程消失等问题,在一秒内数据可恢复发送,使客户不用担心停电等问题,并且有启动脚本,开启服务器即可运行应用,无需过多操作。1. Designed a multi-link data fusion solution. Even if problems such as server shutdown and application process disappearance occur, data can be resumed within one second, so that customers do not have to worry about power outages and other problems. There is also a startup script to start the server. The app can be run without much action required.

2.设计了一种集群搭建的解决方案,利用多线程技术,请求发送一个节点,这个节点会通过多线程异步发送相同请求到其它所有节点,这样集群的配置文件上传,配置文件同步,发送状态同步等问题,就可以迎刃而解。然而当一个服务器发生宕机等现象,这个服务器的上的应用节点就会消失,其它节点就会移除这个节点,不影响其它节点的正常运行。然而当有一台节点发生请求问题时,也可设置所有节点状态回滚的操作,使所有节点状态一致,从而减少运维操作。2. Designed a cluster building solution that uses multi-threading technology to send a request to a node. This node will asynchronously send the same request to all other nodes through multi-threading, so that the cluster's configuration file is uploaded, the configuration file is synchronized, and the status is sent. Problems such as synchronization can be easily solved. However, when a server goes down, the application node on this server will disappear, and other nodes will remove this node, without affecting the normal operation of other nodes. However, when a request problem occurs on one node, you can also set up a rollback operation for the status of all nodes to make the status of all nodes consistent, thereby reducing operation and maintenance operations.

3.设计了一种支持自定义数据过滤算法的方法。用户可根据自己的数据需求,定制化获取怎样的数据,并通过上传二进制算法文件和配置文件,定制自己的过滤方法,简单有效。3. Designed a method to support custom data filtering algorithms. Users can customize what kind of data to obtain according to their own data needs, and customize their own filtering methods by uploading binary algorithm files and configuration files, which is simple and effective.

4.设计了一种支持连接多种数据源的简单搭建服务的方法,利用多线程技术一个节点可连接多个数据源,通过配置多个数据源基本信息,转发目的地,路由表等编写操作,并上传配置文件,即可以定制用户需要的应用节点,且保存到数据库中,节点重启依然使用上一次配置连接信息,无需再次上传配置文件。4. Designed a simple method to build services that supports connecting multiple data sources. Using multi-threading technology, one node can connect to multiple data sources. By configuring basic information of multiple data sources, forwarding destinations, routing tables and other writing operations , and upload the configuration file, that is, the application node required by the user can be customized and saved in the database. When the node is restarted, the last configuration connection information will still be used, and there is no need to upload the configuration file again.

5.设计了一种数据协议解析通用化的解决方案,通过定制配置文件协议,做到应用节点处理数据协议的功能具有通用性,减少开发者应对新协议的研发,且只需上传一次配置文件,就可搭建自定义数据协议解析算法的环境,从而减少用户定义数据协议的操作。5. Designed a universal solution for data protocol parsing. By customizing the configuration file protocol, the application node's function of processing data protocols is universal, reducing developers' research and development of new protocols, and only needing to upload the configuration file once. , you can build an environment for custom data protocol parsing algorithms, thereby reducing the operations of user-defined data protocols.

6.设计了一种总体架构,包括缓存中间件Redis,数据库和节点应用,且支持多种语言编写的客户端接入并获取数据。6. Designed an overall architecture, including cache middleware Redis, database and node applications, and supported clients written in multiple languages to access and obtain data.

附图说明Description of drawings

图1为本发明的系统架构图;Figure 1 is a system architecture diagram of the present invention;

图2为业务流程图;Figure 2 is a business flow chart;

图3为多链路部署图;Figure 3 is a multi-link deployment diagram;

图4为节点集群状态同步图;Figure 4 is a node cluster status synchronization diagram;

图5为差值数据处理流程;Figure 5 shows the difference data processing flow;

图6为差值算法流程图。Figure 6 is a flow chart of the difference algorithm.

具体实施方式Detailed ways

为使本发明的目的、内容和优点更加清楚,下面结合附图和实施例,对本发明的具体实施方式作进一步详细描述。In order to make the purpose, content and advantages of the present invention clearer, specific implementation modes of the present invention will be further described in detail below in conjunction with the accompanying drawings and examples.

本发明属于高可用数据融合领域,一种基于多线程的多链路数据接收融合方法,具体涉及多线程技术,过滤算法技术,数据协议解析技术。The invention belongs to the field of high-availability data fusion, a multi-thread-based multi-link data reception and fusion method, and specifically involves multi-thread technology, filtering algorithm technology, and data protocol analysis technology.

本发明提供了一种数据过滤融合,节点状态同步以及节点状态恢复的解决方案,并将北斗技术与多线程技术相结合,本发明提供一种基于多线程的多链路数据接收融合过滤方法。The invention provides a solution for data filtering fusion, node status synchronization and node status recovery, and combines Beidou technology with multi-thread technology. The invention provides a multi-thread-based multi-link data reception fusion filtering method.

本发明的方法包括应用于具有高可靠性数据远程发送和存储,节点服务集群搭建,可以应用于GPS或北斗定位物体和物体跟踪展示服务等应用环境,使数据提供方无需担心服务器停机断电,或节点应用进程停止运行等问题造成数据传输中止等问题。当多个机房的服务集群其中一个机房发生停电现象时,用户不用担心数据中断传输,从而提高了数据发送的稳定性,减少了监管人员的运维成本,让用户只关心数据展示业务。同时应用具有数据源连接可配置化,数据协议解析可配置化,融合过滤算法可配置化等配置功能,用户可以自定义数据源连接,自定义数据协议和数据过滤方法,用于解析、过滤和转发数据。The method of the present invention includes application in high-reliability data remote transmission and storage, node service cluster construction, and can be applied to application environments such as GPS or Beidou positioning objects and object tracking display services, so that data providers do not need to worry about server shutdown and power outage. Or the node application process stops running, causing problems such as data transmission interruption. When a power outage occurs in one of the service clusters of multiple computer rooms, users do not have to worry about data transmission being interrupted, thereby improving the stability of data transmission, reducing the operation and maintenance costs of supervisors, and allowing users to only care about the data display business. At the same time, the application has configuration functions such as configurable data source connection, configurable data protocol parsing, and configurable fusion filtering algorithms. Users can customize data source connections, customize data protocols, and data filtering methods for parsing, filtering, and Forward data.

一种基于多线程的多链路数据接收融合方法,本发明可以启动多个节点作为一个集群,每个节点都具有数据收发、数据融合功能,因此不会产生数据重复发送或数据丢失的情况,具体架构如图1所示。A multi-thread-based multi-link data receiving and fusion method. The present invention can start multiple nodes as a cluster. Each node has data sending and receiving and data fusion functions, so there will be no repeated data transmission or data loss. The specific architecture is shown in Figure 1.

本发明的方法基于融合系统,该融合系统由多个节点组成,构成一个集群,每个节点连接三个目标,第一个目标是数据源,数据源可以有多个,用于接收数据,第二个目标为数据库,数据库是关系型数据库,例如Oracle、MySQL或者国产化数据库达梦和神通等,用于记录集群状态,第三个目标为缓存中间件Redis,用于数据融合脚本协议的执行,该方案包括如下几个问题解决方案。The method of the present invention is based on a fusion system. The fusion system is composed of multiple nodes, forming a cluster. Each node is connected to three targets. The first target is a data source. There can be multiple data sources for receiving data. The third target is a data source. The second target is the database. The database is a relational database, such as Oracle, MySQL or the domestic database Dameng and Shentong. It is used to record the cluster status. The third target is the cache middleware Redis, which is used to execute the data fusion script protocol. , the program includes the following solutions to several problems.

1.1多线程处理多数据源数据1.1 Multi-threaded processing of data from multiple data sources

在每个节点中通过自定义配置文件定义接收器形式,数据协议形式和数据过滤算法,并通过线程池技术建立多个线程,每个线程建立一个接收器,同时运行数据接收、数据处理服务,且每个接收器的数据处理服务相互之间不会产生影响,解析过后,再汇总多数据源数据,做数据去重和数据过滤操作。In each node, the receiver form, data protocol form and data filtering algorithm are defined through a custom configuration file, and multiple threads are established through thread pool technology. Each thread establishes a receiver and runs data reception and data processing services at the same time. And the data processing services of each receiver will not affect each other. After parsing, the data from multiple data sources will be summarized to perform data deduplication and data filtering operations.

1.2多链路数据融合1.2 Multi-link data fusion

如图1所示,集群由多个节点组成,每个节点是一个独立运行的节点,且每个节点通过多线程技术,连接多个数据源,并行运行,然而由于每个节点接收的数据有重复数据和错误数据,为了集群发送的数据不冗余且不会产生错误,首先每个节点去重数据源接收的数据,然后对最近两点的数据距离计算和接收时间差计算,判断是否为错误数据,如果是则过滤,最后将处理过后的数据发送到缓存中间件Redis,依靠缓存中间件Redis的脚本技术和SET集合执行数据去重和数据发送,并定时销毁过时数据,依靠脚本技术的高效率,做到在1秒中内,就可以起到数据恢复发送的作用,具体脚本协议如图2所示。As shown in Figure 1, the cluster consists of multiple nodes. Each node is an independently running node, and each node uses multi-threading technology to connect multiple data sources and run in parallel. However, since the data received by each node has Duplicate data and erroneous data. In order to ensure that the data sent by the cluster is not redundant and does not produce errors, each node first deduplicates the data received by the data source, and then calculates the data distance and reception time difference between the two most recent points to determine whether it is an error. data, if so, it is filtered, and finally the processed data is sent to the cache middleware Redis. It relies on the scripting technology of the cache middleware Redis and the SET collection to perform data deduplication and data sending, and destroys outdated data regularly, relying on the advanced scripting technology. The efficiency is such that data can be restored and sent within 1 second. The specific script protocol is shown in Figure 2.

(1)本发明向外提供数据,可支持不同编程语言,(1) This invention provides data to the outside world and can support different programming languages.

缓存中间件Redis设置在应用集群之外,只要网络连通,可以在不同机房,甚至不同地区。缓存中间件Redis是集群搭建,集群每个节点都通过TCP协议与缓存中间件Redis连接,写数据到缓存中间件Redis的SET集合和阻塞队列,缓存中间件Redis支持多种语言连接,因此可以提供不同开发语言开发的系统消费数据。The cache middleware Redis is set up outside the application cluster. As long as the network is connected, it can be in different computer rooms or even different regions. The cache middleware Redis is built in a cluster. Each node in the cluster is connected to the cache middleware Redis through the TCP protocol, and writes data to the SET collection and blocking queue of the cache middleware Redis. The cache middleware Redis supports multiple language connections, so it can provide Systems developed in different development languages consume data.

(2)本发明具有高可用性,且融合后的数据具有唯一性(2) The present invention has high availability, and the fused data is unique

每个节点部署在不同的服务器,每个节点在服务器中是一个进程,集群搭建在多个服务器之上,多个服务器可以分布在多个机房,如果有一台服务器发生宕机,或者服务器重启,进程消失,集群就会丢失一个节点,然而其它节点还在运行,数据不会丢失,如图3所示。所有节点的数据接收相同的数据,因此发送的数据也具有相同性,即使一个节点的服务器发生故障,也不影响其它节点接收发送数据,又由于缓存中间件Redis脚本技术和SET集合不重复的特性,将所有节点的数据发送到这个SET集合中,判断是否存在,如果存在,则说明已经接收,如果不存在,则说明数据还没接收,因此当一个节点挂掉,数据也可以在1秒钟内恢复发送,且数据具有唯一性,然后由需要展示物体运动状态的系统,接收数据,展示物体在地图上的运行情况。Each node is deployed on a different server. Each node is a process in the server. The cluster is built on multiple servers. Multiple servers can be distributed in multiple computer rooms. If one server goes down or the server restarts, If the process disappears, the cluster will lose one node. However, other nodes are still running and the data will not be lost, as shown in Figure 3. The data of all nodes receive the same data, so the data sent is also the same. Even if the server of one node fails, it will not affect the reception and sending of data by other nodes. Moreover, due to the non-duplicate characteristics of the caching middleware Redis script technology and SET collection , send the data of all nodes to this SET set to determine whether it exists. If it exists, it means that it has been received. If it does not exist, it means that the data has not been received. Therefore, when a node hangs up, the data can also be received in 1 second. The transmission is resumed within the system, and the data is unique. Then the system that needs to display the motion status of the object receives the data and displays the operation of the object on the map.

(3)融合后的数据具有有序性(3) The fused data is orderly

每个节点接收的数据都有数据时间,如图2所示,首先每个节点都会过滤掉时间乱序的数据,然而每个节点发送到缓存中间件Redis的时间不同,因此缓存中间件Redis的脚本协议的定制有两种解决方案,第一种方案是,因为每个节点的数据具有有序性,只需要在缓存中间件Redis中设置SET集合,利用SET集合的不重复性,将数据发送到SET集合,判断数据是否接收,如果SET集合中没有数据,则缓存数据到SET集合,再将数据发送到阻塞队列中,因为队列具有先进先出的特性,因此最终消费数据的就具有的有序性。第二种方案是,在缓存中间件Redis中设置权限Key,且具有5秒的时效,这个5秒时效可自定义为任何正整形数值,Key的值为节点的IP地址,这个权限的分配规则为先发送数据的节点具有这个权限,也就是说这Key的值为这个先发送数据的节点的IP地址,那么这个节点就会有5秒钟时间发送数据到SET集合的权限,再判断集合是否接收相同数据,如果没有,则缓存数据到SET集合,再将数据发送到阻塞队列中。5秒过后就重现创建一个空Key,还是任何一个节点先发送数据,那么这个节点的IP地址会赋值给这个Key,由于每个节点的数据具有时序性,因此最终汇集到队列的数据具有时序性。The data received by each node has a data time, as shown in Figure 2. First, each node will filter out time-ordered data. However, the time each node sends to the cache middleware Redis is different, so the cache middleware Redis There are two solutions for customizing the script protocol. The first solution is that because the data of each node is ordered, you only need to set the SET set in the cache middleware Redis, and use the non-repetitive nature of the SET set to send the data. Go to the SET collection to determine whether the data is received. If there is no data in the SET collection, cache the data to the SET collection, and then send the data to the blocking queue. Because the queue has the first-in-first-out characteristic, the final consumer of data has sequence. The second option is to set the permission Key in the cache middleware Redis, and it has a validity of 5 seconds. This 5-second validity can be customized to any positive integer value. The value of the Key is the IP address of the node. The allocation rules of this permission The node that sends data first has this permission. That is to say, the value of this Key is the IP address of the node that sends data first. Then this node will have 5 seconds to send data to the SET collection permission, and then determine whether the collection is Receive the same data, if not, cache the data to the SET collection, and then send the data to the blocking queue. After 5 seconds, an empty Key is created again, or any node sends data first, then the IP address of this node will be assigned to this Key. Since the data of each node has timing, the data finally collected into the queue has timing. sex.

1.3集群设置1.3 Cluster settings

如图3所示,集群是由一个或多个节点组成,每个节点具有相同的数据接收和数据发送服务,每个节点连接相同的数据源,数据源可能有多个,因此多个数据源同时向每一个节点发送相同的数据,只要在应用配置文件中配置所有节点的IP地址和端口,利用服务器套接字编写规则和用逗号连接,编写节点集群配置,例如127.0.0.1:8080,127.0.0.1:8081,127.0.0.1:8082As shown in Figure 3, the cluster is composed of one or more nodes. Each node has the same data receiving and data sending services. Each node is connected to the same data source. There may be multiple data sources, so there are multiple data sources. Send the same data to each node at the same time. Just configure the IP addresses and ports of all nodes in the application configuration file, use server sockets to write rules and connect with commas, and write node cluster configuration, such as 127.0.0.1:8080,127.0 .0.1:8081,127.0.0.1:8082

1.4数据协议通用性解析方案1.4 Data protocol universal analysis solution

通过配置文件数据协议格式配置,配置协议名称,协议ID,协议的字节组成,例如第一字节为字接头,第二字节为数据类型,第三字节为数据数量,第四到第七字节为数据的经度值等等,且利用编程语言反射技术在配置文件中配置协议数据每个字节或每个字节数组解析方法名称,利用编程语言的反射机制和映射机制,将名称映射为字节数据的可执行方法,通过这些方法解析数据。用户也上传自己的方法文件到数据库,并加载到内存中,再通过反射技术,获取可执行的方法,最后将方法名和方法,通过映射的关系,缓存到应用中,当需要解析数据时,只需要通过配置文件协议解析数据方法名称获取解析方法,应用就能通过这些方法解析数据。Configure the data protocol format through the configuration file, configure the protocol name, protocol ID, and the byte composition of the protocol. For example, the first byte is the word connector, the second byte is the data type, the third byte is the data quantity, and the fourth to fourth byte is the data type. The seven bytes are the longitude value of the data, etc., and the programming language reflection technology is used to configure the protocol data in the configuration file for each byte or each byte array to parse the method name, and the reflection mechanism and mapping mechanism of the programming language are used to convert the name Executable methods mapped to byte data through which the data is parsed. Users also upload their own method files to the database and load them into memory, and then use reflection technology to obtain executable methods. Finally, the method names and methods are cached in the application through the mapping relationship. When the data needs to be parsed, only It is necessary to obtain the parsing method through the configuration file protocol parsing data method name, and the application can parse the data through these methods.

1.5节点连接状态设置1.5 Node connection status setting

数据库是关系型数据库,例如Oracle、MySQL或者国产化数据库达梦和神通等,如需使用这些数据库,只需添加数据库驱动和配置文件连接URL即可连接,在数据库中设置好连接目标和转发目标,节点启动后会自动读取数据库建立接收和转发连接。The database is a relational database, such as Oracle, MySQL or the domestic databases Dameng and Shentong. If you need to use these databases, you only need to add the database driver and configuration file connection URL to connect, and set the connection target and forwarding target in the database. , after the node starts, it will automatically read the database and establish receiving and forwarding connections.

1.6节点状态设置1.6 Node status setting

(1)状态同步机制(1)State synchronization mechanism

用户发送请求到节点1,节点1通过配置文件获取集群配置,异步转发请求到节点2,节点3,节点4……等节点,同步状态,使集群所有节点同时具有相同的接收和转发状态,当同步成功时,在数据库保存当前状态信息。The user sends a request to node 1, and node 1 obtains the cluster configuration through the configuration file, and asynchronously forwards the request to node 2, node 3, node 4... and other nodes. The synchronization state ensures that all nodes in the cluster have the same receiving and forwarding state at the same time. When synchronization is successful, the current status information is saved in the database.

(2)去除节点机制(2)Remove node mechanism

当一个节点发生宕机,或请求不通,可移除这个节点,不影响其它节点的请求操作。When a node goes down or requests fail, the node can be removed without affecting the request operations of other nodes.

(2)状态回滚机制(2)State rollback mechanism

使用数据库存储节点上一步状态,当有一个节点服务请求失败时,读取数据库上一步状态,并回滚恢复上一步状态。Use the database to store the previous state of the node. When a node service request fails, read the previous state of the database and roll back to restore the previous state.

(3)状态恢复机制(3)State recovery mechanism

使用数据库存储节点状态,连接和转发信息,当节点挂掉的时候,再启动,可以恢复连接状态和其它状态。Use a database to store node status, connection and forwarding information. When a node hangs up and is restarted, the connection status and other statuses can be restored.

1.7可定制化数据过滤协议1.7 Customizable data filtering protocol

(1)定制数据过滤算法(1) Customized data filtering algorithm

用户可通过上传数据过滤算法二进制文件,定制过滤数据算法。应用将二进制文件加载内存中,再通过编程语言的反射技术获取到过滤算法,这样应用就能通过算法过滤用户想要过滤的数据。Users can customize the data filtering algorithm by uploading the data filtering algorithm binary file. The application loads the binary file into the memory, and then obtains the filtering algorithm through the reflection technology of the programming language, so that the application can filter the data that the user wants to filter through the algorithm.

1.8自定义数据连接方案1.8 Custom data connection scheme

(1)定制数据源连接(1) Customized data source connection

本发明支持多种数据源,通过配置文件配置数据源连接协议TCP/UDP,数据连接方式,数据源为客户端或服务端等等配置The invention supports multiple data sources. The data source connection protocol TCP/UDP and data connection mode are configured through the configuration file. The data source is configured as a client or a server, etc.

(2)定制TCP数据粘包半包数据处理算法(2) Customized TCP data sticky half-packet data processing algorithm

通过配置文件配置制TCP数据粘包半包数据处理算法Configure TCP data sticky half-packet data processing algorithm through configuration file

(3)定制数据协议解析规则(3) Customized data protocol parsing rules

通过配置文件配置数据协议构造和解析规则,解析数据。Configure data protocol construction and parsing rules through configuration files to parse data.

1.9启动脚本1.9 Startup script

建立应用启动脚本,将脚本放置服务器系统开机运行的文件夹内,服务器启动时,会自动运行应用,缓存中间件Redis,数据库和消息队列中间件等等应用,并有执行顺序。Create an application startup script and place the script in the folder where the server system starts up. When the server starts, it will automatically run the application, cache middleware Redis, database and message queue middleware and other applications, and have an execution sequence.

2.0监控脚本2.0 monitoring script

编写监控脚本,监控应用进程名称,缓存中间件Redis和数据库进程,用于启动数据库、缓存中间件Redis、消息队列中间件等等应用,并有执行顺序。Write monitoring scripts to monitor application process names, cache middleware Redis and database processes. They are used to start database, cache middleware Redis, message queue middleware and other applications, and have an execution sequence.

实施例1:Example 1:

本发明是基于Java开源技术框架开发的应用,支持多种数据库,多种数据源,多种数据协议,多种流数据过滤算法。搭建的过程是,首先搭建数据库和缓存中间件Redis,然后启动数据源,在多个服务器上启动节点应用,上传配置文件,最后启动数据源发送数据即可对数据做数据融合过滤处理。The invention is an application developed based on the Java open source technology framework and supports multiple databases, multiple data sources, multiple data protocols, and multiple stream data filtering algorithms. The construction process is to first build the database and cache middleware Redis, then start the data source, start node applications on multiple servers, upload configuration files, and finally start the data source to send data to perform data fusion and filtering processing on the data.

(1)连接器配置:通过多线程建立多个数据源连接,连接器可配置多种连接协议,TCP或者UDP连接,其中使用UDP支持组播和单播模式,使用TCP连接,支持处理数据粘包和半包问题。本发明提供两种数据粘包半包处理算法,其中之一是数据中每一帧带有数据头和数据长度字段,依据数据头和数据长度判断当前数据是否为完整的一帧数据,如果不是直接丢弃,第二种是支持带有帧头、帧尾和验证字节,用于检测数据是不是一帧完整的数据,如果粘包则拆包处理,如果是半包,则加入缓存,待下一帧数据的到来在进行处理。(1) Connector configuration: Establish multiple data source connections through multi-threading. The connector can be configured with multiple connection protocols, TCP or UDP connections. UDP is used to support multicast and unicast modes, and TCP connections are used to support processing of data stickiness. Package and half package issues. The invention provides two data sticky half-packet processing algorithms. One of them is that each frame in the data has a data header and data length field. Based on the data header and data length, it is judged whether the current data is a complete frame of data. If not, Discard directly. The second one supports frame header, frame tail and verification bytes, which is used to detect whether the data is a complete frame of data. If the data is stuck, it will be unpacked. If it is a half packet, it will be added to the cache. The arrival of the next frame of data is being processed.

(2)协议配置:配置协议名称,协议ID和协议粘包半包解析算法,且配置协议数据每字节代表的数据类型和解析算法;(2) Protocol configuration: configure the protocol name, protocol ID and protocol sticky half-packet parsing algorithm, and configure the data type and parsing algorithm represented by each byte of the protocol data;

(3)过滤器配置:配置自定义算法,本发明提供两种算法。一种是依据具有时序性的数据,计算北斗定位物体速度的算法,并根据速度判断当前数据或物体是否异常,并做异常处理,计算公式如下所示:(3) Filter configuration: Configure a custom algorithm. The present invention provides two algorithms. One is an algorithm that calculates the speed of Beidou positioning objects based on time-series data, and determines whether the current data or objects are abnormal based on the speed, and handles exceptions. The calculation formula is as follows:

distence=R*arccos(sin(latitude1)*sin(latitude 2)+cos(latitude1)*cos(latitude 2)*cos(longitude1-longitude 2))distence=R*arccos(sin(latitude1)*sin(latitude 2)+cos(latitude1)*cos(latitude 2)*cos(longitude1-longitude 2))

v=distence/(time2–time1)v=distence/(time2–time1)

其中distance是距离,R为半径,latitude1是第一点纬度,latitude2是第二点纬度,longitude1是第一点经度,longitude2是第二点经度,v是速度,time2是第二点时间,time1是第一点时间。Where distance is the distance, R is the radius, latitude1 is the latitude of the first point, latitude2 is the latitude of the second point, longitude1 is the longitude of the first point, longitude2 is the longitude of the second point, v is the speed, time2 is the time of the second point, time1 is The first bit of time.

也可自定义算法,如果异常数据是高斯分布,可使用高斯过滤算法,建立置信区间,保留置信区间数据,过滤错误数据。You can also customize the algorithm. If the abnormal data is Gaussian distribution, you can use the Gaussian filtering algorithm to establish a confidence interval, retain the confidence interval data, and filter out erroneous data.

(4)配置差值算法:如果是GPS数据或北斗数据,可添加配置选项。由于采用多线程技术,可支持多个物体的差值计算。(4) Configure difference algorithm: If it is GPS data or Beidou data, configuration options can be added. Due to the use of multi-threading technology, it can support the difference calculation of multiple objects.

(5)字段配置中的方法配置,解析协议数据,也可使用节点应用自带的协议数据解析方法。(5) Method configuration in field configuration, to parse protocol data, you can also use the protocol data parsing method that comes with the node application.

本发明公开了一种多线程、多链路数据的数据融合过滤方法,主要优势体现在以下方面:The invention discloses a data fusion and filtering method for multi-threaded and multi-link data. The main advantages are reflected in the following aspects:

1.设计了一种多链路数据融合解决方案,即使发生服务器停机、应用进程消失等问题,在一秒内数据可恢复发送,使客户不用担心停电等问题,并且有启动脚本,开启服务器即可运行应用,无需过多操作。1. Designed a multi-link data fusion solution. Even if problems such as server shutdown and application process disappearance occur, data can be resumed within one second, so that customers do not have to worry about power outages and other problems. There is also a startup script to start the server. The app can be run without much action required.

2.设计了一种集群搭建的解决方案,利用多线程技术,请求发送一个节点,通过多线程异步发送相同请求到其它节点,这样集群的配置文件上传,配置文件同步,发送状态同步等问题,就可以迎刃而解。然而当一个服务器发生宕机等现象,这个服务器的上的应用节点就会消失,其它节点就会移除这个节点,不影响其它节点的正常运行。然而当有一台节点发生请求问题时,也可设置所有节点状态回滚的操作,使所有节点状态一致,从而减少运维操作。2. Designed a solution for cluster construction, using multi-threading technology to send a request to a node, and asynchronously sending the same request to other nodes through multi-threading, so that the cluster's configuration file upload, configuration file synchronization, sending status synchronization and other issues can be solved. It can be solved easily. However, when a server goes down, the application node on this server will disappear, and other nodes will remove this node, without affecting the normal operation of other nodes. However, when a request problem occurs on one node, you can also set up a rollback operation for the status of all nodes to make the status of all nodes consistent, thereby reducing operation and maintenance operations.

3.设计了一种支持自定义数据过滤算法的方法。用户可根据自己的数据需求,定制化获取怎样的数据,并通过上传二进制文件和配置文件,定制自己的过滤方法,简单有效。3. Designed a method to support custom data filtering algorithms. Users can customize what kind of data to obtain according to their own data needs, and customize their own filtering methods by uploading binary files and configuration files, which is simple and effective.

4.设计了一种支持连接多种数据源的简单搭建服务的方法,利用多线程技术一个节点可连接多个数据源,通过配置多个数据源基本信息,转发目的地,路由表等编写操作,并上传配置文件,即可以定制用户需要的应用节点,且保存到数据库中,节点重启依然使用上一次配置连接信息,无需再次上传配置文件。4. Designed a simple method to build services that supports connecting multiple data sources. Using multi-threading technology, one node can connect to multiple data sources. By configuring basic information of multiple data sources, forwarding destinations, routing tables and other writing operations , and upload the configuration file, that is, the application node required by the user can be customized and saved in the database. When the node is restarted, the last configuration connection information will still be used, and there is no need to upload the configuration file again.

5.设计了一种数据协议解析通用化的解决方案,通过定制配置文件协议,做到应用节点处理数据协议的功能具有通用性,减少开发者应对新协议的研发,且只需上传一次配置文件,就可搭建自定义数据协议解析算法的环境,从而减少用户定义数据协议的操作。5. Designed a universal solution for data protocol parsing. By customizing the configuration file protocol, the application node's function of processing data protocols is universal, reducing developers' research and development of new protocols, and only needing to upload the configuration file once. , you can build an environment for custom data protocol parsing algorithms, thereby reducing the operations of user-defined data protocols.

6.设计了一种总体架构,包括缓存中间件Redis,数据库和节点应用,且支持多种语言编写的客户端接入并获取数据。6. Designed an overall architecture, including cache middleware Redis, database and node applications, and supported clients written in multiple languages to access and obtain data.

以上所述仅是本发明的优选实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明技术原理的前提下,还可以做出若干改进和变形,这些改进和变形也应视为本发明的保护范围。The above are only preferred embodiments of the present invention. It should be noted that those of ordinary skill in the art can also make several improvements and modifications without departing from the technical principles of the present invention. These improvements and modifications It should also be regarded as the protection scope of the present invention.

Claims (10)

1.一种基于多线程的多链路数据接收融合方法,其特征在于,该方法基于融合系统,该融合系统由多个节点组成,构成一个集群,每个节点连接三个目标,第一个目标是数据源,数据源用于接收数据,第二个目标为数据库,数据库是关系型数据库,用于记录集群状态,第三个目标为缓存中间件Redis,用于数据融合脚本协议的执行;该方法包括:1. A multi-link data receiving and fusion method based on multi-threading, characterized in that the method is based on a fusion system, which consists of multiple nodes forming a cluster. Each node is connected to three targets. The first The target is the data source, which is used to receive data. The second target is the database. The database is a relational database used to record the cluster status. The third target is the cache middleware Redis, used for the execution of the data fusion script protocol; The method includes: 在每个节点中通过自定义配置文件定义接收器形式,数据协议形式和数据过滤算法,并通过线程池技术建立多个线程,每个线程建立一个接收器,同时运行数据接收、数据处理服务,且每个接收器的数据处理服务相互之间不会产生影响,解析过后,再汇总多数据源数据,做数据去重和数据过滤操作;In each node, the receiver form, data protocol form and data filtering algorithm are defined through a custom configuration file, and multiple threads are established through thread pool technology. Each thread establishes a receiver and runs data reception and data processing services at the same time. And the data processing services of each receiver will not affect each other. After parsing, the data from multiple data sources will be summarized to perform data deduplication and data filtering operations; 每个节点是一个独立运行的节点,且每个节点通过多线程技术,连接多个数据源,并行运行,每个节点去重数据源接收的数据,然后对最近两点的数据距离计算和接收时间差计算,判断是否为错误数据,如果是则过滤,最后将处理过后的数据发送到缓存中间件Redis,依靠缓存中间件Redis的脚本技术和SET集合执行数据去重和数据发送,并且将缓存的过时数据定时销毁;Each node is an independently running node, and each node uses multi-threading technology to connect multiple data sources and run in parallel. Each node deduplicates the data received by the data source, and then calculates and receives the data distance between the two nearest points. Calculate the time difference to determine whether it is wrong data. If so, filter it. Finally, send the processed data to the cache middleware Redis. Rely on the script technology and SET collection of the cache middleware Redis to perform data deduplication and data sending, and cache the data. Outdated data is destroyed regularly; 缓存中间件Redis设置在应用集群之外,集群每个节点都通过TCP协议与缓存中间件Redis连接,发送数据大到缓存中间件Redis的SET集合,当集合SET不存在则发送数据到阻塞队列,并且缓存中间件Redis支持多种语言连接,能消费阻塞队列数据;The cache middleware Redis is set outside the application cluster. Each node in the cluster is connected to the cache middleware Redis through the TCP protocol. Data is sent to the SET set of the cache middleware Redis. When the set SET does not exist, the data is sent to the blocking queue. Moreover, the cache middleware Redis supports multiple language connections and can consume blocking queue data; 每个节点部署在不同的服务器,每个节点在服务器中是一个进程,集群搭建在多个服务器之上,多个服务器分布在多个机房,如果有一台服务器发生宕机,或者服务器重启,进程消失,集群就会丢失一个节点,然而其它节点还在运行,数据不会丢失;所有节点的数据接收相同的数据,因此发送的数据也具有相同性,即使一个节点的服务器发生故障,也不影响其它节点接收发送数据,又由于缓存中间件Redis脚本技术和SET集合不重复的特性,将所有节点的数据发送到这个SET集合中,判断是否存在,如果存在,则说明已经接收,如果不存在,则说明数据还没接收,因此当一个节点挂掉,数据也能在1秒钟内恢复发送,且数据具有唯一性;Each node is deployed on a different server. Each node is a process in the server. The cluster is built on multiple servers. Multiple servers are distributed in multiple computer rooms. If a server goes down or the server restarts, the process disappears, the cluster will lose a node, but other nodes are still running and the data will not be lost; the data of all nodes receive the same data, so the data sent is also the same. Even if the server of a node fails, it will not be affected. Other nodes receive and send data, and due to the caching middleware Redis script technology and the non-duplicate characteristics of SET collections, the data of all nodes are sent to this SET collection to determine whether it exists. If it exists, it means it has been received. If it does not exist, It means that the data has not been received, so when a node hangs up, the data can be resumed within 1 second, and the data is unique; 每个节点接收的数据都有数据时间,首先每个节点都会过滤掉时间乱序的数据,然而每个节点发送到缓存中间件Redis的时间不同,因此缓存中间件Redis的脚本协议的定制有两种解决方案,第一种方案是,因为每个节点的数据具有有序性,只需要在缓存中间件Redis中设置SET集合,将所有节点的数据发送到SET集合,利用SET集合的不可重复性,判断数据是否接收,如果SET集合中没有数据,则将数据缓存到SET集合,且发送到阻塞队列中,因为队列具有先进先出的特性,因此最终消费数据的就具有的有序性;第二种方案是,在缓存中间件Redis中设置权限Key,且具有5秒的时效,这个5秒时效可自定义为任何正整形数值,Key的值为节点的IP地址,这个权限的分配规则为先发送数据的节点具有这个权限,也就是说这Key的值为这个先发送数据的节点的IP地址,那么这个节点就会有5秒钟时间发送数据到SET集合的权限,再判断集合是否接收相同数据,如果没有,则将数据缓存到SET集合,再将数据发送到阻塞队列中;5秒过后就重现创建一个空Key,还是任何一个节点先发送数据,那么这个节点的IP地址会赋值给这个Key,由于每个节点的数据具有时序性,因此最终汇集到队列的数据具有时序性;The data received by each node has a data time. First, each node will filter out time-ordered data. However, the time each node sends to the cache middleware Redis is different. Therefore, there are two ways to customize the script protocol of the cache middleware Redis. There are two solutions. The first solution is that because the data of each node is ordered, you only need to set the SET set in the cache middleware Redis, send the data of all nodes to the SET set, and take advantage of the non-repeatability of the SET set. , determine whether the data is received. If there is no data in the SET set, the data will be cached in the SET set and sent to the blocking queue. Because the queue has the first-in, first-out characteristic, the final consumption of data will have orderliness; The second option is to set the permission Key in the cache middleware Redis and have a 5-second aging. This 5-second aging can be customized as any positive integer value. The value of the Key is the IP address of the node. The allocation rules for this permission are: The node that sends data first has this permission. That is to say, the value of this Key is the IP address of the node that sends data first. Then this node will have 5 seconds to send data to the SET collection permission, and then determine whether the collection receives it. The same data, if not, caches the data to the SET collection, and then sends the data to the blocking queue; after 5 seconds, an empty Key is created again, or any node sends data first, then the IP address of this node will be assigned a value Given this Key, since the data of each node is sequential, the data finally collected into the queue is sequential; 用户发送请求到节点1,节点1通过配置文件获取集群配置,异步转发请求到其他节点,同步状态,使集群所有节点同时具有相同的接收和转发状态,当同步成功时,在数据库保存当前状态信息;当一个节点发生宕机,或请求不通,移除这个节点,不影响其它节点的请求操作;如果不是宕机或请求不通,当有一个节点服务请求失败时,为了使所有节点状态一致,使用数据库已经存储的节点上一步状态,读取数据库上一步状态,用于节点恢复上一步状态;使用数据库存储节点状态,连接和转发信息,当节点挂掉的时候,再启动,则恢复连接状态和其它状态。The user sends a request to Node 1, and Node 1 obtains the cluster configuration through the configuration file, asynchronously forwards the request to other nodes, and synchronizes the status so that all nodes in the cluster have the same receiving and forwarding status at the same time. When the synchronization is successful, the current status information is saved in the database. ; When a node is down, or requests are blocked, removing this node will not affect the request operations of other nodes; if it is not down or requests are blocked, when a node service request fails, in order to make the status of all nodes consistent, use Read the previous status of the node that has been stored in the database, and use it to restore the previous status of the node; use the database to store the node status, connection and forwarding information. When the node hangs up and restarts, the connection status and Other status. 2.如权利要求1所述的基于多线程的多链路数据接收融合方法,其特征在于,关系型数据库为Oracle、MySQL、达梦或神通。2. The multi-thread-based multi-link data receiving and fusion method as claimed in claim 1, wherein the relational database is Oracle, MySQL, Dameng or Shentong. 3.如权利要求1所述的基于多线程的多链路数据接收融合方法,其特征在于,多个数据源同时向每一个节点发送相同的数据,只要在应用配置文件中配置所有节点的IP地址和端口,利用服务器套接字编写规则和用逗号连接,编写节点集群配置。3. The multi-thread-based multi-link data receiving and fusion method as claimed in claim 1, characterized in that multiple data sources send the same data to each node at the same time, as long as the IPs of all nodes are configured in the application configuration file. Address and port, write rules using server sockets and connect with commas, write node cluster configuration. 4.如权利要求1所述的基于多线程的多链路数据接收融合方法,其特征在于,通过配置文件数据协议格式配置,配置协议名称,协议ID,协议的字节数组,且利用编程语言反射技术在配置文件中配置协议数据每个字节或每个字节数组解析方法名称,利用编程语言的反射机制和映射机制,将名称映射为字节数据的可执行方法,通过这些方法解析数据。4. The multi-thread-based multi-link data receiving and fusion method as claimed in claim 1, characterized in that the protocol name, protocol ID, and byte array of the protocol are configured through the configuration file data protocol format, and a programming language is used. Reflection technology configures the protocol data in the configuration file to parse the method name for each byte or byte array. It uses the reflection mechanism and mapping mechanism of the programming language to map the name to an executable method of byte data, and parses the data through these methods. . 5.如权利要求1所述的基于多线程的多链路数据接收融合方法,其特征在于,用户通过上传数据过滤算法二进制文件,定制过滤数据算法,应用将二进制文件加载内存中,再通过编程语言的反射技术获取到过滤算法,这样应用就能通过算法过滤用户想要过滤的数据。5. The multi-thread-based multi-link data receiving and fusion method as claimed in claim 1, characterized in that the user customizes the filtering data algorithm by uploading a data filtering algorithm binary file, and then loads the binary file into the memory through programming. The reflection technology of the language obtains the filtering algorithm, so that the application can filter the data that the user wants to filter through the algorithm. 6.如权利要求1所述的基于多线程的多链路数据接收融合方法,其特征在于,用户定制数据源连接,通过配置文件配置数据源连接协议TCP/UDP,数据连接方式,以及数据源为客户端或服务端配置,通过配置文件配置制TCP数据粘包半包数据处理算法,通过配置文件配置数据协议构造和解析规则,解析数据。6. The multi-thread-based multi-link data receiving and fusion method according to claim 1, characterized in that the user customizes the data source connection and configures the data source connection protocol TCP/UDP, the data connection mode, and the data source through the configuration file. Configure for the client or server, configure the TCP data sticky half-packet data processing algorithm through the configuration file, configure the data protocol construction and parsing rules through the configuration file, and parse the data. 7.如权利要求1所述的基于多线程的多链路数据接收融合方法,其特征在于,建立应用启动脚本,将脚本放置服务器系统开机运行的文件夹内,服务器启动时,会自动运行应用,缓存中间件Redis,数据库和消息队列中间件应用,并有执行顺序。7. The multi-thread-based multi-link data receiving and fusion method as claimed in claim 1, characterized in that an application startup script is created and the script is placed in a folder where the server system is started and run. When the server is started, the application will be automatically run. , cache middleware Redis, database and message queue middleware applications, and have execution order. 8.如权利要求1所述的基于多线程的多链路数据接收融合方法,其特征在于,编写监控脚本,监控应用进程名称,缓存中间件Redis和数据库进程,用于启动数据库、缓存中间件Redis、消息队列中间件应用,并有执行顺序。8. The multi-thread-based multi-link data receiving and fusion method as claimed in claim 1, characterized by writing monitoring scripts, monitoring application process names, cache middleware Redis and database processes for starting the database and cache middleware. Redis, message queue middleware applications, and execution sequence. 9.如权利要求1-8任一项所述的基于多线程的多链路数据接收融合方法,其特征在于,通过多线程建立多个数据源连接,连接器配置多种连接协议,TCP或者UDP连接,其中使用UDP支持组播和单播模式,使用TCP连接,支持处理数据粘包和半包问题;数据粘包半包处理算法包括两种,其中之一是数据中每一帧带有数据头和数据长度字段,依据数据头和数据长度判断当前数据是否为完整的一帧数据,如果不是直接丢弃,第二种是支持带有帧头、帧尾和验证字节,用于检测数据是不是一帧完整的数据,如果粘包则拆包处理,如果是半包,则加入缓存,待下一帧数据的到来在进行处理。9. The multi-thread-based multi-link data receiving and fusion method according to any one of claims 1-8, characterized in that multiple data source connections are established through multi-threads, and the connector is configured with multiple connection protocols, such as TCP or UDP connection, which uses UDP to support multicast and unicast modes, uses TCP connection to support processing of data sticky and half-packet problems; data sticky and half-packet processing algorithms include two types, one of which is that each frame in the data contains The data header and data length fields are used to determine whether the current data is a complete frame of data based on the data header and data length. If it is not discarded directly, the second method supports frame header, frame tail and verification bytes for detecting data. Is it a complete frame of data? If it is sticky, it will be unpacked and processed. If it is a half packet, it will be added to the cache and will be processed when the next frame of data arrives. 10.如权利要求9所述的基于多线程的多链路数据接收融合方法,其特征在于,过滤器配置包括:配置自定义算法,一种是依据具有时序性的数据,计算北斗定位物体速度的算法,并根据速度判断当前数据或物体是否异常,并做异常处理,计算公式如下所示:10. The multi-thread-based multi-link data receiving and fusion method according to claim 9, wherein the filter configuration includes: configuring a custom algorithm, one is to calculate the speed of Beidou positioning objects based on time-series data algorithm, and determine whether the current data or object is abnormal based on the speed, and perform exception handling. The calculation formula is as follows: distence=R*arccos(sin(latitude1)*sin(latitude2)+cos(latitude1)*cos(latitude2)*cos(longitude1-longitude2))distence=R*arccos(sin(latitude1)*sin(latitude2)+cos(latitude1)*cos(latitude2)*cos(longitude1-longitude2)) v=distence/(time2–time1)v=distence/(time2–time1) 其中distance是距离,R为半径,latitude1是第一点纬度,latitude2是第二点纬度,longitude1是第一点经度,longitude2是第二点经度,v是速度,time2是第二点时间,time1是第一点时间;Where distance is the distance, R is the radius, latitude1 is the latitude of the first point, latitude2 is the latitude of the second point, longitude1 is the longitude of the first point, longitude2 is the longitude of the second point, v is the speed, time2 is the time of the second point, time1 is The first bit of time; 第二种是自定义算法,如果异常数据是高斯分布,使用高斯过滤算法,建立置信区间,保留置信区间数据,过滤错误数据。The second is a custom algorithm. If the abnormal data is Gaussian distributed, use the Gaussian filtering algorithm to establish a confidence interval, retain the confidence interval data, and filter out erroneous data.
CN202310467072.6A 2023-04-26 2023-04-26 A multi-link data receiving fusion method based on multi-threading Active CN116776275B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310467072.6A CN116776275B (en) 2023-04-26 2023-04-26 A multi-link data receiving fusion method based on multi-threading

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310467072.6A CN116776275B (en) 2023-04-26 2023-04-26 A multi-link data receiving fusion method based on multi-threading

Publications (2)

Publication Number Publication Date
CN116776275A true CN116776275A (en) 2023-09-19
CN116776275B CN116776275B (en) 2025-04-11

Family

ID=87986820

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310467072.6A Active CN116776275B (en) 2023-04-26 2023-04-26 A multi-link data receiving fusion method based on multi-threading

Country Status (1)

Country Link
CN (1) CN116776275B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117540151A (en) * 2023-12-08 2024-02-09 深圳市亲邻科技有限公司 Data preprocessing method of data pushing system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190174276A1 (en) * 2017-12-01 2019-06-06 Veniam, Inc. Systems and methods for the data-driven and distributed interoperability between nodes to increase context and location awareness in a network of moving things, for example in a network of autonomous vehicles
CN113407341A (en) * 2021-06-22 2021-09-17 中国工商银行股份有限公司 Load object synchronization method and device based on distributed load balancing
CN113722281A (en) * 2021-08-24 2021-11-30 中国建设银行股份有限公司 Service process processing system and method using multi-level data cache
CN113783921A (en) * 2021-01-27 2021-12-10 北京京东振世信息技术有限公司 Method and apparatus for creating cache components
US20220092024A1 (en) * 2020-09-24 2022-03-24 Commvault Systems, Inc. Container data mover for migrating data between distributed data storage systems integrated with application orchestrators
WO2022222579A1 (en) * 2021-04-23 2022-10-27 焦点科技股份有限公司 Database middleware cluster-based high-availability client load balancing method
CN115811546A (en) * 2022-11-17 2023-03-17 上海大学 System and method for realizing network cooperative distributed processing for scientific and technological service
CN116016671A (en) * 2022-12-22 2023-04-25 中能融合智慧科技有限公司 A method, system, device and medium for processing and forwarding multiple data sources

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190174276A1 (en) * 2017-12-01 2019-06-06 Veniam, Inc. Systems and methods for the data-driven and distributed interoperability between nodes to increase context and location awareness in a network of moving things, for example in a network of autonomous vehicles
US20220092024A1 (en) * 2020-09-24 2022-03-24 Commvault Systems, Inc. Container data mover for migrating data between distributed data storage systems integrated with application orchestrators
CN113783921A (en) * 2021-01-27 2021-12-10 北京京东振世信息技术有限公司 Method and apparatus for creating cache components
WO2022222579A1 (en) * 2021-04-23 2022-10-27 焦点科技股份有限公司 Database middleware cluster-based high-availability client load balancing method
CN113407341A (en) * 2021-06-22 2021-09-17 中国工商银行股份有限公司 Load object synchronization method and device based on distributed load balancing
CN113722281A (en) * 2021-08-24 2021-11-30 中国建设银行股份有限公司 Service process processing system and method using multi-level data cache
CN115811546A (en) * 2022-11-17 2023-03-17 上海大学 System and method for realizing network cooperative distributed processing for scientific and technological service
CN116016671A (en) * 2022-12-22 2023-04-25 中能融合智慧科技有限公司 A method, system, device and medium for processing and forwarding multiple data sources

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王智强: "基于区块链的个人健康数据保护机制研究", 《硕士电子期刊信息科技辑》, 15 January 2023 (2023-01-15) *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117540151A (en) * 2023-12-08 2024-02-09 深圳市亲邻科技有限公司 Data preprocessing method of data pushing system

Also Published As

Publication number Publication date
CN116776275B (en) 2025-04-11

Similar Documents

Publication Publication Date Title
WO2021121370A1 (en) Message loss detection method and apparatus for message queue
US7275177B2 (en) Data recovery with internet protocol replication with or without full resync
US9514208B2 (en) Method and system of stateless data replication in a distributed database system
CN103763396B (en) Energy consumption data acquisition device and its acquisition method based on multi-protocols parallel acquisition technique
US20040267836A1 (en) Replication of snapshot using a file system copy differential
CN106953901A (en) A cluster communication system and method for improving message delivery performance
US20120180070A1 (en) Single point, scalable data synchronization for management of a virtual input/output server cluster
CN107430606B (en) Message Broker System with Parallel Persistence
JP2012528382A (en) Cache data processing using cache clusters in configurable mode
CN104486107A (en) Log collection device and method
CN110209507A (en) Data processing method, device, system and storage medium based on message queue
US11240306B2 (en) Scalable storage system
CN107682169B (en) Method and device for sending message by Kafka cluster
CN109639773A (en) A kind of the distributed data cluster control system and its method of dynamic construction
CN116776275A (en) A multi-thread-based multi-link data reception and fusion method
WO2019089057A1 (en) Scalable storage system
CN104270450A (en) A dual-controller multi-link heartbeat monitoring method using UDP protocol
CN115277375B (en) A switching method, system, device and storage medium for active and standby servers
CN113032477B (en) GTID-based long-distance data synchronization method, device and computing equipment
CN115632947B (en) A configuration sending method and device
CN111143475B (en) State management method and device for Storm data analysis
CN115129521A (en) Data synchronization method, device and system between Redis clusters
CN114826892A (en) Cluster node control method, device, equipment and medium
CN115617536A (en) A data synchronization method, device, network equipment and cluster
Liang et al. Study on Service Oriented Real-Time Message Middleware

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant