CN114793245A - Flexible and configurable streaming information processing method and system - Google Patents

Flexible and configurable streaming information processing method and system Download PDF

Info

Publication number
CN114793245A
CN114793245A CN202210708707.2A CN202210708707A CN114793245A CN 114793245 A CN114793245 A CN 114793245A CN 202210708707 A CN202210708707 A CN 202210708707A CN 114793245 A CN114793245 A CN 114793245A
Authority
CN
China
Prior art keywords
configuration
routing
module
processing
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210708707.2A
Other languages
Chinese (zh)
Other versions
CN114793245B (en
Inventor
李文宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Upyun Technology Co ltd
Original Assignee
Hangzhou Upyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Upyun Technology Co ltd filed Critical Hangzhou Upyun Technology Co ltd
Priority to CN202210708707.2A priority Critical patent/CN114793245B/en
Publication of CN114793245A publication Critical patent/CN114793245A/en
Application granted granted Critical
Publication of CN114793245B publication Critical patent/CN114793245B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • H04L41/0813Configuration setting characterised by the conditions triggering a change of settings
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/74Address processing for routing
    • H04L45/742Route cache; Operation thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a flexibly configurable streaming information processing method and a system, comprising the following steps: sending the off-line data stored on the server to a message receiving module in batches through HTTP requests, converting the off-line data into streaming information to be processed and storing the streaming information in the message receiving module; the message receiving module transmits the HTTP request to the routing selection module, and the routing selection module determines routing configuration through the URI corresponding to the HTTP request; the routing selection module transmits a message processing module to the determined routing configuration, the message processing module performs conversion processing on the streaming information to be processed according to a processing strategy in the routing configuration, if the conversion is successful, the message processing module sends back-end service, and if the conversion is unsuccessful, the message processing module discards the message. The invention can flexibly and dynamically load configuration, execute customized processing logic, avoid stopping service and update and reduce the burden of operators.

Description

Flexible and configurable streaming information processing method and system
Technical Field
The invention relates to the field of streaming information processing methods, in particular to a flexibly configurable streaming information processing method and a flexibly configurable streaming information processing system.
Background
As a CDN service provider, there are a large number of requirements for log customization, analysis, and processing. And collecting the logs of a specific client by starting a real-time reporting mode for the client with high log real-time requirement. The existing CDN edge access log reports to a gateway service in real time, and then the gateway routing function distributes data to different back-end processing services, and for request contents in different formats, different servers need to be developed to implement a specific processing logic.
The existing processing scheme has long processing link and the following defects:
(1) flexible customization is not possible;
(2) the machine cost and the development cost are high;
(3) excessive backend services increase the workload of the operation and maintenance personnel.
Disclosure of Invention
The first purpose of the present invention is to provide a flexible and configurable streaming information processing method and system, which can flexibly and dynamically load configuration, execute customized processing logic, avoid service halt and update, and reduce the burden of operators.
A flexibly configurable method of streaming information processing, comprising the steps of:
1) the off-line data stored on the server is sent to a message receiving module in batches through HTTP requests, and the step has the function of converting the off-line data into streaming information to be processed and storing the streaming information in the message receiving module, so that the processing of subsequent modules is facilitated;
2) the message receiving module transmits the HTTP request to the routing module, the routing module determines routing configuration through the URI corresponding to the HTTP request, and the step is used for establishing a one-to-one mapping relation between the specific request URI and a specific message processing strategy;
3) the routing selection module transmits a message processing module to the determined routing configuration, the message processing module performs conversion processing on the streaming information to be processed according to a processing strategy in the routing configuration, if the conversion is successful, the streaming information is sent to a back-end service, and if the conversion is unsuccessful, the message is discarded;
4) the routing selection module checks the routing configuration of the configuration center service every 1-50 seconds and synchronizes;
5) and the message processing module checks the processing strategy of the configuration center service every 1-50 seconds and synchronizes.
In step 2), determining the routing configuration through the URI corresponding to the HTTP request specifically includes:
the routing configuration is determined from the longest prefix tree matching URI.
In step 3), the message processing module performs conversion processing on the streaming information to be processed according to the processing policy in the routing configuration, and specifically includes:
3.1) the message processing module interprets the Domain-specific language (Domain-specific language) of the route configuration into an executable processing strategy;
the specific domain language is formed by combining a JSON text and a Lua script so as to describe a message processing flow; the JSON text format is one of the most widely used general configuration text description forms, and the adoption of the format to describe the processing process is easy to understand and can reduce the starting difficulty of developers; lua is a lightweight programming language, so that developers can quickly get up and write message processing logic; the two are combined to ensure that the customization development work is more flexible and convenient and is more agile when the variable requirements are met;
adopting JSON text and Lua script combination, which specifically comprises the following steps:
combining message processing rules predefined by a message processing module by taking JSON as a configuration description form, wherein the message processing rules comprise: the message processing method comprises the following steps of realizing a message segmentation rule, an deserialization rule, a script executor rule and a message output rule by using Lua script programming for a message processing rule which is not predefined by a message processing module;
3.2) according to the message processing rule defined in the step 3.1), executing message processing calculation on the to-be-processed streaming information by a LuaJIT module built in OpenResty;
3.3) serializing the messages after processing and calculation into a target format specified in the configuration, and completing conversion processing.
In step 4), the routing module checks and synchronizes the routing configuration of the configuration center service every 1-50 seconds, and specifically comprises:
4.1) the routing module inquires the storage position of the configuration file in the service of the configuration center;
4.2) the route selection module reads and analyzes each route configuration;
4.3) a plurality of route configurations are stored in a key value pair mode, the route selection module constructs the route configurations into a prefix tree data structure and stores the prefix tree data structure in an internal memory, leaf nodes of each route configuration point to the alias of a unique message processing strategy, and each route configuration is provided with a hash abstract, so that versioning management is facilitated;
4.4) calculating the hash abstract of the analyzed routing configuration, comparing the hash abstract with the hash abstract of the routing configuration in the memory of the routing module, if the hash abstract is different from the hash abstract of the routing configuration in the memory of the routing module, indicating that the routing configuration in the memory is expired, updating the analyzed routing configuration into the memory, and restarting the corresponding working thread to enable the new routing configuration to take effect.
In step 5), the step of checking and synchronizing the processing strategy of the configuration center service by the message processing module every 1-50 seconds specifically comprises:
5.1) the message processing module inquires the storage position of the configuration file in the configuration center service;
5.2) according to the alias of the message processing strategy recorded in the digital data structure constructed by the routing module and the hash abstract information thereof, marking the alias as the required processing strategy;
5.3) the message processing module inquires and reads the required processing strategy, and then analyzes the processing strategy in a specific mode;
and 5.4) calculating a hash abstract of the analyzed processing strategy, comparing the hash abstract with the hash abstract of the processing strategy in the memory of the message processing module, if the hash abstract is different from the hash abstract of the processing strategy in the memory of the message processing module, updating the analyzed processing strategy into the memory, and restarting the corresponding working thread to enable the newly configured processing strategy to take effect.
A flexibly configurable streaming information processing system, comprising:
the message receiving module receives offline data from the server;
the configuration center service is used for dynamically configuring the storage distribution service based on Key/Value (Key Value) storage service;
the routing module selects different routing processing logics for the received streaming information to be processed according to the routing configuration information;
and the message processing module subscribes the routing configuration information from the configuration center service and converts the to-be-processed streaming information.
Compared with the prior art, the invention has the following advantages:
the flexible configuration method provided by the invention can avoid the outage of service updating and keep the high availability of the whole system;
the configuration center issues the configuration files, so that the configuration customized logic is conveniently managed and configured in a centralized manner, the operation and maintenance difficulty of multi-instance deployment is reduced, and the burden of operation and maintenance personnel is reduced;
thirdly, the JSON text and the Lua script combination are used for describing the message processing flow, so that the customization and development work is more flexible and convenient, and is more agile when the variable requirements are met;
the method directly converts the message processing flow into the byte code which can be executed by the Lua virtual machine in the message processing module, can replace most of back-end services with the same function, and can improve the development efficiency, shorten the online period and reduce the operation and maintenance burden.
Drawings
Fig. 1 is a schematic diagram of a streaming information processing system according to the present invention.
FIG. 2 is a flow chart of synchronous configuration of a message processing module and a configuration center according to the present invention.
FIG. 3 is a flow chart describing the message processing flow by using JSON text and Lua script combination according to the present invention.
Detailed Description
Example 1
As shown in fig. 1, the embodiment includes a message sender, a message receiver, a routing module, a message processing module, a configuration center service, and other backend services.
The message receiving module is a high-performance gateway; the message processing module is a log message processing service developed based on an OpenResty project; the configuration center is a dynamic configuration storage distribution service realized based on Key/Value storage service.
The message sender sends original log messages to a message receiving module in batch through HTTP requests, wherein each log message can be in a standard format or a customized format of NGINX access logs or in a single-row JSON form after space compression, line feed characters are used in a message body of the HTTP requests to separate a plurality of log messages, the number of the log messages of the batch requests is related to the average size of the single-row logs, and the total size of the messages of a single request is generally required to be not more than 5 MB.
The routing module matches the URI in the HTTP request to a corresponding message processing module according to the existing routing configuration, wherein the routing module can synchronize the routing configuration with the service of the configuration center at regular intervals, and the routing configuration can be modified only by the service of the configuration center, so that the routing modules in all program instances can receive the modified routing configuration; the synchronization time interval between the routing module and the configuration center service needs to be a proper value, the too short interval time can unnecessarily increase the pressure of the service end to consume the bandwidth of a machine room, and the too long interval time can cause the configuration to be not effective in time, thereby affecting the publishing efficiency. The reasonable synchronization time interval can be selected according to the actual service requirement and the judgment of experience.
The message processing module cuts, deserializes and analyzes the received log message, customizes format conversion, filters according to conditions and serializes the log message into a message format required by the back-end service according to the existing processing strategy. The message processing module and the configuration center service regularly synchronize processing strategies, the processing strategies are described by Domain-specific language (Domain-specific language) formed by combining JSON texts and Lua scripts, predefined processing rules are arranged and combined through the JSON texts, expanded processing rules are realized by programming of the Lua scripts, and the expanded processing rules are called through JSON integration.
The message processing module passes the processed streaming message to other back-end services, where the back-end services are typically distributed message middleware clusters.
Example 2
The method of use of example 1, comprising the steps of:
(1) the routing selection module checks and synchronizes the routing configuration of the configuration center service every 15 seconds, and the message processing module checks and synchronizes the processing strategy of the configuration center every 15 seconds;
(2) a message sender generates messages to be processed and sends the messages to a message receiving module in batches through HTTP requests;
(3) the route selection module selects a corresponding route according to the URI in the HTTP request, and transmits the message to be processed sent by the message receiving module to a specific message processing module;
(4) the message processing module receives a message to be processed and then triggers and executes a message processing strategy;
(5) after the message processing is finished, transmitting the message to other back-end services;
as shown in fig. 2, the step of checking and synchronizing the routing configuration information of the configuration center service comprises the following steps:
(1.1) the current information processing system inquires a configuration file storage position corresponding to the current service;
(1.2) the routing module analyzes each route configuration;
(1.3) if the routing configuration is different from the state stored in the memory of the current message processing module, synchronously configuring the configuration of the central service and restarting a corresponding working thread;
(1.4) analyzing message segmentation rules, deserialization rules, script executor rules and message output rules in the routing configuration one by one;
in the step (1.2), the route configuration is composed of JSON configuration or external Lua scripts, wherein a JSON data structure and Lua Table data types can be converted mutually.
As shown in fig. 3, comparing the routing configuration in step (1.4) with the state in the memory of the current message processing module, may be to take a hash on the configuration item, and if the comparison is different, the configuration is considered to have changed.
The processing flow triggered by the request event in the step (4) to execute the configuration in the route is specifically as follows:
(4.1) matching the URI in the HTTP request to the corresponding route;
(4.2) the message processing module reads and analyzes the processing strategy into byte codes which can be used for message processing operation of the Lua virtual machine;
(4.3) executing message processing operation by a LuaJIT module built in OpenResty;
(4.4) serializing the processed message into a format specified in the configuration.
The comparison of the cost required by the two information processing methods for comparing the new information with the old information of the same type of customized log item on line in terms of manpower, time and the like is shown in table 1.
TABLE 1
Figure 612408DEST_PATH_IMAGE002
The flexibly configurable streaming information processing method and the system can flexibly and dynamically load configuration, execute customized processing logic, avoid service halt updating, reduce the burden of operators, improve the development efficiency, shorten the online period and reduce the operation and maintenance burden.

Claims (8)

1. A flexibly configurable method of streaming information processing, comprising the steps of:
1) sending the off-line data stored on the server to a message receiving module in batches through HTTP requests, converting the off-line data into streaming information to be processed and storing the streaming information in the message receiving module;
2) the message receiving module transmits the HTTP request to the routing selection module, and the routing selection module determines routing configuration through the URI corresponding to the HTTP request;
3) the routing selection module transmits a message processing module to the determined routing configuration, the message processing module performs conversion processing on the streaming information to be processed according to a processing strategy in the routing configuration, if the conversion is successful, the message processing module sends back-end service, and if the conversion is unsuccessful, the message processing module discards the message.
2. The flexibly configurable streaming information processing method according to claim 1, further comprising:
4) the routing selection module checks the routing configuration of the configuration center service every 1-50 seconds and synchronizes;
5) and the message processing module checks the processing strategy of the configuration center service every 1-50 seconds and synchronizes.
3. The flexibly configurable streaming information processing method according to claim 1, wherein in step 2), determining the routing configuration through the URI corresponding to the HTTP request specifically includes:
the routing configuration is determined from the longest prefix tree matching URI.
4. The flexibly configurable streaming information processing method according to claim 1, wherein in step 3), the message processing module performs conversion processing on the streaming information to be processed according to a processing policy in the routing configuration, and specifically includes:
3.1) the message processing module interprets the specific domain language of the route configuration into an executable processing strategy;
3.2) according to the message processing rule defined in the step 3.1), executing message processing calculation on the to-be-processed streaming information by a LuaJIT module built in OpenResty;
3.3) serializing the messages after processing and calculation into a format specified in the configuration, and completing conversion processing.
5. The flexibly configurable streaming information processing method according to claim 4, wherein in step 3.1), the domain-specific language employs a combination of JSON text and Lua script, and specifically comprises:
combining message processing rules predefined by a message processing module by taking JSON as a configuration description form, wherein the message processing rules comprise: message segmentation rules, deserialization rules, script executor rules and message output rules, and Lua scripts are used for message processing rules which are not predefined by the message processing module.
6. The flexibly configurable streaming information processing method according to claim 2, wherein in step 4), the routing module checks the routing configuration of the configuration center service every 1-50 seconds and synchronizes, specifically comprising:
4.1) the routing module inquires the storage position of the configuration file in the service of the configuration center;
4.2) the route selection module reads and analyzes each route configuration;
4.3) a plurality of route configurations are stored in a key-value pair mode, a route selection module constructs the route configurations into a prefix tree data structure and stores the prefix tree data structure in an internal memory, leaf nodes of each route configuration point to the alias of a unique message processing strategy, and each route configuration is provided with a hash abstract;
4.4) calculating the hash abstract of the analyzed routing configuration, comparing the hash abstract with the hash abstract of the routing configuration in the memory of the routing module, if the hash abstract is different from the hash abstract of the routing configuration in the memory of the routing module, indicating that the routing configuration in the memory is expired, updating the analyzed routing configuration into the memory, and restarting the corresponding working thread to enable the new routing configuration to take effect.
7. The flexibly configurable streaming information processing method according to claim 2, wherein in step 5), the step of checking, by the message processing module, the processing policy of the configuration center service every 1-50 seconds and synchronizing specifically comprises:
5.1) the message processing module inquires the storage position of the configuration file in the configuration center service;
5.2) according to the alias of the message processing strategy and the hash abstract information thereof recorded in the digital data structure constructed by the routing module, marking the alias as the required processing strategy;
5.3) the message processing module inquires and reads the required processing strategy, and then analyzes the processing strategy in a specific mode;
and 5.4) calculating a hash abstract of the analyzed processing strategy, comparing the hash abstract with the hash abstract of the processing strategy in the memory of the message processing module, if the hash abstract is different from the hash abstract of the processing strategy in the memory of the message processing module, updating the analyzed processing strategy into the memory, and restarting the corresponding working thread to enable the newly configured processing strategy to take effect.
8. A flexibly configurable streaming information processing system for implementing the method of any one of claims 1 to 7, comprising:
the message receiving module receives offline data from the server;
the configuration center service is used for dynamically configuring the storage issuing service based on the key value storage service;
the routing module selects different routing processing logics for the received streaming information to be processed according to the routing configuration information;
and the message processing module subscribes the routing configuration information from the configuration center service and converts the to-be-processed streaming information.
CN202210708707.2A 2022-06-22 2022-06-22 Flexible and configurable streaming information processing method and system Active CN114793245B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210708707.2A CN114793245B (en) 2022-06-22 2022-06-22 Flexible and configurable streaming information processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210708707.2A CN114793245B (en) 2022-06-22 2022-06-22 Flexible and configurable streaming information processing method and system

Publications (2)

Publication Number Publication Date
CN114793245A true CN114793245A (en) 2022-07-26
CN114793245B CN114793245B (en) 2022-09-27

Family

ID=82462809

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210708707.2A Active CN114793245B (en) 2022-06-22 2022-06-22 Flexible and configurable streaming information processing method and system

Country Status (1)

Country Link
CN (1) CN114793245B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102035725A (en) * 2010-08-10 2011-04-27 中国科学院计算技术研究所 Relevant technology system for one-way flow uniform resource identifier (URI) under asymmetric routing and method thereof
CN107479878A (en) * 2017-07-26 2017-12-15 北京供销科技有限公司 A kind of lua Modular development methods and Development Framework based on openresty
CN109871502A (en) * 2019-01-18 2019-06-11 北京赛思信安技术股份有限公司 A kind of flow data canonical matching process based on Storm
CN110035117A (en) * 2019-03-15 2019-07-19 启迪云计算有限公司 One kind is based on configurable monitoring script monitoring system and monitoring method
US20190319885A1 (en) * 2018-04-16 2019-10-17 Citrix Systems, Inc. Policy based service routing
CN110996372A (en) * 2019-11-11 2020-04-10 广州爱浦路网络技术有限公司 Message routing method, device and system and electronic equipment
WO2021012663A1 (en) * 2019-07-24 2021-01-28 网宿科技股份有限公司 Access log processing method and device
CN112532425A (en) * 2020-11-04 2021-03-19 浙江大学 Message routing configuration method facing edge calculation
CN113076107A (en) * 2021-04-13 2021-07-06 杭州又拍云科技有限公司 Method for automatically acquiring and fusing logs through finite state machine

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102035725A (en) * 2010-08-10 2011-04-27 中国科学院计算技术研究所 Relevant technology system for one-way flow uniform resource identifier (URI) under asymmetric routing and method thereof
CN107479878A (en) * 2017-07-26 2017-12-15 北京供销科技有限公司 A kind of lua Modular development methods and Development Framework based on openresty
US20190319885A1 (en) * 2018-04-16 2019-10-17 Citrix Systems, Inc. Policy based service routing
CN109871502A (en) * 2019-01-18 2019-06-11 北京赛思信安技术股份有限公司 A kind of flow data canonical matching process based on Storm
CN110035117A (en) * 2019-03-15 2019-07-19 启迪云计算有限公司 One kind is based on configurable monitoring script monitoring system and monitoring method
WO2021012663A1 (en) * 2019-07-24 2021-01-28 网宿科技股份有限公司 Access log processing method and device
CN110996372A (en) * 2019-11-11 2020-04-10 广州爱浦路网络技术有限公司 Message routing method, device and system and electronic equipment
CN112532425A (en) * 2020-11-04 2021-03-19 浙江大学 Message routing configuration method facing edge calculation
CN113076107A (en) * 2021-04-13 2021-07-06 杭州又拍云科技有限公司 Method for automatically acquiring and fusing logs through finite state machine

Also Published As

Publication number Publication date
CN114793245B (en) 2022-09-27

Similar Documents

Publication Publication Date Title
CN109327509B (en) Low-coupling distributed streaming computing system of master/slave architecture
CN106874424B (en) A kind of collecting webpage data processing method and system based on MongoDB and Redis
CN110058987B (en) Method, apparatus, and computer readable medium for tracking a computing system
CN110502583B (en) Distributed data synchronization method, device, equipment and readable storage medium
US20150237113A1 (en) Method and system for file transmission
CN110609782B (en) Micro-service optimization system and method based on big data
CN111143382B (en) Data processing method, system and computer readable storage medium
CN111580884A (en) Configuration updating method and device, server and electronic equipment
CN101827302A (en) Multi-service unified processing method and unified service platform
CN103546343B (en) The network traffics methods of exhibiting of network traffic analysis system and system
CN111405020B (en) Asynchronous file export method and system based on message queue and fastDFS micro-service framework
KR102678656B1 (en) Data processing methods and devices
CN111200523B (en) Method, device, equipment and storage medium for configuring middle platform system
CN109462640B (en) Metadata synchronization method, data terminal, interaction system and medium
CN104378234A (en) Cross-data-center data transmission processing method and system
CN113032379B (en) Distribution network operation and inspection-oriented multi-source data acquisition method
CN102026228B (en) Statistical method and equipment for communication network performance data
CN114328618A (en) Cache data synchronization method, device, equipment and storage medium
CN103942324A (en) Data real-time synchronization system and method
CN111984505A (en) Operation and maintenance data acquisition engine and acquisition method
CN114817190A (en) Log synchronization method, device, system, equipment and storage medium
CN114793245B (en) Flexible and configurable streaming information processing method and system
CN110555064A (en) data service system and method for insurance business
US20130282846A1 (en) System and method for processing similar emails
KR101736382B1 (en) Ems server and log data management method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant