CN111597058B - Data stream processing method and system - Google Patents

Data stream processing method and system Download PDF

Info

Publication number
CN111597058B
CN111597058B CN202010307212.XA CN202010307212A CN111597058B CN 111597058 B CN111597058 B CN 111597058B CN 202010307212 A CN202010307212 A CN 202010307212A CN 111597058 B CN111597058 B CN 111597058B
Authority
CN
China
Prior art keywords
processing
processing node
data
node
data stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010307212.XA
Other languages
Chinese (zh)
Other versions
CN111597058A (en
Inventor
周源
贾晓捷
冯萌萌
王佳佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Weimeng Chuangke Network Technology China Co Ltd
Original Assignee
Weimeng Chuangke Network Technology China Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Weimeng Chuangke Network Technology China Co Ltd filed Critical Weimeng Chuangke Network Technology China Co Ltd
Priority to CN202010307212.XA priority Critical patent/CN111597058B/en
Publication of CN111597058A publication Critical patent/CN111597058A/en
Application granted granted Critical
Publication of CN111597058B publication Critical patent/CN111597058B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/546Message passing systems or structures, e.g. queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/54Interprogram communication
    • G06F9/547Remote procedure calls [RPC]; Web services
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/08Configuration management of networks or network elements
    • H04L41/0803Configuration setting
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/12Discovery or management of network topologies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/06Network architectures or network communication protocols for network security for supporting key management in a packet data network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/288Distributed intermediate devices, i.e. intermediate devices for interaction with other intermediate devices on the same level
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/54Indexing scheme relating to G06F9/54
    • G06F2209/547Messaging middleware

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Physics & Mathematics (AREA)
  • Computer Hardware Design (AREA)
  • Computer Security & Cryptography (AREA)
  • Computing Systems (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data stream processing method and a system, which are used for solving the problems that the mutual communication among processing nodes in the existing data stream processing method needs to rely on a centralized node, so that the deployment is heavy and the expansion and the contraction of capacity are inconvenient, and the method comprises the following steps: each processing node obtains a configuration file of the processing node, wherein the configuration file comprises topology parameters; each processing node builds a topology structure according to own topology parameters; when any processing node receives the data stream, the processing node carries out service processing on the data stream according to the topological structure and outputs processing result information. According to the embodiment of the application, the topology framework is automatically built by each processing node according to the own topology parameters, each processing node is mutually independent, only the own topology parameters are concerned, the behavior of other processing nodes is not concerned, and the centralized processing nodes are not arranged in the built topology framework, so that the deployed framework is lighter and is convenient for expansion and contraction.

Description

Data stream processing method and system
Technical Field
The embodiment of the application relates to the technical field of data processing, in particular to a data stream processing method and system.
Background
With the rise of Internet big data, the development of big data processing technology is accelerated. Different data have different requirements on the processing technology. The data of the stream processing system is real-time collected data, and the collected data is calculated in real time and is rapidly fed back to a user after calculation is completed, so that the purposes of quick response, low delay and reliability are achieved. Therefore, the stream processing system has the characteristics of rapidness, high efficiency, high fault tolerance and the like, and can process the data information accurately and without errors. In practical application, the flow processing system can be applied to scenes such as fire alarm, gas leakage alarm and the like.
The common stream processing framework is a Storm distributed real-time computing framework, and the framework is distinguished by good real-time performance and high performance in various platform technologies of large data stream processing, and has the characteristics of high expandability, stability, reliability and the like, and is widely focused and used in the industry. The Storm is used as a stream data processing engine, a polling algorithm is adopted to carry out task scheduling, and quick operation is carried out based on a memory, so that each message can be ensured to be processed, and the method has high response speed and is very suitable for real-time stream processing.
However, the store realizes mutual discovery among the processing nodes through one centralizing node, i.e. mutual communication among the processing nodes needs to be realized by relying on the centralizing node. The start-up of e.g. storm needs to rely on zookeeper. Thus, the deployment of the store is cumbersome and the expansion and contraction of the volume is inconvenient.
Disclosure of Invention
The embodiment of the application provides a data stream processing method and a system, which are used for solving the problems that the mutual communication among processing nodes in the existing data stream processing method needs to rely on a centralized node, so that the deployment is heavy and the expansion and the contraction are inconvenient.
The embodiment of the application adopts the following technical scheme:
in a first aspect, a data stream processing method is provided, the method including:
each processing node obtains a configuration file of the processing node, wherein the configuration file comprises topology parameters;
each processing node builds a topology structure according to own topology parameters;
when any processing node receives the data stream, the processing node carries out service processing on the data stream according to the topological structure and outputs processing result information.
In a second aspect, there is provided a data stream processing system, the system comprising: each processing node comprises an acquisition module, a building module and a service processing module, wherein:
the acquisition module is used for acquiring a configuration file of the processing node, wherein the configuration file comprises topology parameters;
the building module is used for building a topology framework according to the topology parameters of the affiliated processing nodes;
and the service processing module is used for carrying out service processing on the data stream according to the topological structure when the processing node receives the data stream and outputting processing result information.
In a third aspect, a data stream processing system is provided, comprising: a memory storing computer program instructions;
a processor, which when executed by the processor, implements the data stream processing method as described above.
In a fourth aspect, a computer readable storage medium is provided, comprising instructions which, when run on a computer, cause the computer to perform a data stream processing method as described above.
The above at least one technical scheme adopted by the embodiment of the application can achieve the following beneficial effects:
according to the embodiment of the application, the topology framework is automatically built by each processing node according to the own topology parameters, each processing node is mutually independent, only the own topology parameters are concerned, the behavior of other processing nodes is not concerned, and the centralized processing nodes are not arranged in the built topology framework, so that the deployed framework is lighter and is convenient for expansion and contraction.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
FIG. 1 is a flow chart of a data stream processing method according to an embodiment of the present disclosure;
fig. 2 is a schematic diagram of an actual application scenario of a data stream processing method according to an embodiment of the present disclosure;
fig. 3 is a second schematic diagram of an actual application scenario of the data stream processing method according to an embodiment of the present disclosure;
fig. 4 is a third schematic view of an actual application scenario of the data stream processing method according to an embodiment of the present disclosure;
fig. 5 is a schematic diagram of a practical application scenario of a data stream processing method according to an embodiment of the present disclosure;
fig. 6 is a schematic diagram of a practical application scenario of a data stream processing method according to an embodiment of the present disclosure;
fig. 7 is a schematic diagram of a practical application scenario of a data stream processing method according to an embodiment of the present disclosure;
FIG. 8 is a schematic diagram of a data stream processing system according to one embodiment of the present disclosure;
FIG. 9 is a diagram illustrating a second embodiment of a data stream processing system according to the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present application based on the embodiments herein.
The embodiment of the application provides a data stream processing method and a system, which are used for solving the problems that the mutual communication among processing nodes in the existing data stream processing method needs to rely on a centralized node, so that the deployment is heavy and the expansion and the contraction are inconvenient. The embodiment of the application provides a data stream processing method, and an execution body of the method can be, but is not limited to, an application program or a system capable of being configured to execute the method provided by the embodiment of the application.
Fig. 1 is a flowchart of a data stream processing method according to an embodiment of the present application, where the method of fig. 1 may be performed by a system, as shown in fig. 1, and the method may include:
in step 110, each processing node obtains its own configuration file.
Wherein the configuration file may include topology parameters.
The topology parameters may include: the method comprises the steps of processing identification information of a node, data to be processed and processing result data of the processing node, a key value of message middleware read by the processing node and a key list of the message middleware output by the processing node, the type of the message middleware read by the processing node, the thread number of the processing node, an instruction for controlling the processing node to execute service processing and the like.
In step 120, each processing node builds a topology according to its own topology parameters.
The topology parameters may include, for example, data to be processed and processing result data of the processing node, the processing node reading a key value of the message middleware and the processing node writing the data to the key value of the message middleware. The method comprises the following steps:
and each processing node determines the association relation of each processing node according to the data to be processed and the processing result data so as to generate a topological graph.
The topology is a directed acyclic graph that is used to determine data flow direction, and to review processing node performance pressures, etc.
Illustratively, as shown in fig. 2, assume that:
the processing result data of the processing node (simply referred to as node) 1 is the processing data to be processed of the node 3 and the node 4, the processing result data of the node 2 is the processing data to be processed of the node 4 and the node 5, the processing result data of the node 4 is the processing data to be processed of the node 6, the processing result data of the node 4 is the processing data to be processed of the node 5 and the node 7, and the processing result data of the node 7 is the processing data to be processed of the node 6.
Then, the association relationship between the nodes 1 to 7 can be determined, and a directed acyclic graph, i.e. a topological graph, is generated according to the association relationship.
And according to the topological graph, in each processing node, a first processing node which adopts a key value to write data into the message middleware and a second processing node which adopts the same key value to read data from the message middleware are communicated, so that the topological architecture is built.
As shown in FIG. 3, message middleware is embedded in the system in the form of an interface plug-in, and common message middleware is Kafka, redis, blockingQueue (provided by Java Development Kit, JDK), memcacheq, and the like. The plug-in extension mode can be used for communication inside jvm (Java Virtual Machine ) and can also be used for communication across jvm.
In specific implementation, the embodiment of the application uses BlockingQueue, and data can only be transferred in Jvm when the BlockingQueue is used. Data may be transferred across machines when other message middleware is used in embodiments of the present application.
The key value adopted by the first processing node for writing the data into the message middleware is consistent with the key value adopted by the second processing node for reading the data from the message middleware.
For example, as shown in fig. 2, assuming that the first processing node is node 1 and the second processing node is node 3, node 1 uses Key value Key1-3 to write the processing result data into the message middleware, and node 4 uses Key value Key1-3 to read the data to be processed from the message middleware. Similarly, assuming that the first processing node is node 4 and the second processing node is node 5, the node 4 adopts a Key value Key4-5 to write the processing result data into the message middleware, and the node 5 adopts the Key value Key4-5 to read the data to be processed from the message middleware. Similarly, each processing node may build a topology by establishing communication using the same key value.
And 130, when any processing node receives the data stream, the processing node carries out service processing on the data stream according to the topological structure and outputs processing result information.
In the embodiment of the application, the relation of each processing node and the logic of the topological graph can be realized by using java annotation, each processing node only concerns the data read by itself and the data output by itself, does not concern the behavior of other processing nodes, and does not have a centralized processing node in the constructed topological structure, so that the deployed frame is lighter and is convenient for expansion and contraction.
As an embodiment, step 110 may be specifically implemented as:
the scanning provides packets of the class of processing nodes to obtain class definitions, annotations of the class definitions of the processing nodes being imposed on the class of processing nodes.
As shown in fig. 3, this step may provide packets of the processing node class by scanning by a class scanner on the processing node.
And analyzing the annotation defined by the class to obtain the processing node class and the processing node class object.
And carrying out instantiation processing on the processing node class.
Before executing the instantiating process on the processing node class, step 110 further includes:
and judging the processing node class to be legal according to the annotation defined by the class. The method specifically comprises the following steps: judging whether the processing node class is legal or not according to the annotation defined by the class; if yes, carrying out instantiation processing on the processing node class.
The instantiation processing of the processing node class includes two modes: first, a generic instantiation process by the system framework (i.e., generic class initialization as shown in FIG. 4); second, an instantiation process in which processing node class objects that rely on by annotation, such as with processing node classes, are injected into a spring framework (i.e., spring class initialization as shown in FIG. 4).
According to the embodiment of the application, the processing node class object which is depended on by the processing node class through annotation is injected into the spring framework, so that the processing node class is instantiated.
Among them, spring framework is created due to the complexity of software development, is a lightweight control inversion (IoC) and cut-plane (AOP) oriented container framework.
Spring promotes loose coupling by a technique called control inversion (IoC). When IoC is applied, the other processing node classes that one processing node class depends on may be passed in a passive manner, rather than the processing node class itself creating or looking up a dependent processing node class object. It can be considered that IoC is contrary to JNDI, rather than the processing node class looking up dependencies from the Spring framework, the Spring framework actively passes dependencies to it when processing node class initialization is not in the processing node class request.
Illustratively, the annotation sample @ Processor (desc= "private source distribution information processing", readeventname= "direct_message_report", readEventType = QueueType.REDIS, emitEventName = { "origin_report" }, threadNum = 2)), contains fields with a processing node name, a processing node reading a key of a message middleware, a key list of a message middleware output by a processing node, a message middleware type, a thread number of a current processing node, some information of a switch and a skip switch, and the like.
According to the embodiment of the application, the spring framework can be used for managing the initialization and the dependency relationship of the processing nodes when the processing nodes are built, so that a path is provided for code sharing among the processing nodes, and the problem of code sharing of the stream processing service and the non-stream processing service is solved, and the development of codes is greatly facilitated.
In the realized project, the number of code lines is reduced by about 30% under the scene that the stream processing service and the http service coexist. In deployment, the embodiment of the application realizes data communication among processing nodes through the message middleware (mainly various message queues), simplifies the deployment difficulty of stream processing service, avoids the problem that the stream processing service relies on a centralized processing node to discover, adopts a data transmission mode as a pull mode, namely, the upstream is only responsible for writing data into the message middleware, and the downstream actively pulls data from the message middleware when idle.
In addition, the stream processing service provided by the embodiment of the application can be mixed with other types of services, such as http service and rpc service, so that sharing can be realized without occupying other physical server resources, the resource occupation is reduced, and the utilization rate of the server is improved.
As an embodiment, the topology parameter includes an instruction for controlling the processing node to perform service processing, and before performing step 130, the data stream processing method provided by the embodiment of the present application may further include:
when any processing node receives the data stream, the processing node executes the operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing.
Specifically, the instruction for controlling the processing node to execute the service processing may include a closing instruction and/or a skip instruction, and the processing node executes an operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing, which may be specifically implemented as:
the processing node executes the operation of closing the self-execution business processing according to the closing instruction; and/or the processing node does not execute the operation of business processing according to the skip instruction and directly outputs data.
Illustratively, as shown in fig. 5, in step 510, it is determined whether the processing node performs an operation of turning off itself to perform service processing according to a turn-off instruction; if yes, ending; if not, go to step 520.
At step 520, the processing node reads data from the message middleware.
In step 530, it is determined whether the processing node does not execute the operation of the service processing according to the skip instruction, if so, step 550 is executed, and if not, step 540 is executed.
In step 540, the processing node performs the operation of the service process and outputs the process data, which is written into the message middleware.
At step 550, the processing node outputs the input data directly and writes to the message middleware.
According to the embodiment of the application, the topology parameters are provided with the instructions for controlling the processing nodes to execute the service processing, so that the processing nodes can be flexibly controlled to execute the starting or stopping of the data processing operation, and the requirements of various application scenes are met.
As an embodiment, the data stream processing method provided by the embodiment of the present application may further include:
when a first processing node inputs source data and outputs first intermediate data, the first processing node performs packet processing on the first intermediate data to obtain a first data packet and outputs the first data packet, wherein the first data packet comprises first identification information.
When the second processing node inputs the first data packet and outputs second intermediate data, the second processing node performs packet processing on the second intermediate data and the first identification information to obtain and output a second data packet, wherein the second data packet comprises second identification information.
Illustratively, assume that the first data packet contains first identification information (currentprocessRequestID 11) as shown in FIG. 6, and the identification information (partentProcessRequestID 10) in FIG. 6 is identification information generated by a processing node preceding the first processing node. The second processing node analyzes the first data packet to obtain first identification information (currentprocessRequestID 11) generated by the last processing node, and the second processing node performs packet processing on the second intermediate data and the first identification information to obtain a second data packet, and generates second identification information (currentprocessRequestID 12).
And obtaining a data stream log generated in the process of converting the source data into the second intermediate data according to the inheritance relation of the first identification information and the second identification information.
The method comprises the following steps: as shown in fig. 7, along with the above example, according to the inheritance relationship of the first identification information and the second identification information, and so on, the identification information respectively generated by the first processing node to the tenth processing node can be obtained, in order as follows: requestID1, requestID 2, requestID 3, requestID4, requestID 5, requestID 6, requestID 7, requestID 8, requestID 9, requestID10.
And obtaining the inheritance relation between the identification information, and finally obtaining the data flow log in the process of converting the source data into the intermediate data generated by the tenth processing node.
In the embodiment of the application, when data is output from the previous processing node, a packet is carried out on the data, a request ID is recorded in the packet, and when the data reaches the current processing node, the data packet is unpacked, so that the request ID when the previous processing node processes is known, the request ID of the current processing node is generated, the flow direction information of the data can be obtained according to the inheritance relationship between the two request IDs, and the data can be traced back to the source data from any intermediate data, thereby being convenient for data searching.
The data stream processing method according to the embodiment of the present specification is described in detail above with reference to fig. 1 to 7, and the system according to the embodiment of the present specification is described in detail below with reference to fig. 8.
Fig. 8 shows a schematic structural diagram of a system provided in an embodiment of the present disclosure, and as shown in fig. 8, the system 800 may include: each processing node comprises an acquisition module, a building module and a service processing module, wherein:
the obtaining module 810 is configured to obtain a configuration file of the processing node, where the configuration file includes topology parameters;
the building module 820 is configured to build a topology architecture according to topology parameters of the processing node to which the topology architecture belongs;
the service processing module 830 is configured to perform service processing on a data stream according to the topology architecture when the processing node receives the data stream, and output processing result information.
In an embodiment, the topology parameter includes data to be processed and processing result data of the processing node, the processing node reads a key value of the message middleware and the processing node writes the data into the key value of the message middleware;
the building module 820 may include:
the determining unit is used for determining the association relation between the processing nodes and each processing node according to the processing data and the processing result data of the processing nodes so as to generate a topological graph;
and the establishing unit is used for establishing communication between a first processing node which adopts a key value to write data into the message middleware and a second processing node which adopts the key value to read data from the message middleware in the processing nodes according to the topological graph, so as to establish the topological structure.
In one embodiment, the obtaining module 810 may include:
a class scanner for scanning packets providing classes of processing nodes to obtain class definitions, annotations of the class definitions of the processing nodes being imposed on the class of processing nodes;
the analyzing unit is used for analyzing the annotation defined by the class to obtain the processing node class and the processing node class object;
the processing unit is used for carrying out instantiation processing on the processing node class, and the processing unit is particularly used for injecting a processing node class object which is depended on by the processing node class through annotation into the spring framework.
In one embodiment, the obtaining module 810 may include:
and the judging unit is used for judging that the class of the processing node is legal according to the annotation defined by the class.
In an embodiment, the topology parameter includes an instruction to control a processing node to perform a service process, and the system 800 may include:
and the executing module 840 is configured to, when any processing node receives the data stream, execute an operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing.
In an embodiment, the instructions controlling the processing node to perform the traffic processing include a shutdown instruction and/or a skip instruction; the execution module 840 may include:
the execution unit is used for executing the operation of closing the execution business processing of the processing node according to the closing instruction;
and/or the processing node does not execute the operation of business processing according to the skip instruction and directly outputs data.
In one embodiment, the system 800 may include:
the first packet processing module 850 is configured to, when a first processing node inputs source data and outputs first intermediate data, perform packet processing on the first intermediate data by using the first processing node to obtain a first data packet and output the first data packet, where the first data packet includes first identification information;
the second packet processing module 860 is configured to, when the second processing node inputs the first data packet and outputs second intermediate data, perform packet processing on the second intermediate data and the first identification information by the second processing node to obtain and output a second data packet, where the second data packet includes second identification information;
and the log obtaining module 870 is configured to obtain a data flow log generated in the process of converting the source data into the second intermediate data according to the inheritance relationship between the first identification information and the second identification information.
According to the embodiment of the application, the topology framework is automatically built by each processing node according to the own topology parameters, each processing node is mutually independent, only the own topology parameters are concerned, the behavior of other processing nodes is not concerned, and the centralized processing nodes are not arranged in the built topology framework, so that the deployed framework is lighter and is convenient for expansion and contraction.
A data stream processing system according to an embodiment of the present application will be described in detail with reference to fig. 9. Referring to fig. 9, at the hardware level, the data stream processing system includes a processor, optionally including an internal bus, a network interface, a memory. As shown in fig. 9, the Memory may include a Memory, such as a Random-Access Memory (RAM), and may further include a non-volatile Memory (non-volatile Memory), such as at least 1 disk Memory, and so on. Of course, the data stream processing system may also include the hardware required to implement other targeted services.
The processor, network interface, and memory may be interconnected by an internal bus, which may be an industry standard architecture (Industry Standard Architecture, ISA) bus, a peripheral component interconnect standard (Peripheral Component Interconnect, PCI) bus, or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, among others. The buses may be classified as address buses, data buses, control buses, etc. For ease of illustration, only one bi-directional arrow is shown in fig. 9, but not only one bus or one type of bus.
And the memory is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory may include memory and non-volatile storage and provide instructions and data to the processor.
The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form a data stream processing system which associates the resource value-added object with the resource object on a logic level. The processor executes the programs stored in the memory and is specifically configured to perform the operations of the method embodiments described in the foregoing description.
The methods disclosed in the embodiments shown in fig. 1 to 8 and the methods performed by the data stream processing system may be applied to a processor or implemented by a processor. The processor may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware in a processor or by instructions in the form of software. The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but also digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), field programmable gate arrays (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present application may be embodied directly in the execution of a hardware decoding processor, or in the execution of a combination of hardware and software modules in a decoding processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
The data stream processing system shown in fig. 8 may also execute the methods of fig. 1 to 7 to implement the functions of the embodiments of the data stream processing methods shown in fig. 1 to 7, and the embodiments of the present application are not described herein again.
Of course, in addition to software implementation, the data stream processing system of the present application does not exclude other implementation, such as a logic device or a combination of software and hardware, etc., that is, the execution subject of the following processing flow is not limited to each logic unit, but may also be hardware or a logic device.
The embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when executed by a processor, implements the processes of the above embodiments of the method, and can achieve the same technical effects, and for avoiding repetition, the description is omitted here. Wherein the computer readable storage medium is selected from Read-Only Memory (ROM), random access Memory (Random Access Memory, RAM), magnetic disk or optical disk.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a server for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction servers which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (9)

1. A method of data stream processing, comprising:
each processing node obtains a configuration file of the processing node, wherein the configuration file comprises topology parameters;
each processing node builds a topology structure according to own topology parameters;
when any processing node receives a data stream, the processing node carries out service processing on the data stream according to the topological structure and outputs processing result information;
the topology parameters comprise data to be processed and processing result data of the processing node, and the processing node reads a key value of the message middleware and writes the data into the key value of the message middleware;
each processing node builds a topology structure according to own topology parameters, and the topology structure comprises:
each processing node determines the association relation of each processing node according to the data to be processed and the processing result data, and generates a topological graph;
and according to the topological graph, in each processing node, a first processing node which adopts a key value to write data into the message middleware and a second processing node which adopts the same key value to read data from the message middleware are communicated, so that the topological architecture is built.
2. The method of claim 1, wherein each processing node obtains its own profile, comprising:
scanning a packet providing a class of processing nodes, obtaining class definitions, annotations of the class definitions of the processing nodes being imposed on the class of processing nodes;
analyzing the annotation defined by the class to obtain the processing node class and the processing node class object;
the instantiation processing is carried out on the processing node class, and the method specifically comprises the following steps: the processing node class is injected into the spring framework through annotating processing node class objects that are dependent on by implementation.
3. The method of claim 2, further comprising, prior to instantiating the class of processing nodes: and judging the processing node class to be legal according to the annotation defined by the class.
4. The method of claim 1, wherein the topology parameters include instructions for controlling processing nodes to perform traffic processing, comprising, prior to any processing node performing traffic processing on the data stream according to the topology architecture:
when any processing node receives the data stream, the processing node executes the operation corresponding to the instruction according to the instruction for controlling the processing node to execute the service processing.
5. The method according to claim 4, wherein the instructions controlling the processing node to perform traffic processing comprise a shutdown instruction and/or a skip instruction;
the processing node executes operations corresponding to the instructions according to the instructions for controlling the processing node to execute service processing, and the operations comprise:
the processing node executes the operation of closing the self-execution business processing according to the closing instruction;
and/or the processing node does not execute the operation of business processing according to the skip instruction and directly outputs data.
6. The method according to claim 1, characterized in that the method comprises:
when a first processing node inputs source data and outputs first intermediate data, the first processing node performs packet processing on the first intermediate data to obtain a first data packet and outputs the first data packet, wherein the first data packet comprises first identification information;
when a second processing node inputs the first data packet and outputs second intermediate data, the second processing node performs packet processing on the second intermediate data and the first identification information to obtain and output a second data packet, wherein the second data packet comprises second identification information;
and obtaining a data stream log generated in the process of converting the source data into the second intermediate data according to the inheritance relation of the first identification information and the second identification information.
7. A data stream processing system comprising a plurality of processing nodes, each processing node comprising an acquisition module, a building module and a service processing module, wherein:
the acquisition module is used for acquiring a configuration file of the processing node, wherein the configuration file comprises topology parameters;
the building module is used for building a topology framework according to the topology parameters of the affiliated processing nodes;
the service processing module is used for carrying out service processing on the data stream according to the topological architecture when the processing node receives the data stream and outputting processing result information;
the building module of each processing node comprises:
the determining unit is used for determining the association relation of each processing node according to the data to be processed and the processing result data of each processing node so as to generate a topological graph;
and the establishing unit is used for establishing communication between a first processing node which adopts a key value to write data into the message middleware and a second processing node which adopts the same key value to read data from the message middleware in the processing nodes according to the topological graph, so as to establish the topological structure.
8. A data stream processing system, comprising:
a memory storing computer program instructions;
a processor which when executed by the processor implements the data stream processing method according to any one of claims 1 to 6.
9. A computer readable storage medium comprising instructions which, when run on a computer, cause the computer to perform the data stream processing method according to any one of claims 1 to 6.
CN202010307212.XA 2020-04-17 2020-04-17 Data stream processing method and system Active CN111597058B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010307212.XA CN111597058B (en) 2020-04-17 2020-04-17 Data stream processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010307212.XA CN111597058B (en) 2020-04-17 2020-04-17 Data stream processing method and system

Publications (2)

Publication Number Publication Date
CN111597058A CN111597058A (en) 2020-08-28
CN111597058B true CN111597058B (en) 2023-10-17

Family

ID=72181470

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010307212.XA Active CN111597058B (en) 2020-04-17 2020-04-17 Data stream processing method and system

Country Status (1)

Country Link
CN (1) CN111597058B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114116065B (en) * 2021-11-29 2022-11-15 中电金信软件有限公司 Method and device for acquiring topological graph data object and electronic equipment
CN114281297A (en) * 2021-12-09 2022-04-05 上海深聪半导体有限责任公司 Transmission management method, device, equipment and storage medium for multi-audio stream

Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102065136A (en) * 2010-12-10 2011-05-18 中国科学院软件研究所 P2P (Peer-to-Peer) network safety data transmission method and system
CN103368770A (en) * 2013-06-18 2013-10-23 华中师范大学 Gateway level topology-based self-adaptive ALM overlay network constructing and maintaining method
CN103491129A (en) * 2013-07-05 2014-01-01 华为技术有限公司 Service node configuration method and service node pool logger and system
CN103560943A (en) * 2013-10-31 2014-02-05 北京邮电大学 Network analytic system and method supporting real-time mass data processing
CN104038364A (en) * 2013-12-31 2014-09-10 华为技术有限公司 Distributed flow processing system fault tolerance method, nodes and system
CN105574082A (en) * 2015-12-08 2016-05-11 曙光信息产业(北京)有限公司 Storm based stream processing method and system
CN107678852A (en) * 2017-10-26 2018-02-09 携程旅游网络技术(上海)有限公司 Method, system, equipment and the storage medium calculated in real time based on flow data
WO2018072708A1 (en) * 2016-10-21 2018-04-26 中兴通讯股份有限公司 Cloud platform service capacity reduction method, apparatus, and cloud platform
CN108268305A (en) * 2017-01-04 2018-07-10 中国移动通信集团四川有限公司 For the system and method for virtual machine scalable appearance automatically
CN108595699A (en) * 2018-05-09 2018-09-28 国电南瑞科技股份有限公司 The Stream Processing method of wide-area distribution type data in electric power scheduling automatization system
CN108594810A (en) * 2018-04-08 2018-09-28 百度在线网络技术(北京)有限公司 Method, apparatus, storage medium, terminal device and the automatic driving vehicle of data processing
CN108881369A (en) * 2018-04-24 2018-11-23 中国科学院信息工程研究所 A kind of method for interchanging data and cloud message-oriented middleware system of the cloud message-oriented middleware based on data-oriented content
CN108900320A (en) * 2018-06-04 2018-11-27 佛山科学技术学院 A kind of internet test envelope topological structure large scale shrinkage in size method and device
CN109104318A (en) * 2018-08-23 2018-12-28 广东轩辕网络科技股份有限公司 The dispositions method and system of method for realizing cluster self-adaption deployment, the self-adaption deployment big data cluster based on cloud platform
CN109194919A (en) * 2018-09-19 2019-01-11 图普科技(广州)有限公司 A kind of camera data flow distribution system, method and its computer storage medium
CN109992561A (en) * 2019-02-14 2019-07-09 石化盈科信息技术有限责任公司 Industrial real-time computing technique, storage medium and calculating equipment
CN110113399A (en) * 2019-04-24 2019-08-09 华为技术有限公司 Load balancing management method and relevant apparatus
CN110245108A (en) * 2019-07-15 2019-09-17 北京一流科技有限公司 It executes body creation system and executes body creation method
CN110716744A (en) * 2019-10-21 2020-01-21 中国科学院空间应用工程与技术中心 Data stream processing method, system and computer readable storage medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2591416A4 (en) * 2010-07-05 2014-08-13 Saab Ab Method for configuring a distributed avionics control system
KR102372219B1 (en) * 2016-04-25 2022-03-08 콘비다 와이어리스, 엘엘씨 Data Stream Analytics at the Service Layer
US11238164B2 (en) * 2017-07-10 2022-02-01 Burstiq, Inc. Secure adaptive data storage platform
WO2020014372A1 (en) * 2018-07-10 2020-01-16 Nokia Technologies Oy Dynamic multiple endpoint generation

Patent Citations (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102065136A (en) * 2010-12-10 2011-05-18 中国科学院软件研究所 P2P (Peer-to-Peer) network safety data transmission method and system
CN103368770A (en) * 2013-06-18 2013-10-23 华中师范大学 Gateway level topology-based self-adaptive ALM overlay network constructing and maintaining method
CN103491129A (en) * 2013-07-05 2014-01-01 华为技术有限公司 Service node configuration method and service node pool logger and system
CN103560943A (en) * 2013-10-31 2014-02-05 北京邮电大学 Network analytic system and method supporting real-time mass data processing
CN104038364A (en) * 2013-12-31 2014-09-10 华为技术有限公司 Distributed flow processing system fault tolerance method, nodes and system
CN105574082A (en) * 2015-12-08 2016-05-11 曙光信息产业(北京)有限公司 Storm based stream processing method and system
WO2018072708A1 (en) * 2016-10-21 2018-04-26 中兴通讯股份有限公司 Cloud platform service capacity reduction method, apparatus, and cloud platform
CN108268305A (en) * 2017-01-04 2018-07-10 中国移动通信集团四川有限公司 For the system and method for virtual machine scalable appearance automatically
CN107678852A (en) * 2017-10-26 2018-02-09 携程旅游网络技术(上海)有限公司 Method, system, equipment and the storage medium calculated in real time based on flow data
CN108594810A (en) * 2018-04-08 2018-09-28 百度在线网络技术(北京)有限公司 Method, apparatus, storage medium, terminal device and the automatic driving vehicle of data processing
CN108881369A (en) * 2018-04-24 2018-11-23 中国科学院信息工程研究所 A kind of method for interchanging data and cloud message-oriented middleware system of the cloud message-oriented middleware based on data-oriented content
CN108595699A (en) * 2018-05-09 2018-09-28 国电南瑞科技股份有限公司 The Stream Processing method of wide-area distribution type data in electric power scheduling automatization system
CN108900320A (en) * 2018-06-04 2018-11-27 佛山科学技术学院 A kind of internet test envelope topological structure large scale shrinkage in size method and device
CN109104318A (en) * 2018-08-23 2018-12-28 广东轩辕网络科技股份有限公司 The dispositions method and system of method for realizing cluster self-adaption deployment, the self-adaption deployment big data cluster based on cloud platform
CN109194919A (en) * 2018-09-19 2019-01-11 图普科技(广州)有限公司 A kind of camera data flow distribution system, method and its computer storage medium
CN109992561A (en) * 2019-02-14 2019-07-09 石化盈科信息技术有限责任公司 Industrial real-time computing technique, storage medium and calculating equipment
CN110113399A (en) * 2019-04-24 2019-08-09 华为技术有限公司 Load balancing management method and relevant apparatus
CN110245108A (en) * 2019-07-15 2019-09-17 北京一流科技有限公司 It executes body creation system and executes body creation method
CN110716744A (en) * 2019-10-21 2020-01-21 中国科学院空间应用工程与技术中心 Data stream processing method, system and computer readable storage medium

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
基于Spark的有效载荷参数解析处理方法;张文彬;王春梅;王静;陈托;智佳;;计算机工程与设计(第02期);587-591 *
基于Storm的面向大数据实时流查询系统设计研究;蒋晨晨;季一木;孙雁飞;王汝传;;南京邮电大学学报(自然科学版)(第03期);100-105+111 *
基于软件定义网络的自适应数据流处理模型;王斌;马颖;;计算机工程与设计(12);3601-3604+3627 *
自私节点下无线多跳网的节能博弈拓扑研究;陈松林;秦燕;;计算机科学(第12期);182-185 *

Also Published As

Publication number Publication date
CN111597058A (en) 2020-08-28

Similar Documents

Publication Publication Date Title
CN107450981B (en) Block chain consensus method and equipment
CN109002362B (en) Service method, device and system and electronic equipment
CN110704037B (en) Rule engine implementation method and device
CN111124906A (en) Tracking method, compiling method and device based on dynamic embedded points and electronic equipment
CN111597058B (en) Data stream processing method and system
CN110674105A (en) Data backup method, system and server
CN110457132B (en) Method and device for creating functional object and terminal equipment
CN111142925A (en) Pipeline type data processing method, equipment and storage medium
CN116737130A (en) Method, system, equipment and storage medium for compiling modal-oriented intermediate representation
CN111694639A (en) Method and device for updating address of process container and electronic equipment
CN114546672A (en) Unmanned communication method, device, equipment and storage medium
CN113419952A (en) Cloud service management scene testing device and method
US20070240164A1 (en) Command line pipelining
CN111435320B (en) Data processing method and device
CN111459819A (en) Software testing method and device, electronic equipment and computer readable medium
CN111401020A (en) Interface loading method and system and computing equipment
CN111459474A (en) Templated data processing method and device
CN116668542B (en) Service execution method based on heterogeneous resource binding under enhanced service architecture
CN115023931B (en) Method and network entity for service API release
CN110908898B (en) Method and system for generating test scheme
CN113271235B (en) Fuzzy test method and device for network traffic, storage medium and processor
CN116775441A (en) Test method and device, electronic equipment and computer readable storage medium
CN115454494A (en) Method, system and medium for implementing content management system based on event stream mechanism
CN117827943A (en) Data mapping method and device, electronic equipment and storage medium
CN114637680A (en) Information acquisition method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant