CN111352726A - Streaming data processing method and device based on containerized micro-service - Google Patents

Streaming data processing method and device based on containerized micro-service Download PDF

Info

Publication number
CN111352726A
CN111352726A CN201811585463.3A CN201811585463A CN111352726A CN 111352726 A CN111352726 A CN 111352726A CN 201811585463 A CN201811585463 A CN 201811585463A CN 111352726 A CN111352726 A CN 111352726A
Authority
CN
China
Prior art keywords
container
data
containers
new
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811585463.3A
Other languages
Chinese (zh)
Other versions
CN111352726B (en
Inventor
刘梦晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
3600 Technology Group Co ltd
Original Assignee
Beijing Qihoo Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Qihoo Technology Co Ltd filed Critical Beijing Qihoo Technology Co Ltd
Priority to CN201811585463.3A priority Critical patent/CN111352726B/en
Publication of CN111352726A publication Critical patent/CN111352726A/en
Application granted granted Critical
Publication of CN111352726B publication Critical patent/CN111352726B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a streaming data processing method and a device based on containerized micro-service, wherein the method comprises the following steps: respectively receiving stream data information by a plurality of containers providing micro service services, and screening data information to be processed from the received stream data information according to the type of the micro service services of each container; processing the data information to be processed according to a preset service logic, and storing the processing result data into the appointed storage position of each container; and acquiring processing result data from the designated storage positions of the containers, summarizing the processing result data, and outputting a processing result set of the stream data information. The method provided by the invention can increase the expandability of the streaming computation, and each container can also screen out the data to be processed which needs to be processed from the received streaming data information for processing, thereby realizing the heterogenization, minimizing the consumption of non-computation resources, simultaneously improving the computation efficiency and saving the storage space.

Description

Streaming data processing method and device based on containerized micro-service
Technical Field
The invention relates to the technical field of data processing, in particular to a streaming data processing method and device based on containerized microservice.
Background
At present, for a small amount of data requiring real-time calculation of big data, when stream-oriented calculation deployment is performed, calculation resource management technologies of big data batch processing technologies are often relied on, and these technologies are mainly oriented to centralized, large-batch and CPU-intensive calculations, and have large resource consumption, and are often deployed on a physical machine, capacity expansion needs to be performed by adding a new machine, and the cycle is long, and capacity expansion cannot be performed at any time according to needs.
In addition, since each stream computing application requires many service supports such as a stream computing framework, a computing task scheduling, and the like, a large amount of resources are occupied. Meanwhile, because the computing nodes are completely consistent, when static data (such as some rule bases) needs to be loaded from the outside, the memory of each node needs to load all the static data, and each node often only needs partial static data, thereby causing resource waste. Because of the large amount of scheduling work before the calculation process, the calculation efficiency of the traditional flow calculation framework is very low.
Disclosure of Invention
The present invention provides a streaming data processing method and apparatus based on containerized microservice to overcome the above problems or at least partially solve the above problems.
According to one aspect of the invention, a streaming data processing method based on containerized micro-services is provided, which comprises the following steps:
respectively receiving stream data information by a plurality of containers providing micro service services, and screening data information to be processed from the received stream data information according to the type of the micro service services of each container;
processing the data information to be processed according to a preset service logic, and storing the processing result data into the appointed storage position of each container;
and acquiring processing result data from the designated storage positions of the containers, summarizing the processing result data, and outputting a processing result set of the stream data information.
Optionally, before the receiving stream data information respectively by a plurality of containers that can provide different micro service services, the method further includes:
deploying a plurality of working nodes executing the same or different working tasks, and packaging a plurality of containers providing the micro service business based on the working nodes; wherein each container comprises at least one work node; the work node includes: the data processing system comprises a data receiving node, a data processing node, a data storage node and/or a data summarizing node;
and distributing unique identification information for each container, and uniformly managing each container through a preset container management platform.
Optionally, after the receiving streaming data information by a plurality of containers providing micro service services, and screening out data information to be processed from the received streaming data information according to the type of the micro service of each container, the method further includes:
if any container monitors that the information of the data to be processed screened by the container exceeds a preset information quantity and/or the storage quantity of the designated storage position exceeds a preset storage quantity, sending a capacity expansion request to the preset container management platform, and starting a new container by the preset container management platform based on the configuration information of the container.
Optionally, after allocating unique identification information to each container and performing unified management on each container through a preset container management platform, the method further includes:
pre-configuring a connection mode of at least one container;
if a new first container is started, the new first container sends the connection mode to the container.
Optionally, if the new first container is started, after the new first container sends the connection mode to the container, the method further includes:
if a new second container is subsequently started, the new second container sends the connection mode to the container;
and sending the new second container to the new first container by the container, and sending the connection mode of the new first container to the new second container.
Optionally, after the deploying a plurality of working nodes executing the same or different working tasks and packaging a plurality of containers providing the microservice service based on the working nodes, the method further includes:
and storing the functions supported by the containers and the number of the starting instances in each container, and displaying the state information of the container through an interface.
Optionally, after processing the to-be-processed data information according to a preset service logic and storing the processing result data in the designated storage location of each container, the method further includes:
and marking the processing result data with the storage time exceeding the preset time as overdue data, and recovering the overdue data stored in each container based on preset rules.
According to another aspect of the present invention, there is also provided a streaming data processing apparatus based on containerized microservices, comprising:
the receiving module is configured to respectively receive the stream data information through a plurality of containers providing the micro service services, and screen out the data information to be processed from the received stream data information according to the type of the micro service services of each container;
the processing module is configured to process the data information to be processed according to preset service logic and store the processing result data into the designated storage positions of the containers;
and the summarizing module is configured to acquire processing result data from the designated storage positions of the containers, summarize the processing result data and output a processing result set of the flow data information.
Optionally, the apparatus further comprises:
the deployment module is configured to deploy a plurality of working nodes executing the same or different working tasks, and pack a plurality of containers providing the microservice service based on the working nodes; wherein each container comprises at least one work node; the work node includes: the data processing system comprises a data receiving node, a data processing node, a data storage node and/or a data summarizing node;
and distributing unique identification information for each container, and uniformly managing each container through a preset container management platform.
Optionally, the apparatus further comprises:
the starting module is configured to send a capacity expansion request to the preset container management platform when any container monitors that the information of the data to be processed screened by the container exceeds a preset information amount and/or the storage amount of the designated storage position exceeds a preset storage amount, and the preset container management platform starts a new container based on the configuration information of the container.
Optionally, the deployment module is further configured to:
after the containers are uniformly managed through a preset container management platform, a connection mode of at least one container is configured in advance;
if a new first container is started, the new first container sends the connection mode to the container.
Optionally, the deployment module is further configured to:
if a new second container is subsequently started, the new second container sends the connection mode to the container;
and sending the new second container to the new first container by the container, and sending the connection mode of the new first container to the new second container.
Optionally, the deployment module is further configured to:
after a plurality of containers for providing the micro service business are packaged based on the working nodes, the functions supported by the containers and the number of the starting examples are stored in each container, and the state information of the containers is displayed through an interface.
Optionally, the processing module is further configured to:
and after the processing result data are stored in the designated storage positions of the containers, marking the processing result data with the storage time exceeding the preset time as overdue data, and recovering the overdue data stored in the containers based on preset rules.
According to another aspect of the present invention, there is also provided a computer storage medium storing computer program code which, when run on a computing device, causes the computing device to execute any one of the above-mentioned streaming data processing methods based on containerized microservices.
According to another aspect of the present invention, there is also provided a computing device comprising:
a processor;
a memory storing computer program code;
the computer program code, when executed by the processor, causes the computing device to perform any of the above-described containerized microservice-based streaming data processing methods.
The invention provides a high-efficiency streaming data processing method and device based on containerization micro-service. Based on the method provided by the invention, the container is adopted to realize the receiving and processing of the stream data information, so that the dependence on a physical machine can be eliminated, and the expandability of stream type calculation is greatly improved on the basis of certain existing resources; in addition, after the received stream data information, each container can load own data to be processed as required for processing, so that the heterogenization is realized, the consumption of non-computing resources is minimized, and the computing efficiency is improved.
The foregoing description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the above and other objects, features, and advantages of the present invention more clearly understandable.
The above and other objects, advantages and features of the present invention will become more apparent to those skilled in the art from the following detailed description of specific embodiments thereof, taken in conjunction with the accompanying drawings.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a flow chart of a streaming data processing method based on containerized microservices according to an embodiment of the present invention;
FIG. 2 is a flow diagram illustrating a flow of streaming data processing based on containerized microservices in accordance with a preferred embodiment of the present invention;
FIG. 3 is a schematic diagram of a streaming data processing apparatus based on containerized microservices according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a streaming data processing apparatus based on containerized microservice according to a preferred embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
The flow data calculation is different from the flow of batch processing calculation which collects data firstly and then processes the data uniformly, the flow calculation continuously processes the continuously arriving data, the original data is not stored, and only the calculation result and part of the intermediate data are stored and transmitted backwards in a limited time. In the conventional method, when stream data is calculated, data streams are usually sent to a computing cluster, the computing cluster becomes a task and then distributes the task to computing nodes in various physical computer forms in the cluster, and after each computing node completes calculation, the calculation results are distributed to computing nodes in other physical computer forms for summary. For example, commonly used Spark-streaming, a flow computation framework based on small batch processing, is deployed with a logic (such as map and reduce) based on big data computation resource management technology (such as YARN) and is used for computing and referencing big data computation. Apache Storm, a distributed real-time big data processing system, accepts data using Spout and processes the data using Bolt, which may have a certain topology and is a real-time computation relative to Spark-streaming. Deployments include master nodes and worker nodes, and are also typically based on big data computing resource management techniques (e.g., YARNs). The method is not easy to expand capacity, and the memory of each node needs to be loaded with all static data, thereby causing resource waste. In addition, because of the large amount of scheduling work before the computation process, the computation efficiency of the traditional stream computation framework is often only a fraction of that of the microservice. Storm introduces that the framework can process tens of thousands of pieces of data per second, micro-services may only require one to two machines to process the data, but Storm and Spark-streaming may require several.
An embodiment of the present invention provides a schematic flow chart of an efficient streaming data processing method based on a containerized microservice, and as shown in fig. 1, the streaming data processing method based on the containerized microservice according to the embodiment of the present invention may include:
step S102, receiving stream data information through a plurality of containers providing micro service services respectively, and screening out data information to be processed from the received stream data information according to the type of the micro service services of each container;
step S104, processing the data information to be processed according to a preset service logic, and storing the processing result data into the appointed storage position of each container;
and step S106, acquiring the processing result data from the designated storage positions of the containers, summarizing the processing result data, and outputting a processing result set of the stream data information.
In the streaming data post-processing method based on containerization micro-services provided by the embodiment of the invention, streaming data information is mainly distributed to a plurality of containers providing micro-services, after data information to be processed which needs to be processed is screened out by each container, processing is carried out according to preset service logic and processing results are stored, and finally, processed data are obtained from each container based on different requirements for data summarization so as to output a processing result set. Based on the method provided by the embodiment of the invention, the container is adopted to realize the receiving and processing of the stream data information, so that the dependence on a physical machine can be eliminated, and the expandability of stream type calculation is greatly improved on the basis of certain existing resources; in addition, after the received stream data information, each container can load own data to be processed as required for processing, so that the heterogenization is realized, the consumption of non-computing resources is minimized, and the computing efficiency is improved.
The container is isolated as a virtual machine, can have a unique ID and a unique human-readable name, and can also expose services to the outside. However, the container is designed to run a single process, and occupies relatively less resources than a virtual machine and a physical machine, the virtual machine usually simulates a complete hardware and software environment, and the container only simulates a software environment, which is less expensive than the virtual machine. Therefore, the embodiment of the invention can improve the calculation efficiency while realizing the heterogeneity of each processing node by adopting the container to calculate and process the flow data information.
Alternatively, when the streaming data is processed by container calculation, the processing can be completed by different work nodes in the system. That is, before step S102, the method may further include: deploying a plurality of working nodes executing the same or different working tasks, and packaging a plurality of containers providing the micro service business based on the working nodes; wherein each container comprises at least one work node; the work node may include: the data processing system comprises a data receiving node, a data processing node, a data storage node and/or a data summarizing node; and distributing unique identification information for each container, and uniformly managing each container through a preset container management platform.
That is to say, before processing the streaming data, a plurality of Worker nodes Woker having the same or different work tasks may be deployed, where a Worker may include several of the following, a data receiving node provider, a data processing node Processor, a data storage node Local cache, and a data summarization node Aggregator, where the data storage node Local cache may be a node that stores data by using a memory, a hard disk, or a mixture of the two, and the number of the various types of Worker nodes may be set according to actual requirements, which is not limited in the present invention.
Fig. 2 shows a flow diagram of streaming data processing based on containerized microservices according to a preferred embodiment of the present invention, and as can be seen from fig. 2, four Consumer nodes, four Processor nodes, four Local cache nodes and one Aggregator node are deployed in the preferred embodiment, where a Consumer is responsible for receiving streaming data information and distributing the received information to one or more processors for processing according to business needs, and each Consumer preferably adopts the same distribution logic. The Processor is responsible for processing the flow data, the processing logic of the Processor is determined by specific services, the processing result data is stored in the Local caches, finally the Aggregator pulls the results from the Local caches according to the requirements and summarizes the results, and finally a result set is output. For example, when a user enters a search keyword based on a search engine, advertisement placement is performed in real time based on the search keyword for search entry.
When the containers are packed based on the working nodes shown in fig. 2, one Consumer, one Processor and one Local cache can be selected respectively according to business requirements and packed into four containers, and the Aggregator is independently used as one container. The two containers are used for information processing with a search engine, the two containers are used for data processing between advertisement putting platforms, and one container is used for ground archiving of all data. When the stream data information is sent to each container, the data can be distributed based on the micro-service type processed by the container and the load balancing technology.
In the conventional scheme, after streaming data is sent to a computing cluster, the streaming data is copied by the computing cluster to become a task, and then the task is distributed to each computing node in the cluster. Generally, one computing node is a physical machine, and after each computing node is finished, the computing result is distributed to the next batch of nodes, so that a large amount of network overhead is generated in the process. In the preferred embodiment, the containers respectively process the streaming data related to the containers, and the connection between the containers can be based on the memory, so that the connection efficiency is higher compared with the connection efficiency between the traditional network-based physical machines, and the real-time streaming data can be processed more quickly.
When the Local cache stores the processing result data, only the data in a certain time window may be stored, that is, after the step S104, the processing result data whose storage time exceeds the preset time may be marked as the expired data, and the expired data stored in each container is recovered based on the preset rule. For streaming data, the value of the data is gradually reduced with the passage of time, when a certain time is exceeded, the stored data is equivalent to losing value, at the moment, the data exceeding the preset time can be marked as expired data, and the expired data is recovered at a proper time to release the memory. When data recovery is performed, the outdated data can be recovered uniformly at regular intervals, or the outdated data can be recovered according to the expiration time of the outdated data, or the outdated data can be recovered when a certain amount of data is reached, which is not limited by the invention.
As mentioned above, each container may provide microservice services, which may be equivalent to splitting a large single application and service into multiple (e.g., tens of) supporting microservices, each of which may correspond to a separate business function. A microservice policy may facilitate work by extending individual components rather than the entire application stack to meet service level agreements. In the embodiment of the invention, one class of containers can correspond to one type of micro service business, and because the containers are isolated from each other, the micro service businesses provided by the containers are not influenced mutually, so that the data processing with higher real-time performance is realized, and the related codes of the same micro service business are placed in the same container, which is also beneficial to the later maintenance.
In the embodiment of the invention, when the container is packed, zero to a plurality of Consumer, Processor, Local Cache and Aggregator can be selected from the deployed working nodes according to the service requirements of each microservice. When the container is packed, the container can be packed by using a gradle or other packing tools to assist in packing, and all codes can also be contained, but a certain number of instances are started only according to the business requirement of the micro-service processed by each container, and for the packed container, the functions supported by the container and the number of the started instances can be stored in each container, and the state information of the container can be displayed through an interface.
After the containers are packaged, the containers can be uniformly managed through a preset container management platform, in addition, unique identification information such as ID can be distributed to the containers, so that the containers can be reasonably managed subsequently, such as monitoring of the state of each container, starting of the containers, closing of the containers and the like. The preset container management platform may be kubernets or other container management software, which is not limited in the present invention.
And for each container managed by the preset container management platform, actively pulling or passively receiving data and monitoring the state of the container, if any container monitors that the information of the data to be processed screened out by the container exceeds the preset information amount and/or the storage amount of the specified storage position exceeds the preset storage amount, sending an expansion request to the preset container management platform, and starting a new container by the preset container management platform based on the configuration information of the container. When a new container is started based on the configuration information of the container, the type, the number and the related codes of the working nodes included in the container can be completely copied.
In addition, in the preferred embodiment of the present invention, when the management platform is preset to manage each container, a connection mode of at least one container can be configured in advance; if the new first container is started, the connection mode of the new first container is sent to the container, and therefore the containers can better sense each other. Due to the pre-configured connection of the existing containers, the containers can be actively connected based on the connection after the new first container is started. If a new second container is subsequently started, the new second container sends the connection mode to the container; and, the new second container mode is sent to the new first container through the container, and the new first container connection mode is sent to the new second container at the same time. Based on this, after the preset container management platform starts a new container, the new container is sensed by the started container as described above, and in addition, the started container can be notified to start a new Worker instance, and other containers can be notified of their own states. The connection method of other containers is acquired among the containers, and if the container serving as the root node cannot provide service, the other containers can still realize data mutual transmission.
For example, assuming that a connection (e.g., API port) is configured for container a in advance, when container B is started, container B may actively connect container a and send its connection to container a. If the container C is started again subsequently, the container C can be connected with the container A and send the connection mode of the container C to the container A, and the container A can send the connection mode of the container B to the container C and simultaneously send the connection mode of the container C to the container B. When the container D is started, the container A sends the connection mode of the containers B and C to the container D, and simultaneously sends the connection mode of the container D to the containers B and C.
In the embodiment of the invention, the total number of the containers on the same physical machine and the Worker instances started in a single container can be limited within a fixed number range, so that the phenomenon that the containers or the Worker instances are started infinitely when a program is abnormal to cause excessive occupation of resources is avoided.
Based on the same inventive concept, an embodiment of the present invention further provides a streaming data processing apparatus based on a containerized micro service, and as shown in fig. 3, the streaming data processing apparatus based on the containerized micro service according to an embodiment of the present invention may include:
the receiving module 310 is configured to receive stream data information through a plurality of containers providing micro service services, and screen out to-be-processed data information from the received stream data information according to the type of the micro service of each container;
the processing module 320 is configured to process the data information to be processed according to a preset service logic, and store the processing result data into the designated storage position of each container;
and the summarizing module 330 is configured to obtain the processing result data from the designated storage positions of the containers for summarizing, and output a processing result set of the stream data information.
In a preferred embodiment of the present invention, as shown in fig. 4, the apparatus may further include:
the deployment module 340 is configured to deploy a plurality of working nodes executing the same or different working tasks, and package a plurality of containers providing the microservice service based on the working nodes; wherein each container comprises at least one work node; the above-mentioned work node includes: the data processing system comprises a data receiving node, a data processing node, a data storage node and/or a data summarizing node;
and distributing unique identification information for each container, and uniformly managing each container through a preset container management platform.
In a preferred embodiment of the present invention, as shown in fig. 4, the apparatus may further include:
the starting module 350 is configured to send a capacity expansion request to the preset container management platform when any container monitors that the to-be-processed data information screened by the container exceeds a preset information amount and/or the storage amount of the designated storage location exceeds a preset storage amount, and the preset container management platform starts a new container based on the configuration information of the container.
In a preferred embodiment of the present invention, the deployment module 340 may be further configured to:
after all containers are uniformly managed through a preset container management platform, a connection mode of at least one container is configured in advance;
if the new first container is started, the new first container sends the connection mode to the container.
In a preferred embodiment of the present invention, the deployment module 340 may be further configured to:
if a new second container is subsequently started, the new second container sends the connection mode to the container;
and sending the mode of the new second container to the new first container through the container, and simultaneously sending the connection mode of the new first container to the new second container.
In a preferred embodiment of the present invention, the deployment module 340 may be further configured to:
after a plurality of containers providing the micro service business are packaged based on the working nodes, the functions supported by the containers and the number of the starting instances are stored in each container, and the state information of the containers is displayed through an interface.
In a preferred embodiment of the present invention, the processing module 320 may be further configured to:
and after the processing result data are stored in the designated storage positions of the containers, marking the processing result data with the storage time exceeding the preset time as overdue data, and recovering the overdue data stored in the containers based on preset rules.
Based on the same inventive concept, embodiments of the present invention further provide a computer storage medium storing computer program code, which, when run on a computing device, causes the computing device to execute the streaming data processing method based on the containerized microservice according to any of the above embodiments.
Based on the same inventive concept, an embodiment of the present invention further provides a computing device, including:
a processor;
a memory storing computer program code;
the computer program code, when executed by the processor, causes the computing device to perform the streaming data processing method based on containerized microservice of any of the embodiments described above.
The embodiment of the invention provides a high-efficiency streaming data processing method and device based on containerization microservice, which adopts a container to receive and process streaming data information, can get rid of dependence on a physical machine, and greatly increases the expandability of streaming calculation on the basis of certain existing resources; in addition, after the received stream data information, each container can load own data to be processed as required for processing, so that heterogenization is realized, the non-computing resource consumption is minimized, the computing efficiency is improved, and the storage space is saved. Furthermore, each container can monitor the state of the container, when the calculation or storage pressure of the container is monitored to be higher, a new container can be applied to the container management software to start, and the new container can be sensed by all started containers, so that the state and information communication among the containers is realized.
It is clear to those skilled in the art that the specific working processes of the above-described systems, apparatuses and units may refer to the corresponding processes in the foregoing method embodiments, and for the sake of brevity, further description is omitted here.
In addition, the functional units in the embodiments of the present invention may be physically independent of each other, two or more functional units may be integrated together, or all the functional units may be integrated in one processing unit. The integrated functional units may be implemented in the form of hardware, or in the form of software or firmware.
Those of ordinary skill in the art will understand that: the integrated functional units, if implemented in software and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computing device (e.g., a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention when the instructions are executed. And the aforementioned storage medium includes: u disk, removable hard disk, Read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disk, and other various media capable of storing program code.
Alternatively, all or part of the steps of implementing the foregoing method embodiments may be implemented by hardware (such as a computing device, e.g., a personal computer, a server, or a network device) associated with program instructions, which may be stored in a computer-readable storage medium, and when the program instructions are executed by a processor of the computing device, the computing device executes all or part of the steps of the method according to the embodiments of the present invention.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments can be modified or some or all of the technical features can be equivalently replaced within the spirit and principle of the present invention; such modifications or substitutions do not depart from the scope of the present invention.
According to an aspect of an embodiment of the present invention, a1. a streaming data processing method based on containerized microservice is provided, including:
respectively receiving stream data information by a plurality of containers providing micro service services, and screening data information to be processed from the received stream data information according to the type of the micro service services of each container;
processing the data information to be processed according to a preset service logic, and storing the processing result data into the appointed storage position of each container;
and acquiring processing result data from the designated storage positions of the containers, summarizing the processing result data, and outputting a processing result set of the stream data information.
A2. The method according to a1, wherein before the receiving streaming data information via a plurality of containers respectively providing different microservice services, the method further comprises:
deploying a plurality of working nodes executing the same or different working tasks, and packaging a plurality of containers providing the micro service business based on the working nodes; wherein each container comprises at least one work node; the work node includes: the data processing system comprises a data receiving node, a data processing node, a data storage node and/or a data summarizing node;
and distributing unique identification information for each container, and uniformly managing each container through a preset container management platform.
A3. The method as claimed in a2, wherein, after receiving the streaming data information respectively via a plurality of containers providing the micro service, and screening the to-be-processed data information from the received streaming data information according to the micro service type of each container, the method further comprises:
if any container monitors that the information of the data to be processed screened by the container exceeds a preset information quantity and/or the storage quantity of the designated storage position exceeds a preset storage quantity, sending a capacity expansion request to the preset container management platform, and starting a new container by the preset container management platform based on the configuration information of the container.
A4. The method according to a2, wherein, after allocating unique identification information to each container and performing unified management on each container through a preset container management platform, the method further comprises:
pre-configuring a connection mode of at least one container;
if a new first container is started, the new first container sends the connection mode to the container.
A5. The method according to a4, wherein, if a new first container is started, after the new first container sends its connection mode to the container, the method further comprises:
if a new second container is subsequently started, the new second container sends the connection mode to the container;
and sending the new second container to the new first container by the container, and sending the connection mode of the new first container to the new second container.
A6. The method according to any one of a2-a5, wherein the deploying a plurality of work nodes performing the same or different work tasks further comprises, after packaging a plurality of containers providing microservice services based on the work nodes:
and storing the functions supported by the containers and the number of the starting instances in each container, and displaying the state information of the container through an interface.
A7. The method according to any one of a1-a5, wherein after the processing the to-be-processed data information according to the preset service logic and storing the processing result data in the designated storage location of each container, the method further comprises:
and marking the processing result data with the storage time exceeding the preset time as overdue data, and recovering the overdue data stored in each container based on preset rules.
There is further provided B8. a streaming data processing apparatus based on containerized microservices, according to another aspect of embodiments of the present invention, including:
the receiving module is configured to respectively receive the stream data information through a plurality of containers providing the micro service services, and screen out the data information to be processed from the received stream data information according to the type of the micro service services of each container;
the processing module is configured to process the data information to be processed according to preset service logic and store the processing result data into the designated storage positions of the containers;
and the summarizing module is configured to acquire processing result data from the designated storage positions of the containers, summarize the processing result data and output a processing result set of the flow data information.
B9. The apparatus of B8, further comprising:
the deployment module is configured to deploy a plurality of working nodes executing the same or different working tasks, and pack a plurality of containers providing the microservice service based on the working nodes; wherein each container comprises at least one work node; the work node includes: the data processing system comprises a data receiving node, a data processing node, a data storage node and/or a data summarizing node;
and distributing unique identification information for each container, and uniformly managing each container through a preset container management platform.
B10. The apparatus of B9, further comprising:
the starting module is configured to send a capacity expansion request to the preset container management platform when any container monitors that the information of the data to be processed screened by the container exceeds a preset information amount and/or the storage amount of the designated storage position exceeds a preset storage amount, and the preset container management platform starts a new container based on the configuration information of the container.
B11. The apparatus of B9, wherein the deployment module is further configured to:
after the containers are uniformly managed through a preset container management platform, a connection mode of at least one container is configured in advance;
if a new first container is started, the new first container sends the connection mode to the container.
B12. The apparatus of B11, wherein the deployment module is further configured to:
if a new second container is subsequently started, the new second container sends the connection mode to the container;
and sending the new second container to the new first container by the container, and sending the connection mode of the new first container to the new second container.
B13. The apparatus of any one of B9-B12, wherein the deployment module is further configured to:
after a plurality of containers for providing the micro service business are packaged based on the working nodes, the functions supported by the containers and the number of the starting examples are stored in each container, and the state information of the containers is displayed through an interface.
B14. The apparatus of any one of B8-B12, wherein the processing module is further configured to:
and after the processing result data are stored in the designated storage positions of the containers, marking the processing result data with the storage time exceeding the preset time as overdue data, and recovering the overdue data stored in the containers based on preset rules.
According to another aspect of the embodiments of the present invention, there is also provided c15 a computer storage medium storing computer program code which, when run on a computing device, causes the computing device to execute the streaming data processing method based on containerized microservice of any of a1-a 7.
There is also provided, in accordance with another aspect of an embodiment of the present invention, apparatus for computing, including:
a processor;
a memory storing computer program code;
the computer program code, when executed by the processor, causes the computing device to perform the streaming data processing method based on containerized microservices of any one of A1-A7.

Claims (10)

1. A streaming data processing method based on containerized micro-services comprises the following steps:
respectively receiving stream data information by a plurality of containers providing micro service services, and screening data information to be processed from the received stream data information according to the type of the micro service services of each container;
processing the data information to be processed according to a preset service logic, and storing the processing result data into the appointed storage position of each container;
and acquiring processing result data from the designated storage positions of the containers, summarizing the processing result data, and outputting a processing result set of the stream data information.
2. The method of claim 1, wherein before receiving the streaming data information via the plurality of containers respectively providing different microservice services, the method further comprises:
deploying a plurality of working nodes executing the same or different working tasks, and packaging a plurality of containers providing the micro service business based on the working nodes; wherein each container comprises at least one work node; the work node includes: the data processing system comprises a data receiving node, a data processing node, a data storage node and/or a data summarizing node;
and distributing unique identification information for each container, and uniformly managing each container through a preset container management platform.
3. The method of claim 2, wherein after receiving the streaming data information via a plurality of containers respectively providing the micro service services and screening the received streaming data information for pending data information according to the type of the micro service of each container, the method further comprises:
if any container monitors that the information of the data to be processed screened by the container exceeds a preset information quantity and/or the storage quantity of the designated storage position exceeds a preset storage quantity, sending a capacity expansion request to the preset container management platform, and starting a new container by the preset container management platform based on the configuration information of the container.
4. The method according to claim 2, wherein after assigning unique identification information to each container and performing unified management on each container through a preset container management platform, the method further comprises:
pre-configuring a connection mode of at least one container;
if a new first container is started, the new first container sends the connection mode to the container.
5. The method of claim 4, wherein if the new first container is started, after the new first container sends its connection mode to the container, further comprising:
if a new second container is subsequently started, the new second container sends the connection mode to the container;
and sending the new second container to the new first container by the container, and sending the connection mode of the new first container to the new second container.
6. The method of any of claims 2-5, wherein the deploying a plurality of worker nodes performing the same or different work tasks further comprises, after packaging a plurality of containers providing microservice services based on the worker nodes:
and storing the functions supported by the containers and the number of the starting instances in each container, and displaying the state information of the container through an interface.
7. The method according to any one of claims 1 to 5, wherein after processing the data information to be processed according to a preset service logic and storing the processing result data in the designated storage location of each container, the method further comprises:
and marking the processing result data with the storage time exceeding the preset time as overdue data, and recovering the overdue data stored in each container based on preset rules.
8. A streaming data processing apparatus based on containerized microservices, comprising:
the receiving module is configured to respectively receive the stream data information through a plurality of containers providing the micro service services, and screen out the data information to be processed from the received stream data information according to the type of the micro service services of each container;
the processing module is configured to process the data information to be processed according to preset service logic and store the processing result data into the designated storage positions of the containers;
and the summarizing module is configured to acquire processing result data from the designated storage positions of the containers, summarize the processing result data and output a processing result set of the flow data information.
9. A computer storage medium storing computer program code which, when run on a computing device, causes the computing device to perform the containerized microservice-based streaming data processing method of any of claims 1-7.
10. A computing device, comprising:
a processor;
a memory storing computer program code;
the computer program code, when executed by the processor, causes the computing device to perform the containerized microservice-based streaming data processing method of any of claims 1-7.
CN201811585463.3A 2018-12-24 2018-12-24 Stream data processing method and device based on containerized micro-service Active CN111352726B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811585463.3A CN111352726B (en) 2018-12-24 2018-12-24 Stream data processing method and device based on containerized micro-service

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811585463.3A CN111352726B (en) 2018-12-24 2018-12-24 Stream data processing method and device based on containerized micro-service

Publications (2)

Publication Number Publication Date
CN111352726A true CN111352726A (en) 2020-06-30
CN111352726B CN111352726B (en) 2024-04-05

Family

ID=71196216

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811585463.3A Active CN111352726B (en) 2018-12-24 2018-12-24 Stream data processing method and device based on containerized micro-service

Country Status (1)

Country Link
CN (1) CN111352726B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112286935A (en) * 2020-10-30 2021-01-29 上海淇玥信息技术有限公司 Scheduling method and device based on scheduling platform and electronic equipment
CN114924877A (en) * 2022-05-17 2022-08-19 江苏泰坦智慧科技有限公司 Dynamic allocation calculation method, device and equipment based on data stream
CN115499421A (en) * 2022-09-19 2022-12-20 北京三维天地科技股份有限公司 Micro-service architecture mode based on three-layer architecture

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103986608A (en) * 2014-05-29 2014-08-13 浪潮电子信息产业股份有限公司 J2EE application virtualization management method based on Itanium Linux application containers
CN106126338A (en) * 2016-06-21 2016-11-16 浪潮(北京)电子信息产业有限公司 A kind of method and device of cluster virtual machine telescopic arrangement
CN106227579A (en) * 2016-07-12 2016-12-14 深圳市中润四方信息技术有限公司 A kind of Docker container construction method and Docker manage control station
CN108228347A (en) * 2017-12-21 2018-06-29 上海电机学院 The Docker self-adapting dispatching systems that a kind of task perceives
CN108494574A (en) * 2018-01-18 2018-09-04 清华大学 Network function parallel processing architecture in a kind of NFV
US20180285165A1 (en) * 2017-03-31 2018-10-04 Ca, Inc. Container-based system analytics appliance
CN108958881A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Data processing method, device and computer readable storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103986608A (en) * 2014-05-29 2014-08-13 浪潮电子信息产业股份有限公司 J2EE application virtualization management method based on Itanium Linux application containers
CN106126338A (en) * 2016-06-21 2016-11-16 浪潮(北京)电子信息产业有限公司 A kind of method and device of cluster virtual machine telescopic arrangement
CN106227579A (en) * 2016-07-12 2016-12-14 深圳市中润四方信息技术有限公司 A kind of Docker container construction method and Docker manage control station
US20180285165A1 (en) * 2017-03-31 2018-10-04 Ca, Inc. Container-based system analytics appliance
CN108228347A (en) * 2017-12-21 2018-06-29 上海电机学院 The Docker self-adapting dispatching systems that a kind of task perceives
CN108494574A (en) * 2018-01-18 2018-09-04 清华大学 Network function parallel processing architecture in a kind of NFV
CN108958881A (en) * 2018-05-31 2018-12-07 平安科技(深圳)有限公司 Data processing method, device and computer readable storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ION-DORINEL FILIP: ""Microservices Scheduling Model Over Heterogeneous Cloud-Edge Environments As Support for IoT Applications"", 《IEEE INTERNET OF THINGS JOURNAL》, vol. 5, no. 4, 31 August 2018 (2018-08-31), pages 2672 - 2681 *
TECHEEK: ""如何使用Docker部署微服务"", Retrieved from the Internet <URL:《https://cloud.tencent.com/developer/article/1339967》> *
杜圣东: ""交通大数据:一种基于微服务的敏捷处理架构设计"", 《大数据》, vol. 3, no. 03, 20 May 2017 (2017-05-20), pages 53 - 67 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112286935A (en) * 2020-10-30 2021-01-29 上海淇玥信息技术有限公司 Scheduling method and device based on scheduling platform and electronic equipment
CN114924877A (en) * 2022-05-17 2022-08-19 江苏泰坦智慧科技有限公司 Dynamic allocation calculation method, device and equipment based on data stream
CN114924877B (en) * 2022-05-17 2023-10-17 江苏泰坦智慧科技有限公司 Dynamic allocation calculation method, device and equipment based on data stream
CN115499421A (en) * 2022-09-19 2022-12-20 北京三维天地科技股份有限公司 Micro-service architecture mode based on three-layer architecture

Also Published As

Publication number Publication date
CN111352726B (en) 2024-04-05

Similar Documents

Publication Publication Date Title
US10171377B2 (en) Orchestrating computing resources between different computing environments
US10733029B2 (en) Movement of services across clusters
US9971823B2 (en) Dynamic replica failure detection and healing
US9319281B2 (en) Resource management method, resource management device, and program product
EP3282356A1 (en) Container monitoring configuration deployment
US10338958B1 (en) Stream adapter for batch-oriented processing frameworks
US7490265B2 (en) Recovery segment identification in a computing infrastructure
US9405589B2 (en) System and method of optimization of in-memory data grid placement
US8843632B2 (en) Allocation of resources between web services in a composite service
US10764165B1 (en) Event-driven framework for filtering and processing network flows
CN109995842B (en) Grouping method and device for distributed server cluster
CN111352726A (en) Streaming data processing method and device based on containerized micro-service
Zhang et al. Improving Hadoop service provisioning in a geographically distributed cloud
CN111459641B (en) Method and device for task scheduling and task processing across machine room
US9184982B2 (en) Balancing the allocation of virtual machines in cloud systems
US20220329651A1 (en) Apparatus for container orchestration in geographically distributed multi-cloud environment and method using the same
CN109873714B (en) Cloud computing node configuration updating method and terminal equipment
US20190253488A1 (en) Transaction process management by dynamic transaction aggregation
CN113608838A (en) Deployment method and device of application image file, computer equipment and storage medium
KR102231358B1 (en) Single virtualization method and system for HPC cloud service
CN112583740B (en) Network communication method and device
CN115826845A (en) Storage resource allocation method and device, storage medium and electronic device
CN113037812A (en) Data packet scheduling method and device, electronic equipment, medium and intelligent network card
US10819777B1 (en) Failure isolation in a distributed system
US11768704B2 (en) Increase assignment effectiveness of kubernetes pods by reducing repetitive pod mis-scheduling

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20240308

Address after: Room 03, 2nd Floor, Building A, No. 20 Haitai Avenue, Huayuan Industrial Zone (Huanwai), Binhai New Area, Tianjin, 300450

Applicant after: 3600 Technology Group Co.,Ltd.

Country or region after: China

Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park)

Applicant before: BEIJING QIHOO TECHNOLOGY Co.,Ltd.

Country or region before: China

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant