KR101867220B1 - Device and method for realtime stream processing to enable supporting both streaming model and automatic selection depending on stream data - Google Patents
Device and method for realtime stream processing to enable supporting both streaming model and automatic selection depending on stream data Download PDFInfo
- Publication number
- KR101867220B1 KR101867220B1 KR1020170023893A KR20170023893A KR101867220B1 KR 101867220 B1 KR101867220 B1 KR 101867220B1 KR 1020170023893 A KR1020170023893 A KR 1020170023893A KR 20170023893 A KR20170023893 A KR 20170023893A KR 101867220 B1 KR101867220 B1 KR 101867220B1
- Authority
- KR
- South Korea
- Prior art keywords
- stream data
- real
- data
- processing method
- time
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/466—Transaction processing
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Measuring And Recording Apparatus For Diagnosis (AREA)
Abstract
A real-time stream processing method, an apparatus and a recording medium to which the method is applied are provided. According to the present real-time stream processing method, the real-time stream data processing method can be selected according to the type of the stream data, and the input stream data can be processed according to the selected real-time stream data processing method. By automatically selecting the model, the throughput of the real-time processing service can be increased.
Description
The present invention relates to a real-time stream processing technique, and more particularly, to a real-time stream processing method and apparatus for processing stream data in real time.
A real-time processing system should be guaranteed a service level with a delay time of less than a second and should provide a constant response speed and predictable performance.
Event Stream processing method guarantees low latency, enables almost all logic processing, and it is easy to implement state management. However, the event stream processing method has a problem that a bottleneck may occur if all data are concentrated on a specific key, and that all events must be handled separately, which increases the processing cost of the failure.
The micro-batching method reduces the trouble handling cost and the throughput rate because it processes data by batch. However, there is a problem that the micro-batching method has a limitation on logic processing and a large delay time.
Currently there is a real-time processing framework such as Storm, Samza, and Flink that uses event stream processing method and Spark which uses micro-batching method. However, it supports event stream processing method and micro-batching method simultaneously . In addition, when the stream is processed by the micro-batching method using the existing framework, the delay time is rapidly increased due to the data concentrated at a specific time.
This causes a problem that it is necessary to reset the batch interval at a specific time period during which data is concentrated so as to keep the low delay time constant. In addition, according to the service characteristic or the format classification of stream data (regular, semi-regular, irregular) It has the problem of changing the processing framework or integrating with other platforms to solve performance issues.
SUMMARY OF THE INVENTION The present invention has been made to solve the above problems, and it is an object of the present invention to provide a method and apparatus for selecting a real-time stream data processing method according to a type of stream data, And an apparatus and a recording medium to which the present invention is applied.
Another object of the present invention is to select and process a real-time stream data processing method among a micro-batching method or an event stream processing method according to the type of stream data, and a buffer having a fixed size is used as the micro- And an apparatus and a recording medium to which the present invention is applied.
According to an aspect of the present invention, there is provided a real-time stream processing method including: receiving stream data; Selecting a real-time stream data processing method according to a type of the stream data; And processing the input stream data in accordance with the selected real-time stream data processing method.
In the selection step, either the first real-time stream data processing method or the second real-time stream data processing method can be selected according to the type of the stream data.
In addition, the type of stream data may include fixed data, semi-fixed data, and unstructured data.
In the selection step, when the stream data is unstructured data or semi-fixed data, the real-time stream data processing method can be selected as the first real-time stream data processing method.
Also, the first real-time stream data processing method may be a micro-batching method.
A buffer of a fixed size may be used for the micro-batching method.
In the selection step, when the stream data is formatted data, the real-time stream data processing method may be selected by the second real-time stream data processing method.
The second real-time stream data processing method may be an event stream processing method.
In addition, the method may further include setting an automatic mode or a manual mode, and the selecting step may select a real-time stream data processing method set by a user when the manual mode is set.
According to another aspect of the present invention, there is provided a real-time stream processing apparatus including an input unit for receiving stream data; A selecting unit for selecting a real-time stream data processing method according to the type of the stream data; And a processing unit for processing the input stream data according to the selected real-time stream data processing method.
According to another aspect of the present invention, there is provided a computer-readable recording medium having a computer readable recording medium having recorded thereon a program for causing a computer to execute the steps of: receiving stream data; Selecting a real-time stream data processing method according to a type of the stream data; And processing the input stream data in accordance with the selected real-time stream data processing method.
According to another aspect of the present invention, there is provided a real-time stream processing method comprising: receiving stream data; Selecting a real-time stream data processing method from a micro-batching method or an event stream processing method according to a type of stream data; And processing input stream data according to the selected real-time stream data processing method. In the micro-batching method, a fixed-size buffer is used.
According to various embodiments of the present invention, a real-time stream processing method for selecting a real-time stream data processing method according to a type of stream data and processing input stream data according to a selected real-time stream data processing method, So that the throughput of the real-time processing service can be increased by automatically selecting the streaming model according to the type of stream data to be input.
In addition, two streaming models (event stream / micro-batching) are supported at the same time so that the existing real-time processing system can be used without modification. When a bottleneck occurs in the stream data processing based on the event stream method, it is possible to automatically select a microbatching method which is an alternative streaming model. In addition, the load balancing performance problem that occurs when the micro-batching method is used can be improved, and a lower delay time can be guaranteed.
FIG. 1 schematically shows a configuration of a real-time stream processing apparatus according to an embodiment of the present invention,
2 is a diagram provided in the description of a real-time stream processing method according to another embodiment of the present invention,
3 is a diagram provided in the description of a real-time stream processing method according to another embodiment of the present invention,
4 is a diagram schematically illustrating a real-time stream processing method according to another embodiment of the present invention,
FIG. 5 is a diagram schematically illustrating a conventional micro-batching system,
6 is a diagram schematically illustrating a micro-batching method according to another embodiment of the present invention.
Hereinafter, the present invention will be described in detail with reference to the drawings.
1. Real-time stream processing device
1 is a diagram schematically showing a configuration of a real-time
As shown in FIG. 1, the real-time
The
Specifically, the stream data may include fixed data, semi-fixed data, and unstructured data.
Structured Data represents data stored in fixed fields of a database type. Examples of formal data are relational databases and spreadsheets.
Semi-structured data represents data stored in a form that does not conform to a data model of a structured structure connected to relational databases or other types of data tables. Semi-structured data is not a structured structure but contains tags, schemas, or other markers, so that semantic elements can be distinguished and records and field hierarchies in data can be represented. Examples of semi-structured data include extensible markup language (XML), JavaScript Object Notation (JSON), non-SQL databases, and the like.
Unstructured data refers to unstructured data, such as pictures, images, and documents, which are different in form and structure from data of a certain standard or form. Examples of unstructured data include traditional data such as books, magazines, medical records, voice information, and video information, as well as data generated online from mobile devices such as email, Twitter, and blogs.
In this way, the
The
At this time, the
If the stream data is unstructured data or semi-fixed data, the selecting
At this time, a fixed size buffer is used for the micro-batching method, and a detailed description thereof will be described later with reference to FIG. 5 and FIG.
The
In addition, the real-time
The
The real-time
The real-time
2. How to process live streams
2 is a diagram provided in the description of a real-time stream processing method according to another embodiment of the present invention.
First, the real-time
The real-time
Specifically, if the stream data is unstructured data or semi-fixed data (S220-Y), the real-time
On the other hand, if the stream data is the form data (S220-N, S240), the real-time
Thereafter, the real-time
On the other hand, the real-time
First, the real-time
Then, the real-time
Through this process, the real-time
3. Simultaneous streaming model support and automatic selection of streaming model
4 is a diagram schematically illustrating a real-time stream processing method according to another embodiment of the present invention. As shown in FIG. 4, according to the embodiment of the present invention, it is confirmed that the streaming model is simultaneously supported and automatically selected by the micro-batching method and the event stream processing method.
According to FIG. 4, when the stream data source is a text file, it corresponds to unstructured data, so it can be confirmed that the
According to FIG. 4, when the stream data source is JSON, it corresponds to semi-formed data, and therefore it can be confirmed that the
In addition, according to FIG. 4, when the stream data source is the RDBMS, it corresponds to the formatted data, and therefore, it can be confirmed that the
As described above, the real-time
4. New micro-batching method
5 is a diagram schematically illustrating a conventional micro-batching method.
As shown in FIG. 5, the conventional micro-batching scheme can confirm that all stream data input during the interval N (Interval N) are bundled and transmitted in one batch unit. However, since all the data input during the interval N is processed in one batch unit, there is a problem that the delay time is rapidly increased due to the stream data concentrated at a specific time.
A solution to this problem is the new micro-batching method disclosed in FIG. 6 is a diagram schematically illustrating a micro-batching method according to another embodiment of the present invention.
As shown in FIG. 6, it can be seen that a fixed size buffer is used in the micro-batch method according to the embodiment of the present invention. That is, the real-time
Specifically, when the stream data input during the interval N is larger than the buffer unit, the real-time
As such, the real-time
Needless to say, the technical idea of the present invention can also be applied to a computer-readable recording medium having a computer program for performing the functions of the real-time
While the present invention has been particularly shown and described with reference to exemplary embodiments thereof, it is to be understood that the invention is not limited to the disclosed exemplary embodiments, but, on the contrary, It will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the present invention.
100: Real-time stream processing device
110: input unit
120:
130:
Claims (12)
A real-time stream processing apparatus comprising: selecting a real-time stream data processing method according to a type of stream data; And
Processing the inputted stream data in accordance with the selected real-time stream data processing method,
In the selection step,
Selects one of a first real-time stream data processing method and a second real-time stream data processing method depending on the type of the stream data,
The type of the stream data is,
Characterized in that the data stream includes fixed data, semi-fixed data, and unstructured data.
In the selection step,
Wherein when the stream data is unstructured data or semi-fixed data, the real-time stream data processing method is selected as the first real-time stream data processing method.
In the first real-time stream data processing method,
Wherein the method is a micro-batching method.
The micro-
Wherein a fixed size buffer is used.
In the selection step,
And when the stream data is formatted data, selecting a real-time stream data processing method as a second real-time stream data processing method.
In the second real-time stream data processing method,
Wherein the event stream processing method is an event stream processing method.
In the selection step,
Wherein when the real time stream processing apparatus is set to the manual mode, the real time stream data processing method set by the user is selected.
A selecting unit for selecting a real-time stream data processing method according to the type of the stream data; And
And a processing unit for processing the inputted stream data according to the selected real-time stream data processing method,
The selection unit,
Selects one of a first real-time stream data processing method and a second real-time stream data processing method depending on the type of the stream data,
The type of the stream data is,
Wherein the real-time stream data includes fixed data, semi-fixed data, and unstructured data.
Selecting a real-time stream data processing method according to a type of the stream data; And
Processing input stream data according to a selected real-time stream data processing method,
In the selection step,
Selects one of a first real-time stream data processing method and a second real-time stream data processing method depending on the type of the stream data,
The type of the stream data is,
Characterized in that the program includes a formatted data, semi-structured data, semi-structured data, and unstructured data.
Selecting a real-time stream data processing method from a micro-batching method or an event stream processing method according to the type of the stream data; And
Processing the inputted stream data in accordance with the selected real-time stream data processing method,
The micro-
A fixed size buffer is used,
The type of the stream data is,
Characterized in that the data stream includes fixed data, semi-fixed data, and unstructured data.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020170023893A KR101867220B1 (en) | 2017-02-23 | 2017-02-23 | Device and method for realtime stream processing to enable supporting both streaming model and automatic selection depending on stream data |
US15/464,798 US10671636B2 (en) | 2016-05-18 | 2017-03-21 | In-memory DB connection support type scheduling method and system for real-time big data analysis in distributed computing environment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020170023893A KR101867220B1 (en) | 2017-02-23 | 2017-02-23 | Device and method for realtime stream processing to enable supporting both streaming model and automatic selection depending on stream data |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101867220B1 true KR101867220B1 (en) | 2018-06-12 |
Family
ID=62622482
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020170023893A KR101867220B1 (en) | 2016-05-18 | 2017-02-23 | Device and method for realtime stream processing to enable supporting both streaming model and automatic selection depending on stream data |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101867220B1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11640402B2 (en) | 2020-07-22 | 2023-05-02 | International Business Machines Corporation | Load balancing in streams parallel regions |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103761309A (en) * | 2014-01-23 | 2014-04-30 | 中国移动(深圳)有限公司 | Operation data processing method and system |
US8978034B1 (en) * | 2013-03-15 | 2015-03-10 | Natero, Inc. | System for dynamic batching at varying granularities using micro-batching to achieve both near real-time and batch processing characteristics |
KR20150084098A (en) * | 2014-01-13 | 2015-07-22 | 한국전자통신연구원 | System for distributed processing of stream data and method thereof |
JP2016539427A (en) * | 2013-12-05 | 2016-12-15 | オラクル・インターナショナル・コーポレイション | Pattern matching across multiple input data streams |
-
2017
- 2017-02-23 KR KR1020170023893A patent/KR101867220B1/en active IP Right Grant
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8978034B1 (en) * | 2013-03-15 | 2015-03-10 | Natero, Inc. | System for dynamic batching at varying granularities using micro-batching to achieve both near real-time and batch processing characteristics |
JP2016539427A (en) * | 2013-12-05 | 2016-12-15 | オラクル・インターナショナル・コーポレイション | Pattern matching across multiple input data streams |
KR20150084098A (en) * | 2014-01-13 | 2015-07-22 | 한국전자통신연구원 | System for distributed processing of stream data and method thereof |
CN103761309A (en) * | 2014-01-23 | 2014-04-30 | 中国移动(深圳)有限公司 | Operation data processing method and system |
Non-Patent Citations (1)
Title |
---|
손재기 외 1명. ‘Squall: 실시간 이벤트와 마이크로-배치의 동시 처리 지원을 위한 TMO 모델 기반의 실시간 빅데이터 처리 프레임워크’. 정보과학회논문지 제44권 제1호, 2017.1, pp.84-94. * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11640402B2 (en) | 2020-07-22 | 2023-05-02 | International Business Machines Corporation | Load balancing in streams parallel regions |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20170169061A1 (en) | NoSQL RELATIONAL DATABASE (RDB) DATA MOVEMENT | |
US10007656B2 (en) | DOM snapshot capture | |
US10922288B2 (en) | Method for storing data elements in a database | |
US11798208B2 (en) | Computerized systems and methods for graph data modeling | |
US20170220647A1 (en) | Pluggable architecture for embedding analytics in clustered in-memory databases | |
US20110022643A1 (en) | Dynamic media content previews | |
US10353874B2 (en) | Method and apparatus for associating information | |
US20190361607A1 (en) | Providing combined data from a cache and a storage device | |
US10755091B2 (en) | Method and apparatus for retrieving image-text block from web page | |
US20190266024A1 (en) | Selective and piecemeal data loading for computing efficiency | |
US20170032052A1 (en) | Graph data processing system that supports automatic data model conversion from resource description framework to property graph | |
US10671636B2 (en) | In-memory DB connection support type scheduling method and system for real-time big data analysis in distributed computing environment | |
KR101867220B1 (en) | Device and method for realtime stream processing to enable supporting both streaming model and automatic selection depending on stream data | |
US11622164B2 (en) | System and method for streaming video/s | |
US20170163555A1 (en) | Video file buffering method and system | |
CN111078697B (en) | Data storage method and device, storage medium and electronic equipment | |
KR101830504B1 (en) | In-Memory DB Connection Support Type Scheduling Method and System for Real-Time Big Data Analysis in Distributed Computing Environment | |
US20140067766A1 (en) | Propagating per-custodian preservation and collection requests between ediscovery management applications and content archives | |
WO2017071210A1 (en) | Contact creation method and device | |
US9767191B2 (en) | Group based document retrieval | |
US9679015B2 (en) | Script converter | |
US9705833B2 (en) | Event driven dynamic multi-purpose internet mail extensions (MIME) parser | |
US10482077B2 (en) | System and method for asynchronous update of a search index | |
US10037155B2 (en) | Preventing write amplification during frequent data updates | |
US8787972B2 (en) | Electronic device and method for managing commands |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |