KR20160096313A - Apparatus and method for monitoring analysis application for analyzing big data - Google Patents
Apparatus and method for monitoring analysis application for analyzing big data Download PDFInfo
- Publication number
- KR20160096313A KR20160096313A KR1020150017775A KR20150017775A KR20160096313A KR 20160096313 A KR20160096313 A KR 20160096313A KR 1020150017775 A KR1020150017775 A KR 1020150017775A KR 20150017775 A KR20150017775 A KR 20150017775A KR 20160096313 A KR20160096313 A KR 20160096313A
- Authority
- KR
- South Korea
- Prior art keywords
- analysis
- information
- application
- processing
- analysis application
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/3065—Monitoring arrangements determined by the means or processing involved in reporting the monitored data
- G06F11/3086—Monitoring arrangements determined by the means or processing involved in reporting the monitored data where the reporting involves the use of self describing data formats, i.e. metadata, markup languages, human readable formats
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Library & Information Science (AREA)
- Quality & Reliability (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Debugging And Monitoring (AREA)
Abstract
Description
BACKGROUND OF THE
Recently, as the spread of mobile communication terminals such as smart phones and tablet PCs has become popular and the use of social network services (SNS), machine to machines (M2M), and sensor networks have increased The amount of data, the rate of generation, and the variety thereof are increasing exponentially.
Analysis of these large and diverse data sets can be applied to various technologies such as intelligent robots, next-generation PCs, telematics, home networks, customer relationship management, artificial intelligence, and search engines. Research is actively under way.
Big data analysis technology refers to technology that deduces not only data that can be managed by existing relational database but also formal, irregular, semi-formal data as valuable knowledge information or constitutes knowledge base.
However, since it is practically impossible to analyze vast amounts of big data with a single information processing apparatus, a parallel distributed processing system in which big data is distributed to a plurality of information processing apparatuses and processed in parallel is used.
In a parallel distributed processing system, big data is distributed to a plurality of information processing apparatuses and analyzed through an analysis application executed in each of a plurality of information processing apparatuses.
However, the above-described conventional big data analysis technology has a problem in that, in the information processing apparatuses of the plurality of information processing apparatuses, the big data analysis is executed in the analysis application, what resources and data are used for the big data analysis, There is a limitation that the process of analyzing data can not be monitored.
Thus, when a problem occurs in the system, in order to solve the problem, the developer or the system operator must individually check the analysis work of the analysis application executed in a plurality of information processing apparatuses, and thus the system can not be efficiently operated and managed there is a problem.
As a result, there is a problem that the performance of big data analysis is poor and the reliability of knowledge deduced by big data analysis may also be lowered.
SUMMARY OF THE INVENTION An object of the present invention is to provide an apparatus for monitoring an analysis application in order to efficiently integrate and manage large data analysis results distributed in a plurality of information processing apparatuses.
It is another object of the present invention to provide a method of monitoring an analysis application so as to improve the analysis performance of large data to be distributed and infer high reliability information from big data.
According to an aspect of the present invention, there is provided an apparatus for monitoring an analysis application, the apparatus comprising: a plurality of information processing apparatuses, each of which is implemented in a parallel distributed processing system for distributing large data, A metadata extracting unit for extracting metadata for the application, a metadata extracting unit for extracting metadata for the application, at least one of the plurality of information processing apparatuses, An information collecting unit for collecting processing information on an analysis task, and a monitoring providing unit for displaying and displaying processing information on at least one analysis task.
Here, the analysis application may refer to a MapReduce program composed of a Map function for distributing big data and a Reduce function for integrating distributed data analysis results.
Here, the information collecting unit receives processing information for at least one analysis task operating in the analysis application from each of the plurality of information processing apparatuses, and maps the processing information for the at least one analysis task and the metadata for the analysis application Can be stored.
Here, the metadata for the analysis application is identification data previously assigned to identify an analysis application to be executed in each of the plurality of information processing apparatuses, identification information of the information processing apparatus to be executed the analysis application, Location, execution path, content, rights condition and usage condition.
Herein, the processing information for at least one analysis task includes resource information or data information each of which is accessed by the analysis task to process the big data, identification information of each analysis task, type information of a function that operates each analysis task, Information representing the success or failure of the processing for each of the tasks, and information representing the progress status, and a time at which the operation of each of the analysis tasks is started and an end time.
Here, the system administrator or developer who operates the parallel distributed processing system may further include a search condition setting unit for providing a user interface so that the search condition of the process information for at least one analysis task can be input.
Here, the monitoring and providing unit may sort the processing information for at least one analysis task corresponding to the input search condition on the basis of the operation order and display the processing information on the screen.
According to another aspect of the present invention, there is provided a method for monitoring an analysis application, the method comprising: extracting metadata for an analysis application to be executed in each of a plurality of information processing apparatuses; Collecting processing information for at least one analysis task operating in the analysis application from each of the plurality of information processing apparatuses based on the metadata for the analysis application as the analysis application is executed in each of the plurality of information processing apparatuses And sorting and displaying processing information for at least one analysis task.
Here, the method may further include providing a user interface so that a system operator or developer who operates the parallel distributed processing system can input search conditions of processing information for at least one analysis task.
According to the apparatus and method for monitoring an analysis application according to an embodiment of the present invention as described above, it is possible to efficiently integrate and manage the large data analysis results distributed in a plurality of information processing apparatuses.
In addition, it is possible to improve the analysis performance of the large data subjected to the distributed processing and at the same time to infer the highly reliable knowledge information from the big data.
BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a diagram illustrating an example of a parallel distributed processing system for distributing large data according to an embodiment of the present invention; FIG.
2 is a block diagram illustrating an apparatus for monitoring an analysis application in accordance with an embodiment of the present invention.
FIG. 3 is an exemplary view illustrating a screen in which processing information for at least one analysis task operating in an analysis application according to an embodiment of the present invention is provided. FIG.
4 is an exemplary view illustrating a screen on which a result of analysis of big data processed through an analysis application according to an embodiment of the present invention is provided.
5 is an exemplary view illustrating a screen provided for monitoring an analysis application according to a search condition according to an embodiment of the present invention.
6 is a flow diagram illustrating a method for monitoring an analytical application in accordance with an embodiment of the present invention.
While the invention is susceptible to various modifications and alternative forms, specific embodiments thereof are shown by way of example in the drawings and will herein be described in detail. It should be understood, however, that the invention is not intended to be limited to the particular embodiments, but includes all modifications, equivalents, and alternatives falling within the spirit and scope of the invention. Like reference numerals are used for like elements in describing each drawing.
The terms first, second, A, B, etc. may be used to describe various elements, but the elements should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from another. For example, without departing from the scope of the present invention, the first component may be referred to as a second component, and similarly, the second component may also be referred to as a first component. And / or < / RTI > includes any combination of a plurality of related listed items or any of a plurality of related listed items.
It is to be understood that when an element is referred to as being "connected" or "connected" to another element, it may be directly connected or connected to the other element, . On the other hand, when an element is referred to as being "directly connected" or "directly connected" to another element, it should be understood that there are no other elements in between.
The terminology used in this application is used only to describe a specific embodiment and is not intended to limit the invention. The singular expressions include plural expressions unless the context clearly dictates otherwise. In the present application, the terms "comprises" or "having" and the like are used to specify that there is a feature, a number, a step, an operation, an element, a component or a combination thereof described in the specification, But do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, or combinations thereof.
Unless defined otherwise, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Terms such as those defined in commonly used dictionaries are to be interpreted as having a meaning consistent with the contextual meaning of the related art and are to be interpreted as either ideal or overly formal in the sense of the present application Do not.
Hereinafter, preferred embodiments according to the present invention will be described in detail with reference to the accompanying drawings.
FIG. 1 is an exemplary diagram illustrating a parallel distributed processing system for distributing large data according to an embodiment of the present invention, and FIG. 2 is a block diagram illustrating an apparatus for monitoring an analysis application according to an embodiment of the present invention.
FIG. 3 is an exemplary view illustrating a screen on which process information for at least one analysis task operating in an analysis application according to an exemplary embodiment of the present invention is provided. FIG. 4 is a flowchart illustrating an analysis application according to an exemplary embodiment of the present invention. FIG. 5 is an exemplary view illustrating a screen provided for monitoring an analysis application according to a search condition according to an embodiment of the present invention. FIG.
Referring to Figs. 1 to 5, a technique for monitoring an analysis application for analyzing big data will be described.
In recent years, it is common to use a parallel distributed processing system configured with a plurality of information processing apparatuses for analyzing big data.
However, the conventional big data analysis technique using the parallel distributed processing system does not support a technique for monitoring the process of analyzing big data. Thus, there is a problem that it is not possible to efficiently manage the big data analysis result distributed in a plurality of information processing apparatuses. This has the problem that the performance of big data analysis can be deteriorated and the reliability of inferred knowledge through big data analysis can also be deteriorated.
In order to solve the problems of the prior art described above, the present invention provides an analysis application for analyzing big data so that a system operator or a developer can efficiently integrate and manage big data analysis results distributed through a plurality of information processing apparatuses .
An apparatus for monitoring an analysis application (hereinafter referred to as 'analysis application monitoring apparatus') 100 according to an embodiment of the present invention may be implemented in a parallel
More specifically, the parallel
The distributed
The
Here, the distributed
The big data distributed in the distributed
Thus, the plurality of
Accordingly, the plurality of
In each of the
The analysis
2, the analysis
The
The metadata for the analysis application is identification data of the
Here, the analysis application may be, for example, a MapReduce program composed of a Map function for distributing large data and a Reduce function for integrating distributed data analysis results, And an analysis algorithm designed with a 1: N structure such that one analytical application is processed through N analytical operations.
The
More specifically, the
At this time, the reason for mapping and storing the processing information for at least one analysis task and the metadata for the analysis application is to identify which of the plurality of
Accordingly, the processing information for the analysis task can be stored in the
The
Referring to FIG. 3, the processing information on the analysis task can be displayed in the order of the at least one analysis task in operation in the analysis application.
For example, a system operator or a developer may have a task associated with task A of at least one analysis task running in the analysis application, having the identification information of 'task_1404868050785_0002_m_000003', operated by a map function, Progress), it can be confirmed that the big data analysis is successful (Succeeded). Also, the time '2014-07-09 11:23:18' at which Task A was operated and the time at which the task A's operation ended are '2014-07-09 11:24:24' (Elapsed) was used for the operation of the second embodiment. In particular, it is easy to confirm not only resource information or data information that Task A has accessed to process big data, but also what action it has performed.
This allows the system operator or developer to monitor the execution of at least one analysis task running in the analysis application.
Further, the
For example, referring to FIG. 4, when the average time (argMapTime) of the analysis task operated by the map function in the analysis application is 50s and the total number of operations (mapsTotal) with the mapsCompleted analysis task completed by the map function is 11 You can confirm the change. Likewise, the average time (argReduceTime) of the analysis task operated by the decrement function is 1m 13s, and it can be seen that the reduced task and the total number of operations (reducedTotal) are 3 by the reduction function. In addition, information such as the time (finishTime), ID, and user ID of the execution of the analysis application can be confirmed.
The analysis
Accordingly, the
For example, as shown in (4) of FIG. 5, a system operator or a developer can provide a user interface for inputting search conditions. For example, when a system operator or a developer inputs user identification information called flamingo, information corresponding to flamingo among processing information for at least one analysis task operated in a plurality of analysis applications can be sorted based on the operation order .
In addition, summary information about the analysis application executed by the flamingo can be displayed on the screen so that the system operator or developer can quickly recognize the operation of the analysis application.
Specifically, the metadata for the retrieved analytical application can be displayed according to the retrieval condition as shown in (1) or (2) of FIG. Also, information on analysis tasks operating in the analysis application searched through? Can be displayed. In addition, a user interface such as downloading the retrieved contents or updating by refreshing can be provided as in (5).
This allows the system operator or developer to easily understand the operation of a particular application or an analysis task running in a particular application through the user interface.
Here, the
Particularly, the apparatus for monitoring an
6 is a flow diagram illustrating a method for monitoring an analytical application in accordance with an embodiment of the present invention.
Referring to FIG. 6, a method for monitoring an analysis application includes extracting metadata for an analysis application (SlOO), processing information for at least one analysis task operating in the analysis application based on metadata for the analysis application (S200) of sorting and displaying processing information for at least one analysis task (S300).
The metadata for the analysis application can be extracted before the analysis application is executed in a plurality of information processing apparatuses (S100).
The metadata for the analysis application is identification data of the
Here, the analysis application may be, for example, a MapReduce program composed of a Map function for distributing large data and a Reduce function for integrating distributed data analysis results, And an analysis algorithm designed with a 1: N structure such that one analytical application is processed through N analytical operations.
After the metadata for the analysis application is extracted, processing information for at least one analysis task operating in the analysis application can be collected from each of the plurality of information processing apparatuses as the analysis application is executed in each of the plurality of information processing apparatuses S200).
More specifically, in order to receive and process the processing information for at least one analysis task operating in the analysis application from each of the plurality of information processing apparatuses in order of processing, metadata and mapping for the analysis application extracted by the metadata extraction unit .
At this time, the reason for mapping and storing the processing information for at least one analysis task and the metadata for the analysis application is to identify which of the plurality of
Accordingly, the processing information for the analysis task can be stored in the
The processing information for the collected at least one analysis task may be arranged and displayed so that the system operator or developer can monitor execution of at least one analysis task operating in the analysis application (S300).
In addition, processing information for at least one analysis task can be integrated to provide a big data analysis result processed through an analysis application.
In this case, the step of displaying the processing information for the analysis task on the screen may further include the step of providing a user interface so that the system operator or the developer can input the search condition of the processing information for the analysis task.
Thus, the processing information for the analysis task corresponding to the search condition input by the system operator or the developer can be displayed on the screen by sorting based on the operation order. This allows the system operator or developer to easily understand the operation of a particular application or an analysis task running in a particular application through the user interface.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present invention without departing from the spirit or scope of the present invention as defined by the following claims It can be understood that
10: parallel distributed processing system 20: distributed file management module
30: resource management module 40: information processing device
100: Analysis application monitoring apparatus 110: Metadata extracting unit
120: Information collecting unit 130:
140: Search condition setting section
Claims (12)
A metadata extraction unit for extracting metadata for an analysis application to be executed in each of the plurality of information processing apparatuses;
Processing for at least one analysis task operating in the analysis application from each of the plurality of information processing apparatuses based on metadata for the analysis application as the analysis application is executed in each of the plurality of information processing apparatuses An information collecting unit for collecting information; And
And a monitoring and providing unit for sorting and displaying processing information for the at least one analysis task.
The analysis application,
And a MapReduce program configured by a Map function for distributing the big data and a Reduce function for integrating the distributed large data analysis results. .
The information collecting unit,
Receiving processing information for the at least one analysis task operating in the analysis application from each of the plurality of information processing apparatuses and mapping the processing information for the at least one analysis task and the metadata for the analysis application, Wherein the analyzing application monitors the analysis application.
The metadata for the analytical application may include:
The identification information of the information processing apparatus to be executed with the analysis application, the location of the analysis application in the information processing apparatus, the execution path, the contents , A rights condition, and a use condition. ≪ Desc / Clms Page number 17 >
Wherein the processing information for the at least one analysis task comprises:
Wherein each of the analysis tasks includes resource information or data information accessed to process the big data, identification information of each of the analysis tasks, type information of a function that operates each of the analysis tasks, Information representing a progress state, and at least one of a time at which the operation of each of the analysis tasks is started and an end time.
Further comprising a search condition setting unit for providing a user interface so that a system operator or a developer who operates the parallel distributed processing system can input search conditions of processing information for the at least one analysis task. Devices to monitor.
The monitoring and providing unit,
And arranges the processing information for the at least one analysis task corresponding to the input search condition on the basis of the operation order and displays the sorting information on the screen.
Extracting metadata for an analysis application to be executed in each of the plurality of information processing apparatuses;
Processing for at least one analysis task operating in the analysis application from each of the plurality of information processing apparatuses based on metadata for the analysis application as the analysis application is executed in each of the plurality of information processing apparatuses Collecting information; And
And arranging and displaying processing information for the at least one analysis task.
The analysis application,
And a MapReduce program configured by a Map function for distributing the big data and a Reduce function for integrating the distributed data with the large data analysis result. .
The step of collecting processing information for the at least one analysis task comprises:
Receiving processing information for the at least one analysis task operating in the analysis application from each of the plurality of information processing apparatuses and mapping the processing information for the at least one analysis task and the metadata for the analysis application, The method comprising the steps of:
Further comprising the step of providing a user interface so that a system operator or a developer operating the parallel distributed processing system can input search conditions of processing information for the at least one analysis task Way.
The step of sorting and displaying the processing information for the at least one analysis task comprises:
And arranging the processing information for the at least one analysis task corresponding to the input search condition on the basis of the operation order and displaying the sorting information on a screen.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150017775A KR20160096313A (en) | 2015-02-05 | 2015-02-05 | Apparatus and method for monitoring analysis application for analyzing big data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150017775A KR20160096313A (en) | 2015-02-05 | 2015-02-05 | Apparatus and method for monitoring analysis application for analyzing big data |
Publications (1)
Publication Number | Publication Date |
---|---|
KR20160096313A true KR20160096313A (en) | 2016-08-16 |
Family
ID=56854325
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150017775A KR20160096313A (en) | 2015-02-05 | 2015-02-05 | Apparatus and method for monitoring analysis application for analyzing big data |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR20160096313A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180059269A (en) * | 2016-11-25 | 2018-06-04 | 진데이타 주식회사 | Bigdata platform interlock apparatus and method thereof |
KR20180080924A (en) * | 2017-01-05 | 2018-07-13 | 주식회사 엑셈 | Apparatus and method for monitoring the processing result of big data processing server |
CN109214704A (en) * | 2018-09-26 | 2019-01-15 | 广东电网有限责任公司 | A kind of distributed intelligence operation platform, method, apparatus and readable storage medium storing program for executing |
KR20220085365A (en) * | 2020-12-15 | 2022-06-22 | 현대오토에버 주식회사 | Apparatus for monitoring task execution time and operating method of node |
KR102511977B1 (en) | 2022-12-19 | 2023-03-22 | 주식회사 비브라이트 | method and system for providing food product curating service using Ministry of Food and Drug Safety public data |
KR102589677B1 (en) | 2023-02-17 | 2023-10-17 | 주식회사 비브라이트 | device that recommends food products and composition ratios based on data from the Ministry of Food and Drug Safety |
-
2015
- 2015-02-05 KR KR1020150017775A patent/KR20160096313A/en not_active Application Discontinuation
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20180059269A (en) * | 2016-11-25 | 2018-06-04 | 진데이타 주식회사 | Bigdata platform interlock apparatus and method thereof |
KR20180080924A (en) * | 2017-01-05 | 2018-07-13 | 주식회사 엑셈 | Apparatus and method for monitoring the processing result of big data processing server |
CN109214704A (en) * | 2018-09-26 | 2019-01-15 | 广东电网有限责任公司 | A kind of distributed intelligence operation platform, method, apparatus and readable storage medium storing program for executing |
KR20220085365A (en) * | 2020-12-15 | 2022-06-22 | 현대오토에버 주식회사 | Apparatus for monitoring task execution time and operating method of node |
KR102511977B1 (en) | 2022-12-19 | 2023-03-22 | 주식회사 비브라이트 | method and system for providing food product curating service using Ministry of Food and Drug Safety public data |
KR102589677B1 (en) | 2023-02-17 | 2023-10-17 | 주식회사 비브라이트 | device that recommends food products and composition ratios based on data from the Ministry of Food and Drug Safety |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10534773B2 (en) | Intelligent query parameterization of database workloads | |
KR20160096313A (en) | Apparatus and method for monitoring analysis application for analyzing big data | |
CN109672741B (en) | Micro-service monitoring method and device, computer equipment and storage medium | |
US9501562B2 (en) | Identification of complementary data objects | |
US10346292B2 (en) | Software component recommendation based on multiple trace runs | |
CN110738389A (en) | Workflow processing method and device, computer equipment and storage medium | |
CN112491602B (en) | Behavior data monitoring method and device, computer equipment and medium | |
CN113157947A (en) | Knowledge graph construction method, tool, device and server | |
US20150370616A1 (en) | Method and system for recommending computer products on the basis of observed usage patterns of a computational device of known configuration | |
CN112394908A (en) | Method and device for automatically generating embedded point page, computer equipment and storage medium | |
CN110717647A (en) | Decision flow construction method and device, computer equipment and storage medium | |
CN103108033B (en) | File uploading method and system | |
CN107277019A (en) | Data clear text acquisition methods, device, electric terminal and readable storage medium storing program for executing | |
CN113485999A (en) | Data cleaning method and device and server | |
US10331484B2 (en) | Distributed data platform resource allocator | |
EP3151124A1 (en) | On-board information system and information processing method therefor | |
CN111813517A (en) | Task queue allocation method and device, computer equipment and medium | |
KR101686919B1 (en) | Method and apparatus for managing inference engine based on big data | |
CN112491650B (en) | Method for dynamically analyzing call loop condition between services and related equipment | |
KR20150110063A (en) | Apparatus and method of integrating mapreduce for big data processing | |
CN110059096A (en) | Data version management method, apparatus, equipment and storage medium | |
CN109656894A (en) | Log standardization storage method, device, equipment and readable storage medium storing program for executing | |
CN109684156B (en) | Monitoring method, device, terminal and storage medium based on mixed mode application | |
CN111159213A (en) | Data query method, device, system and storage medium | |
US11645187B2 (en) | Application curation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E601 | Decision to refuse application |