CN104915793A - Public information intelligent analysis platform based on big data analysis and mining - Google Patents

Public information intelligent analysis platform based on big data analysis and mining Download PDF

Info

Publication number
CN104915793A
CN104915793A CN201510368814.5A CN201510368814A CN104915793A CN 104915793 A CN104915793 A CN 104915793A CN 201510368814 A CN201510368814 A CN 201510368814A CN 104915793 A CN104915793 A CN 104915793A
Authority
CN
China
Prior art keywords
analysis
module
data
information
large data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510368814.5A
Other languages
Chinese (zh)
Inventor
祝守宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
BEIJING THETA NETWORKS CO LTD
Original Assignee
BEIJING THETA NETWORKS CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by BEIJING THETA NETWORKS CO LTD filed Critical BEIJING THETA NETWORKS CO LTD
Priority to CN201510368814.5A priority Critical patent/CN104915793A/en
Publication of CN104915793A publication Critical patent/CN104915793A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a public information intelligent analysis platform based on big data analysis and mining. The public information intelligent analysis platform comprises a big data preprocessing module, an intelligent analysis module and an information presentation module. The big data preprocessing module is used for storing and preprocessing enterprise information via a clouded mode. The intelligent analysis module is used for performing event cause and effect analysis, assessment of the overall situation of the company and future trend prediction and returning analysis results to an upper layer by comprehensively utilizing the preprocessed data according to data analysis tasks transmitted by the upper layer. The information presentation module is used for generating the data analysis tasks to the intelligent business analysis module and receiving the returned results, presenting the analysis results of the intelligent analysis module and providing an operation interface to a user. Intelligent analysis is performed on mass data in enterprise operation by adopting a clustered computing method via flexibly establishing and configuring analysis models so that problems of poor real-time performance, high effectiveness and interactivity of a conventional data processing method are solved, the user is assisted to perceive the enterprise situation in real time and thus enterprise management efficiency and business processing level are enhanced.

Description

Based on the public information intellectualized analysis platform that large data analysis is excavated
Technical field
The present invention relates to large processing data information field, in particular to the public information intellectualized analysis platform excavated based on large data analysis.
Background technology
Large data are data sets that scale is very huge and complicated, have 4V characteristic, and one is data volume huge (Volume) and increases continuously and healthily; Two is data input, the output stream with high speed (Velocity); Three is data type and source variation (Variety); Four is that value (Value) density is low.
Most enterprises, especially large enterprise and listed company etc., all produce business and the service data of magnanimity every day, therefore need in time and process these information datas efficiently.Wherein, information data (Variety) of a great variety, (Volume) in large scale, to input and processing speed requirement higher (Velocity) of information data.The Information and knowledge that information data comprises is very abundant, but due to the impact of Deta sparseness, value (Value) density of information data is lower.In summary it can be seen, information data meets the 4V characteristic that large data have completely, is a typical sample in the middle of large concept data category.
In enterprise, (especially large-scale state-owned enterprise and large-scale marketing enterprises) data volume is huge and sharply increase, under day to day operation analyzes the increasingly sophisticated background of demand, data carding process and the information excavating process of the situation of enterprises analysis and evaluation work are very complicated, and requiring higher to the speed of data exhibiting, dimension, fineness, the real-time of traditional data disposal route, high efficiency and interactivity can not adapt to the demand of the large data analysis of enterprise.For this reason, the acquisition of magnanimity business data, storage, retrieval, share, to analyze and the aspect such as visual all needs new large data processing technique.
Summary of the invention
The object of the present invention is to provide a kind of public information intellectualized analysis platform excavated based on large data analysis, adopt the information data of the mode of large data to enterprise to process, to improve the processing speed of information data, the high and better user experience of flexibility ratio.
First aspect, embodiments provides a kind of public information intellectualized analysis platform excavated based on large data analysis, comprising:
Large data preprocessing module, stores and pre-service company information for the mode by cloud; Pre-service is used for realizing the management of load balancing, resource virtualizing and Distributed Storage;
Intelligent analysis module, for sending instruction according to upper strata, large data analysis is carried out to the task data that upper strata sends, and fully utilize the data that large data preprocessing module provides, the event of carrying out causality analysis, company's overall condition assessment, future trend prediction, and analysis result is returned to upper strata;
Information display module, passes to IN service analysis module and also receives for generating data analysis task the result returned; By visualization technique that can be mutual, the analysis and evaluation result of company information, all kinds of details data hierarchy level are carried out comprehensive, real-time representing; And provide operation interface for user; Operation interface at least comprise in following functions one or more: there is Visual Chart, analysis report, content retrieval and message push/subscription.
In conjunction with first aspect, embodiments provide the first possible embodiment of first aspect, wherein, in the public information intellectualized analysis platform that should excavate based on large data analysis, large data preprocessing module comprises: large data memory module and cloud platform management module;
Large data memory module stores the management data of company information by distributed system architecture Hadoop;
Cloud platform management module and IN service analysis module, in order to realize load balancing, resource virtualizing, Distributed Storage Management and application Program Interfaces api interface function.
In conjunction with the first possible embodiment of first aspect, embodiments provide the embodiment that the second of first aspect is possible, wherein, in the public information intellectualized analysis platform that should excavate based on large data analysis, large data memory module, based on large data technique system, adopts the large data warehouse of mode construction of HadoopHDFS+Hive, the basis of large data warehouse builds cube, for upper layer module provides Data support.
In conjunction with the embodiment that the second of first aspect is possible,, embodiments provide the third possible embodiment of first aspect, wherein, in the public information intellectualized analysis platform that should excavate based on large data analysis, intelligent analysis module comprises analysis module and index evaluation module;
Analysis module, for realizing the analytic functions such as extemporaneous inquiry/combination condition query, multidimensional OLAP, KPI index, MDX inquiry, realizes the data mining capabilitys such as classification, cluster, correlation rule simultaneously, and parameter configuration function flexibly;
Index evaluation module, for carrying out real-time company situation real-time assessment according to preset data information; Preset data information comprises: manpower, finance, material and business.
In conjunction with the third possible embodiment of first aspect, embodiments provide the 4th kind of possible embodiment of first aspect, wherein, the public information intellectualized analysis platform that should excavate based on large data analysis, also comprises operation monitoring module;
Operation monitoring module, for monitoring in real time the overall operation situation of public information intellectualized analysis platform, and the monitor message of all component in large data platform is sent to information display module carry out concentrate show.
In conjunction with the 4th kind of possible embodiment of first aspect, embodiments provide the 5th kind of possible embodiment of first aspect, wherein, the public information intellectualized analysis platform that should excavate based on large data analysis, also comprises operating statistic module; Operating statistic module comprises: timing module and statistical module;
Timing module, for counting according to predetermined period, and generates information when counting a pre-designed one number time of arrival each time;
Statistical module, for according to information, regularly adds up the business monitoring data that platform completes, and is carried out extracting and storing by periodic traffic monitor data.
In conjunction with the 5th kind of possible embodiment of first aspect,, embodiments provide the 6th kind of possible embodiment of first aspect, wherein, in the public information intellectualized analysis platform that should excavate based on large data analysis, operating statistic module also comprises: statistical analysis module and contrast module;
Statistical analysis module, for carrying out statistics and analysis to the service operation situation on public information intellectualized analysis platform; Analyze service operation procedural information, obtain the resource Using statistics in service operation process, data turnover statistics, perform Information Statistics and trend;
Contrast module, for analytical cycle business information, contrasts same business each run situation in certain hour section, finds operation trend and the exception of this business.
In conjunction with the 6th kind of possible embodiment of first aspect, the invention process side provides the 7th kind of possible embodiment of first aspect, wherein, in the public information intellectualized analysis platform that should excavate based on large data analysis, statistical analysis module comprises: the first analysis module and the first statistical module:
First analysis module, for analyzing the network traffics in service operation process, IO read-write, resource service condition and operating Map and Reduce operation information;
First statistical module, lays particular stress on rate, local data operation optimization rate and data processing rate trend for the calculating counted in process of service execution.
In conjunction with the 7th kind of possible embodiment of first aspect, embodiments provide the 8th kind of possible embodiment of first aspect, wherein, in the public information intellectualized analysis platform that should excavate based on large data analysis, statistical analysis module comprises: the second analysis module and the second statistical module;
Second analysis module, for the analysis of the statistical information after terminating same business each run within one period;
Second statistical module, for obtain this business at this moment between operation trend in section, these operation trends have: the change of the change of business operation data volume, business execution time information and the change of service resources use amount.
In conjunction with the 8th kind of possible embodiment of first aspect, embodiments provide the 9th kind of possible embodiment of first aspect, wherein, in the public information intellectualized analysis platform that should excavate based on large data analysis, cloud data supporting module utilizes cloud platform to carry out ETL scheduling, realizes the cleaning of data and integrated;
Analysis report in information display module comprises type of service, business title, basic condition introduction, the analysis of causes, analysis chart, related statements, suggestion, can carry out online editing, down operation.
A kind of public information intellectualized analysis platform excavated based on large data analysis that the embodiment of the present invention provides, by flexibly create, Allocation Analysis model, adopt clustering computing method, intellectual analysis is carried out to the mass data in enterprise operation, solve the problem that traditional data disposal route real-time, high efficiency and interactivity can not adapt to the large data analysis requirements of enterprise, help user's perception school feelings in real time, thus improve enterprise management efficiency and business processing level.
Compared with prior art, the present invention has the following advantages:
1, processing speed is fast: system architecture scheme adopts large data technique to carry out rational management to calculating, store tasks, can give full play to the arithmetic capability of each clustered node in system; When business demand increases, expanding system scale, elevator system performance can be come conveniently by interpolation clustered node.
2, better user experience: system supports that multiple terminal runs, and supports the real-time visual of schools at different levels feelings index, provides the interactive mode of simple, intuitive;
3, flexibility ratio is high: can create flexibly according to the actual conditions of this enterprise, Allocation Analysis model; System adopts hierarchical design, is easy to dispose enforcement, upgrade maintenance.
For making above-mentioned purpose of the present invention, feature and advantage become apparent, preferred embodiment cited below particularly, and coordinate appended accompanying drawing, be described in detail below.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, be briefly described to the accompanying drawing used required in embodiment below, be to be understood that, the following drawings illustrate only some embodiment of the present invention, therefore the restriction to scope should be counted as, for those of ordinary skill in the art, under the prerequisite not paying creative work, other relevant accompanying drawings can also be obtained according to these accompanying drawings.
The structural representation of a kind of public information intellectualized analysis platform excavated based on large data analysis that Fig. 1 shows that the embodiment of the present invention provides;
The structural representation of large data preprocessing module in a kind of public information intellectualized analysis platform excavated based on large data analysis that Fig. 2 shows that the embodiment of the present invention provides;
The structural representation of intelligent analysis module in a kind of public information intellectualized analysis platform excavated based on large data analysis that Fig. 3 shows that the embodiment of the present invention provides;
The structural representation of the public information intellectualized analysis platform that the another kind that Fig. 4 shows the embodiment of the present invention to be provided excavates based on large data analysis
The structural representation of operating statistic module in a kind of public information intellectualized analysis platform excavated based on large data analysis that Fig. 5 shows that the embodiment of the present invention provides;
Fig. 6 shows the structural representation of operating statistic module in the public information intellectualized analysis platform that another kind that the embodiment of the present invention provides excavates based on large data analysis;
The structural representation of statistical analysis module in a kind of public information intellectualized analysis platform excavated based on large data analysis that Fig. 7 shows that the embodiment of the present invention provides;
Fig. 8 shows the structural representation of statistical analysis module in the public information intellectualized analysis platform that another kind that the embodiment of the present invention provides excavates based on large data analysis.
Main element symbol description:
11, large data preprocessing module; 21, intelligent analysis module; 31, information display module; 41, operation monitoring module; 51, operating statistic module; 111, large data memory module; 112, cloud platform management module; 211, analysis module; 212, index evaluation module; 511, timing module; 512, statistical module; 513, statistical analysis module; 514, module is contrasted; 5131, the first analysis module; 5132, the first statistical module; 5133, the second analysis module; 5134, the second statistical module.
Embodiment
Below in conjunction with accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.The assembly of the embodiment of the present invention describing and illustrate in usual accompanying drawing herein can be arranged with various different configuration and design.Therefore, below to the detailed description of the embodiments of the invention provided in the accompanying drawings and the claimed scope of the present invention of not intended to be limiting, but selected embodiment of the present invention is only represented.Based on embodiments of the invention, the every other embodiment that those skilled in the art obtain under the prerequisite not making creative work, all belongs to the scope of protection of the invention.
Hadoop distributed file system (HDFS) is designed to be applicable to operating in the distributed file system on common hardware (commodity hardware).It and existing distributed file system have a lot of common ground.But meanwhile, it is also clearly with the difference of other distributed file system.HDFS is the system of an Error Tolerance, is applicable to being deployed on cheap machine.HDFS can provide the data access of high-throughput, is applicable to very much the application on large-scale dataset.HDFS relaxes a part of POSIX and retrains, and realizes the object of streaming file reading system data.HDFS develops starting most the architecture as Apache Nutch search engine project.
The present invention be directed to large data to process, therefore for solving prior art Problems existing, the invention provides a kind of public information intellectualized analysis platform excavated based on large data analysis, adopt clustering computing method, enterprise's mass data is incorporated in the large data warehouse based on cloud platform, and the school feelings analytical model created based on cube, achieve and utilize enterprise's magnanimity multidimensional data comprehensively to analyze, substantially increase the comprehensive of analysis result and accuracy; System has extensibility, can provide the data-handling capacity relative to traditional server modular system more than hundred times speed; System also has visualization characteristic, and company manager can be helped to monitor crucial school feelings evaluation index, real-time perception school feelings.
Below in conjunction with such as 1-Fig. 8, the public information intellectualized analysis platform excavated based on large data analysis provided by the invention is described in detail:
With reference to figure 1, the invention provides a kind of public information intellectualized analysis platform excavated based on large data analysis, comprising:
Large data preprocessing module 11, stores and pre-service company information for the mode by cloud; Pre-service is used for realizing the management of load balancing, resource virtualizing and Distributed Storage;
Intelligent analysis module 21, for sending instruction according to upper strata, large data analysis is carried out to the task data that upper strata sends, and fully utilize the data that large data preprocessing module 11 provides, the event of carrying out causality analysis, company's overall condition assessment, future trend prediction, and analysis result is returned to upper strata;
Information display module 31, passes to IN service analysis module and also receives for generating data analysis task the result returned; By visualization technique that can be mutual, the analysis and evaluation result of company information, all kinds of details data hierarchy level are carried out comprehensive, real-time representing; And provide operation interface for user; Operation interface at least comprise in following functions one or more: there is Visual Chart, analysis report, content retrieval and message push/subscription.
A kind of public information intellectualized analysis platform excavated based on large data analysis that the embodiment of the present invention provides, by flexibly create, Allocation Analysis model, adopt clustering computing method, intellectual analysis is carried out to the mass data in enterprise operation, solve the problem that traditional data disposal route real-time, high efficiency and interactivity can not adapt to the large data analysis requirements of enterprise, help user's perception school feelings in real time, thus improve enterprise management efficiency and business processing level.
Compared with prior art, the present invention has the following advantages:
1, processing speed is fast: system architecture scheme adopts large data technique to carry out rational management to calculating, store tasks, can give full play to the arithmetic capability of each clustered node in system; When business demand increases, expanding system scale, elevator system performance can be come conveniently by interpolation clustered node.
2, better user experience: system supports that multiple terminal runs, and supports the real-time visual of schools at different levels feelings index, provides the interactive mode of simple, intuitive;
3, flexibility ratio is high: can create flexibly according to the actual conditions of this enterprise, Allocation Analysis model; System adopts hierarchical design, is easy to dispose enforcement, upgrade maintenance.
In the present embodiment, whole analysis platform realizes cloud platform based on the Hadoop framework of increasing income, cluster hardware configuration is as follows: the CPU of 16 core 32 threads, the internal memory of the GB of 64 or 128, the hard disk (total storage reaches 24TB) of multiple preset rotation speed directly connected by the mainboard controller of CPU, and adopt gigabit Ethernet to build cluster.Wherein, the quantity of hard disk and rotating speed can be arranged as required, and if quantity is 20, rotating speed is 3600r/s etc.
Wherein, Hadoop cluster has four kinds of basic task roles: title node (comprising alternative name node), work shadow node, tasks carrying node and back end.Title node is responsible for the data coordinated on cluster and is stored; Operation is followed the trail of node and is responsible for coordination data Processing tasks; Tasks carrying node is responsible for carrying out the task such as data acquisition, data processing; Back end is responsible for storing data.In cluster, most node needs simultaneously as back end and tasks carrying node.
On the basis of Hadoop cluster, realize the support to the process of distributed parallel task by Map/Reduce.Map/Reduce is a programming model for big data quantity parallel computation, also be a kind of Task Scheduling Model efficiently simultaneously, a large task is divided into a lot of more fine-grained subtask by it, by carrying out subtask scheduling between the processing node of free time, the node avoiding processing speed slow extends the deadline of whole task.
In the present invention, whole intellectualized analysis platform comprises three parts, as large data preprocessing module 11, intelligent analysis module 21 and information display module 31; Wherein, large data warehouse has been built in large data preprocessing module 11, for storing the crucial raw data in the data source of ETL process abstraction, the basis of large data warehouse builds cube (Cube), for systematic analysis, displaying provide Data support.
Large data preprocessing module 11 also for being realized load balancing to bottom layer node equipment, resource virtualizing, Distributed Storage management, fault-tolerant strategy management by cloud platform management module 112 and providing the functions such as api interface, achieves large data processing and management.
Above-mentioned data source is the independent all operation systems of each business department of enterprise and database, comprises HR Office's data, Finance Department's data, research and development department's data, Finance Department's data, market department's data, assessment place data, interconnected wet end data and integrated management data; Wherein, above-mentioned part may segment a lot of little department, and as market department may also comprise secretary's group, market group etc., data now include in market department.
Above-mentioned large data warehouse realizes based on HDFS and Hive, and employing distributed storage mode has concentrated the mass data in each separate operation system of enterprise, for cube provides data.Further, the data in large data warehouse store with the form of dimension and fact table, and dimension is here the attribute of data, and represent the angle analyzing data, type has general dimension, time dimension and gradual change dimension; Fact table is the master meter storing the data that will analyze, and only includes major key, external key and metric.
In the embodiment of the present invention, information display module 31 adopts Javaweb technology to build the system client of B/S pattern, achieves single-sign-on and controls, for user provides displaying, operation interface; Adopt the ExtJS framework of increasing income to create abundant figure and chart, business information analysis assessment result, all kinds of details data hierarchy level are carried out comprehensive, real-time representing; System client has the functions such as Visual Chart, analysis report, content retrieval, message push/subscription, can by browser execution in mobile terminal, PC end.
In the embodiment of the present invention, above-mentioned Visual Chart comprises the charts such as broken line graph, histogram, cake chart, scatter diagram, area-graph, radar map, twin shaft figure, meter diagram, map.Can filter analysis be carried out, series pulls, chart links, the operation such as focus link.Wherein, analysis report comprises the contents such as exercise question, basic condition introduction, the analysis of causes, analysis chart, related statements, suggestion, can carry out the operation such as online editing, download.
Further, with reference to figure 2, in the public information intellectualized analysis platform that should excavate based on large data analysis, large data preprocessing module 11 comprises: large data memory module 111 and cloud platform management module 112;
Large data memory module 111 stores the management data of company information by distributed system architecture Hadoop;
Cloud platform management module 112 and IN service analysis module 211, in order to realize load balancing, resource virtualizing, Distributed Storage Management and application Program Interfaces api interface function.
Wherein, above-mentioned middle ETL process is the process of data pick-up, conversion, loading, by load balancing, ETL process is evenly distributed to parallel running on cluster, promotes the speed of data importing; ETL process realizes based on Hive, Hive is the data warehouse architecture be based upon on Hadoop, provide the instrument that a series of data are extracted, changed, load, achieve function SQL statement being converted to Map/Reduce task and carrying out, formed a kind ofly can to store, the mechanism of large-scale data in inquiry and analysis HDFS, the interfaces such as shell, JDBC/ODBC, Thrift, Web are provided.Real time data adds large data warehouse to every Preset Time (as half an hour) hour.
In addition, the Multidimensional Data Model that above-mentioned cube (Cube) is made up of multiple dimension and metric, realize based on the individual data node in cluster, combined by different business dimension in large data warehouse, formed with business information analysis model and map, the analytic angle of various dimensions is provided.Further, each business module at least one cube corresponding, each multidimensional data concentrates the business datum related to be not limited to the data of an operation system, and such as scientific research cube combines the data of the multi-service dimensions such as occurrences in human life, research and development, finance.
Further, in the public information intellectualized analysis platform that should excavate based on large data analysis,
Large data memory module 111, based on large data technique system, adopts the large data warehouse of mode construction of HadoopHDFS+Hive, the basis of large data warehouse builds cube, for upper layer module provides Data support.
Further, with reference to figure 3, the public information intellectualized analysis platform that should excavate based on large data analysis, intelligent analysis module 21 comprises analysis module 211 and index evaluation module 212;
Analysis module 211, for realizing the analytic functions such as extemporaneous inquiry/combination condition query, multidimensional OLAP, KPI index, MDX inquiry, realizes the data mining capabilitys such as classification, cluster, correlation rule simultaneously, and parameter configuration function flexibly;
Index evaluation module 212, for carrying out real-time company situation real-time assessment according to preset data information; Preset data information comprises: manpower, finance, material and business.
In the present embodiment, the extemporaneous inquiry/combination condition query function of analysis module 211 realizes based on Hive instrument.
Multidimensional OLAP, the MDX query function of analysis module 211 realize based on cube.Multidimensional OLAP (Multi-dimension on-line analytical process) is the on-line analytical processing of directly enrolling cube, user can the institute that combines of the different aspect of observed data collection and different aspect likely; Multidimensional OLAP process provides data storage management by multidimensional OLAP server.MDX (Multi-Dimensionalexpressions) is Multidimensional Expressions, supports definition and the operation of multidimensional data and multi dimensional object, additionally provides the expanded function such as collection of functions and user-defined function.MDX inquiry comprises the contents such as request of data (SELECT clause), starting point (FROM clause) and screening (WHERE clause), can concentrate data specific part of extracting from multidimensional data.Preferably, SQLServerAnalysisServices is adopted to realize multidimensional OLAP, MDX inquiry.
The data digging method such as classification, cluster, correlation rule of analysis module 211 is provided by Mahout, Mahout is an open source projects under ApacheSoftwareFoundation (ASF), realize based on Hadoop, support that HDFS accesses and provides some extendible machine learning field classic algorithm, comprise cluster, classification, distributed collaboration filtration, frequent subitem excavation etc., and these classic algorithm are converted into Map/Reduce pattern, make it be applicable to cloud environment, greatly improve input data volume and the handling property of algorithm.
The parameter configuration of analysis module 211 is by arranging realization to cube.
Further, with reference to figure 4, the public information intellectualized analysis platform that should excavate based on large data analysis, also comprises operation monitoring module 41;
Operation monitoring module 41, for monitoring in real time the overall operation situation of public information intellectualized analysis platform, and the monitor message of all component in large data platform is sent to information display module 31 carry out concentrate show.
Also arrange operation monitoring module 41 in the present embodiment to monitor all data operated on platform, and monitor data is carried out real-time displaying.
Concrete, large data platform monitoring operation subsystem: carry out calculated off-line and data analysis is large data platform key operation;
Wherein, homework type in existing large data platform is MapReduce operation, MapReduce monitoring operation function gathers for the data message of the MapReduce operation on hadoop, operation information and statistical information, needs to take distinct methods to monitor the operation run and the operation completed because the way to manage of hadoop to operation determines.Can be obtained just at running job operation information by the form of Restfu1API in Hadoop in the present embodiment, after job run, under hadoop leaves the end-state information of the operation completed and statistical information the catalogue of HDFS in, the historical information of the operation completed can be obtained by the Historical Jobs message file of access HDFS.
The present embodiment can also realize real time job monitoring, concrete, and the Restful interface using Yarn to provide obtains the job run information run;
This module is in order to realize the monitoring to job run process, use finger daemon Collecting operation operation monitoring data, setting one-minute timer, an acquisition tasks is triggered every one minute, generate the url of the RESTful interface of monitoring operation information, after sending RESTful acquisition request result, be stored to database.Can obtain the trend of operation in operational process in this way, these trend can reflect network and IO trend in job run process, for task analysis provides foundation.
Further, with reference to figure 4 and Fig. 5, the public information intellectualized analysis platform that should excavate based on large data analysis also comprises operating statistic module 51; Operating statistic module 51 comprises: timing module 511 and statistical module 512;
Timing module 511, for counting according to predetermined period, and generates information when counting a pre-designed one number time of arrival each time;
Statistical module 512, for according to information, regularly adds up the business monitoring data that platform completes, and is carried out extracting and storing by periodic traffic monitor data.
In the present embodiment, also for passing through operating statistic module 51 Statistical monitor data in preset time period in real time;
Concrete, operating statistic module 51 is monitored in real time to large data platform overall operation situation, the monitor message of all component in large data platform is carried out concentrate displaying, mainly distributed file system HDFS running status is shown, resource management framework Yarn running status is shown, distributed consensus service Zookeeper running status is shown and the displaying of NoSql database HBase running status is integrated.
Further, with reference to figure 6, in the public information intellectualized analysis platform that should excavate based on large data analysis, operating statistic module 51 also comprises: statistical analysis module 513 and contrast module 514;
Statistical analysis module 513, for carrying out statistics and analysis to the service operation situation on public information intellectualized analysis platform; Analyze service operation procedural information, obtain the resource Using statistics in service operation process, data turnover statistics, perform Information Statistics and trend;
Contrast module 514, for analytical cycle business information, contrasts same business each run situation in certain hour section, finds operation trend and the exception of this business.
Further, with reference to figure 7, in the public information intellectualized analysis platform that should excavate based on large data analysis, statistical analysis module 513 comprises: the first analysis module 5131 and the first statistical module 5132;
First analysis module 5131, for analyzing the network traffics in service operation process, IO read-write, resource service condition and operating Map and Reduce operation information;
First statistical module 5132, lays particular stress on rate, local data operation optimization rate and data processing rate trend for the calculating counted in process of service execution.
Further, with reference to figure 8, the public information intellectualized analysis platform that should excavate based on large data analysis, statistical analysis module 513 comprises: the second analysis module 5133 and the second statistical module 5134:
Second analysis module 5133, for the analysis of the statistical information after terminating same business each run within one period;
Second statistical module 5134, for obtain this business at this moment between operation trend in section, these operation trends have: the change of the change of business operation data volume, business execution time information and the change of service resources use amount.
Concrete, in the application in large data platform, there is operation can produce a large amount of intermediate data in the process of implementation, when platform stores inadequate, these intermediate data can affect the computing power of large data platform greatly, thus drag slow whole cluster, cause task failure on a large scale.So it is necessary to carry out statistics to job run procedural information, after task brings into operation, timing acquisition task run information, then to a series of data analysis and the displaying of task run state, thus the middle operation trend of Job execution can be analyzed, ensure the smooth execution of operation.In large data platform, there is new data importing every day, need to run specific program every day to process new data and analyze, rule can be found by Historical Jobs statistical information for the operation run these every days, thus job run situation is judged and predicts.It is as follows that native system carries out statistical study information to the operation on large data platform:
Network traffics, the i.e. flow that produces during pulling data in operational process of Hadoop operation, produce network traffics and have following three phases: Map end obtains input data phase from HDFS, the shuffle stage obtains Map end and exports data phase, after Reduce has operated, output is written to the HDFS stage.Can obtain uninterrupted between operation and HDFS by two counter analyzing operation IO relevant, these two Counter are respectively; HDFS_BYTES_READ and HDFS_BYTES_WRITTEN in the FileSystem Counters of file system statistical information group.And the Reduce shuffle bytes in MapReduceFramework information group illustrates the operation flow that pulling data produces in shuffle process, also represent it is that Map end is transferred to Reduce and holds data volume size altogether.Namely the network traffics of Hadoop operation add up by the parameter of three above.
IO reads and writes: read and write data by the IO analyzing operation, the file system deflection that operation operates in the process of implementation can be obtained, the Hadoop default record all I/O operation to file system of operation, the statistical number of these operations is in FileSystemCounters group, as follows to FileSystemCounters counter group analysis:
1.HDFS_BYTES_READ represents that the byte number of data is read in operation in the process of implementation from HDFS, because MapReduce operation only has the Map stage to read data from HDFS, so also represent that Map obtains the total amount of data from HDFS, comprises split metadata.
2.HDFS_BYTES_WRITTEN represents that operation writes the total amount of byte of data in the process of implementation on HDFS, the Reduce stage of MapReduce operation is after being finished, result of calculation is write HDFS, if there is no the Reduce stage in operation, then after the operation Map stage is finished, the Output rusults in Map stage stored in HDFS.
3.HDFS_READ_OPS represents that operation in the process of implementation, altogether HDFS is carried out to the number of times of read operation.
4.HDFS_WRITTEN_OPS represents that operation in the process of implementation, altogether HDFS is carried out to the number of times of write operation.
5.FILE_BYTES_READ represents that the byte number of data is read in operation in the process of implementation from local disk, Map and the Reduce end of MapReduce operation can carry out sorting operation, needs to read the intermediate calculation data in local disk.
6.FILE_BYTES_WRITTEN represents that the byte number of data is read in operation in the process of implementation from local disk, Map and the Reduce end of MapReduce operation can carry out sorting operation, needs in interim intermediate result write local disk.
7.FILE_READ_OPS adds up the operand reading local disk.
8.FILE_WRITTEN_OPS adds up the operand reading local disk
Further, in the public information intellectualized analysis platform that should excavate based on large data analysis, cloud data supporting module utilizes cloud platform to carry out ETL scheduling, realizes the cleaning of data and integrated;
Analysis report in information display module 31 comprises type of service, business title, basic condition introduction, the analysis of causes, analysis chart, related statements, suggestion, can carry out online editing, down operation.
A kind of public information intellectualized analysis platform excavated based on large data analysis that the embodiment of the present invention provides, by flexibly create, Allocation Analysis model, adopt clustering computing method, intellectual analysis is carried out to the mass data in enterprise operation, solve the problem that traditional data disposal route real-time, high efficiency and interactivity can not adapt to the large data analysis requirements of enterprise, help user's perception school feelings in real time, thus improve enterprise management efficiency and business processing level.
Compared with prior art, the present invention has the following advantages:
1, processing speed is fast: system architecture scheme adopts large data technique to carry out rational management to calculating, store tasks, can give full play to the arithmetic capability of each clustered node in system; When business demand increases, expanding system scale, elevator system performance can be come conveniently by interpolation clustered node.
2, better user experience: system supports that multiple terminal runs, and supports the real-time visual of schools at different levels feelings index, provides the interactive mode of simple, intuitive;
3, flexibility ratio is high: can create flexibly according to the actual conditions of this enterprise, Allocation Analysis model; System adopts hierarchical design, is easy to dispose enforcement, upgrade maintenance.
In several embodiments that the application provides, should be understood that disclosed system, apparatus and method can realize by another way.Device embodiment described above is only schematic, such as, the division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, again such as, multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some communication interfaces, and the indirect coupling of device or unit or communication connection can be electrical, machinery or other form.
The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.
If described function using the form of SFU software functional unit realize and as independently production marketing or use time, can be stored in a computer read/write memory medium.Based on such understanding, the part of the part that technical scheme of the present invention contributes to prior art in essence in other words or this technical scheme can embody with the form of software product, this computer software product is stored in a storage medium, comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) perform all or part of step of method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, portable hard drive, ROM (read-only memory) (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or CD etc. various can be program code stored medium.
The above; be only the specific embodiment of the present invention, but protection scope of the present invention is not limited thereto, is anyly familiar with those skilled in the art in the technical scope that the present invention discloses; change can be expected easily or replace, all should be encompassed within protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of described claim.

Claims (10)

1., based on the public information intellectualized analysis platform that large data analysis is excavated, it is characterized in that, comprising:
Large data preprocessing module, stores and pre-service company information for the mode by cloud; Described pre-service is used for realizing the management of load balancing, resource virtualizing and Distributed Storage;
Intelligent analysis module, for sending instruction according to upper strata, large data analysis is carried out to the task data that upper strata sends, and fully utilize the data that large data preprocessing module provides, the event of carrying out causality analysis, company's overall condition assessment, future trend prediction, and analysis result is returned to upper strata;
Information display module, passes to IN service analysis module and also receives for generating data analysis task the result returned; By visualization technique that can be mutual, the analysis and evaluation result of company information, all kinds of details data hierarchy level are carried out comprehensive, real-time representing; And provide operation interface for user; Described operation interface at least comprise in following functions one or more: there is Visual Chart, analysis report, content retrieval and message push/subscription.
2. the public information intellectualized analysis platform excavated based on large data analysis according to claim 1, is characterized in that, described large data preprocessing module comprises: large data memory module and cloud platform management module;
Described large data memory module stores the management data of company information by distributed system architecture Hadoop;
Described cloud platform management module, for realizing load balancing, resource virtualizing, Distributed Storage Management and application Program Interfaces api interface function.
3. the public information intellectualized analysis platform excavated based on large data analysis according to claim 2, is characterized in that,
Described large data memory module, based on large data technique system, adopts the large data warehouse of mode construction of HadoopHDFS+Hive, the basis of large data warehouse builds cube, for upper layer module provides Data support.
4. the public information intellectualized analysis platform excavated based on large data analysis according to claim 3, is characterized in that, described intelligent analysis module comprises analysis module and index evaluation module;
Described analysis module, for realizing the analytic functions such as extemporaneous inquiry/combination condition query, multidimensional OLAP, KPI index, MDX inquiry, realizes the data mining capabilitys such as classification, cluster, correlation rule simultaneously, and parameter configuration function flexibly;
Index evaluation module, for carrying out real-time company situation real-time assessment according to preset data information; Described preset data information comprises: manpower, finance, material and business.
5. the public information intellectualized analysis platform excavated based on large data analysis according to claim 4, is characterized in that, also comprise operation monitoring module;
Described operation monitoring module, for monitoring in real time the overall operation situation of public information intellectualized analysis platform, and the monitor message of all component in large data platform is sent to described information display module carry out concentrate show.
6. the public information intellectualized analysis platform excavated based on large data analysis according to claim 5, is characterized in that, also comprise operating statistic module; Described operating statistic module comprises: timing module and statistical module;
Described timing module, for counting according to predetermined period, and generates information when counting a pre-designed one number time of arrival each time;
Described statistical module, for according to described information, regularly adds up the business monitoring data that platform completes, and is carried out extracting and storing by periodic traffic monitor data.
7. the public information intellectualized analysis platform excavated based on large data analysis according to claim 6, is characterized in that, described operating statistic module also comprises: statistical analysis module and contrast module;
Described statistical analysis module, for carrying out statistics and analysis to the service operation situation on public information intellectualized analysis platform; Analyze service operation procedural information, obtain the resource Using statistics in service operation process, data turnover statistics, perform Information Statistics and trend;
Described contrast module, for analytical cycle business information, contrasts same business each run situation in certain hour section, finds operation trend and the exception of this business.
8. the public information intellectualized analysis platform excavated based on large data analysis according to claim 7, it is characterized in that, described statistical analysis module comprises: the first analysis module and the first statistical module;
Described first analysis module, for analyzing the network traffics in service operation process, IO read-write, resource service condition and operating Map and Reduce operation information;
Described first statistical module, lays particular stress on rate, local data operation optimization rate and data processing rate trend for the calculating counted in process of service execution.
9. the public information intellectualized analysis platform excavated based on large data analysis according to claim 8, it is characterized in that, described statistical analysis module comprises: the second analysis module and the second statistical module;
Described second analysis module, for the analysis of the statistical information after terminating same business each run within one period;
Described second statistical module, for obtain this business at this moment between operation trend in section, these operation trends have: the change of the change of business operation data volume, business execution time information and the change of service resources use amount.
10. the public information intellectualized analysis platform excavated based on large data analysis according to claim 9, is characterized in that, described cloud data supporting module utilizes cloud platform to carry out ETL scheduling, realizes the cleaning of data and integrated;
Analysis report in described information display module comprises type of service, business title, basic condition introduction, the analysis of causes, analysis chart, related statements, suggestion, can carry out online editing, down operation.
CN201510368814.5A 2015-06-30 2015-06-30 Public information intelligent analysis platform based on big data analysis and mining Pending CN104915793A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510368814.5A CN104915793A (en) 2015-06-30 2015-06-30 Public information intelligent analysis platform based on big data analysis and mining

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510368814.5A CN104915793A (en) 2015-06-30 2015-06-30 Public information intelligent analysis platform based on big data analysis and mining

Publications (1)

Publication Number Publication Date
CN104915793A true CN104915793A (en) 2015-09-16

Family

ID=54084840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510368814.5A Pending CN104915793A (en) 2015-06-30 2015-06-30 Public information intelligent analysis platform based on big data analysis and mining

Country Status (1)

Country Link
CN (1) CN104915793A (en)

Cited By (47)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105260448A (en) * 2015-10-10 2016-01-20 成都博元时代软件有限公司 Big data information analysis method
CN105389402A (en) * 2015-12-29 2016-03-09 曙光信息产业(北京)有限公司 Big-data-oriented ETL (Extraction-Transformation-Loading) method and device
CN105608187A (en) * 2015-12-23 2016-05-25 中国石油天然气股份有限公司 Hadoop-based oil-gas production Internet of Things big data processing method and system
CN105631012A (en) * 2015-12-29 2016-06-01 湖北睛彩视讯科技有限公司 Intelligent new-media big-data analysis system
CN105677539A (en) * 2016-01-12 2016-06-15 北京中交兴路车联网科技有限公司 Method and device for big data system information summarizing and graph reporting
CN105677820A (en) * 2015-12-31 2016-06-15 天津英福科技有限公司 Instrument layer of business intelligence system
CN105930370A (en) * 2016-04-13 2016-09-07 曙光信息产业(北京)有限公司 Data monitoring method and device
CN106227764A (en) * 2016-07-17 2016-12-14 合肥赑歌数据科技有限公司 A kind of intelligence system of big data cognitive Decision
CN106326437A (en) * 2016-08-25 2017-01-11 李晓龙 Finance and economic data analysis method and device
CN107122898A (en) * 2017-04-18 2017-09-01 格罗斯产业链服务(深圳)有限公司 A kind of end-to-end SaaS air control methods of trade based on data statistics
CN107241752A (en) * 2017-05-26 2017-10-10 华中科技大学 The YARN dispatching methods and system of a kind of sensing network flow
CN107273867A (en) * 2017-06-27 2017-10-20 航天星图科技(北京)有限公司 Empty day Remote Sensing Data Processing all-in-one
CN107506602A (en) * 2017-09-07 2017-12-22 北京海融兴通信息安全技术有限公司 A kind of big data health forecast system
CN107590181A (en) * 2017-08-01 2018-01-16 佛山市深研信息技术有限公司 A kind of intelligent analysis system of big data
CN107908794A (en) * 2017-12-15 2018-04-13 广东工业大学 A kind of method of data mining, system, equipment and computer-readable recording medium
CN108280644A (en) * 2018-01-10 2018-07-13 清华大学 Group member relation data method for visualizing and system
CN108363756A (en) * 2018-01-31 2018-08-03 佛山市聚成知识产权服务有限公司 A kind of intelligent transportation big data processing system
CN108518315A (en) * 2018-03-20 2018-09-11 深圳众厉电力科技有限公司 A kind of Wind turbines intelligent monitor system based on cloud storage technology
CN108628964A (en) * 2018-04-18 2018-10-09 江苏运时数据软件股份有限公司 A kind of intelligent scene enterprise big data system
CN108765016A (en) * 2018-05-31 2018-11-06 临泽旷易科技有限公司 A kind of network marketing system based on big data analysis
CN108984718A (en) * 2018-07-10 2018-12-11 四川汇源吉迅数码科技有限公司 A kind of digital content interactive system and exchange method based on big data technology
CN109067690A (en) * 2018-08-07 2018-12-21 腾讯科技(深圳)有限公司 The method for pushing and device of off-line calculation result data
CN109542859A (en) * 2018-10-18 2019-03-29 天津大学 A kind of Information Maritime processing model based on cloud computing
CN109684322A (en) * 2018-12-26 2019-04-26 交通运输部水运科学研究所 A kind of data processing system and method checked for automatic maritime affairs
CN109711658A (en) * 2018-11-09 2019-05-03 成都数之联科技有限公司 A kind of industrial production optimizing detection system and method
CN109976867A (en) * 2019-04-09 2019-07-05 美林数据技术股份有限公司 System and method is seen clearly in a kind of analysis of data digging flow
WO2019137444A1 (en) * 2018-01-12 2019-07-18 第四范式(北京)技术有限公司 Method and system for executing feature engineering for use in machine learning
CN110069508A (en) * 2017-10-11 2019-07-30 北京奇虎科技有限公司 Data analysing method, device and terminal device based on big data
CN110291520A (en) * 2017-03-30 2019-09-27 国际商业机器公司 Interactive text excavation processing is supported with natural language dialogue
CN110704402A (en) * 2019-10-18 2020-01-17 广州趣丸网络科技有限公司 Data analysis system, method and equipment for multiple data sources
CN110795600A (en) * 2019-11-05 2020-02-14 成都深思科技有限公司 Aggregation dimension reduction statistical method for distributed network flow
CN111126013A (en) * 2019-12-27 2020-05-08 浙江艮威水利建设有限公司 Hydraulic and hydroelectric engineering construction safety management system
CN111435344A (en) * 2019-01-15 2020-07-21 中国石油集团川庆钻探工程有限公司长庆钻井总公司 Big data-based drilling acceleration influence factor analysis model
CN111461576A (en) * 2020-04-27 2020-07-28 宁波市食品检验检测研究院 Fuzzy comprehensive evaluation method for safety risk of chemical hazards in food
CN111751788A (en) * 2020-06-29 2020-10-09 成都数之联科技有限公司 Auxiliary enhancement system for big data intelligent detection equipment
CN111914014A (en) * 2020-08-17 2020-11-10 深圳市联恒星科技有限公司 Big data platform and application thereof
CN112084148A (en) * 2020-09-18 2020-12-15 陕西千山航空电子有限责任公司 Comprehensive application platform for aviation objective information
CN112131302A (en) * 2020-09-08 2020-12-25 银盛支付服务股份有限公司 Business data analysis method and platform
CN112527602A (en) * 2020-12-16 2021-03-19 平安养老保险股份有限公司 Business data statistical method and device, computer equipment and storage medium
CN113189942A (en) * 2021-03-29 2021-07-30 武汉卓尔信息科技有限公司 Intelligent industrial data analysis system and method
CN113641750A (en) * 2021-08-20 2021-11-12 广东云药科技有限公司 Enterprise big data analysis platform
CN113742413A (en) * 2021-09-10 2021-12-03 湖南强智科技发展有限公司 High-accuracy big data analysis system based on multi-form processing
CN114780074A (en) * 2022-06-20 2022-07-22 北京风锐科林医疗科技有限公司 Information computing system for realizing big data analysis and construction method
CN115827944A (en) * 2022-12-23 2023-03-21 宿州市翰辞网络科技有限公司 Big data analysis method and server based on Internet platform system optimization
WO2023130775A1 (en) * 2022-01-07 2023-07-13 华中科技大学同济医学院附属协和医院 Visual analysis system based on discipline evaluation report
CN116502868A (en) * 2023-06-25 2023-07-28 北京云行在线软件开发有限责任公司 Distributed scheduling engine system and distributed scheduling method
CN116578598A (en) * 2023-07-11 2023-08-11 荣耀终端有限公司 Data query method, system and storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521246A (en) * 2011-11-11 2012-06-27 国网信息通信有限公司 Cloud data warehouse system
CN104573071A (en) * 2015-01-26 2015-04-29 湖南大学 Intelligent school situation analysis system and method based on megadata technology
CN104615526A (en) * 2014-12-05 2015-05-13 北京航空航天大学 Monitoring system of large data platform

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521246A (en) * 2011-11-11 2012-06-27 国网信息通信有限公司 Cloud data warehouse system
CN104615526A (en) * 2014-12-05 2015-05-13 北京航空航天大学 Monitoring system of large data platform
CN104573071A (en) * 2015-01-26 2015-04-29 湖南大学 Intelligent school situation analysis system and method based on megadata technology

Cited By (56)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105260448A (en) * 2015-10-10 2016-01-20 成都博元时代软件有限公司 Big data information analysis method
CN105608187A (en) * 2015-12-23 2016-05-25 中国石油天然气股份有限公司 Hadoop-based oil-gas production Internet of Things big data processing method and system
CN105389402A (en) * 2015-12-29 2016-03-09 曙光信息产业(北京)有限公司 Big-data-oriented ETL (Extraction-Transformation-Loading) method and device
CN105631012A (en) * 2015-12-29 2016-06-01 湖北睛彩视讯科技有限公司 Intelligent new-media big-data analysis system
CN105389402B (en) * 2015-12-29 2019-04-12 曙光信息产业(北京)有限公司 A kind of ETL method and apparatus towards big data
CN105677820A (en) * 2015-12-31 2016-06-15 天津英福科技有限公司 Instrument layer of business intelligence system
CN105677539A (en) * 2016-01-12 2016-06-15 北京中交兴路车联网科技有限公司 Method and device for big data system information summarizing and graph reporting
CN105930370A (en) * 2016-04-13 2016-09-07 曙光信息产业(北京)有限公司 Data monitoring method and device
CN106227764A (en) * 2016-07-17 2016-12-14 合肥赑歌数据科技有限公司 A kind of intelligence system of big data cognitive Decision
CN106326437A (en) * 2016-08-25 2017-01-11 李晓龙 Finance and economic data analysis method and device
CN110291520B (en) * 2017-03-30 2023-05-23 国际商业机器公司 Supporting interactive text mining processing with natural language dialog
CN110291520A (en) * 2017-03-30 2019-09-27 国际商业机器公司 Interactive text excavation processing is supported with natural language dialogue
CN107122898A (en) * 2017-04-18 2017-09-01 格罗斯产业链服务(深圳)有限公司 A kind of end-to-end SaaS air control methods of trade based on data statistics
CN107241752A (en) * 2017-05-26 2017-10-10 华中科技大学 The YARN dispatching methods and system of a kind of sensing network flow
CN107241752B (en) * 2017-05-26 2019-10-25 华中科技大学 A kind of the YARN dispatching method and system of sensing network flow
CN107273867A (en) * 2017-06-27 2017-10-20 航天星图科技(北京)有限公司 Empty day Remote Sensing Data Processing all-in-one
CN107590181A (en) * 2017-08-01 2018-01-16 佛山市深研信息技术有限公司 A kind of intelligent analysis system of big data
CN107506602A (en) * 2017-09-07 2017-12-22 北京海融兴通信息安全技术有限公司 A kind of big data health forecast system
CN110069508A (en) * 2017-10-11 2019-07-30 北京奇虎科技有限公司 Data analysing method, device and terminal device based on big data
CN107908794A (en) * 2017-12-15 2018-04-13 广东工业大学 A kind of method of data mining, system, equipment and computer-readable recording medium
CN108280644A (en) * 2018-01-10 2018-07-13 清华大学 Group member relation data method for visualizing and system
WO2019137444A1 (en) * 2018-01-12 2019-07-18 第四范式(北京)技术有限公司 Method and system for executing feature engineering for use in machine learning
CN108363756A (en) * 2018-01-31 2018-08-03 佛山市聚成知识产权服务有限公司 A kind of intelligent transportation big data processing system
CN108518315A (en) * 2018-03-20 2018-09-11 深圳众厉电力科技有限公司 A kind of Wind turbines intelligent monitor system based on cloud storage technology
CN108628964B (en) * 2018-04-18 2021-08-06 江苏运时数据软件股份有限公司 Intelligent scene-based enterprise big data system
CN108628964A (en) * 2018-04-18 2018-10-09 江苏运时数据软件股份有限公司 A kind of intelligent scene enterprise big data system
CN108765016A (en) * 2018-05-31 2018-11-06 临泽旷易科技有限公司 A kind of network marketing system based on big data analysis
CN108984718A (en) * 2018-07-10 2018-12-11 四川汇源吉迅数码科技有限公司 A kind of digital content interactive system and exchange method based on big data technology
CN109067690A (en) * 2018-08-07 2018-12-21 腾讯科技(深圳)有限公司 The method for pushing and device of off-line calculation result data
CN109542859A (en) * 2018-10-18 2019-03-29 天津大学 A kind of Information Maritime processing model based on cloud computing
CN109711658A (en) * 2018-11-09 2019-05-03 成都数之联科技有限公司 A kind of industrial production optimizing detection system and method
CN109684322B (en) * 2018-12-26 2021-01-22 交通运输部水运科学研究所 Data processing system and method for automatic maritime affair auditing
CN109684322A (en) * 2018-12-26 2019-04-26 交通运输部水运科学研究所 A kind of data processing system and method checked for automatic maritime affairs
CN111435344A (en) * 2019-01-15 2020-07-21 中国石油集团川庆钻探工程有限公司长庆钻井总公司 Big data-based drilling acceleration influence factor analysis model
CN109976867A (en) * 2019-04-09 2019-07-05 美林数据技术股份有限公司 System and method is seen clearly in a kind of analysis of data digging flow
CN110704402A (en) * 2019-10-18 2020-01-17 广州趣丸网络科技有限公司 Data analysis system, method and equipment for multiple data sources
CN110704402B (en) * 2019-10-18 2022-11-29 广州趣丸网络科技有限公司 Data analysis system, method and equipment for multiple data sources
CN110795600A (en) * 2019-11-05 2020-02-14 成都深思科技有限公司 Aggregation dimension reduction statistical method for distributed network flow
CN111126013A (en) * 2019-12-27 2020-05-08 浙江艮威水利建设有限公司 Hydraulic and hydroelectric engineering construction safety management system
CN111461576A (en) * 2020-04-27 2020-07-28 宁波市食品检验检测研究院 Fuzzy comprehensive evaluation method for safety risk of chemical hazards in food
CN111751788A (en) * 2020-06-29 2020-10-09 成都数之联科技有限公司 Auxiliary enhancement system for big data intelligent detection equipment
CN111914014A (en) * 2020-08-17 2020-11-10 深圳市联恒星科技有限公司 Big data platform and application thereof
CN112131302A (en) * 2020-09-08 2020-12-25 银盛支付服务股份有限公司 Business data analysis method and platform
CN112084148A (en) * 2020-09-18 2020-12-15 陕西千山航空电子有限责任公司 Comprehensive application platform for aviation objective information
CN112527602A (en) * 2020-12-16 2021-03-19 平安养老保险股份有限公司 Business data statistical method and device, computer equipment and storage medium
CN113189942A (en) * 2021-03-29 2021-07-30 武汉卓尔信息科技有限公司 Intelligent industrial data analysis system and method
CN113641750A (en) * 2021-08-20 2021-11-12 广东云药科技有限公司 Enterprise big data analysis platform
CN113742413A (en) * 2021-09-10 2021-12-03 湖南强智科技发展有限公司 High-accuracy big data analysis system based on multi-form processing
WO2023130775A1 (en) * 2022-01-07 2023-07-13 华中科技大学同济医学院附属协和医院 Visual analysis system based on discipline evaluation report
CN114780074A (en) * 2022-06-20 2022-07-22 北京风锐科林医疗科技有限公司 Information computing system for realizing big data analysis and construction method
CN114780074B (en) * 2022-06-20 2022-09-16 北京风锐科林医疗科技有限公司 Information computing system for realizing big data analysis and construction method
CN115827944A (en) * 2022-12-23 2023-03-21 宿州市翰辞网络科技有限公司 Big data analysis method and server based on Internet platform system optimization
CN115827944B (en) * 2022-12-23 2024-03-01 山东新明辉安全科技有限公司 Big data analysis method and server based on Internet platform system optimization
CN116502868A (en) * 2023-06-25 2023-07-28 北京云行在线软件开发有限责任公司 Distributed scheduling engine system and distributed scheduling method
CN116578598A (en) * 2023-07-11 2023-08-11 荣耀终端有限公司 Data query method, system and storage medium
CN116578598B (en) * 2023-07-11 2023-11-17 荣耀终端有限公司 Data query method, system and storage medium

Similar Documents

Publication Publication Date Title
CN104915793A (en) Public information intelligent analysis platform based on big data analysis and mining
US11733829B2 (en) Monitoring tree with performance states
US10515469B2 (en) Proactive monitoring tree providing pinned performance information associated with a selected node
US10523538B2 (en) User interface that provides a proactive monitoring tree with severity state sorting
US10243818B2 (en) User interface that provides a proactive monitoring tree with state distribution ring
US9336288B2 (en) Workflow controller compatibility
CN104573071A (en) Intelligent school situation analysis system and method based on megadata technology
CN107766402A (en) A kind of building dictionary cloud source of houses big data platform
CN103838617A (en) Method for constructing data mining platform in big data environment
CN107590181A (en) A kind of intelligent analysis system of big data
CN114416855A (en) Visualization platform and method based on electric power big data
Tannir Optimizing Hadoop for MapReduce
CN115640300A (en) Big data management method, system, electronic equipment and storage medium
CN108399208A (en) A kind of information display system of big data
Benlachmi et al. A comparative analysis of hadoop and spark frameworks using word count algorithm
CN111914014A (en) Big data platform and application thereof
Herodotou Automatic tuning of data-intensive analytical workloads
CN108304549A (en) A kind of big data Intelligent processing system
CN108363756A (en) A kind of intelligent transportation big data processing system
Hassan et al. Real-Time Big Data Analytics for Data Stream Challenges: An Overview
Huang et al. A web interface for XALT log data analysis
Sanaboyina Performance evaluation of time series databases based on energy consumption
CN117689083A (en) Method, device, system and medium for managing solar power
CN116257512A (en) Data monitoring management and control tool
Bhimrao Implementation Map Reduce Paradigm in Data Cube Mining

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20150916