CN109272155A - A kind of corporate behavior analysis system based on big data - Google Patents

A kind of corporate behavior analysis system based on big data Download PDF

Info

Publication number
CN109272155A
CN109272155A CN201811058169.7A CN201811058169A CN109272155A CN 109272155 A CN109272155 A CN 109272155A CN 201811058169 A CN201811058169 A CN 201811058169A CN 109272155 A CN109272155 A CN 109272155A
Authority
CN
China
Prior art keywords
data
platform
analysis
enterprise
system based
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811058169.7A
Other languages
Chinese (zh)
Other versions
CN109272155B (en
Inventor
石国鹏
张国增
杨景伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou Centripetal Force Communication Technology Inc Co
Original Assignee
Zhengzhou Centripetal Force Communication Technology Inc Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou Centripetal Force Communication Technology Inc Co filed Critical Zhengzhou Centripetal Force Communication Technology Inc Co
Priority to CN201811058169.7A priority Critical patent/CN109272155B/en
Publication of CN109272155A publication Critical patent/CN109272155A/en
Application granted granted Critical
Publication of CN109272155B publication Critical patent/CN109272155B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management

Landscapes

  • Business, Economics & Management (AREA)
  • Engineering & Computer Science (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Economics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Operations Research (AREA)
  • Game Theory and Decision Science (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Quality & Reliability (AREA)
  • Tourism & Hospitality (AREA)
  • Physics & Mathematics (AREA)
  • General Business, Economics & Management (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Educational Administration (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The corporate behavior analysis system based on big data that the invention discloses a kind of, including data acquisition processing platform, data warehouse, data control platform, data analysis mining platform, performance analysis platform and data visualization show platform;The various source datas of enterprise are finally visualized after the processing such as the excavation of each platform, analysis;Its effect is: the system is handled by the various source datas to enterprise, big data caused by each operation system in enterprise is associated, by the excavation and analysis of data and ultimately forms effective visualization and show, data supporting service and operation, decision support are provided for enterprise, to enhance the operational paradigm of enterprise, enterprise management level is improved, the core competitiveness and creativity of enterprise are improved.

Description

A kind of corporate behavior analysis system based on big data
Technical field
The present invention relates to big data processing technology fields, and in particular to a kind of corporate behavior analysis system based on big data System.
Background technique
The operational paradigm of enterprise not only can be enhanced in good business administration, but also enterprise can be allowed to have specific development side To.However in the prior art, the data source between each department of enterprises is often independent of one another, and various data can not be in difference The problems such as intersection is shared between system, and there are information to isolate, Data duplication.That is, existing enterprise's data management cannot be to various numbers According to being integrated, the associated application of big data cannot achieve, so that data supporting service can not be provided for enterprise, cannot give enterprise Forecast analysis is carried out, thus the problem of enhancing the operational paradigm of enterprise.
Summary of the invention
The corporate behavior analysis system based on big data that the object of the present invention is to provide a kind of, with solve in the prior art without Method realizes the associated application of big data, so that data supporting service can not be provided for enterprise, prediction point cannot be carried out to enterprise The problem of analysis.
For this purpose, the technical scheme adopted by the invention is as follows: a kind of corporate behavior analysis system based on big data is provided, including Data acquire processing platform, data warehouse, data control platform, data analysis mining platform, performance analysis platform and data can Show platform depending on changing;
Data acquisition processing platform is used to acquire the various source datas of enterprise, and to the various source datas at Reason is to obtain target data;
The data warehouse is for storing the target data;
The data control platform is for providing metadata management, master data management, data quality management, data standard pipe Reason and data security management services;
The data analysis mining platform is for providing algorithm model library and data analysis mining tool;
The performance analysis platform is for passing through the algorithm model library and data analysis mining tool processing target data To obtain processing result, the processing result can provide performance analysis and decision support for enterprise;
The data visualization shows platform for the processing result to be carried out various visual presentations.
Preferably, the data warehouse includes distributed column storing data library and distributed file system.
Preferably, the corporate behavior analysis system further includes SQL engine module, stream process engine module, conjunctive query Engine module, parallelization R algorithm enforcement engine component, full-text search engine component, distributed computing engine module and task Scheduling and monitor component.
Preferably, the various source datas of the enterprise include structural data, half/unstructured data and real time data in It is one or more.
Preferably, the source of the various source datas of the enterprise specifically includes the data of the existing each operation system of enterprise, leads to The internet data crossing the real time data of Distributed Message Queue acquisition and being acquired by web crawlers technology.
Preferably, the source of the various source datas of the enterprise further includes making a report on the data uploaded with report file online.
Preferably, the mode that source data is handled includes: data cleansing, data re-scheduling and data mart modeling;
The data cleansing refer to delete source data in extraneous data, smooth noise data, screen out with theme without The data and processing missing values, exceptional value of pass, wherein the processing of missing values is carried out using elimination method, Shift Method and interpolation;
The data re-scheduling refers to removing the repeated data in source data;
The data mart modeling refers to carrying out the data in source data a point column, merge.
Preferably, the algorithm model library includes model-naive Bayesian, using model-naive Bayesian to the target Data are classified;Wherein, the expression formula of model-naive Bayesian is as follows:
P (B | A)=P (B) × P (A | B)/P (A)
Wherein, P (B | A) indicates to assume the probability of data B when A is set up, i.e. posterior probability;P (A) indicates the instruction that will be observed Practice the prior probability of data A;P (B) indicates prior probability, i.e., the probability that B possesses is assumed before no training data;P(A| B the probability of data A in the case that hypothesis B is set up) is indicated.
Preferably, data analysis mining tool processing target data specifically includes probability description, association analysis, classification, gathers Alanysis, forecast analysis and separate-blas estimation analysis;
Wherein, the probability description includes characteristic description and distinctiveness description, and the characteristic description is for indicating certain The common trait of class target data;Difference between class target data of the distinctiveness description for indicating different;
The association analysis is used for from correlation rule, the correlativity, cause and effect found between item collection in a large amount of target datas The frequent mode of structure and item collection;The correlation rule is used to describe the degree that influences each other between attribute, passes through confidence level It is measured with support, confidence level is used for for the measurement to accuracy in correlation rule, support in correlation rule The measurement of importance;
The classification is for finding one for each class target data under the premise of the feature of known training data and classification Then a reasonable model again classifies to new data with the model;
The clustering is used in the case where preparatory unknown division classification, carries out information according to information similarity principle Aggregation;
The forecast analysis is for predicting continuous or ordinal value;
The separate-blas estimation is analyzed for analyzing significant changes and deviation between data status, historical record or standard, Existing exception record is found out, and is taken corrective action.
Preferably, it includes that J2EE platform and visualization show component, the visualization that the data visualization, which shows platform, Showing component includes that immediate inquiring component, report and instrument panel assemblies, OLAP multidimensional analysis component and map show component.
By adopting the above technical scheme, a kind of corporate behavior based on big data proposed by the present invention point is had the advantage that Analysis system, the system are handled by the various source datas to enterprise, will be counted greatly caused by each operation system in enterprise According to being associated, by the excavation and analysis of data and ultimately forms effective visualization and show, provide data branch for enterprise It supports service and operation, decision support and improves enterprise management level to enhance the operational paradigm of enterprise, improve enterprise Core competitiveness and creativity.
Detailed description of the invention
Fig. 1 is the system structure diagram of the embodiment of the present invention;
Fig. 2 is the technological frame figure of the embodiment of the present invention;
Fig. 3 is the logical construction schematic diagram of the embodiment of the present invention.
Specific embodiment
In order to keep the technical problem to be solved in the present invention, technical solution and advantage clearer, below in conjunction with attached drawing and Specific embodiment is described in detail, and the following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention..It needs Illustrating, some english abbreviations present in file are the profession statement in the industry, it will be understood by those skilled in the art that Some of which noun is explained below.
J2EE:Java2Platform Enterprise Edition, J2EE platform is substantially a distributed clothes Business device application programming environment, a Java context;Spring: being an Open Framework, is to solve enterprise using journey Sequence is developed complexity and is created, and one of main advantage of frame is exactly its layer architecture, and layer architecture allows you to select to use Which component, while integrated frame being provided for J2EE application development;ESB:Enterprise Service Bus, enterprise Industry service bus;ETL:Extract-Transform-Load, for describing data from source terminal by extracting (extract), the process of interaction conversion (transform), load (load) to destination;SQL:Structured Query Language, structured query language;Hyperbase: distributed column storing data library;Inceptor: being a kind of interactive mode Analysis engine, essence are a kind of SQL translaters;HDFS:Hadoop Distributed File System, distributed field system System;The connection of JDBC:Java DataBase Connectivity, java database;ODBC:Open Database Connectivity, open CNC;FUSE:Filesystem in Userspace, user's space file system; Spark: being the computing engines for the Universal-purpose quick for aiming at large-scale data processing and designing;PL/SQL:Procedural Language/SQL, proceduring sql like language;Kafka: distributed information system;JMS:JAVA Message Service, java Messaging service;Cube: data cube, here shown as distributed memory;
OLAP:OnlineAnalytical Processing, online real-time analysis;JSON:JavaScript Object Notation is a kind of data interchange format of lightweight;SparkR: being a R language pack;CRM:Customer Relationship Management, i.e. customer relation management;ERP:Enterprise Resourse Planning, enterprise Resource planning;SNS:Social Networking Services, social network services;HTTP:HyperText Transfer Protocol, hypertext transfer protocol;CA:Certificate Authority, certificate authority;SSO: Single Sign On, single-sign-on;Agent: agency;RESTful:Representational State Transfer, table Existing layer state transfer;SOA:Service-Oriented Architecture, Services Oriented Achitecture.
Referring to figs. 1 to Fig. 3, the corporate behavior analysis system based on big data that the embodiment of the invention provides a kind of, packet Include data acquisition processing platform, data warehouse, data control platform, data analysis mining platform, performance analysis platform and data Visualization shows platform.
Wherein, the data acquisition processing platform is used to acquire the various source datas of enterprise, and to the various source datas It is handled to obtain target data.
The data warehouse is for storing the target data.
The data control platform is for providing metadata management, master data management, data quality management, data standard pipe Reason and data security management services.
The data analysis mining platform is for providing algorithm model library and data analysis mining tool.
The performance analysis platform is for passing through the algorithm model library and data analysis mining tool processing target data To obtain processing result, the processing result can provide performance analysis and decision support for enterprise.
The data visualization shows platform for the processing result to be carried out various visual presentations, wherein visual Change and shows to include diagrammatic representation, Mobile exhibiting, map displaying and large-size screen monitors displaying etc..
It should be noted that data warehouse is the data warehouse based on Hadoop, Hadoop is one by Apache fund The distributed system infrastructure of club's exploitation.The system is handled by the various source datas to enterprise, will be in enterprise Big data caused by each operation system is associated, and by the excavation and analysis of data and ultimately forms effective visualization exhibition It is existing, data supporting service and operation, decision support are provided for enterprise, to enhance the operational paradigm of enterprise, improves enterprise's pipe Reason is horizontal, enhances the core competitiveness of enterprises and creativity.Specifically, various source datas include structural data, half/it is non-structural Change one of data and real time data or a variety of.The source of various source datas mainly includes following channel: (1) enterprise is existing There are the data of each operation system, as the first source data;(2) real time data acquired by Distributed Message Queue, as the Two source datas, the real time data include but are not limited to website clickstream data, real-time event flow data etc.;(3) pass through network The internet data of crawler technology acquisition, as third source data;(4) data obtained by the mode of making a report on online, Yi Jitong It crosses report file and uploads the data that mode obtains.Various source datas are obtained by above-mentioned four kinds of channels, enterprise can be effectively reduced The problem of information island and information between industry department isolate.
Further, in above-mentioned channel (1), data acquire processing platform and pass through data integration and ETL platform, acquisition enterprise Source data in industry existing business system, as the first source data, wherein each operation system include CRM, ERP, other platforms, Financial big data platform etc.), and be loaded into data warehouse in batches after the first source data is handled.In above-mentioned channel (3) In, internet data (website, SNS etc.) can be acquired by internet data acquisition software, and import data afterwards through processing Warehouse.
Preferably, in the present embodiment, data warehouse further includes distributed column storing data library and distributed field system System.Wherein, distributed column storing data library (Hyperbase) is used for structured data, including from existing business system Source data, the multi-threaded associated data set after integration process and application oriented Data Mart of database acquisition etc..System Hyperbase can be accessed by SQL engine module and based on JDBC/ODBC standard interface.Distributed file system (HDFS) is used In storage half/non-structural data, including Office file, XML data, Email data, voucher document scanned copy, video image, The data such as Web page.The data of relevant document attribute are mainly stored in distributed column storing data library Hyperbase;It is right The index data that text data generates mainly is stored in full-text index library.System can be by JAVA API Access HDFS, can also By FUSE carry HDFS, HDFS is mapped as remote disk access and is used.
Further, the mode handled various source datas mainly includes that data cleansing, data re-scheduling and data add Work.
(1) data cleansing
The data cleansing be mainly delete source data in extraneous data, smooth noise data, screen out with theme without The data and processing missing values, exceptional value of pass, wherein the processing of missing values is using at elimination method, Shift Method and interpolation Reason;
Wherein, elimination method is simplest missing values processing method, can be divided into deletion according to the different angle of data processing Observation sample deletes two kinds of variable.All rows containing missing data can be removed by na.omit () function, this belongs to subtract Few sample size is suitable for the lesser situation of missing values proportion come the method for exchanging information integrity for;Variable is deleted to be suitable for Variable has larger missing and influences little situation on goal in research, it is meant that delete entire variable.
Shift Method is used to variable being divided into numeric type and nonumeric type by attribute, and the treating method of the two is different: if lacked Variable is numeric type where mistake value, generally replaces the missing of variable in the mean value of the value of other all objects with the variable Value;If it is non-numerical variable, using the variable, the median or mode of other whole effectively observations are replaced.
Interpolation is for the problem that can have information waste and data structure can become using elimination method, Shift Method It is dynamic, so that the problem of finally obtaining statistical result devious was proposed.Missing values problem is being faced, common interpolation has Regression imputation, multiple interpolation etc..Regression imputation method utilizes regression model, using the variable for needing interpolation to fill a vacancy as dependent variable, His correlated variables predicts the value of dependent variable by regression function lm () to fill a vacancy missing values as independent variable;It is multiple The principle of interpolation is to generate one group of complete data from a data set comprising missing values, is so carried out repeatedly, thus A random sample of missing values is generated, mice () function packet can carry out multiple interpolation in R language.
(2) data re-scheduling
The data re-scheduling is used to remove the repeated data in source data.
(3) data mart modeling
The data mart modeling is used to carry out the data in source data a point column, merges.
It should be noted that can be improved pair after carrying out data cleansing, data re-scheduling and data mart modeling processing to source data The accuracy of data mining and analysis preferably can provide decision service for enterprise.
Further, data control platform can be by the metadata acquisition engine of ETL platform, and unified acquisition process is distributed File system HDFS, distributed column storing data library Hyperbase, ETL process flow and rule, existing business system data The metadata of library and Teradata, oracle database, and be uniformly stored in the database of data control platform, establish source library Table -- > interface table -- > ETL treatment process -- metadata association relationship of > object library table, to be subsequent data standard pipe Reason, master data management, data quality management, data safety management establish hard basis.It is related to and enterprise existing metadata management, master Data management system docking exchange data, can be used ESB platform and message transmission middleware, are based on JMS interface and existing system Real-time exchange metadata, master data change record.(it should be noted that previously described data acquisition processing platform and data Integrated and ETL platform refers to identical platform, it will be understood by those skilled in the art that not explaining one by one herein, for example, ESB is flat Platform and ESB service bus platform are identical platform, distributed column database and the distributed column storing data library in text, distribution Column database is consistent.)
Wherein, ESB platform is for providing message queue, message subscribing and publication, Web Service service orchestration and combination The functions such as calling, service monitoring;
Based on ESB platform and JMS message interface, it can be achieved that the Real Data Exchangs between existing business system (include: Operation management data, metadata/master data etc.), and can by network analysis excavate result data collection be pushed in real time CRM, The application service systems such as ERP, enterprise portal and APP;
ESB platform supports JDBC/ODBC, HTTP/JSON interface, can be with SQL engine, the conjunctive query engine pair in system It connects, so as to be Web Service clothes by data base querying, unstructured and structural data conjunctive query function package Business is called for related application system.
Further, the algorithm model library includes model-naive Bayesian, using model-naive Bayesian to the mesh Mark data are classified;Wherein, the expression formula of model-naive Bayesian is as follows:
P (B | A)=P (B) × P (A | B)/P (A)
Wherein, P (B | A) indicates to assume the probability of data B when A is set up, i.e. posterior probability;P (A) indicates the instruction that will be observed Practice the prior probability of data A;P (B) indicates prior probability, i.e., the probability that B possesses is assumed before no training data;P(A| B the probability of data A in the case that hypothesis B is set up) is indicated.
That is, the structural classification device from the big data of acquisition, classifies to target data, in use, passing through Calculating to its posterior probability, probability value is bigger, illustrates to belong to same class, and so on, convenient for passing through the big data of acquisition Realize that the behavior to enterprise carries out forecast analysis.
Further, data analysis mining tool processing target data specifically includes:
(1) probability description
The probability description includes characteristic description and distinctiveness description, and the characteristic description is for indicating certain class target The common trait of data;Difference between class target data of the distinctiveness description for indicating different.
(2) association analysis
The association analysis is used for from the correlation rule found between item collection in a large amount of target datas, correlativity or cause and effect The frequent mode of structure and item collection, the correlation rule are used to describe the degree that influences each other between attribute, pass through confidence level It is measured with support;Wherein, for the measurement to accuracy in correlation rule, support is used to advise association confidence level The then measurement of middle importance.
(3) classify
The classification is for finding one for each class target data under the premise of the feature of known training data and classification Then a reasonable model again classifies to new data with the model;The classification is including model foundation and uses the mould The step of type is classified;It should be noted that classification here is classified referring to above-mentioned model-naive Bayesian.
(4) clustering
The clustering is used to carry out letter according to information similarity principle in the case where not knowing division classification in advance Cease a kind of method of aggregation.
Specifically, FCM clustering algorithm can be used in clustering algorithm, which is that one kind of traditional hard clustering algorithm changes Into algorithm steps include:
Standardized data matrix;
Fuzzy similarity matrix is established, Subject Matrix is initialized;
Algorithm starts iteration, until objective function converges to minimum;
According to iteration result, class belonging to data is determined as last Subject Matrix, shows last cluster result.
(5) forecast analysis
The forecast analysis is for predicting continuous or ordinal value.
(6) separate-blas estimation is analyzed
The separate-blas estimation is analyzed for analyzing significant changes and deviation between data status, historical record or standard, Existing some exception records are found out, to take corrective action in time.
Further, it includes that J2EE platform and visualization show component that the data visualization, which shows platform, described visual It includes that immediate inquiring component, report and instrument panel assemblies, OLAP multidimensional analysis component and map show component that change, which shows component, is led to It crosses SQL engine module and JDBC/ODBC interface accesses distributed column storing data library and distributed memory.
Specifically, system can realize that unstructured data (is such as stored in by conjunctive query engine and HTTP/JSON interface Text data, XML data in HDFS) and structural data (include: Oracle, MySQL, Teradata, Hyperbase etc. Database data) conjunctive query.System can also dock full-text search engine by HTTP/JSON interface, realize full-text search Inquiry.
Further, the corporate behavior analysis system further includes SQL engine module, stream process engine module, combines and look into It askes engine module, parallelization R algorithm enforcement engine component, full-text search engine component, distributed computing engine module and appoints Business scheduling and monitor component.
Wherein, SQL engine module (Inceptor SQL) is the high-performance realized based on Spark, highly compatible The SQL engine of (SQL2016 standard) provides JDBC/ODBC standard interface for system and accesses Hyperbase database.SQL engine It supports PL/SQL, developer is facilitated to realize the application such as multilist association, aggregation process.
Stream process engine module is the stream process engine module realized based on Spark Streaming, can be disappeared with distribution System docking is ceased, real-time reception handles flow data;It can be docked by JMS api interface with the ESB platform of enterprise, real-time reception is simultaneously Real-time detection can be gone out abnormal events information and sent to ESB platform by processing business data flow.Stream process engine module can pass through SQL engine imports real-time streaming data in real time in distributed column storing data library and distributed memory.Stream process is engine-operated In used business reference data, regular data etc. can be placed in distributed memory, to greatly reduce access database Time loss.
Conjunctive query engine is used to provide unstructured data and structural data conjunctive query service for system.System with Pass through HTTP/JSON interactive interfacing inquiry request and response message between conjunctive query engine.The support of conjunctive query engine passes through JDBC/ODBC interface access data library (Oracle, Teradata, MySQL etc.);It supports to access by Inceptor SQL engine Distributed data base Hyperbase, distributed memory;It supports to access distributed file system HDFS by Java api interface; It supports to access JSON, XML data by HTTP interface.
Parallelization R algorithm enforcement engine component is the parallelization R algorithm engine realized based on SparkR, has been supported at present close 60 kinds of parallelization R algorithms.Developer can will be loaded into algorithm engine using packet by programming conditions in Windows and execute.Parallelization R algorithm engine, data needed for being extracted by JDBC interface and SQL engine to Hyperbase, and will analysis result deposit distribution Formula column storing data library.Parallelization R algorithm engine can also directly read the file data on HDFS.
Full-text search engine component (Elastic Search) is used to extract text data from Hyperbase, HDFS and create Build full-text index library.Full-text index library data can be stored in distributed file system HDFS.Elastic Search is full-text search Inquiry application provides HTTP/JSON access interface.
Distributed computing engine module is used to provide JAVA API framework for distributed batch processing calculating;It is therein Spark engine makes full use of memory computing technique to realize fast distributed processing, supports the language such as Java, Scala, Python.
Task schedule and monitor component for dynamically load nest, management nest and execute Mission Monitor, use Family can check task execution situation according to user state information.
Further, in order to realize the login of multi-user and multisystem, which further includes authentication and access control Component, the component are uniformly to provide authentication and authentication access control for the user of the applications such as access enterprise portal, performance analysis Uniform business.User certificate, authorization message can be stored in relational database (Oracle or MySQL) or the catalogue library of lightweight.It can By proprietary interface or the JMS interface of ESB platform, user certificate information is exchanged with certificate authority (CA).The component also mentions It is for SSOAgent plug-in unit, it can be achieved that integrated to a variety of application systems, the single-sign-on of management system.
From the above, it can be seen that implementing a kind of corporate behavior based on big data point provided by the embodiment of the present invention Analysis system has the advantage that the system is handled by the various source datas to enterprise, by each operation system in enterprise Generated big data is associated, and by the excavation and analysis of data and is ultimately formed effective visualization and is showed, is enterprise Data supporting service and operation, decision support are provided, to enhance the operational paradigm of enterprise, improves business administration water It is flat, improve the core competitiveness and creativity of enterprise
Corporate behavior analysis system based on big data provided by embodiment for a better understanding of the invention, can be from another The system is described in a angle, as shown in Fig. 2, the system be divided into it is following several layers of:
Hardware device level
The hardware device level include server apparatus, the network equipment, storage equipment, load balancer, storage equipment, The hardware devices such as VPN/ firewall.
Virtualization resource layer
The virtualization resource layer is the server virtualization resource pool based on the building of distributed container cluster management system, The container resource allocation and management and running, application of multi-tenant can be provided for types of applications, distributed computing and storage service component It is packaged deployment and the resource management services such as operation, service registration and discovery, scalable, the balanced disaster tolerance of dynamic.
Application platform layer
Exploitation, test and the operation that the application platform layer is applied for big data analysis provide platform, specifically include that J2EE application service platform and Spring frame, report and analysis tool show platform, Parallel Algorithm model library, relation data Library, ESB service bus and ETL data integration platform, authentication and authentication control assembly, full-text search component and big data point Cloth calculating and storage platform component etc..
The big data distributed computing and storage platform component specifically include that distributed column database (i.e. distributed column Storing data library), distributed file system, SQL engine module, stream process engine module, conjunctive query engine, parallelization R calculate Method enforcement engine, full-text search engine, distributed computing engine and the components such as task schedule and monitoring.
Application service layer
For customized development types of applications service, management, data management, application management, content pipe are specifically included that Reason, data analysis, metadata management, decision support, risk management and control, process optimization, supporting, cross-marketing and products innovation Deng application.
Communication network layer
Communication network layer can have been awarded for external user by the internet Internet (containing mobile Internet) access The related application service of power;Internal staff can pass through the application service of comprehensive network access Intranet (WIFI WLAN).
Terminal access layer
System user can access relevant application by PCWeb browser, mobile terminal (smart phone, tablet computer etc.) Service;Platform supports the interaction such as Email, cell phone application, wechat and short message, and the terminal of access includes touching large-size screen monitors and profession end End.System general technological system also includes: big data administrative standard standard system, the safe operation management of unified system etc..
In order to make the system have good compatibility, system additionally provides various data-interfaces, including completely compatible Hadoop ecosphere open source various components api interface, REST access interface includes Web HDFS and StarGate/ Hyperbase REST interface;Simultaneously by supporting SQL2016 standard and PL/SQL, JDBC/ODBC interface is provided, can be made Traditional business scene carries out smooth migration in big data platform;In addition, big data platform provides Java API for data mining And R language interface, by the interface, user directly can interact formula data mining using R language and SQL and explore, together When secondary development can be carried out by the open API of platform, give upper layer application to carry out SQL query by JDBC/ODBC interface;This Outside, the Java API in the parallel statistics mining algorithm library on basis is further comprised in Inceptor, user can pass through parallel algorithm The secondary development of library progress data mining.
This system is designed using Service-Oriented Architecture Based (SOA), using J2EE/Spring, Apache CXF frame, in realization Existing external web Service can be registered, be called by the service registration functionality set, while can be by the clothes of definition Business is called for other application.Using ESB platform, can be docked by JDBC/ODBC interface with SQL engine, by distributed data base Queried access is encapsulated as Web Services, calls for related application system;ESB platform can pass through HTTP/JSON interface and connection Query engine docking is closed, is Web Services by unstructured data and structural data conjunctive query access encapsulation, for phase Application system is closed to call;The analysis mining result that report/analysis platform generates can be encapsulated as to RESTful clothes based on ESB platform Business is called for related application system.
It is had the advantage that using the system
1, the transition and upgrade ability of traditional industries can be promoted.Under the new situation, it is sent out by sufficiently discharging big data in industry Change effect in exhibition, accelerates the change of traditional industries form of operation and administration, service mode and business model innovation and industrial value Chain System reconstruct pushes networking service system, intensive integration, Collaborative exploitation and the high efficiency of social production element to utilize, from And can change traditional production method, promote conventional industries and new industry situation new model cooperative development, accelerate economic transition paces, Promote efficiency of economic operations.By taking industry as an example, in design link, it can use big data corporate behavior analysis system and promote industry The personalization of design link is horizontal;In production link, it can use big data monitoring optimization pipelining, strengthen failure predication With health control, optimize product quality, reduces energy consumption;
2, the system can optimize commodity supply, promote Business Economic Benefit.Past, manufacturing enterprise are according to market It is expected that, judgement go tissue to produce, and the real demand in the expection of enterprise and market not necessarily coincide, so will cause product product Pressure generates inventory, increases entreprise cost.Internet era then can use big data and promote production and marketing docking, according to consumer's Real demand is fixed output quota by sales, and is promoted the precision of industrial products sale, is avoided the generation of unnecessary raw material and human cost, So that business inventory is reached minimum simultaneously, reduce cost, promotes Business Economic Benefit.Such as manufacturing enterprise can be searched by electric business The big data of collection is intuitive, easily obtains the real demand of consumer.Such as: Jingdone district by excavate Jingdone district user browsing, it is right The precision data of purchase overall process than, selection, purchase and comment feeds back to manufacturing enterprise, allows enterprise from beginning manufacturing cost core Just there is data support when calculation, product is designed and produced according to big data, it is more efficient, cost is lower.
3, the system can cultivate generation new growth engines.Traditional kinetic energy is transformed using big data, cultivates new kinetic energy, Meet China's economic development desirability, innovates driving real economy Transformation Development to realizing, it is significant, have a extensive future.Newly Under epoch, the potential value of big data economy necessarily drive it is associated include big data resource construction, it is big data technology, big Development of information industry including data application.
Finally, it should be noted that foregoing description is only a specific embodiment of the invention, but protection scope of the present invention It is not limited thereto, anyone skilled in the art in the technical scope disclosed by the present invention, can readily occur in Change or replacement, should all cover in protection scope of the present invention.

Claims (9)

1. a kind of corporate behavior analysis system based on big data, which is characterized in that acquire processing platform, data bins including data Library, data control platform, data analysis mining platform, performance analysis platform and data visualization show platform;
Data acquisition processing platform is used to acquire the various source datas of enterprise, and to the various source datas handled with Obtain target data;
The data warehouse is for storing the target data;
The data control platform for provide metadata management, master data management, data quality management, data standard management and Data safety management service;
The data analysis mining platform is for providing algorithm model library and data analysis mining tool;
The performance analysis platform is used for through the algorithm model library and data analysis mining tool processing target data to obtain To processing as a result, the processing result can provide performance analysis and decision support for enterprise;
The data visualization shows platform for the processing result to be carried out various visual presentations.
2. a kind of corporate behavior analysis system based on big data according to claim 1, which is characterized in that the data Warehouse includes distributed column storing data library and distributed file system.
3. a kind of corporate behavior analysis system based on big data according to claim 1, which is characterized in that the source number According to include structural data, half/one of unstructured data and real time data or a variety of.
4. a kind of corporate behavior analysis system based on big data according to claim 3, which is characterized in that the enterprise Various source datas source specifically include the existing each operation system of enterprise data, by Distributed Message Queue acquire reality When the data and internet data that is acquired by web crawlers technology.
5. a kind of corporate behavior analysis system based on big data according to claim 4, which is characterized in that the enterprise Various source datas source further include make a report on online with report file upload data.
6. a kind of corporate behavior analysis system based on big data according to claim 1, which is characterized in that source data into The mode of row processing includes: data cleansing, data re-scheduling and data mart modeling;
The data cleansing refers to deleting extraneous data, smooth noise data in source data, screens out unrelated with theme Data and processing missing values, exceptional value, wherein the processing of missing values is carried out using elimination method, Shift Method and interpolation;
The data re-scheduling refers to removing the repeated data in source data;
The data mart modeling refers to carrying out the data in source data a point column, merge.
7. a kind of corporate behavior analysis system based on big data according to claim 1, which is characterized in that the algorithm Model library includes model-naive Bayesian, is classified using model-naive Bayesian to the target data;Wherein, simple shellfish The expression formula of this model of leaf is as follows:
P (B | A)=P (B) × P (A | B)/P (A)
Wherein, P (B | A) indicates to assume the probability of data B when A is set up, i.e. posterior probability;P (A) indicates the training number that will be observed According to the prior probability of A;P (B) indicates prior probability, i.e., the probability that B possesses is assumed before no training data;P (A | B) table Show the probability of data A in the case where assuming B establishment.
8. a kind of corporate behavior analysis system based on big data according to claim 1, which is characterized in that data analysis Digging tool processing target data specifically includes probability description, association analysis, classification, clustering, forecast analysis and deviation It tests and analyzes;
Wherein, the probability description includes characteristic description and distinctiveness description, and the characteristic description is for indicating certain classification Mark the common trait of data;Difference between class target data of the distinctiveness description for indicating different;
The association analysis is used for from correlation rule, the correlativity, causal structure found between item collection in a large amount of target datas And the frequent mode of item collection;The correlation rule is used to describe the degree that influences each other between attribute, passes through confidence level and branch Degree of holding is measured, and confidence level is used for for the measurement to accuracy in correlation rule, support to important in correlation rule The measurement of property;
The classification is for finding a conjunction for each class target data under the premise of the feature of known training data and classification Then the model of reason again classifies to new data with the model;
The clustering is used in the case where preparatory unknown division classification, and it is poly- to carry out information according to information similarity principle Collection;
The forecast analysis is for predicting continuous or ordinal value;
The separate-blas estimation analysis is found out for analyzing significant changes and deviation between data status, historical record or standard Existing exception record, and take corrective action.
9. a kind of corporate behavior analysis system based on big data according to claim 1, which is characterized in that the data It includes that J2EE platform and visualization show component that visualization, which shows platform, and it includes immediate inquiring group that the visualization, which shows component, Part, report and instrument panel assemblies, OLAP multidimensional analysis component and map show component.
CN201811058169.7A 2018-09-11 2018-09-11 Enterprise behavior analysis system based on big data Expired - Fee Related CN109272155B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811058169.7A CN109272155B (en) 2018-09-11 2018-09-11 Enterprise behavior analysis system based on big data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811058169.7A CN109272155B (en) 2018-09-11 2018-09-11 Enterprise behavior analysis system based on big data

Publications (2)

Publication Number Publication Date
CN109272155A true CN109272155A (en) 2019-01-25
CN109272155B CN109272155B (en) 2021-07-06

Family

ID=65188505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811058169.7A Expired - Fee Related CN109272155B (en) 2018-09-11 2018-09-11 Enterprise behavior analysis system based on big data

Country Status (1)

Country Link
CN (1) CN109272155B (en)

Cited By (37)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083641A (en) * 2019-04-26 2019-08-02 广州大学 Intelligence analysis method and device based on goal behavior
CN110119395A (en) * 2019-05-27 2019-08-13 普元信息技术股份有限公司 The method that data standard and quality of data association process are realized based on metadata in big data improvement
CN110175151A (en) * 2019-05-22 2019-08-27 中国农业科学院农业信息研究所 A kind of processing method, device, equipment and the storage medium of agricultural big data
CN110288248A (en) * 2019-06-28 2019-09-27 重庆回形针信息技术有限公司 A kind of business administration resource sharing system, method and business model
CN110362605A (en) * 2019-06-04 2019-10-22 苏州神州数码捷通科技有限公司 A kind of E book data verification method based on big data
CN110489459A (en) * 2019-08-07 2019-11-22 国网安徽省电力有限公司 A kind of enterprise-level industry number fused data analysis system based on big data platform
CN110716774A (en) * 2019-08-22 2020-01-21 华信永道(北京)科技股份有限公司 Data driving method, system and storage medium for brain of financial business data
CN110716966A (en) * 2019-10-16 2020-01-21 京东方科技集团股份有限公司 Data visualization processing method and system, electronic device and storage medium
CN110990469A (en) * 2019-11-18 2020-04-10 北京禧云信息科技有限公司 Data authorization and data self-service extraction method and device based on data warehouse
CN111064790A (en) * 2019-12-18 2020-04-24 广州森立公共服务有限公司 Human resource management system
CN111126852A (en) * 2019-12-25 2020-05-08 江苏三六五网络股份有限公司 BI application system based on big data modeling
CN111258968A (en) * 2019-12-30 2020-06-09 广州博士信息技术研究院有限公司 Enterprise redundant data cleaning method and device and big data platform
CN111640040A (en) * 2020-04-07 2020-09-08 国网新疆电力有限公司 Power supply customer value evaluation method based on customer portrait technology and big data platform
CN111708774A (en) * 2020-04-16 2020-09-25 上海华东电信研究院 Industry analytic system based on big data
CN112231116A (en) * 2020-10-12 2021-01-15 航天科工广信智能技术有限公司 Object fusion method of microwave radar and application system thereof
CN112270613A (en) * 2020-09-29 2021-01-26 广东工业大学 Manufacturing process big data modeling method for whole-process manufacturing management and control of manufacturing enterprise
CN112419027A (en) * 2020-11-26 2021-02-26 天翼征信有限公司 Financial platform system based on operator big data
CN112445853A (en) * 2020-11-18 2021-03-05 广东赛意信息科技有限公司 Digital transparent visual analysis platform
CN112632173A (en) * 2020-12-30 2021-04-09 民生科技有限责任公司 ETL-based due diligence data analysis system and method under mass data
CN112631794A (en) * 2020-12-02 2021-04-09 红云红河烟草(集团)有限责任公司 Information integration method for cigarette manufacturing
CN112685514A (en) * 2021-01-08 2021-04-20 北京云桥智联科技有限公司 AI intelligent customer value management platform
CN112817706A (en) * 2019-11-15 2021-05-18 杭州海康威视数字技术股份有限公司 Distributed task scheduling system and method
CN112819297A (en) * 2021-01-18 2021-05-18 树根互联股份有限公司 Production task completion efficiency analysis method and device and terminal equipment
CN112837199A (en) * 2021-02-25 2021-05-25 重庆数联铭信科技有限公司 Method for establishing big data service platform of small and medium-sized micro-enterprises
CN112965975A (en) * 2021-02-22 2021-06-15 上海明略人工智能(集团)有限公司 Data processing method and system
CN112988865A (en) * 2021-03-02 2021-06-18 中国联合网络通信集团有限公司 Industrial Internet service management system
CN113159547A (en) * 2021-04-12 2021-07-23 上海财经大学浙江学院 Enterprise data monitoring method and system based on big data architecture
CN113177698A (en) * 2021-04-12 2021-07-27 北京科技大学 Industrial big data analysis aid decision platform system
CN113254013A (en) * 2021-07-16 2021-08-13 电子科技大学 Reusable component mining method for complex business process
CN113627865A (en) * 2020-05-07 2021-11-09 景德镇陶瓷大学 Enterprise management analysis system for business administration
CN113742315A (en) * 2021-08-17 2021-12-03 广州工业智能研究院 Manufacturing big data processing platform and method
CN114859744A (en) * 2022-05-07 2022-08-05 内蒙古云科数据服务股份有限公司 Intelligent application visualization control method and system based on big data
CN115190026A (en) * 2022-05-09 2022-10-14 广州中南网络技术有限公司 Internet digital circulation method
CN115630839A (en) * 2022-11-01 2023-01-20 苏州泽达兴邦医药科技有限公司 Production intelligent feedback regulation and control system based on data mining
CN116594987A (en) * 2023-06-18 2023-08-15 广东南华工商职业学院 Database analysis system and method based on big data
CN117688319A (en) * 2023-11-10 2024-03-12 山东恒云信息科技有限公司 Method for analyzing database structure by using AI
CN117725086A (en) * 2024-02-06 2024-03-19 中科云谷科技有限公司 Big data service system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data
CN107193994A (en) * 2017-06-07 2017-09-22 前海梧桐(深圳)数据有限公司 Business decision point method for digging and its system based on mass data
CN107491553A (en) * 2017-08-31 2017-12-19 武汉光谷信息技术股份有限公司 A kind of data digging method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678665A (en) * 2013-12-24 2014-03-26 焦点科技股份有限公司 Heterogeneous large data integration method and system based on data warehouses
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data
CN107193994A (en) * 2017-06-07 2017-09-22 前海梧桐(深圳)数据有限公司 Business decision point method for digging and its system based on mass data
CN107491553A (en) * 2017-08-31 2017-12-19 武汉光谷信息技术股份有限公司 A kind of data digging method and system

Cited By (46)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110083641A (en) * 2019-04-26 2019-08-02 广州大学 Intelligence analysis method and device based on goal behavior
CN110175151A (en) * 2019-05-22 2019-08-27 中国农业科学院农业信息研究所 A kind of processing method, device, equipment and the storage medium of agricultural big data
CN110119395A (en) * 2019-05-27 2019-08-13 普元信息技术股份有限公司 The method that data standard and quality of data association process are realized based on metadata in big data improvement
CN110119395B (en) * 2019-05-27 2023-09-15 普元信息技术股份有限公司 Method for realizing association processing of data standard and data quality based on metadata in big data management
CN110362605A (en) * 2019-06-04 2019-10-22 苏州神州数码捷通科技有限公司 A kind of E book data verification method based on big data
CN110288248A (en) * 2019-06-28 2019-09-27 重庆回形针信息技术有限公司 A kind of business administration resource sharing system, method and business model
CN110489459A (en) * 2019-08-07 2019-11-22 国网安徽省电力有限公司 A kind of enterprise-level industry number fused data analysis system based on big data platform
CN110716774A (en) * 2019-08-22 2020-01-21 华信永道(北京)科技股份有限公司 Data driving method, system and storage medium for brain of financial business data
CN110716966A (en) * 2019-10-16 2020-01-21 京东方科技集团股份有限公司 Data visualization processing method and system, electronic device and storage medium
CN112817706A (en) * 2019-11-15 2021-05-18 杭州海康威视数字技术股份有限公司 Distributed task scheduling system and method
CN112817706B (en) * 2019-11-15 2023-06-02 杭州海康威视数字技术股份有限公司 Distributed task scheduling system and method
CN110990469A (en) * 2019-11-18 2020-04-10 北京禧云信息科技有限公司 Data authorization and data self-service extraction method and device based on data warehouse
CN110990469B (en) * 2019-11-18 2024-02-20 北京禧云信息科技有限公司 Method and device for data authorization and data self-help extraction based on data warehouse
CN111064790A (en) * 2019-12-18 2020-04-24 广州森立公共服务有限公司 Human resource management system
CN111126852A (en) * 2019-12-25 2020-05-08 江苏三六五网络股份有限公司 BI application system based on big data modeling
CN111258968A (en) * 2019-12-30 2020-06-09 广州博士信息技术研究院有限公司 Enterprise redundant data cleaning method and device and big data platform
CN111640040A (en) * 2020-04-07 2020-09-08 国网新疆电力有限公司 Power supply customer value evaluation method based on customer portrait technology and big data platform
CN111708774A (en) * 2020-04-16 2020-09-25 上海华东电信研究院 Industry analytic system based on big data
CN111708774B (en) * 2020-04-16 2023-03-10 上海华东电信研究院 Industry analytic system based on big data
CN113627865A (en) * 2020-05-07 2021-11-09 景德镇陶瓷大学 Enterprise management analysis system for business administration
CN112270613A (en) * 2020-09-29 2021-01-26 广东工业大学 Manufacturing process big data modeling method for whole-process manufacturing management and control of manufacturing enterprise
CN112270613B (en) * 2020-09-29 2024-04-26 广东工业大学 Manufacturing process big data modeling method for manufacturing enterprise full-flow manufacturing control
CN112231116A (en) * 2020-10-12 2021-01-15 航天科工广信智能技术有限公司 Object fusion method of microwave radar and application system thereof
CN112445853A (en) * 2020-11-18 2021-03-05 广东赛意信息科技有限公司 Digital transparent visual analysis platform
CN112419027A (en) * 2020-11-26 2021-02-26 天翼征信有限公司 Financial platform system based on operator big data
CN112631794A (en) * 2020-12-02 2021-04-09 红云红河烟草(集团)有限责任公司 Information integration method for cigarette manufacturing
CN112632173A (en) * 2020-12-30 2021-04-09 民生科技有限责任公司 ETL-based due diligence data analysis system and method under mass data
CN112685514A (en) * 2021-01-08 2021-04-20 北京云桥智联科技有限公司 AI intelligent customer value management platform
CN112819297A (en) * 2021-01-18 2021-05-18 树根互联股份有限公司 Production task completion efficiency analysis method and device and terminal equipment
CN112965975A (en) * 2021-02-22 2021-06-15 上海明略人工智能(集团)有限公司 Data processing method and system
CN112837199A (en) * 2021-02-25 2021-05-25 重庆数联铭信科技有限公司 Method for establishing big data service platform of small and medium-sized micro-enterprises
CN112988865B (en) * 2021-03-02 2023-06-16 中国联合网络通信集团有限公司 Industrial Internet service management system
CN112988865A (en) * 2021-03-02 2021-06-18 中国联合网络通信集团有限公司 Industrial Internet service management system
CN113177698A (en) * 2021-04-12 2021-07-27 北京科技大学 Industrial big data analysis aid decision platform system
CN113159547A (en) * 2021-04-12 2021-07-23 上海财经大学浙江学院 Enterprise data monitoring method and system based on big data architecture
CN113254013A (en) * 2021-07-16 2021-08-13 电子科技大学 Reusable component mining method for complex business process
CN113742315A (en) * 2021-08-17 2021-12-03 广州工业智能研究院 Manufacturing big data processing platform and method
CN114859744A (en) * 2022-05-07 2022-08-05 内蒙古云科数据服务股份有限公司 Intelligent application visualization control method and system based on big data
CN115190026A (en) * 2022-05-09 2022-10-14 广州中南网络技术有限公司 Internet digital circulation method
CN115630839A (en) * 2022-11-01 2023-01-20 苏州泽达兴邦医药科技有限公司 Production intelligent feedback regulation and control system based on data mining
CN115630839B (en) * 2022-11-01 2023-11-10 苍南县求是中医药创新研究院 Intelligent feedback production regulation and control system based on data mining
CN116594987A (en) * 2023-06-18 2023-08-15 广东南华工商职业学院 Database analysis system and method based on big data
CN117688319A (en) * 2023-11-10 2024-03-12 山东恒云信息科技有限公司 Method for analyzing database structure by using AI
CN117688319B (en) * 2023-11-10 2024-05-07 山东恒云信息科技有限公司 Method for analyzing database structure by using AI
CN117725086A (en) * 2024-02-06 2024-03-19 中科云谷科技有限公司 Big data service system
CN117725086B (en) * 2024-02-06 2024-05-07 中科云谷科技有限公司 Big data service system

Also Published As

Publication number Publication date
CN109272155B (en) 2021-07-06

Similar Documents

Publication Publication Date Title
CN109272155A (en) A kind of corporate behavior analysis system based on big data
JP7273045B2 (en) Dimensional Context Propagation Techniques for Optimizing SQL Query Plans
Grover et al. Big data analytics: A review on theoretical contributions and tools used in literature
Muniswamaiah et al. Big data in cloud computing review and opportunities
Buyya et al. Big data: principles and paradigms
Nambiar et al. An overview of data warehouse and data lake in modern enterprise data management
Rodríguez-Mazahua et al. A general perspective of Big Data: applications, tools, challenges and trends
Costa et al. Big Data: State-of-the-art concepts, techniques, technologies, modeling approaches and research challenges
Arshad et al. Nosql: Future of bigdata analytics characteristics and comparison with rdbms
Costa et al. The SusCity big data warehousing approach for smart cities
Dhavapriya et al. Big data analytics: challenges and solutions using Hadoop, map reduce and big table
Kumar et al. Big data and analytics: issues, challenges, and opportunities
Lee et al. Hands-On Big Data Modeling: Effective database design techniques for data architects and business intelligence professionals
Patel et al. Real time data processing frameworks
Jadhav et al. A Practical approach for integrating Big data Analytics into E-governance using hadoop
Chen et al. On construction of a power data lake platform using spark
Bureva Index matrices as a tool for data lakehouse modelling
Darius et al. From Data to Insights: A Review of Cloud-Based Big Data Tools and Technologies
Megahed et al. Survey on Big Data and Cloud Computing: Storage Challenges and Open Issues
Srinivasan et al. Big data analytics tools a review
Alahakoon et al. Leveraging big data for organizational performance management and control
Siddesh et al. Driving big data with hadoop technologies
Gupta et al. Learner to advanced: Big data journey
Sadat et al. A Social Media Approach for Improving Decision-Making Systems
Rallapalli et al. Apache Spark and Hadoop Based Big Data Processing System for Clinical Research

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210706

CF01 Termination of patent right due to non-payment of annual fee