CN108304551A - A kind of enterprise's big data analysis system and method - Google Patents

A kind of enterprise's big data analysis system and method Download PDF

Info

Publication number
CN108304551A
CN108304551A CN201810101175.XA CN201810101175A CN108304551A CN 108304551 A CN108304551 A CN 108304551A CN 201810101175 A CN201810101175 A CN 201810101175A CN 108304551 A CN108304551 A CN 108304551A
Authority
CN
China
Prior art keywords
data
enterprise
subsystem
big
acquisition
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810101175.XA
Other languages
Chinese (zh)
Inventor
崔恩泉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong ICity Information Technology Co., Ltd.
Original Assignee
Shandong Hui Trade Electronic Port Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Hui Trade Electronic Port Co Ltd filed Critical Shandong Hui Trade Electronic Port Co Ltd
Priority to CN201810101175.XA priority Critical patent/CN108304551A/en
Publication of CN108304551A publication Critical patent/CN108304551A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2465Query processing support for facilitating data mining operations in structured databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Probability & Statistics with Applications (AREA)
  • Fuzzy Systems (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Quality & Reliability (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a kind of enterprise's big data analysis system and method, belong to field of cloud computer technology, are acquired, store, analyze, handle and present to data, to carry out Data Integration.Including data acquisition subsystem, it to be used for the acquisition of data, including log acquisition module, network data acquisition module and other data acquisition modules;Data storage subsystem, for collected data to be imported into database;Data process subsystem, for the data in data storage subsystem to be cleaned, convert, extract and calculated;Data analytics subsystem, for for statistical analysis and depth is excavated to data process subsystem treated data;Data show subsystem, for being presented to data.Data Integration, quicklook are allowed enterprise preferably intuitively to be analyzed using big data by the present invention, facilitate enterprise or the personal use for carrying out big data, have very big benefit to enterprise development.

Description

A kind of enterprise's big data analysis system and method
Technical field
The present invention relates to field of cloud computer technology, especially specifically a kind of enterprise's big data analysis system and method.
Background technology
Data analysis refer to statistical analysis technique appropriate to collect come mass data analyze, extract useful letter Breath and formed conclusion and to data be subject in detail research and summary process.This process is also the branch of quality management system Hold process.In practicality, data analysis can help people to judge, to take action appropriate.The number of data analysis It learns basis just to have established in early stage in 20th century, but until the appearance of computer just makes practical operation become a reality, and makes total It is promoted according to analysis.Data analysis is the product that mathematical and computer sciences are combined.
Current era is big data epoch, cloud computing era, our life be unable to do without huge data.According to dependency number According to statistics, people's each second send 290 envelope Emails, and Amazon handles 72.9 orders;People per minute are in You Tube Upload 20 hours videos;Monthly people browse 700,000,000,000 minutes on Facebook in total.Such huge data volume is gone back simultaneously There are the diversified problem of data, the diversified formation of data mainly has both sides reason:First, data source is more, there is search Engine, social networks, message registration, sensor etc.;Second is that data format is more, there are structured data, semi structured data and non-knot Structure data.
Data analysis problems faced is that data volume is big at present, and the diversification such as multiple structural forms and real-time requires, this A little problems increase data acquisition and integrate difficulty, and the architecture design of traditional storage system based on block and file can not expire The needs of sufficient data analysis.In face of so big data volume, how to allow data to generate value, data is allowed to bring interests to us, It is current problem needed to be considered.
Invention content
The technical assignment of the present invention is to be directed to the above shortcoming, provides a kind of enterprise's big data analysis system and method, Data are acquired, store, analyze, handle and are presented, enterprise is allowed preferably intuitively to be analyzed using big data, it is convenient Enterprise or the personal use for carrying out big data.
A kind of enterprise's big data analysis system, is acquired data, stores, analyzes, handles and presents, thus into line number According to integration;Including
Data acquisition subsystem is used for the acquisition of data, including log acquisition module, network data acquisition module and other data Acquisition module;
Data storage subsystem, for collected data to be imported into database;
Data process subsystem, for the data in data storage subsystem to be cleaned, convert, extract and calculated;
Data analytics subsystem, for for statistical analysis and depth is excavated to data process subsystem treated data;
Data show subsystem, for being presented to data.
Preferably, log acquisition module is distributed structure/architecture, disclosure satisfy that the log data acquisition and biography of hundreds of MB per second Defeated demand.
Preferably, the log acquisition module is the pattern based on plug-in unit, and the acquisition plug-in unit of component adaptation business scenario is System calls different acquisition services to be acquired data according to different Log Sources, and unified format analysis processing, persistently arrives daily record Library.
Preferably, the network data acquisition module discloses API by web crawlers or website and obtains data letter from website Breath, unstructured data is extracted from webpage and is stored as unified local data file in a structured way.
Further, network data acquisition module support picture, audio and video file or attachment acquisition, attachment with just Literary auto-associating.
Preferably, other described data acquisition modules with enterprise or research institution by cooperating, for enterprise production and management The confidentiality requirement of data or disciplinary study data improves Information Security using the mode gathered data of particular system interface.
Preferably, the data-storage system is the large-scale distributed database of a concentration.
Preferably, data process subsystem includes the double typing comparisons of data to the cleaning of data, data merge, lookup repeats Value searches missing values and searches exceptional value.
Preferably, data processing system includes that data pick-up, data conversion, data mart modeling and data fill to the extraction of data It carries.
A kind of enterprise's big data analysis method, this method carry out the whole of data using above-mentioned enterprise's big data analysis system It closes, steps are as follows for analysis method:
S1:Using data acquisition subsystem carry out data acquisition, including log collection, network data acquisition and be directed to enterprise Production and operation data or disciplinary study data security require other data acquisition of the particular system interface used;
S2:Collected data are imported in data storage subsystem, that is, are stored in a large-scale distributed database;
S3:The data in data storage subsystem are cleaned, are converted, extracted and calculated using data process subsystem;Its In, cleaning includes the double typing comparisons of data, data merging, searches repetition values, searches missing values and search exceptional value;Extraction includes Data pick-up, data conversion, data mart modeling and data load;
S4:It is for statistical analysis to step S3 treated data and depth is excavated using data analytics subsystem;
S5:Showing subsystem using data, treated that data are presented in the form of table, picture or word to step S4.
To achieve the purpose that Data Integration, quicklook.
A kind of enterprise's big data analysis system and method for the present invention, have the following advantages:
This system and method are by the acquisition of data, storage, processing and analysis, then again by data text by analysis The modes such as word, picture and table are presented to the user, and achieve the purpose that Data Integration, quicklook.
Log acquisition module uses distributed structure/architecture in data acquisition subsystem, disclosure satisfy that the daily record number of hundreds of MB per second According to acquisition and transmission demand;Network data acquisition module supports the acquisition of picture, the files such as audio and video or attachment, attachment with Text can be with auto-associating;Other data acquisition modules are wanted for the confidentiality such as enterprise production and management data or disciplinary study data Higher data are sought, Information Security can be improved using particular system interface.
The present invention can fully cater to current cloud computing, big data epoch, and magnanimity initial data is acquired on a large scale, Using data processing, data analysis finally shows obtained analysis result by way of being easily understood.Pass through this Flow facilitates use of the enterprise to big data, cloud computing, and enterprise is allowed preferably intuitively to be analyzed using big data, convenient Enterprise or the personal use for carrying out big data, have very big benefit to enterprise development.
Description of the drawings
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Attached drawing 1 is enterprise's big data analysis system construction drawing;
Attached drawing 2 is the flow chart of enterprise's big data analysis method in embodiment.
Specific implementation mode
In order to make those skilled in the art more fully understand the solution of the present invention, With reference to embodiment to this Invention is described in further detail.Obviously, described embodiments are only a part of the embodiments of the present invention, rather than all Embodiment.Based on the embodiments of the present invention, those of ordinary skill in the art institute without making creative work The every other embodiment obtained, shall fall within the protection scope of the present invention.
A kind of enterprise's big data analysis system, is acquired data, stores, analyzes, handles and presents, thus into line number According to integration.Including data acquisition subsystem, data storage subsystem, data process subsystem, data analytics subsystem and data Show subsystem.
Data acquisition subsystem, be used for data acquisition, including log acquisition module, network data acquisition module and other Data acquisition module.
Log acquisition module is distributed structure/architecture, disclosure satisfy that the log data acquisition and transmission demand of hundreds of MB per second. The log acquisition module is the pattern based on plug-in unit, and the acquisition plug-in unit of component adaptation business scenario, system is according to different days Will source calls different acquisition services to be acquired data, and unified format analysis processing, persistently arrives daily record library.
Network data acquisition module discloses API by web crawlers or website and obtains data information from website, will be non-structural Change data to extract from webpage and be stored as unified local data file in a structured way.Network data acquisition mould Block supports the acquisition of picture, audio and video file or attachment, attachment and text auto-associating.
Other data acquisition modules are ground by cooperating with enterprise or research institution for enterprise production and management data or subject The confidentiality requirement for studying carefully data improves Information Security using the mode gathered data of particular system interface.
Data storage subsystem, for collected data to be imported into database.Data-storage system is a collection In large-scale distributed database.
Data process subsystem, for the data in data storage subsystem to be cleaned, convert, extract and calculated.
Data process subsystem includes the double typing comparisons of data to the cleaning of data, data merging, searches repetition values, searches Missing values and lookup exceptional value.
Data processing system includes that data pick-up, data conversion, data mart modeling and data load to the extraction of data.
Data analytics subsystem, for for statistical analysis and depth is dug to data process subsystem treated data Pick.
Data show subsystem, for being presented to data.Appearance form includes table, picture and word.
A kind of enterprise's big data analysis method, this method carry out the whole of data using above-mentioned enterprise's big data analysis system It closes, steps are as follows for analysis method:
S1:Using data acquisition subsystem carry out data acquisition, including log collection, network data acquisition and be directed to enterprise Production and operation data or disciplinary study data security require other data acquisition of the particular system interface used;
S2:Collected data are imported in data storage subsystem, that is, are stored in a large-scale distributed database;
S3:The data in data storage subsystem are cleaned, are converted, extracted and calculated using data process subsystem;Its In, cleaning includes the double typing comparisons of data, data merging, searches repetition values, searches missing values and search exceptional value;Extraction includes Data pick-up, data conversion, data mart modeling and data load;
S4:It is for statistical analysis to step S3 treated data and depth is excavated using data analytics subsystem;
S5:Showing subsystem using data, treated that data are presented in the form of table, picture or word to step S4. To achieve the purpose that Data Integration, quicklook.
Above-mentioned specific implementation mode is only the specific case of the present invention, and scope of patent protection of the invention includes but not limited to Above-mentioned specific implementation mode, it is any meet the present invention a kind of enterprise's big data analysis system and method claims and The appropriate change or replacement that the those of ordinary skill of any technical field does it, the patent that should all fall into the present invention are protected Protect range.

Claims (10)

1. a kind of enterprise's big data analysis system, it is characterised in that data are acquired, store, analyze, handle and are presented, from And carry out Data Integration;Including
Data acquisition subsystem is used for the acquisition of data, including log acquisition module, network data acquisition module and other data Acquisition module;
Data storage subsystem, for collected data to be imported into database;
Data process subsystem, for the data in data storage subsystem to be cleaned, convert, extract and calculated;
Data analytics subsystem, for for statistical analysis and depth is excavated to data process subsystem treated data;
Data show subsystem, for being presented to data.
2. a kind of enterprise's big data analysis system according to claim 1, it is characterised in that log acquisition module is distribution Formula framework.
3. a kind of enterprise's big data analysis system according to claim 1 or 2, it is characterised in that the log acquisition module For the pattern based on plug-in unit, the acquisition plug-in unit of component adaptation business scenario, system calls different adopt according to different Log Sources Collection service is acquired data, and unified format analysis processing, persistently arrives daily record library.
4. a kind of enterprise's big data analysis system according to claim 1 or 2, it is characterised in that the network data acquisition Module obtains data information from website, and unstructured data is extracted from webpage and is stored as uniting in a structured way One local data file.
5. a kind of enterprise's big data analysis system according to claim 4, it is characterised in that network data acquisition module branch Hold the acquisition of picture, audio and video file or attachment, attachment and text auto-associating.
6. a kind of enterprise's big data analysis system according to claim 1 or 2, it is characterised in that other data acquisition Module is acquired for the confidentiality requirement of enterprise production and management data or disciplinary study data using the mode of particular system interface Data.
7. a kind of enterprise's big data analysis system according to claim 1 or 2, it is characterised in that the data-storage system The large-scale distributed database concentrated for one.
8. a kind of enterprise's big data analysis system according to claim 1, it is characterised in that data process subsystem logarithm According to cleaning include that the double typing comparison of data, data merge, search repetition values, search missing values and search exceptional value.
9. a kind of enterprise's big data analysis system according to claim 1 or 8, it is characterised in that data processing system logarithm According to extraction include that data pick-up, data conversion, data mart modeling and data load.
10. a kind of enterprise's big data analysis method, it is characterised in that using the big number of enterprise described in any one of claim 1-9 The integration of data is carried out according to analysis system, steps are as follows for analysis method:
S1:The acquisition of data is carried out using data acquisition subsystem;
S2:Collected data are imported in data storage subsystem;
S3:The data in data storage subsystem are cleaned, are converted, extracted and calculated using data process subsystem;
S4:It is for statistical analysis to step S3 treated data and depth is excavated using data analytics subsystem;
S5:Showing subsystem using data, treated that data are presented in the form of table, picture or word to step S4.
CN201810101175.XA 2018-02-01 2018-02-01 A kind of enterprise's big data analysis system and method Pending CN108304551A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810101175.XA CN108304551A (en) 2018-02-01 2018-02-01 A kind of enterprise's big data analysis system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810101175.XA CN108304551A (en) 2018-02-01 2018-02-01 A kind of enterprise's big data analysis system and method

Publications (1)

Publication Number Publication Date
CN108304551A true CN108304551A (en) 2018-07-20

Family

ID=62850874

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810101175.XA Pending CN108304551A (en) 2018-02-01 2018-02-01 A kind of enterprise's big data analysis system and method

Country Status (1)

Country Link
CN (1) CN108304551A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308578A (en) * 2018-09-13 2019-02-05 江苏站企动网络科技有限公司 A kind of enterprise's big data analysis system and method
CN109359628A (en) * 2018-11-28 2019-02-19 上海风语筑展示股份有限公司 A kind of exhibition big data collection analysis platform
CN110287195A (en) * 2019-06-28 2019-09-27 重庆回形针信息技术有限公司 Distributed data analyzing system and method
CN112732688A (en) * 2020-12-30 2021-04-30 四川博道维新企业管理有限公司 Enterprise data acquisition and analysis system
CN112800118A (en) * 2021-04-01 2021-05-14 南泽(广东)科技股份有限公司 Service data integration system based on multi-dimensional analysis and data analysis method thereof
CN117453721A (en) * 2023-10-29 2024-01-26 江苏信而泰智能装备有限公司 Production management data acquisition system based on big data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339439A (en) * 2016-08-22 2017-01-18 成都众易通科技有限公司 Big data analysis method

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339439A (en) * 2016-08-22 2017-01-18 成都众易通科技有限公司 Big data analysis method

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109308578A (en) * 2018-09-13 2019-02-05 江苏站企动网络科技有限公司 A kind of enterprise's big data analysis system and method
CN109359628A (en) * 2018-11-28 2019-02-19 上海风语筑展示股份有限公司 A kind of exhibition big data collection analysis platform
CN110287195A (en) * 2019-06-28 2019-09-27 重庆回形针信息技术有限公司 Distributed data analyzing system and method
CN112732688A (en) * 2020-12-30 2021-04-30 四川博道维新企业管理有限公司 Enterprise data acquisition and analysis system
CN112800118A (en) * 2021-04-01 2021-05-14 南泽(广东)科技股份有限公司 Service data integration system based on multi-dimensional analysis and data analysis method thereof
CN117453721A (en) * 2023-10-29 2024-01-26 江苏信而泰智能装备有限公司 Production management data acquisition system based on big data

Similar Documents

Publication Publication Date Title
CN108304551A (en) A kind of enterprise's big data analysis system and method
CN106339439A (en) Big data analysis method
CN110309264B (en) Method and device for acquiring geographic product data based on knowledge graph
CN103678647B (en) A kind of method and system for realizing information recommendation
CN106933724B (en) Distributed information tracking system, information processing method and device
CN104685495A (en) A system and method for automatic generation of information-rich content from multiple microblogs, each microblog containing only sparse information
CN109063196A (en) Data processing method, device, electronic equipment and computer readable storage medium
CN105069087A (en) Web log data mining based website optimization method
CN110533477A (en) A kind of intelligent analysis method and system based on big data
CN104216889B (en) Data dissemination analyzing and predicting method and system based on cloud service
CN106445894A (en) New media intelligent online editing method and apparatus, and network information release platform
Al-Taie et al. Online data preprocessing: A case study approach
Gomes et al. Towards an infrastructure to support big data for a smart city project
CN112948492A (en) Data processing system, method and device, electronic equipment and storage medium
CN109710767A (en) Multilingual big data service platform
CN106599190A (en) Dynamic Skyline query method based on cloud computing
Hongqian et al. Cloud-based data management system for automatic real-time data acquisition from large-scale laying-hen farms
CN109189842A (en) big data analysis method
Junaidi et al. Analysis of Community Response to Disasters through Twitter Social Media
CN114637903A (en) Public opinion data acquisition system for directional target data expansion
CN104063456B (en) Based on vector query from broadcasting media atlas analysis method and apparatus
CN106354770A (en) Data analysis system
JP2013003880A (en) Consent making support device, consent making support program, and consent making support method
CN107679240B (en) Virtual identity mining method
CN115080636A (en) Big data analysis system based on network service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200811

Address after: 250100 Room 3110, S01 Building, Tidal Building, 1036 Tidal Road, Jinan High-tech Zone, Shandong Province

Applicant after: Shandong Aicheng Network Information Technology Co.,Ltd.

Address before: 250100 S06 Floor, No. 1036 Tidal Road, Jinan High-tech Zone, Shandong Province

Applicant before: SHANDONG HUIMAO ELECTRONIC PORT Co.,Ltd.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20180720

RJ01 Rejection of invention patent application after publication