CN108829704A - A kind of big data distributed libray Analysis Service technology - Google Patents

A kind of big data distributed libray Analysis Service technology Download PDF

Info

Publication number
CN108829704A
CN108829704A CN201810403493.1A CN201810403493A CN108829704A CN 108829704 A CN108829704 A CN 108829704A CN 201810403493 A CN201810403493 A CN 201810403493A CN 108829704 A CN108829704 A CN 108829704A
Authority
CN
China
Prior art keywords
data
module
output end
input terminal
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810403493.1A
Other languages
Chinese (zh)
Inventor
晋亮亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Luckyfield Mdt Infotech Ltd
Original Assignee
Anhui Luckyfield Mdt Infotech Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Luckyfield Mdt Infotech Ltd filed Critical Anhui Luckyfield Mdt Infotech Ltd
Priority to CN201810403493.1A priority Critical patent/CN108829704A/en
Publication of CN108829704A publication Critical patent/CN108829704A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2216/00Indexing scheme relating to additional aspects of information retrieval not explicitly covered by G06F16/00 and subgroups
    • G06F2216/03Data mining

Landscapes

  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a kind of big data distributed libray Analysis Service technologies, including collection terminal, modeling unit, database, server, display terminal and power supply, the collection terminal includes data collection module and data transmission module, the modeling unit includes data aggregate module, data preprocessing module and data categorization module, the database includes the first data storage module, the second data storage module and Nth data storage module, and the server includes data-mining module and data analysis module.The present invention is pre-processed and is polymerize using concurrent operation model by acquisition user behavior data, to user behavior data;According to the user behavior data after polymerization, establish user behavior data ontology model, and it stores in the database, originally practical to combine the powerful processing capacity of cloud computing technology and mass data storage ability, ontology and its reasoning, Methods of Knowledge Discovering Based, it is effectively pushed with accurately user to realize.

Description

A kind of big data distributed libray Analysis Service technology
Technical field
The present invention relates to communication service correlative technology field, in particular to a kind of big data distributed libray Analysis Service skill Art.
Background technique
With the arrival of information age, the geometry that rolls up of data increases again.In order to be excavated from existing mass data There are a variety of different data analysis algorithms in effective information.Big data technology is to be with the total data resource of any system Object simultaneously therefrom finds the information processing technology of correlative relationship showed between data, oneself is through being widely used in internet at present Process optimization, targeted message and advertisement pushing, user individual service and improve etc., become network service behind Powerful background support.Based on the flat analysis and utilization for closing realization to whole user behavior information of big data, user's row has been adapted to For information itself is in large scale, data format complicated pluralism, the demanding feature of arithmetic speed, it can satisfy all types of networks The actual demand of service.Many researchs were done in analysis for user behavior both at home and abroad, but there are some problems:Firstly, mostly It concentrates on and excavates WEB log, but these logs are not sufficient to describe scene when user accesses website in time;Secondly, large-scale net It stands and generally possesses huge online user, the real-time behavior of generation and contextual information amount are huge, therefore, the storage capacity of system It is stronger with calculating speed, analysis result could be fed back into user in time.And currently, most users behavior analysis system is adopted With relational database technology and traditional data processing method, the efficient analysis of mass data cannot be met very well.Therefore, it invents A kind of big data distributed libray Analysis Service technology is necessary to solve the above problems.
Summary of the invention
The purpose of the present invention is to provide a kind of big data distributed libray Analysis Service technologies, to solve above-mentioned background skill The problem of being proposed in art.
To achieve the above object, the present invention provides the following technical solutions:A kind of big data distributed libray Analysis Service skill Art, including collection terminal, modeling unit, database, server, display terminal and power supply, the output end of the collection terminal with The input terminal of modeling unit is electrically connected, and the output end of the modeling unit is electrically connected with the input terminal of database, the database Output end be electrically connected with the input terminal of server, the output end of the server is electrically connected with the input terminal of display terminal;
The collection terminal includes data collection module and data transmission module, the modeling unit include data aggregate module, Data preprocessing module and data categorization module, the database include the first data storage module, the second data storage module With Nth data storage module, the server includes data-mining module and data analysis module.
Preferably, the input terminal of the output end electrical connection data transmission module of the data collection module, the data pass The input terminal of the output end electrical connection data aggregate module of defeated module.
Preferably, the input terminal of the output end electrical connection data preprocessing module of the data aggregate module, the data The input terminal of the output end electrical connection data categorization module of preprocessing module, the output end of the data categorization module are electrically connected respectively Connect the first data storage module, the second data storage module and Nth data storage module.
Preferably, the input terminal of the output end electrical connection data analysis module of the data-mining module.
Preferably, the output end of the power supply is electrically connected collection terminal, modeling unit, database, server and shows Show terminal.
Preferably, the collection terminal can be personal computer, mobile phone and tablet computer.
Preferably, processor is provided in the modeling unit, the processor uses CC2530 micro-chip processor.
Technical effect and advantage of the invention:The present invention uses user behavior data by acquisition user behavior data Concurrent operation model is pre-processed and is polymerize;According to the user behavior data after polymerization, user behavior data ontology mould is established Type, and store in the database;It can be seen that this practical powerful processing capacity and mass data storage by cloud computing technology Ability, ontology and its reasoning, Methods of Knowledge Discovering Based combine, and analyze mass users behavioral data in real time, and it is emerging to obtain user in time Interest is effectively pushed with accurately user to realize.
Detailed description of the invention
Fig. 1 is structure of the invention connection schematic diagram.
Fig. 2 is structure of the invention module connection diagram.
Fig. 3 is structure of the invention flow diagram.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other Embodiment shall fall within the protection scope of the present invention.
Meanwhile cited such as "upper" in this specification ", "lower", "left", "right", " centre " and " one " term, also Only being illustrated convenient for narration, rather than to limit the scope of the invention, relativeness is altered or modified, in nothing Under essence change technology contents, when being also considered as the enforceable range of the present invention.
The present invention provides a kind of big data distributed libray Analysis Service technologies as shown in Figs. 1-3, including collect eventually End, modeling unit, database, server, display terminal and power supply, the output end of power supply are electrically connected collection terminal, modeling Unit, database, server and display terminal, collection terminal can be personal computer, mobile phone and tablet computer, in modeling unit It is provided with processor, processor uses CC2530 micro-chip processor, the output end of collection terminal and the input terminal electricity of modeling unit Connection, the output end of modeling unit are electrically connected with the input terminal of database, the output end of database and the input terminal electricity of server Connection, the output end of server are electrically connected with the input terminal of display terminal;
Collection terminal includes data collection module and data transmission module, the output end electrical connection data transmission of data collection module The input terminal of module, modeling unit include data aggregate module, data preprocessing module and data categorization module, and data transmit mould The input terminal of the output end electrical connection data aggregate module of block, the output end of data aggregate module are electrically connected data preprocessing module Input terminal, data preprocessing module output end electrical connection data categorization module input terminal, database include the first data Storage module, the second data storage module and Nth data storage module, the output end of data categorization module are electrically connected first Data storage module, the second data storage module and Nth data storage module, server include data-mining module and data point Analyse module, the input terminal of the output end electrical connection data analysis module of data-mining module.
Modeling unit, database and server constitute cloud computing platform, and cloud computing platform is the core cluster of system, it It can be made of, can be based on linux system, due to the scalability of cloud computing platform, cloud meter a large amount of low-cost servers The scale for calculating platform can be determined according to the size of system daily record data amount to be treated, carry on cloud computing platform point Cloth document storage system and distributed parallel system are specifically run on each node on cloud computing platform Distributed file storage system and distributed parallel system related linear program, according to role difference, the operation of different nodes Thread will be different, including processing distributed file storage system catalogue stores program and distributed parallel system is parallel Principal and subordinate's thread of calculation procedure, in cluster most of nodes all simultaneously as distributed file storage system catalogue storage unit and Distributed parallel system parallel computation unit, but only when having calculating task, it is responsible for the thread just meeting calculated It executes and exports result.
Website server is the source for the daily record data that need to be excavated, and can be made of common server, and web page server is The core of system interaction mechanism can be made of common server, be responsible for the execution and inquiry request of processing system user, it The order that user terminal is sent issues cloud computing platform to carry out data mining or issue database server to carry out data Export and display, and user terminal is returned result to, web page server can run any server architecture, such as Tomcat, WebLogic, Jboss etc..System user passes through the mining task of web page server operation control system, including hair It send mining task to instruct to control the beginning of mining task, modifies the initial time etc. of automatic mining task.
Data collection module in collection terminal carries out data acquisition accumulation using Cumulative odds model, i.e., excellent using ratio Potential model carries out data accumulation, and magnanimity collects the information data of user object.Collection terminal connection wireless network i.e. include 2G, 3G and 4G cordless communication network or local area network enter WIFI, are communicated with cloud, by the collected Information Number of data collection module It is stored according to modeling unit is transmitted to by data transmission module;User behavior data is used concurrent operation mould by modeling unit Type is polymerize and is pre-processed, and pretreatment includes:Deficiency of data is removed, deleting duplicated data, picture and page animation are right Printing, the collection, preservation, down operation of page progress are converted into corresponding data format after the acquisition, according to pre- place User behavior data after reason is established user behavior data ontology model, and is stored in the database, specifically, being stored in number According in the data storage cell 21 in library 3.The data digging method of data-mining module can be divided into immediate data excavate and Data mining is connect, it is that a model is established using available data that it is target that immediate data, which is excavated, this model is to remaining A specific variable (can be understood as the attribute of storage unit in data storage cell 21, that is, arrange) is described in data, Such as classification, valuation, prophesy belong to immediate data excavation;And it is not selected in indirect data excavation as target a certain specific Variable be described with model;But certain relationship is set up in all variables, such as correlation grouping or association rule Then, assemble, describe to belong to indirect data excavation;The information data that data collection module obtains can be according to data categorization module point Cloth is in the first data storage module, the second data storage module or Nth data storage module, therefore, passes through data mining mould A large amount of unprocessed information datas are converted into being suitble to the form of analysis by block, merge the data from multiple data storage modules, Data are cleaned by data analysis module again to eliminate noise and duplicate information data, selection and current data mining task phase The record and feature of pass analyze mass users behavioral data in real time, obtain user interest in time, to realize effectively and accurately User's push.
User behavior data include user behavior main body, time of origin, generation the page, scroll up and down the page, movement or It clicks mouse, the page residence time, collection, printing, preservation, the same page number of access, duplication stickup text operation, currently use The corresponding title of search condition, search key at family.
In this practical provided several embodiment, it should be understood that disclosed system, it can be in other way It realizes, for example, system embodiment described above is schematical, such as the division of unit, it is a kind of logical function partition, There may be another division manner in actual implementation, such as multiple units or components may be combined or can be integrated into another System, or some features can be ignored or not executed.Another point, shown or discussed mutual lotus root close or directly bring disaster upon Close or communication connection can be through some interfaces, the brief introduction lotus root of device or unit is closed or communication connection, can be electrical property or its His form.
Unit may or may not be physically separated as illustrated by the separation member, shown as a unit It is not physical unit that component, which can be or be also possible to, it can it is in one place, or may be distributed over multiple nets On network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme 's.
It, can also be in addition, each functional unit in this practical each embodiment can integrate in one processing unit It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list Member both can take the form of hardware realization, can also realize in the form of software functional units.
It is worth noting that, included each unit is only divided according to the functional logic in above-described embodiment, But it is not limited to the above division, as long as corresponding functions can be realized;In addition, the specific name of each functional unit It is only for convenience of distinguishing each other, is not limited to this practical protection scope.
Finally it should be noted that:These are only the preferred embodiment of the present invention, is not intended to restrict the invention, although Present invention has been described in detail with reference to the aforementioned embodiments, for those skilled in the art, still can be right Technical solution documented by foregoing embodiments is modified or equivalent replacement of some of the technical features, it is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in protection of the invention Within the scope of.

Claims (7)

1. a kind of big data distributed libray Analysis Service technology, including it is collection terminal, modeling unit, database, server, aobvious Show terminal and power supply, it is characterised in that:The output end of the collection terminal is electrically connected with the input terminal of modeling unit, the modeling The output end of unit is electrically connected with the input terminal of database, and the output end of the database is electrically connected with the input terminal of server, The output end of the server is electrically connected with the input terminal of display terminal;
The collection terminal includes data collection module and data transmission module, the modeling unit include data aggregate module, Data preprocessing module and data categorization module, the database include the first data storage module, the second data storage module With Nth data storage module, the server includes data-mining module and data analysis module.
2. a kind of big data distributed libray Analysis Service technology according to claim 1, it is characterised in that:The data The input terminal of the output end electrical connection data transmission module of collection module, the output end of the data transmission module are electrically connected data The input terminal of aggregation module.
3. a kind of big data distributed libray Analysis Service technology according to claim 1, it is characterised in that:The data The input terminal of the output end electrical connection data preprocessing module of aggregation module, the output end electrical connection of the data preprocessing module The input terminal of data categorization module, the output end of the data categorization module are electrically connected the first data storage module, second Data storage module and Nth data storage module.
4. a kind of big data distributed libray Analysis Service technology according to claim 1, it is characterised in that:The data Excavate the input terminal of the output end electrical connection data analysis module of module.
5. a kind of big data distributed libray Analysis Service technology according to claim 1, it is characterised in that:The power supply Output end be electrically connected collection terminal, modeling unit, database, server and display terminal.
6. a kind of big data distributed libray Analysis Service technology according to claim 1, it is characterised in that:The collection Terminal can be personal computer, mobile phone and tablet computer.
7. a kind of big data distributed libray Analysis Service technology according to claim 1, it is characterised in that:The modeling Processor is provided in unit, the processor uses CC2530 micro-chip processor.
CN201810403493.1A 2018-04-28 2018-04-28 A kind of big data distributed libray Analysis Service technology Pending CN108829704A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810403493.1A CN108829704A (en) 2018-04-28 2018-04-28 A kind of big data distributed libray Analysis Service technology

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810403493.1A CN108829704A (en) 2018-04-28 2018-04-28 A kind of big data distributed libray Analysis Service technology

Publications (1)

Publication Number Publication Date
CN108829704A true CN108829704A (en) 2018-11-16

Family

ID=64147544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810403493.1A Pending CN108829704A (en) 2018-04-28 2018-04-28 A kind of big data distributed libray Analysis Service technology

Country Status (1)

Country Link
CN (1) CN108829704A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581301A (en) * 2020-05-11 2020-08-25 创智汇(苏州)电子商务有限公司 Big data classification system based on distributed data stream and algorithm thereof
CN111813834A (en) * 2020-07-14 2020-10-23 滁州职业技术学院 Data mining system and data mining method
CN112487262A (en) * 2020-11-25 2021-03-12 建信金融科技有限责任公司 Data processing method and device
CN113139822A (en) * 2020-01-19 2021-07-20 苏州金龟子网络科技有限公司 Promotion system and method based on user behavior analysis
CN113641726A (en) * 2021-08-06 2021-11-12 国网北京市电力公司 Unsupervised sheath current data mining system based on generation countermeasure network

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1975720A (en) * 2006-12-27 2007-06-06 章毅 Data tapping system based on Wcb and control method thereof
CN102968466A (en) * 2012-11-09 2013-03-13 同济大学 Indexing network construction method and indexing network constructor based on webpage classification
CN104951814A (en) * 2014-03-27 2015-09-30 上海万达全程健康服务有限公司 RFID-based food safety information tracing terminal
US20160085823A1 (en) * 2012-08-02 2016-03-24 Rule 14 Real-time and adaptive data mining
CN106250705A (en) * 2016-08-10 2016-12-21 深圳市衣信互联网科技有限公司 A kind of big data collection analysis system and method based on cloud service
CN106446085A (en) * 2016-09-09 2017-02-22 北京高地信息技术有限公司 Big data management system
CN106777367A (en) * 2017-01-24 2017-05-31 深圳企管加企业服务有限公司 A kind of user behavior analysis method and system excavated based on big data
CN107507109A (en) * 2017-08-18 2017-12-22 北京百代华夏旅游开发有限公司 A kind of wisdom scenic spot visitor's monitoring location system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1975720A (en) * 2006-12-27 2007-06-06 章毅 Data tapping system based on Wcb and control method thereof
US20160085823A1 (en) * 2012-08-02 2016-03-24 Rule 14 Real-time and adaptive data mining
CN102968466A (en) * 2012-11-09 2013-03-13 同济大学 Indexing network construction method and indexing network constructor based on webpage classification
CN104951814A (en) * 2014-03-27 2015-09-30 上海万达全程健康服务有限公司 RFID-based food safety information tracing terminal
CN106250705A (en) * 2016-08-10 2016-12-21 深圳市衣信互联网科技有限公司 A kind of big data collection analysis system and method based on cloud service
CN106446085A (en) * 2016-09-09 2017-02-22 北京高地信息技术有限公司 Big data management system
CN106777367A (en) * 2017-01-24 2017-05-31 深圳企管加企业服务有限公司 A kind of user behavior analysis method and system excavated based on big data
CN107507109A (en) * 2017-08-18 2017-12-22 北京百代华夏旅游开发有限公司 A kind of wisdom scenic spot visitor's monitoring location system

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113139822A (en) * 2020-01-19 2021-07-20 苏州金龟子网络科技有限公司 Promotion system and method based on user behavior analysis
CN111581301A (en) * 2020-05-11 2020-08-25 创智汇(苏州)电子商务有限公司 Big data classification system based on distributed data stream and algorithm thereof
CN111813834A (en) * 2020-07-14 2020-10-23 滁州职业技术学院 Data mining system and data mining method
CN112487262A (en) * 2020-11-25 2021-03-12 建信金融科技有限责任公司 Data processing method and device
CN113641726A (en) * 2021-08-06 2021-11-12 国网北京市电力公司 Unsupervised sheath current data mining system based on generation countermeasure network
CN113641726B (en) * 2021-08-06 2024-01-30 国网北京市电力公司 Unsupervised sheath current data mining system based on generation of countermeasure network

Similar Documents

Publication Publication Date Title
CN108829704A (en) A kind of big data distributed libray Analysis Service technology
CN106339509A (en) Power grid operation data sharing system based on large data technology
Jayaraman et al. Scalable energy-efficient distributed data analytics for crowdsensing applications in mobile environments
CN108932588B (en) Hydropower station group optimal scheduling system with separated front end and rear end and method
CN103793465A (en) Cloud computing based real-time mass user behavior analyzing method and system
CN102087577B (en) Location independent execution of user interface operations
CN101141370A (en) Gridding service based electric power enterprise real-time data processing method
CN104182506A (en) Log management method
CN107357873A (en) A kind of big data storage management system
Du Energy analysis of Internet of things data mining algorithm for smart green communication networks
CN109977125A (en) A kind of big data safety analysis plateform system based on network security
CN109710767A (en) Multilingual big data service platform
CN110297990A (en) The associated detecting method and system of crowdsourcing marketing microblogging and waterborne troops
CN104298669A (en) Person geographic information mining model based on social network
CN110020273A (en) For generating the method, apparatus and system of thermodynamic chart
Theeten et al. Chive: Bandwidth optimized continuous querying in distributed clouds
Khan et al. A review of big data resource management: Using smart grid systems as a case study
CN201726426U (en) Internet information monitoring system based on cloud computing
CN103942240A (en) Method for building intelligent substation comprehensive data information application platform
Man et al. The study of cross networks alarm correlation based on big data technology
CN109857934A (en) Software module cache prefetching method, apparatus and medium based on user behavior analysis
CN108280790A (en) Policy information service system based on big data analysis
CN105653523A (en) Energy consumption supervise network of things basis platform system building method
CN106777092A (en) The intelligent medical calling querying method of dynamic Skyline inquiries under mobile cloud computing environment
Shi et al. Design and implementation of a scalable distributed web crawler based on Hadoop

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20181116