CN108984718A - A kind of digital content interactive system and exchange method based on big data technology - Google Patents
A kind of digital content interactive system and exchange method based on big data technology Download PDFInfo
- Publication number
- CN108984718A CN108984718A CN201810748907.4A CN201810748907A CN108984718A CN 108984718 A CN108984718 A CN 108984718A CN 201810748907 A CN201810748907 A CN 201810748907A CN 108984718 A CN108984718 A CN 108984718A
- Authority
- CN
- China
- Prior art keywords
- data
- business
- analysis
- management
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Abstract
The present invention provides a kind of digital content interactive systems and exchange method based on big data technology, solidification has existing business analysis model, management data is analyzed, including statistical analysis, customer analysis, content analysis, business monitoring, settlement management and website data crawl;Output data analysis as a result, can simultaneously sort and can graphics mode output;Store user-association data, incidence relation of the associated data between user and business, user-association and multiple services user-association including single business;Data warehouse model and Data Analysis Model are constructed according to business information, layering creation and management are carried out to model;File distribution storage and parallel computation are realized using HADOOP.Compared with prior art, business datum comprehensive analysis can be better achieved and realize marketing support.
Description
Technical field
The present invention relates to a kind of digital content interactive system and exchange method based on big data technology, it is related to based on big number
According to data analysis field.
Background technique
The prior art is based on big data technology, propagates operation service system demand for digital mobile, based on unified library, melts
The BI ability of existing marketing analysis systems, business audit system, via operation analytic system is closed, provides data mining and analysis for marketing
Decision-making capability, and by in-depth analysis ability, meet the support demand of operational decision making and innovation, the technology for becoming urgent need to resolve is asked
Topic.
Summary of the invention
The present invention provides a kind of digital content interactive systems and exchange method based on big data technology, and having can be more
The characteristics of business datum comprehensive analysis realizes marketing support is realized well.
The technical solution adopted by the invention is as follows:
A kind of digital content interactive system based on big data technology, it is characterised in that: including,
Data analysis module, solidification have existing business analysis model, analyze management data, including statistical analysis unit,
Customer analysis unit, content analysis unit, business monitoring unit, settlement management unit and website data picking unit;
Data analyze output module, output data analysis as a result, can simultaneously sort and can graphics mode output;
Unified customer information library, is stored with user-association data, the incidence relation between user and business, including single business
User-association and multiple services user-association;
Data model creation module constructs data warehouse model and Data Analysis Model according to business information, divides model
Layer creation and management;
Distributed computing module realizes file distribution storage and parallel computation function using HADOOP.
Further include Business Process Control module, provides various businesses logic, and various businesses process is controlled and adjusted
Degree.
Further include rights management module, basic personnel's rights management function needed for system safety operation is provided.
Further include log management module, the basic management function needed for operating normally is provided.
A kind of digital content exchange method based on big data technology, it is real on the basis of above-mentioned digital content interactive system
Existing, specific method includes,
Solidification have existing business analysis model, management data is analyzed, including statistical analysis, customer analysis, content analysis,
Business monitoring, settlement management and website data crawl;
Output data analysis as a result, can simultaneously sort and can graphics mode output;
Store user-association data, incidence relation of the associated data between user and business, the user including single business
Association and multiple services user-association;
Data warehouse model and Data Analysis Model are constructed according to business information, layering creation and management are carried out to model;
File distribution storage and parallel computation are realized using HADOOP.
Wherein, realize that file distribution storage and parallel computation carry out the specific method of data mining using HADOOP
Including, the SPRINT algorithm based on MapReduce, using parallel schema,
The data record of input is split as to the form of attribute list completely according to the attributive classification of setting, and is once sorted
Ordering attribute list is generated, the division of root node is generated;
The classification of root node is put into HDFS file system, the division recycled, so that the fully nonlinear water wave of attribute list is completed,
All attribute lists are all assigned in corresponding leaf node, a complete decision tree is generated;
The output of MapReduce is written directly in the file of distributed system, and several constructions is obtained from reduce file.
The method also includes being tool using Hadoop and Lucene, carry out secondary development and realize unstructured data
Unified interaction, specific method include,
Integrated isomerous environment integrates isomerous environment using Hadoop, shields machine hardware in the computer group environment of isomery
Heterogeneous characteristic and the performance of the isomery of the speed of service;
Unified view is constructed for unstructured data, with the mode integrated unstructured data of index, by the unstructured of isomery
Data form a unified view to extract;
It interacts using unified, unstructured data in index database is inquired using Lucene.
The method also includes being controlled and dispatched to various businesses process according to various businesses logic.
The method also includes carrying out rights management to basic personnel needed for system safety operation.
The method also includes, record management log, the basic management operated normally.
Compared with prior art, the beneficial effects of the present invention are: the realization of business datum comprehensive analysis can be better achieved
Marketing support;It is easy to system extension, realizes distributed system management and concurrent operation, and operation accuracy is higher.
Detailed description of the invention
Fig. 1 is the unstructured data interaction schematic diagram of a wherein embodiment of the invention.
Specific embodiment
In order to make the objectives, technical solutions, and advantages of the present invention clearer, with reference to the accompanying drawings and embodiments, right
The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not
For limiting the present invention.
Any feature disclosed in this specification (including abstract and attached drawing) unless specifically stated can be equivalent by other
Or the alternative features with similar purpose are replaced.That is, unless specifically stated, each feature is a series of equivalent or class
Like an example in feature.
A kind of digital content interactive system based on big data technology, including,
Data analysis module, solidification have existing business analysis model, analyze management data, including statistical analysis unit,
Customer analysis unit, content analysis unit, business monitoring unit, settlement management unit and website data picking unit;
Data analyze output module, output data analysis as a result, can simultaneously sort and can graphics mode output;
Unified customer information library, is stored with user-association data, the incidence relation between user and business, including single business
User-association and multiple services user-association;
Data model creation module constructs data warehouse model and Data Analysis Model according to business information, divides model
Layer creation and management;
Distributed computing module realizes file distribution storage and parallel computation function using HADOOP.
As one embodiment of the present invention, system includes that content includes portal, data processing, report, KPI, statistics
Analysis, customer analysis, content analysis, business monitoring, marketing support, marketing tracking, customer resources library, settlement management, website number
According to functions such as crawls.Managing portal function is the unified interface towards each channel user of service, and the interfaceization of each function is supported to manage
Reason operation.
Data analysis module preferably instructs subsequent operation convenient for extracting to the valuable information of the following business activities.
Unified customer information library support client comprehensive analysis, realizes marketing support.Marketing support is established on the basis of customer segmentation, is led to
Cross the user-association data in unified customer information library, it can be achieved that single business subscriber segmentation and multiple services subscriber segmentation, it is right
One user plays multiple labels.
It is top-down every using the thought of layering in conjunction with fundamental design idea and typical data digging system model
Layer interface under all transparent calling of layer, top is alternation of bed, for the interaction between user and system.The bottom is distribution
Computation layer realizes file distribution storage and parallel computation function using HADOOP.Using layering, become only between each layer
It is vertical, it is easy to the extension of system.
Technical solution of the present invention provides the interface between system and user.There is the figure of good behaviour form by providing
Various output results are checked or saved so that user can customize various fine-grained business with login system in interface.
As one embodiment of the present invention, further includes Business Process Control module, provide various businesses logic, and right
Various businesses process is controlled and is dispatched.
Business Process Control module provides various businesses logic and realizes the control and scheduling to various businesses process.
The business that user submits is processed in Business Process Control module, control and scheduling.Such as: what user submitted completes specific data
The business of classified excavation is processed in this module.User's submission is completed by calling multiple modules of data mining algorithm
Business, and return results to user.Business Process Control module also controls and dispatches the modules of data mining platform
It executes.
It further include rights management module as one embodiment of the present invention, base needed for system safety operation is provided
This personnel rights management function.
Portal integration is carried out to other systems by portal, the permission of each system is consistent, and user can after once logging in
Check the content for integrating each system.
Further include log management module as one embodiment of the present invention, the basic pipe needed for operating normally is provided
Manage function.
A kind of digital content exchange method based on big data technology, it is real on the basis of above-mentioned digital content interactive system
Existing, specific method includes,
Solidification have existing business analysis model, management data is analyzed, including statistical analysis, customer analysis, content analysis,
Business monitoring, settlement management and website data crawl;
Output data analysis as a result, can simultaneously sort and can graphics mode output;
Store user-association data, incidence relation of the associated data between user and business, the user including single business
Association and multiple services user-association;
Data warehouse model and Data Analysis Model are constructed according to business information, layering creation and management are carried out to model;
File distribution storage and parallel computation are realized using HADOOP.
As one embodiment of the present invention, wherein realize file distribution storage and parallel using HADOOP
Calculating and carrying out the specific method of data mining includes the SPRINT algorithm based on MapReduce, using parallel schema,
The data record of input is split as to the form of attribute list completely according to the attributive classification of setting, and is once sorted
Ordering attribute list is generated, the division of root node is generated;
The classification of root node is put into HDFS file system, the division recycled, so that the fully nonlinear water wave of attribute list is completed,
All attribute lists are all assigned in corresponding leaf node, a complete decision tree is generated;
The output of MapReduce is written directly in the file of distributed system, and several constructions is obtained from reduce file.
Data mining provides the modules of data mining phases Business Stream needs for service application, and has thinner
Granularity.The present invention realizes cluster-based storage with HADOOP frame, calculates, and provides distributed file system using HADOOP
With parallel operational mode, while the management to distributed system is realized.
The parallelization of algorithm during various tasks is realized, and task is submitted to HADOOP distribution computation layer and is carried out
Operation, and return the result.
As specific embodiments of the present invention, in terms of machine data excavation, project is used based on MapReduce's
SPRINT algorithm, and innovatory algorithm characteristic.
Serial data mining algorithm is successfully transplanted to HADOOP and is put down by this project by using correct paralleling tactic
Under platform.By the design and realization to the parallel SPRINT algorithm flow based on HADOOP, meet following characteristic:
In algorithm, Map process is highly-parallel;
Reading data multiple Map of algorithm parallel from file system are redistributed to Reduce;
The transplanting of algorithm will not reduce or increase the measure of accuracy of traditional SPRINT algorithm.
It is parallel due to successfully realizing, and some good characteristics of the frame and strong have been used based on HADOOP platform
Big computing capability, so can embody stronger retractility when handling big data quantity and show higher efficiency.
As one embodiment of the present invention, the method also includes being tool using Hadoop and Lucene, carry out
Secondary development realizes that uniformly interaction, specific method include unstructured data,
Integrated isomerous environment integrates isomerous environment using Hadoop, shields machine hardware in the computer group environment of isomery
Heterogeneous characteristic and the performance of the isomery of the speed of service;
Unified view is constructed for unstructured data, with the mode integrated unstructured data of index, by the unstructured of isomery
Data form a unified view to extract;
It interacts using unified, unstructured data in index database is inquired using Lucene.
It is upper layer data collection as shown in Figure 1, realizing the integrated of isomerous environment as a specific embodiment of the invention
A stabilised platform is provided at application;The integrated of unstructured data is realized, is prepared for the post-processing of data;It utilizes
Lucene inquires unstructured data in index database, by the nonproductive poll to information, provides one for user
A scheme that lateral semi-automated analytical can be carried out to background of information is preferably realized on the basis of unified access data
Comparative analysis of the user to information.
Relevant information library is responsible for saving document related information, and parsing inventory puts data after parsing, and classified index library stores
The unstructured data of Homogeneous, these libraries are stored in the distributed file system of Hadoop Distributed Computing Platform.It is interior
Hold relevant informations, the resolvers such as path, the modification time of extraction module extraction unstructured data to be responsible for extracting unstructured number
According to content, classifier realizes that document classification, Lucene engine drive index that document after classification is constructed classified index, realize system
One view.Application interface module is responsible for that decomposition will be inputted, and query interface is responsible for input after decomposing and is converted to index identification
Format converts index returned data to inquire respective index content, and marking and queuing module, which is responsible for inquire, to be connect
The result that mouth returns returns to application interface after being ranked up.
As one embodiment of the present invention, the method also includes according to various businesses logic, to various businesses stream
Cheng Jinhang control and scheduling.
As one embodiment of the present invention, the method also includes to basic personnel needed for system safety operation
Carry out rights management.
As one embodiment of the present invention, the method also includes, record management log, the base operated normally
This management.
Claims (10)
1. a kind of digital content interactive system based on big data technology, it is characterised in that: including,
Data analysis module, solidification have existing business analysis model, analyze management data, including statistical analysis unit,
Customer analysis unit, content analysis unit, business monitoring unit, settlement management unit and website data picking unit;
Data analyze output module, output data analysis as a result, can simultaneously sort and can graphics mode output;
Unified customer information library, is stored with user-association data, the incidence relation between user and business, including single business
User-association and multiple services user-association;
Data model creation module constructs data warehouse model and Data Analysis Model according to business information, divides model
Layer creation and management;
Distributed computing module realizes file distribution storage and parallel computation function using HADOOP.
2. digital content interactive system according to claim 1, it is characterised in that: it further include Business Process Control module,
Various businesses logic is provided, and various businesses process is controlled and dispatched.
3. digital content interactive system according to claim 1, it is characterised in that: further include rights management module, provide
Basic personnel's rights management function needed for system safety operation.
4. digital content interactive system according to claim 1, it is characterised in that: further include log management module, provide
Basic management function needed for operating normally.
5. a kind of digital content exchange method based on big data technology, the interaction of the digital content described in one of claims 1 to 4
It is realized on the basis of system, specific method includes,
Solidification have existing business analysis model, management data is analyzed, including statistical analysis, customer analysis, content analysis,
Business monitoring, settlement management and website data crawl;
Output data analysis as a result, can simultaneously sort and can graphics mode output;
Store user-association data, incidence relation of the associated data between user and business, the user including single business
Association and multiple services user-association;
Data warehouse model and Data Analysis Model are constructed according to business information, layering creation and management are carried out to model;
File distribution storage and parallel computation are realized using HADOOP.
6. digital content exchange method according to claim 5, wherein realize that file distribution is deposited using HADOOP
It includes the SPRINT algorithm based on MapReduce, using parallel mould that storage and parallel computation, which carry out the specific method of data mining,
Formula,
The data record of input is split as to the form of attribute list completely according to the attributive classification of setting, and is once sorted
Ordering attribute list is generated, the division of root node is generated;
The classification of root node is put into HDFS file system, the division recycled, so that the fully nonlinear water wave of attribute list is completed,
All attribute lists are all assigned in corresponding leaf node, a complete decision tree is generated;
The output of MapReduce is written directly in the file of distributed system, and several constructions is obtained from reduce file.
7. digital content exchange method according to claim 5 or 6, the method also includes, using Hadoop and
Lucene is tool, carries out secondary development and realizes that uniformly interaction, specific method include unstructured data,
Integrated isomerous environment integrates isomerous environment using Hadoop, shields machine hardware in the computer group environment of isomery
Heterogeneous characteristic and the performance of the isomery of the speed of service;
Unified view is constructed for unstructured data, with the mode integrated unstructured data of index, by the unstructured of isomery
Data form a unified view to extract;
It interacts using unified, unstructured data in index database is inquired using Lucene.
8. digital content exchange method according to claim 5 or 6, the method also includes, according to various businesses logic,
Various businesses process is controlled and dispatched.
9. digital content exchange method according to claim 5 or 6, the method also includes to system safety operation institute
The basic personnel needed carry out rights management.
10. digital content exchange method according to claim 5 or 6, the method also includes, record management log, into
The basic management that row operates normally.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810748907.4A CN108984718A (en) | 2018-07-10 | 2018-07-10 | A kind of digital content interactive system and exchange method based on big data technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810748907.4A CN108984718A (en) | 2018-07-10 | 2018-07-10 | A kind of digital content interactive system and exchange method based on big data technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108984718A true CN108984718A (en) | 2018-12-11 |
Family
ID=64536613
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810748907.4A Pending CN108984718A (en) | 2018-07-10 | 2018-07-10 | A kind of digital content interactive system and exchange method based on big data technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108984718A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110245208A (en) * | 2019-04-30 | 2019-09-17 | 广东省智能制造研究所 | A kind of retrieval analysis method, apparatus and medium based on big data storage |
CN112365282A (en) * | 2020-10-29 | 2021-02-12 | 苏州实盎网络科技有限公司 | Marketing big data modeling method and platform |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103838617A (en) * | 2014-02-18 | 2014-06-04 | 河海大学 | Method for constructing data mining platform in big data environment |
CN104915793A (en) * | 2015-06-30 | 2015-09-16 | 北京西塔网络科技股份有限公司 | Public information intelligent analysis platform based on big data analysis and mining |
CN105007171A (en) * | 2015-05-25 | 2015-10-28 | 上海欣方软件有限公司 | User data analysis system and method based on big data in communication field |
CN105205104A (en) * | 2015-08-26 | 2015-12-30 | 成都布林特信息技术有限公司 | Cloud platform data acquisition method |
CN106709012A (en) * | 2016-12-26 | 2017-05-24 | 北京锐安科技有限公司 | Method and device for analyzing big data |
CN107590181A (en) * | 2017-08-01 | 2018-01-16 | 佛山市深研信息技术有限公司 | A kind of intelligent analysis system of big data |
CN107945092A (en) * | 2017-12-13 | 2018-04-20 | 成都市审计局 | Big data integrated management approach and system for audit field |
CN107958005A (en) * | 2016-10-17 | 2018-04-24 | 哈尔滨光凯科技开发有限公司 | A kind of medical search engine service system Construction method based on Lucene |
CN108073625A (en) * | 2016-11-14 | 2018-05-25 | 北京京东尚科信息技术有限公司 | For the system and method for metadata information management |
-
2018
- 2018-07-10 CN CN201810748907.4A patent/CN108984718A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103838617A (en) * | 2014-02-18 | 2014-06-04 | 河海大学 | Method for constructing data mining platform in big data environment |
CN105007171A (en) * | 2015-05-25 | 2015-10-28 | 上海欣方软件有限公司 | User data analysis system and method based on big data in communication field |
CN104915793A (en) * | 2015-06-30 | 2015-09-16 | 北京西塔网络科技股份有限公司 | Public information intelligent analysis platform based on big data analysis and mining |
CN105205104A (en) * | 2015-08-26 | 2015-12-30 | 成都布林特信息技术有限公司 | Cloud platform data acquisition method |
CN107958005A (en) * | 2016-10-17 | 2018-04-24 | 哈尔滨光凯科技开发有限公司 | A kind of medical search engine service system Construction method based on Lucene |
CN108073625A (en) * | 2016-11-14 | 2018-05-25 | 北京京东尚科信息技术有限公司 | For the system and method for metadata information management |
CN106709012A (en) * | 2016-12-26 | 2017-05-24 | 北京锐安科技有限公司 | Method and device for analyzing big data |
CN107590181A (en) * | 2017-08-01 | 2018-01-16 | 佛山市深研信息技术有限公司 | A kind of intelligent analysis system of big data |
CN107945092A (en) * | 2017-12-13 | 2018-04-20 | 成都市审计局 | Big data integrated management approach and system for audit field |
Non-Patent Citations (2)
Title |
---|
寒阳墨: "Hadoop与Lucene和Nutch的关系", 《HTTPS://WWW.CNBLOGS.COM/HANYANGMO/P/3903401.HTML》 * |
瞿卓: "基于hadoop2.0的数据挖掘算法并行化研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110245208A (en) * | 2019-04-30 | 2019-09-17 | 广东省智能制造研究所 | A kind of retrieval analysis method, apparatus and medium based on big data storage |
CN112365282A (en) * | 2020-10-29 | 2021-02-12 | 苏州实盎网络科技有限公司 | Marketing big data modeling method and platform |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112699175B (en) | Data management system and method thereof | |
US9607056B2 (en) | Providing a multi-tenant knowledge network | |
Vera-Baquero et al. | Business process analytics using a big data approach | |
CN101706738B (en) | Flow application system | |
CN109542967B (en) | Smart city data sharing system and method based on XBRL standard | |
CN110334274A (en) | Information-pushing method, device, computer equipment and storage medium | |
US10303690B1 (en) | Automated identification and classification of critical data elements | |
CN106407278A (en) | Architecture design system of big data platform | |
CN1751288A (en) | Horizontal enterprise planning in accordance with an enterprise planning model | |
DE112011100360T5 (en) | System and method for building a cloud aware solution background for mass data analysis | |
CN103995899A (en) | Analysis system for KPI | |
Xia et al. | An effective classification-based framework for predicting cloud capacity demand in cloud services | |
CN101739454B (en) | Data processing system | |
CN106982251A (en) | Project field work data reporting method and system are reconnoitred based on mobile device | |
CN108984718A (en) | A kind of digital content interactive system and exchange method based on big data technology | |
CN112801607A (en) | Management service platform and construction method | |
CN103198099A (en) | Cloud-based data mining application method facing telecommunication service | |
CN116384889A (en) | Intelligent analysis method for information big data based on natural language processing technology | |
Kun et al. | Application of big data technology in scientific research data management of military enterprises | |
CN111191228A (en) | Service processing method and device, equipment and storage medium | |
US10055469B2 (en) | Method and software for retrieving information from big data systems and analyzing the retrieved data | |
Santos et al. | Enhancing big data warehousing for efficient, integrated and advanced analytics: visionary paper | |
US20230409567A1 (en) | Managing Multiple Types of Databases Using a Single User Interface (UI) That Includes Voice Recognition and Artificial Intelligence (AI) | |
CN115438995B (en) | Business processing method and equipment for clothing customization enterprise based on knowledge graph | |
Jin et al. | Financial management and decision based on decision tree algorithm |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181211 |
|
RJ01 | Rejection of invention patent application after publication |