CN104182389B - A kind of big data analyzing business intelligence service system based on semanteme - Google Patents

A kind of big data analyzing business intelligence service system based on semanteme Download PDF

Info

Publication number
CN104182389B
CN104182389B CN201410348407.3A CN201410348407A CN104182389B CN 104182389 B CN104182389 B CN 104182389B CN 201410348407 A CN201410348407 A CN 201410348407A CN 104182389 B CN104182389 B CN 104182389B
Authority
CN
China
Prior art keywords
data
module
subsystem
analysis
semantic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201410348407.3A
Other languages
Chinese (zh)
Other versions
CN104182389A (en
Inventor
璐惧博
贾岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
ANHUI HUAZHEN INFORMATION SCIENCE & TECHNOLOGY Co Ltd
Original Assignee
ANHUI HUAZHEN INFORMATION SCIENCE & TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by ANHUI HUAZHEN INFORMATION SCIENCE & TECHNOLOGY Co Ltd filed Critical ANHUI HUAZHEN INFORMATION SCIENCE & TECHNOLOGY Co Ltd
Priority to CN201410348407.3A priority Critical patent/CN104182389B/en
Publication of CN104182389A publication Critical patent/CN104182389A/en
Application granted granted Critical
Publication of CN104182389B publication Critical patent/CN104182389B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present invention proposes a kind of big data analyzing business intelligence service system based on semanteme, and precisely analysis, conveniently can provide commercial intelligence service, it includes for medium-sized and small enterprises with realizing the business information that is rich in internet:Data acquisition storage subsystem, real-time stream processing subsystem, storage subsystem, basic-level support subsystem and business output subsystem;Wherein, subsystem is put in storage in data acquisition, including separate distributed reptile module and data source adapter, distributed reptile module and data source adapter connect real-time stream processing subsystem respectively, distributed reptile module is responsible for data source header detecting, internet data collection and HTML pretreatments, data source adapter and is used for third party's data resource cut-in operation;Real-time stream processing subsystem is connected to storage subsystem, and the temporary storage module including being connected and data flow hook, the data interim storage that temporary storage module will collect in real time.

Description

Big data analysis business intelligent service system based on semantics
Technical Field
The invention relates to the technical field of business intelligence, in particular to a big data analysis business intelligent service system based on semantics.
Background
In the new period of the social development of China, the different military of small and medium-sized enterprises is prominent, and therefore the force of the Chinese market is increasingly vigorous. They are eagerly developed and need information services without the strength and energy of capital-intensive large group companies from having constructed information institutions. The information resource is one of the most important resources of an enterprise, and developing the information resource is the starting point of enterprise informatization and is also the 'homing' of the enterprise informatization.
With the continuous deepening of the informatization degree, the desire of enterprises for the analysis service of the big data is increasingly strong. The continuous increase of information resources of the internet contains huge amount of information with commercial value, and becomes an important business intelligent service information source, but the value of the internet is not fully developed and utilized by the industry due to the difficulties of huge data volume, large acquisition difficulty, relatively low unit value, almost all non-structural data such as texts and the like.
For an enterprise, "efficiency is life and time is money". The internet can provide more convenient, rapid and omnibearing reference consultation service for small and medium-sized enterprises only by actively providing information service means and utilizing modern technical equipment to realize resource sharing and organizing, planning and purposefully collecting and processing information, thereby accelerating the decision-making speed of enterprise leadership and gaining opportunity for enterprises in market economy.
Disclosure of Invention
Based on the problems in the background art, the invention provides a semantic-based big data analysis business intelligent service system, which realizes accurate analysis of business information rich in the Internet and can conveniently and quickly provide business intelligent service for small and medium-sized enterprises.
The invention provides a big data analysis business intelligent service system based on semantics, which comprises: the system comprises a data acquisition and storage subsystem, a real-time data stream processing subsystem, a storage subsystem, a bottom layer support subsystem and a service output subsystem; wherein,
the data acquisition and storage subsystem comprises a distributed crawler module and a data source adapter which are mutually independent, the distributed crawler module and the data source adapter are respectively connected with the real-time data stream processing subsystem, the distributed crawler module is responsible for data source detection, internet data acquisition and HTML preprocessing, and the data source adapter is used for accessing third-party data resources to work;
the real-time data stream processing subsystem is connected to the storage subsystem and comprises a temporary storage module and a data stream hook which are connected, the temporary storage module takes the memory of the cluster as a cache environment, and temporarily stores the data acquired in real time for being read by a module with real-time requirement; the stream data hook provides a hook for mounting, and when data arrives, the hook mounts basic description of the data so as to facilitate the module mounted to the hook to read; a cache threshold value is set in the real-time data stream processing subsystem, and data are emptied when the cache threshold value is exceeded;
the storage subsystem is connected to the service output subsystem and comprises a Hadoop cluster and a mysql cluster which are connected, and the Hadoop cluster is used for storing a large amount of webpage data and analysis results without random read-write requirements; the mysql cluster has a small storage volume and needs data which is read and written randomly;
the bottom layer support subsystem comprises a semantic information extraction module and a semantic search engine which are connected, wherein the semantic information extraction module is responsible for extracting semantic information from a text and supporting other modules needing semantic extraction and semantic analysis, and the semantic information extraction module is respectively connected with the real-time data processing subsystem and the service output subsystem; the semantic search engine integrates all tools and API modules related to semantic search and text processing, and is simultaneously connected with the Hadoop cluster and the service output subsystem;
the service output subsystem is used for executing, scheduling and displaying specific services and comprises an accurate marketing module, a data service module, a report generation module, a commercial information analysis module and a public opinion analysis module which are connected in parallel; the accurate marketing module is used for providing technical support of data collection, analysis and marketing means for accurate marketing; the data service module is used for data collection and semantic analysis which are carried out for meeting the specific data requirements of customers; the report generation module generates a short, summary and image-text combined information summary for a client, and supports automatic generation and report summarization and writing at regular intervals; the business information analysis module is used for business opportunity information analysis, competitor analysis, industry movement and data analysis; the public opinion analysis module is used for topic tracking analysis, event and person related tracking analysis, network public opinion data collection and integrated analysis.
And in the distributed crawler module, reliability weights are set for different information sources.
The distributed crawler module adopts a fixed-point squatting and guarding type and/or heuristic type and/or universal collection strategy.
The buffer threshold value of the real-time data stream processing subsystem is 0.1-100 minutes.
Hadoop clusters are persistent storage.
And the operation data, the data mining result and the semantic analysis result are stored in the mysql cluster.
The semantic information extraction module adopts the semantic information extraction technology of the natural language-like language and describes and marks the semantic information in the natural language text in a form extremely similar to the natural language.
The semantic information extraction module records the information amount of each topic by adopting a semantic clustering technology and reminds a user to pay attention to important events.
The invention effectively solves the problem of web-based big data analysis, has the characteristics of high precision, rich semantic information, high practicability, industrialization and the like, and can fully release the value of text information by using the big data as input data of technologies such as data mining and the like; the method comprises the steps of analyzing business behaviors of internet users to realize accurate marketing service of enterprise products; the method helps enterprises to insights the dynamic trend in the industry and the upstream and downstream industries, grasp business opportunities, avoid risks and help the enterprises to make scientific decisions and other business intelligent services quickly. The invention has wide industrialized application prospect.
Drawings
Fig. 1 is a structural diagram of a semantic-based big data analysis business intelligence service system according to the present invention.
Detailed Description
Referring to fig. 1, the big data analysis business intelligent service system based on semantics provided by the invention comprises: the system comprises a data acquisition and storage subsystem, a real-time data stream processing subsystem, a storage subsystem, a bottom layer support subsystem and a service output subsystem.
The data acquisition and storage subsystem comprises a distributed crawler module and a data source adapter which are mutually independent, and the distributed crawler module and the data source adapter are respectively connected with the real-time data stream processing subsystem. The distributed crawler module is responsible for data source detection, internet data acquisition and HTML (hypertext markup language) preprocessing. The data source adapter is used for accessing third-party data resources to work, such as data which needs to be analyzed and is specified by a client, and the processing flow of the system can be intervened through the data source adapter.
In the distributed crawler module, credibility weights are set for different information sources, so that a user can determine information value and extraction time is saved. For example, in this embodiment, the data mining toolkit adopts an abstract data mining common algorithm toolkit, and combines with tools and algorithm kits of open source communities to form a relatively mature data mining algorithm and toolkit, and collects data on networks such as various websites, forums, blogs, and the like in real time, and simultaneously, adopts ranking data of a ranking network of chinese websites, sets a confidence weight for each website information, and also has corresponding weights for different source information such as news, blogs, forums, and the like. The distributed crawler module collects data according to different topics, and meanwhile, in the embodiment, the main data blocks of the webpage are determined through webpage structure analysis of similar pages, and an executable template is automatically generated to achieve webpage extraction. In addition, the acquisition of network data adopts various acquisition strategies such as fixed-point squatting and guarding type, heuristic type and extensive acquisition. The method and the device have the advantages of wide data acquisition range, strong pertinence, high efficiency and less omission.
The real-time data stream processing subsystem is connected to the storage subsystem and comprises a temporary storage module and a data stream hook which are connected. The temporary storage module takes the memory of the cluster as a cache environment, temporarily stores the data acquired in real time by the data acquisition and storage subsystem, and provides the data for the module with the real-time requirement to read. The streaming data hook provides a hook for mounting, the basic mechanism is a subscription-consumption model, when data arrives, the hook mounts basic description of the data for a module mounted to the hook to read. The real-time data stream processing subsystem organically accesses various analysis requirements between the data acquisition and storage subsystem and the storage subsystem through a hook mechanism, so that the real-time processing is ensured, the data can be stored in a distributed mode, and processing congestion is avoided through an extensible architecture strategy. The real-time data stream processing subsystem is provided with a buffer threshold, and when the buffer threshold is exceeded, the data will be emptied, and the buffer threshold in the embodiment is 5 minutes, and in specific implementation, the buffer threshold may be set separately, for example, any value in 0.1 to 100 minutes.
The storage subsystem is connected to the service output subsystem and comprises a Hadoop cluster (a distributed system infrastructure) and a mysql cluster (a relational database) connected together. The Hadoop cluster is used for storing a large amount of webpage data and an analysis result without random read-write requirements, the data storage in the Hadoop cluster is permanent, the stored data capacity is large, and a foundation is laid for a data stream hook technology of a real-time data stream processing subsystem. The mysql cluster is small in storage volume and needs data read and written randomly, such as operation data, data mining results, semantic analysis results and the like. The Hadoop cluster and the mysql cluster improve the efficiency of data calling.
And the bottom layer support subsystem comprises a semantic information extraction module and a semantic search engine which are connected. The semantic information extraction module is responsible for extracting semantic information from the text and supporting other modules needing semantic extraction and semantic analysis, and the semantic information extraction module is respectively connected with the real-time data processing subsystem and the service output subsystem to transmit semantic analysis results. The semantic search engine integrates all tools related to semantic search and text processing and an Application Programming Interface (API) module, is simultaneously connected with the Hadoop cluster and the service output subsystem, and can search data in the Hadoop cluster and transmit results to the service output subsystem.
In the embodiment, the semantic information extraction module adopts a semantic analysis technology which takes paragraphs as analysis objects and takes the attributes of people, things and things as targets to extract all the common side surfaces and attributes related to the people, things and things; meanwhile, a semantic clustering technology is adopted to record the information content of each topic and remind a user to pay attention to the important events. In the embodiment, semantic information in a natural language text is described and marked in a form extremely similar to a natural language, no attempt is made to construct a strict rule, and a concerned semantic element is manually marked starting from a specific sentence which expresses similar meanings or contains similar semantic information one by one; analyzing the unmarked part in the sentence through a built-in semantic dictionary to generate an induction rule; by sorting rules that conform to natural language expression habits (also known as "intuitive compliance"); and carrying out a new iteration on the sentences which are not covered by the rules, thereby forming a set of rules which can be manually understood and can be used for semantic matching and text information extraction. The text semantic processing method effectively solves the problem of web-based big data analysis, has the characteristics of high precision, rich provided semantic information, high practicability, industrialization and the like, and can fully release the value of text information when being used as input data of technologies such as data mining and the like. Meanwhile, the semantic expression mode is excellent in duplication elimination performance, multiple times of storage of the same piece of information are avoided, and storage space is saved.
And the service output subsystem is used for executing, scheduling and displaying specific services and comprises an accurate marketing module, a data service module, a report generation module, a commercial information analysis module and a public opinion analysis module which are connected in parallel. The accurate marketing module is used for providing technical support of data collection, analysis and marketing means for accurate marketing; the data service module is used for data collection and semantic analysis which are carried out for meeting the specific data requirements of customers; the report generation module generates a short, summary and image-text combined information summary for a client, and supports automatic generation and report summarization and writing at regular intervals; the business information analysis module is used for business opportunity information analysis, competitor analysis, industry movement and data analysis; the public opinion analysis module is used for topic tracking analysis, event and person related tracking analysis, network public opinion data collection and integrated analysis. The business output subsystem realizes accurate marketing service for enterprise products by analyzing the commercial behaviors of Internet users; the method helps enterprises to insights the dynamic trend in the industry and the upstream and downstream industries, grasp business opportunities, avoid risks and help the enterprises to make scientific decisions and other business intelligent services quickly. Has wide industrial application prospect.
The system realizes the analysis of the commercial behaviors of the internet surfing people through the monitoring of the internet and the semantic analysis of the text information, and recommends products suitable for business opportunities to the people, thereby realizing the function of accurate marketing. On the other hand, by monitoring the external business environment of the enterprise, the business intelligent services including market environment, industry dynamic, product and brand monitoring, monitoring of the upstream and downstream environments of the enterprise and the like are provided.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art should be considered to be within the technical scope of the present invention, and the technical solutions and the inventive concepts thereof according to the present invention should be equivalent or changed within the scope of the present invention.

Claims (6)

1. A big data analysis business intelligence service system based on semantics, comprising: the system comprises a data acquisition and storage subsystem, a real-time data stream processing subsystem, a storage subsystem, a bottom layer support subsystem and a service output subsystem; wherein,
the data acquisition and storage subsystem comprises a distributed crawler module and a data source adapter which are mutually independent, the distributed crawler module and the data source adapter are respectively connected with the real-time data stream processing subsystem, the distributed crawler module is responsible for data source detection, internet data acquisition and HTML preprocessing, and the data source adapter is used for accessing third-party data resources to work; in the distributed crawler module, credibility weights are set for different information sources, and the distributed crawler module adopts a fixed-point squatting and/or heuristic acquisition strategy and/or a universal acquisition strategy;
the real-time data stream processing subsystem is connected to the storage subsystem and comprises a temporary storage module and a data stream hook which are connected, the temporary storage module takes the memory of the cluster as a cache environment, and temporarily stores the data acquired in real time for being read by a module with real-time requirement; the stream data hook provides a hook for mounting, and when data arrives, the hook mounts basic description of the data so as to facilitate the module mounted to the hook to read; a cache threshold value is set in the real-time data stream processing subsystem, and data are emptied when the cache threshold value is exceeded;
the storage subsystem is connected to the service output subsystem and comprises a Hadoop cluster and a mysql cluster which are connected, and the Hadoop cluster is used for storing a large amount of webpage data and analysis results without random read-write requirements; the mysql cluster has a small storage volume and needs data which is read and written randomly;
the bottom layer support subsystem comprises a semantic information extraction module and a semantic search engine which are connected, wherein the semantic information extraction module is responsible for extracting semantic information from a text and supporting other modules needing semantic extraction and semantic analysis, and the semantic information extraction module is respectively connected with the real-time data processing subsystem and the service output subsystem; the semantic search engine integrates all tools and API modules related to semantic search and text processing, and is simultaneously connected with the Hadoop cluster and the service output subsystem;
the service output subsystem is used for executing, scheduling and displaying specific services and comprises an accurate marketing module, a data service module, a report generation module, a commercial information analysis module and a public opinion analysis module which are connected in parallel; the accurate marketing module is used for providing technical support of data collection, analysis and marketing means for accurate marketing; the data service module is used for data collection and semantic analysis which are carried out for meeting the specific data requirements of customers; the report generation module generates a short, summary and image-text combined information summary for a client, and supports automatic generation and report summarization and writing at regular intervals; the business information analysis module is used for business opportunity information analysis, competitor analysis, industry movement and data analysis; the public opinion analysis module is used for topic tracking analysis, event and person related tracking analysis, network public opinion data collection and integrated analysis.
2. The big data analytics business intelligence service system based on semantics of claim 1 wherein the real-time data stream processing subsystem has a buffering threshold of 0.1 to 100 minutes.
3. The big data analytics business intelligence service system as claimed in claim 1, wherein the Hadoop cluster is persistent storage.
4. The big data analytics business intelligence service system based on semantics of claim 1 wherein operational data, data mining results, semantic analysis results are stored in a mysql cluster.
5. The big data analytics business intelligence service system based on semantics of claim 1 wherein the semantic information extraction module employs natural language like semantic information extraction techniques to describe and tag semantic information in natural language text in the form of natural language.
6. The big data analytics business intelligence service system based on semantics as claimed in claim 1, wherein the semantics information extraction module employs a semantics clustering technique to record the information amount of each topic to remind the user to pay attention to important events.
CN201410348407.3A 2014-07-21 2014-07-21 A kind of big data analyzing business intelligence service system based on semanteme Expired - Fee Related CN104182389B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410348407.3A CN104182389B (en) 2014-07-21 2014-07-21 A kind of big data analyzing business intelligence service system based on semanteme

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410348407.3A CN104182389B (en) 2014-07-21 2014-07-21 A kind of big data analyzing business intelligence service system based on semanteme

Publications (2)

Publication Number Publication Date
CN104182389A CN104182389A (en) 2014-12-03
CN104182389B true CN104182389B (en) 2018-01-19

Family

ID=51963449

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410348407.3A Expired - Fee Related CN104182389B (en) 2014-07-21 2014-07-21 A kind of big data analyzing business intelligence service system based on semanteme

Country Status (1)

Country Link
CN (1) CN104182389B (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281695B (en) * 2014-10-13 2017-12-15 安徽华贞信息科技有限公司 The semantic information abstracting method and its system of natural language based on combinatorial theory
CN104281697A (en) * 2014-10-15 2015-01-14 安徽华贞信息科技有限公司 Semantic-based hadoop system
CN106487562A (en) * 2015-09-01 2017-03-08 天脉聚源(北京)科技有限公司 A kind of method and system of wechat Users'Data Analysis
CN105320757A (en) * 2015-10-19 2016-02-10 杭州华量软件有限公司 Business intelligent analysis method for quickly processing data
CN105389348A (en) * 2015-10-27 2016-03-09 成都贝发信息技术有限公司 Open information interactive system
CN105389347A (en) * 2015-10-27 2016-03-09 成都贝发信息技术有限公司 Web based knowledge sharing system
CN105243515B (en) * 2015-11-09 2022-01-18 浙江中之杰软件技术有限公司 Enterprise condition management system
CN105447202A (en) * 2015-12-31 2016-03-30 宁波公众信息产业有限公司 Internet information collecting system
CN105787064A (en) * 2016-03-01 2016-07-20 广州铭诚计算机科技有限公司 Mining platform establishment method based on big data
EP3440569A4 (en) * 2016-04-05 2019-12-11 Fractal Industries, Inc. System for fully integrated capture, and analysis of business information resulting in predictive decision making and simulation
CN106776575A (en) * 2016-12-29 2017-05-31 深圳爱拼信息科技有限公司 A kind of system and method for real-time semantic search working opportunity
CN107563715A (en) * 2017-07-19 2018-01-09 天津云脉三六五科技有限公司 Foreign trade set-off marketing system and method
CN107632974B (en) * 2017-08-08 2021-04-13 北京微瑞思创信息科技股份有限公司 Chinese analysis platform suitable for multiple fields
CN107451292A (en) * 2017-08-16 2017-12-08 北京京东尚科信息技术有限公司 Scene feature data storage method, system and data extraction system on line
CN107704622A (en) * 2017-10-27 2018-02-16 成都艾薇尼尔信息技术有限公司 A kind of Intelligent Business service system based on big data analysis
CN107908778A (en) * 2017-12-04 2018-04-13 杭州华量软件有限公司 A kind of wisdom market big data management system
CN108764823A (en) * 2018-05-11 2018-11-06 甘肃祥龙科技服务有限责任公司 A kind of data analysis statistical system based on science service
TWI668649B (en) * 2018-05-18 2019-08-11 大陸商北京牡丹電子集團有限責任公司 System for expanding information of decision advice based on development direction of competitors and method thereof
CN109213983A (en) * 2018-07-13 2019-01-15 北京圣康汇金科技有限公司 A kind of generate online grinds reporting system and method
CN110717676A (en) * 2019-10-10 2020-01-21 广西电网有限责任公司 Method and system for managing and controlling performance risk
CN111026804A (en) * 2019-12-04 2020-04-17 深圳瑞力网科技有限公司 Big data analysis intelligent service system based on semantics
CN111708774B (en) * 2020-04-16 2023-03-10 上海华东电信研究院 Industry analytic system based on big data
CN112685385B (en) * 2020-12-31 2021-11-16 广西中科曙光云计算有限公司 Big data platform for smart city construction
CN112860971A (en) * 2021-02-05 2021-05-28 浙江华坤道威数据科技有限公司 Distributed multi-task based social negative public opinion real-time analysis method
CN114862284A (en) * 2022-07-06 2022-08-05 南通思普信息科技有限公司 Business intelligent module system based on cloud real-time semantic analysis

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101158963A (en) * 2007-10-31 2008-04-09 中兴通讯股份有限公司 Information acquisition processing and retrieval system
CN103389998A (en) * 2012-05-11 2013-11-13 安徽华贞信息科技有限公司 Novel Internet commercial intelligence information semantic analysis technology based on cloud service

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7260571B2 (en) * 2003-05-19 2007-08-21 International Business Machines Corporation Disambiguation of term occurrences

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101158963A (en) * 2007-10-31 2008-04-09 中兴通讯股份有限公司 Information acquisition processing and retrieval system
CN103389998A (en) * 2012-05-11 2013-11-13 安徽华贞信息科技有限公司 Novel Internet commercial intelligence information semantic analysis technology based on cloud service

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
《互联网舆情监管与应对技术探究》;郝文江等;《专题研究》;20120331(第3期);全文 *
《分布式多主题网络爬虫系统的研究与实现》;白鹤等;《计算机工程》;20091031;第35卷(第19期);全文 *
《基于数据挖掘的竞争情报智能获取模型研究》;殷之明等;《情报探索》;20091231(第12期);全文 *

Also Published As

Publication number Publication date
CN104182389A (en) 2014-12-03

Similar Documents

Publication Publication Date Title
CN104182389B (en) A kind of big data analyzing business intelligence service system based on semanteme
CN105677844B (en) A kind of orientation of moving advertising big data pushes and user is across screen recognition methodss
CN103745000B (en) Hot topic detection method of Chinese micro-blogs
Jansen et al. Classifying web queries by topic and user intent
Chianese et al. Cultural heritage and social pulse: a semantic approach for CH sensitivity discovery in social media data
CN105068991A (en) Big data based public sentiment discovery method
Psomakelis et al. Big IoT and social networking data for smart cities: Algorithmic improvements on Big Data Analysis in the context of RADICAL city applications
Xu et al. Wikipedia‐based topic clustering for microblogs
Hossny et al. Feature selection methods for event detection in Twitter: a text mining approach
CN104281608A (en) Emergency analyzing method based on microblogs
CN104965823A (en) Big data based opinion extraction method
CN106227885A (en) Processing method, device and the terminal of a kind of big data
CN107704622A (en) A kind of Intelligent Business service system based on big data analysis
Das et al. A CV parser model using entity extraction process and big data tools
CN105183765A (en) Big data-based topic extraction method
Ouyang et al. Sentistory: multi-grained sentiment analysis and event summarization with crowdsourced social media data
Guo et al. A survey of Internet public opinion mining
TW201640383A (en) Internet events automatic collection and analysis method and system thereof
Xu et al. The mobile media based emergency management of web events influence in cyber-physical space
Subramani et al. Text mining and real-time analytics of twitter data: A case study of australian hay fever prediction
Wu et al. Sub-event discovery and retrieval during natural hazards on social media data
Plummer et al. Analysing the Sentiment Expressed by Political Audiences on Twitter: The case of the 2017 UK general election
CN109902230A (en) A kind of processing method and processing device of news data
Voronov et al. Forecasting popularity of news article by title analyzing with BN-LSTM network
Kaufhold et al. Big data and multi-platform social media services in disaster management

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180119

Termination date: 20210721