CN115269771A - Big data analysis system based on semantics - Google Patents

Big data analysis system based on semantics Download PDF

Info

Publication number
CN115269771A
CN115269771A CN202210708476.5A CN202210708476A CN115269771A CN 115269771 A CN115269771 A CN 115269771A CN 202210708476 A CN202210708476 A CN 202210708476A CN 115269771 A CN115269771 A CN 115269771A
Authority
CN
China
Prior art keywords
unit
data
analysis
information
service
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210708476.5A
Other languages
Chinese (zh)
Inventor
杨懿
孟庆森
马祥帅
杨政伟
孟庆麟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xuzhou Guoyun Information Technology Co ltd
Original Assignee
Xuzhou Guoyun Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xuzhou Guoyun Information Technology Co ltd filed Critical Xuzhou Guoyun Information Technology Co ltd
Priority to CN202210708476.5A priority Critical patent/CN115269771A/en
Publication of CN115269771A publication Critical patent/CN115269771A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/02Marketing; Price estimation or determination; Fundraising
    • G06Q30/0201Market modelling; Market analysis; Collecting market data

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Databases & Information Systems (AREA)
  • Finance (AREA)
  • Development Economics (AREA)
  • Computational Linguistics (AREA)
  • Strategic Management (AREA)
  • Artificial Intelligence (AREA)
  • Accounting & Taxation (AREA)
  • Mathematical Physics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Game Theory and Decision Science (AREA)
  • Economics (AREA)
  • Marketing (AREA)
  • General Business, Economics & Management (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to the technical field of big data networks, and discloses a big data analysis system based on semantics, which comprises a data collection unit, a data identification unit, a data analysis unit and a data visualization unit, wherein the data collection unit is used for storing and updating big data in real time, the data identification unit is used for identifying and preliminarily filtering information required by a user, the data analysis unit is used for integrating, classifying and associating the big data information to analyze data to form an analysis result and provide real-time data required by analysis, and the data visualization unit is used for presenting the data analysis result in a graphic voice which can be identified by the user; the system also comprises an information processing unit, a service output unit, a service matching unit and a service processing unit. The invention has the advantages of distributed data acquisition, storage and monitoring, and further provides the opportunity of multi-party cooperation for enterprises to analyze data, focus business channels.

Description

Big data analysis system based on semantics
Technical Field
The invention relates to the technical field of big data networks, in particular to a big data analysis system based on semantics.
Background
With the increase of the popularity of the internet, the electronic commerce industry and the medium and small business industry provide online services, and provide corresponding service requirements for users by using a language processing application technology. The information resource is one of the most important resources of an enterprise, and developing the information resource is the starting point of enterprise informatization and is also the 'homing' of the enterprise informatization.
The big data has important influence on content creativity, product creation, marketing transmission, extension service and terminal manufacturing equivalent value chain links of small-sized cultural industry, can bring direct profit for cultural enterprises, and can create competitive advantages which are difficult to imitate for enterprises through positive feedback. The big data enables the creation of production and experience values of cultural products to be socialized and popular day by day, enables the creation of cultural attitude and business mode to be normalized and diversified day by day, enables enterprises to understand market demands to be real-time and accurate day by day, and enables the whole operation of the cultural enterprises to be collaborative and ecological day by day.
Most of the existing small and medium-sized enterprises are managed in a closed mode, all the industrial chains are in a self privacy protection state, although the small and medium-sized enterprises gradually start an online business mode combining the main business mode and the virtual business mode, the small and medium-sized enterprises lose the opportunity of cross-border fusion and external cooperation progress because the enterprises are closed to protect the business privacy and cannot focus on online business large data channel analysis, and therefore a semantic-based large data analysis system is required to be provided for people.
Disclosure of Invention
The invention provides a semantic-based big data analysis system, which has the advantages of distributed acquisition, storage and monitoring of data, further has the advantages of analyzing data, focusing commercial channels and providing opportunities for multi-party cooperation for enterprises, and solves the problem that the cooperation opportunities are lost due to the fact that online channels cannot be focused in the background technology.
The invention provides the following technical scheme: a big data analysis system based on semantics comprises a data collection unit, a data recognition unit, a data analysis unit and a data visualization unit, wherein the data collection unit is used for storing and updating big data in real time, the data recognition unit is used for recognizing and primarily filtering information required by a user, the data analysis unit is used for integrating, classifying and associating analysis data of the big data information to form an analysis result and provide real-time data required by analysis, and the data visualization unit is used for presenting the data analysis result in a graphic voice which can be recognized by the user;
the system also comprises an information processing unit, a service output unit, a service matching unit and a service processing unit; the information processing unit is used for sequentially transmitting data processing results of the data collection unit, the data identification unit, the data analysis unit and the data visualization unit; the service output unit user inputs, searches and searches the required service information; the service matching unit is used for collecting user requirement information and big data processing information for matching and scheduling, and the service processing unit is used for executing and displaying specific services of the user.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: the data collection unit comprises a distributed acquisition unit, a data storage unit and a data monitoring unit;
the distributed acquisition unit is used for acquiring data among different acquisition stations;
the data storage unit is used for classifying and storing mass data;
the data monitoring unit is used for monitoring flowing data, finishing effective interception according to a set confusion principle, then carrying out data restoration on the intercepted data, and finally analyzing the restored data and making a certain control decision.
As an alternative to the semantic-based big data analysis system of the present invention, the semantic-based big data analysis system further comprises: the data identification unit comprises a text error correction unit, an emotional tendency analysis unit and a comment viewpoint extraction unit;
the text error correction unit identifies wrong segments in the text, carries out error prompt and gives out correct suggested text content;
the emotional tendency analysis unit is used for judging the emotional tendency type of the text comprising the subjective information;
the comment opinion extraction unit is used for automatically analyzing comment attention points and comment opinions and outputting comment opinion labels and comment opinion polarities.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: the data analysis unit comprises a conversation emotion recognition unit, an article label unit, an article classification unit and a news summarization unit;
the conversation emotion recognition unit is used for recognizing the emotion of the user, which is subsequently included in the texts of the two conversation parties, in the conversation scene and giving out a targeted reference reply operation by combining the context;
the article label unit is used for performing core keyword gap on an article and providing technical support for personalized news urging, similar article aggregation, text content analysis and the like;
the article classification unit is used for automatically classifying the article installation content types, supporting 26 mainstream content types such as entertainment, sports and science and technology, and providing basic technical support for application such as article clustering and text content analysis;
and the news abstract unit automatically extracts key information in the news text and generates a news abstract with a specified length based on the deep semantic analysis model.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: the data visualization unit comprises a structured data unit and an unstructured data unit;
the structured data is used for presenting data in a fixed format and of a limited length; the unstructured data is used for presentation of data without a fixed format.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: the business processing unit comprises an enterprise information recording unit, a marketing data collecting unit, a marketing report generating unit and a business information analyzing unit;
the enterprise information recording unit is used for accurately extracting name, telephone and address information in the text, and performing automatic supplement and correction through natural language processing to assist address identification so as to generate standard and standard structured information;
the marketing data collection unit is used for accurately marketing and providing technical support of data collection, analysis and marketing means;
the marketing report generation unit is used for generating a summary marketing abstract for a user and supporting automatic generation, report and writing at regular intervals;
the business information analysis unit is used for analyzing business information, competitor analysis and business marketing plan data.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: the service processing unit also comprises a talent resource recommending unit and a supplier resource recommending unit;
the talent resource recommendation unit is used for recommending and serving human resources for enterprise users;
and the supplier resource recommending unit is used for matching and linking the matched suppliers of the enterprise users.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: the unstructured data unit comprises a visual graphic unit and a main body diagram unit;
the visual graphic unit is used for managing graphics and images; the main body diagram unit visually explains the data through the reality of expression, modeling, surface, attribute and animation.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: when the service output unit, the service matching unit and the service processing unit operate, after social network data are collected and analyzed, the service request is input to the service output unit based on the classification and the result of the document, the service request is distributed to service personnel in the service matching unit for interaction, the service request is further processed and enters the service processing unit, and then the service personnel in the service processing unit is triggered to meet the user requirements for recommending and searching information.
As an alternative to the semantic-based big data analysis system of the present invention, wherein: the service output unit comprises an automatic matching unit and a manual corresponding unit, the automatic matching unit is used for automatically replying to the user after information required by the user is matched and processed, and the manual corresponding unit is used for manually replying to the client after the information is manually retrieved and matched.
The invention has the following beneficial effects:
1. the big data analysis system based on the semantics realizes hierarchical and multidimensional business mode innovation from the aspects of core capability, business combination, product line, profitability method and the like by connecting and executing information and multi-party cooperation required by an enterprise through the information processing unit, the business matching unit and the business processing unit and feeding the information back to enterprise clients so as to realize sustainable development of the enterprise.
2. According to the semantic-based big data analysis system, distributed acquisition, storage and monitoring of data are achieved through the distributed acquisition unit, the data storage unit and the data monitoring unit, the data are screened, and therefore the most favorable data of enterprises and the most visual and similar data are obtained and are searched for reference and support.
3. The big data analysis system based on the semantics plays a role in accuracy and reading experience of enterprise users when enterprise clients search engines in the data identification unit, can help the merchants to analyze products, and effectively improves the searching accuracy.
4. According to the big data analysis system based on semantics, data analysis is carried out through a conversation emotion recognition unit, an article label unit, an article classification unit and a news summarization unit in a data analysis unit, and through humanized emotion analysis and a news summarization analysis model, a summarization result is conveniently and quickly formed, so that basic technical support is conveniently provided for enterprise users.
5. The big data analysis system based on the semantics automatically generates a summary marketing abstract for enterprise users through an enterprise information recording unit, a marketing data collecting unit, a marketing report generating unit and a business information analyzing unit in a business processing unit to analyze enterprise information and marketing data and refer to big data information, further provides marketing plans and references for the enterprise users in the next season, matches elite talent resources and supplier resources in industries matched with the enterprise users through a talent resource recommending unit and a supplier resource recommending unit, provides linkage for the enterprises, and reduces the problems that the enterprises cannot obtain information of multiple channels and cannot focus on-line business cooperation.
Drawings
FIG. 1 is a schematic structural diagram of a semantic-based big data analysis system according to the present invention.
Fig. 2 is a schematic diagram of a service processing unit structure according to the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Most of the existing small and medium-sized enterprises are managed in a closed mode, all the industrial chains are in a self privacy protection state, although the small and medium-sized enterprises gradually start an online business mode combining the main business mode and the virtual business mode, in the implementation process, because the enterprises are closed to protect the business privacy and cannot focus on online business large data channel analysis, the opportunity of cross-border fusion and external cooperation progress is lost.
The invention provides the following scheme, please refer to fig. 1-2, a big data analysis system based on semantics, comprising a data collection unit, a data recognition unit, a data analysis unit and a data visualization unit, wherein the data collection unit is used for storing and updating big data in real time, the data recognition unit is used for recognizing and primarily filtering information required by a user, the data analysis unit is used for integrating, classifying and associating analysis data of the big data information to form an analysis result and provide real-time data required by the analysis, and the data visualization unit is used for presenting the data analysis result in a graphic voice which can be recognized by the user;
the system also comprises an information processing unit, a service output unit, a service matching unit and a service processing unit; the information processing unit is used for sequentially transmitting data processing results of the data collection unit, the data identification unit, the data analysis unit and the data visualization unit; the service output unit user inputs, searches and searches the required service information; the service matching unit is used for collecting user requirement information and big data processing information for matching and scheduling, and the service processing unit is used for executing and displaying user specific services.
At present, no matter enterprises, governments, colleges and universities face data in a state that the data cannot be stored and calculated, and big data is massive and cannot be processed quickly by clicking, and the data needs to be processed by vertical expansion and horizontal expansion, namely, big memory high efficiency of the big data, a big disk big cluster and the like;
the continuous increase of information resources of the internet contains a huge amount of information with commercial value, and becomes an important business intelligent service information source, but the value of the internet is not fully developed and utilized by the industry due to the difficulties of huge data volume, high acquisition difficulty, relatively low unit value, almost all non-structural data such as texts and the like;
in the big data era, the business model of cultural enterprises slowly embodies a mode of mainly online and combining virtual and real, and a new mechanism of industrial chain operation and cross-border fusion with channel big data and customer big data as supports, and gives a strong look to create unique and rich experience value as pointing and then gap data, thereby providing online data support;
the big data are stored and updated in real time through the data collection unit, and information required by the enterprise user is identified and preliminarily filtered through the data identification unit; the big data information is integrated, classified and associated through the data analysis unit, and after the association analysis result of the enterprise is obtained, the data analysis result is provided for the enterprise to support related data, and the data analysis result is displayed through the visualization unit; information and multi-party cooperation required by an enterprise are connected and executed through the information processing unit, the service matching unit and the service processing unit, and the information is fed back to enterprise clients, so that hierarchical and multi-dimensional business mode innovation is realized from the aspects of core capacity, service combination, product line, profitable method and the like, and the sustainable development of the enterprise is realized;
not only can the enterprise protect the privacy, but also the enterprise can focus on the accurate convergence analysis, and the enterprise can conveniently perform cross-boundary fusion with a multi-channel customer industrial chain to realize multi-party cooperative connection.
Example 2
The present embodiment is an explanation based on embodiment 1, and specifically, referring to fig. 1-2, the data collection unit includes a distributed acquisition unit, a data storage unit, and a data monitoring unit;
the distributed acquisition unit is used for acquiring data among different acquisition stations;
the distributed data acquisition system is adaptable to both large-scale and small-and medium-scale systems, since a system of a corresponding scale can be constructed by selecting an appropriate number of acquisition stations.
Because a plurality of data acquisition stations taking a single chip as a core are adopted, if a certain data acquisition station fails, only the unit data can be influenced, no influence is caused to other parts of the system, and fault finding and replacement are facilitated. Because the distributed data acquisition system adopts a multi-machine parallel working mode, and each single chip microcomputer only finishes limited data acquisition and processing tasks, the requirement on hardware is not high, a high-performance system can be constructed by using low-grade hardware, and the distributed data acquisition system has the advantage that the micro-computer data acquisition system cannot compare with the system.
The data storage unit is used for classifying and storing mass data;
the data monitoring unit is used for monitoring flowing data, finishing effective interception according to a set confusion principle, then carrying out data restoration on the intercepted data, and finally analyzing the restored data and making a certain control decision.
The network data monitoring means that for the data flowing on the network, firstly, the effective interception is completed according to the preset interception principle, then the intercepted data is restored, and finally, the restored data is analyzed and a certain control decision is made. The network monitoring needs to be divided into three stages, namely data interception is completed firstly, then data reduction is performed, and finally control is performed. The difficulty of network monitoring is how to complete the first and second stages of work. Once the data frames are captured from the network, they are stored in the buffer, and it can be seen that the data capturing part can be divided into a hardware implementation part and a software implementation part, the hardware part is the network interface device of the corresponding computer, and the software part has many open source codes, such as libpcap and its application version Winpcap on the Windows platform.
The distributed acquisition, storage and monitoring of the data are realized through the distributed acquisition unit, the data storage unit and the data monitoring unit, the data are screened, and the most favorable and most visual and similar data searched by enterprises are obtained for reference and support.
Example 3
The present embodiment is an explanation based on embodiment 1, and specifically, please refer to fig. 1, where the data identification unit includes a text error correction unit, an emotional tendency analysis unit, and a comment viewpoint extraction unit;
the data analysis unit comprises a conversation emotion recognition unit, an article label unit, an article classification unit and a news abstract unit;
the text error correction unit identifies wrong segments in the text, carries out error prompt and gives out correct suggested text content; the method supports various text contents such as short texts, long texts, voice recognition results and the like, is widely applied to search engines, voice recognition and content verification, and can obviously improve the semantic accuracy and the user reading experience in various scenes;
the emotional tendency analysis unit is used for judging the emotional tendency type of the text comprising the subjective information; aiming at a chapter text with subjective description in a specific scene, automatically identifying core entity words in the text, and respectively judging the corresponding emotion and the corresponding confidence of each entity word; the emotion polarity labeling linguistic data suitable for the application scene of the user are used by the user, optimization training is carried out on the basis of the general model, and the requirement of higher accuracy of the exclusive scene is met;
the comment opinion extraction unit is used for automatically analyzing comment attention points and comment opinions and outputting comment opinion labels and comment opinion polarities; and automatically analyzing the comment attention points and the comment viewpoints, and outputting comment viewpoint labels and comment viewpoint polarities. The viewpoint extraction of 13 types of product user comments, including food, hotels, automobiles, scenic spots and the like, can help merchants to analyze products and assist users in consumption decision; on the basis of the effect of a universal edition, uploading of custom comment word lists of 13 industry verticals is supported, the precision and recall of comment extraction are effectively improved through customization, and meanwhile, "normalization tags" of user custom comments are supported;
in the data identification unit, accuracy and reading experience of enterprise users are achieved when enterprise clients search engines, and the data identification unit can help merchants to analyze products and effectively improve searching accuracy.
Example 4
The present embodiment is an explanation based on embodiment 1, and specifically, referring to fig. 1, the data analysis unit further includes a conversation emotion recognition unit, an article label unit, an article classification unit, and a news summarization unit;
the conversation emotion recognition unit is used for recognizing the emotion of the user, which is subsequently included in the texts of the two conversation parties, in the conversation scene and giving out a targeted reference reply operation by combining the context;
in a conversation scene, user emotions contained behind texts of two conversation parties are identified, the first-level emotion is divided into 3 types of positive, neutral and negative emotions, and the positive emotion is subdivided into: 3 types of love, pleasure and thanks; negative emotions are subdivided into: complain, anger, disgust, fear and sadness 5 kinds, give out the targeted reference answer to talk skill according to the negative emotion recognized by the machine and combining the context, help the application party to calm the negative emotion of the customer at the first time;
the article label unit is used for performing core keyword gap on an article and providing technical support for personalized news urging, similar article aggregation, text content analysis and the like;
the technology is accurate in advanced identification, the article label service deeply analyzes the title and the content of the article and outputs multi-dimensional labels capable of reflecting the key information of the article, such as topics, topics and entities, and corresponding label confidence coefficients; the method has rich dimensionality, is widely applied to contain multi-dimensional information, comprehensively covers key information topics of articles, and can be widely applied to scenes such as article aggregation, personalized recommendation, content retrieval and the like;
the article classification unit is used for automatically classifying the article installation content types, supporting 26 mainstream content types such as entertainment, sports and science and technology, and providing basic technical support for application such as article clustering and text content analysis;
the article classification service carries out deep analysis on the article content, and outputs the classification results of the article, such as entertainment, society, music, humanity, science, history, military affairs, sports, science and technology, education and the like, and the related confidence degrees, and a certain corresponding confidence degree score can be given through the article classification results, such as general relevance, extraordinary relevance and low relevance degree. The method has wide application value in scenes such as personalized recommendation, article aggregation, text content analysis and the like;
and the news abstract unit automatically extracts key information in the news text and generates a news abstract with a specified length based on the deep semantic analysis model.
The method can be used for scenes such as hot news aggregation, news recommendation, voice broadcast, APP information PUSH and the like, news semantics are analyzed comprehensively, traditional semantic features and a deep learning model are combined, short-circuit distribution and chapter structures are fully considered, the importance of news sentences is accurately calculated, comprehensive semantic understanding and analysis are carried out on news contents, abstract texts are automatically extracted, the abstract length can be flexibly controlled according to requirements, key information is automatically extracted, and abstract results are formed. The method can be used for various applications such as content understanding, content distribution, intelligent writing and the like;
through the data analysis of the conversation emotion recognition unit, the article label unit, the article classification unit and the news summarization unit in the data analysis unit, through humanized emotion analysis and a news summarization analysis model, summarization results are conveniently and quickly formed, and basic technical support is conveniently provided for enterprise users.
Example 5
The present embodiment is an explanation based on embodiment 1, and specifically, please refer to fig. 1-2, where the data visualization unit includes a structured data unit and an unstructured data unit;
the structured data is used for presenting data in a fixed format and of a limited length; the unstructured data is used for presentation of data without a fixed format.
Nowadays, more and more unstructured data are data with an indefinite length and an indefinite format, such as: video, voice, web pages, etc.; is some data in XML or HTML format. After the big data is acquired, the data is used to do: data acquisition, data storage, data cleaning, data analysis and data visualization;
the core effect of big data is data valuation, which means that the big data generates various values, the data valuation process is the main thing of big data, the big data can be recorded, described and predicted, the strategic significance of big data technology is not to master huge data information, but to do specialized processing to the meaningful data.
Example 6
The present embodiment is an explanation based on embodiment 1, and specifically, referring to fig. 2, the business processing unit includes an enterprise information recording unit, a marketing data collecting unit, a marketing report generating unit, and a business information analyzing unit;
the enterprise information recording unit is used for accurately extracting name, telephone and address information in the text, and performing automatic supplement and correction through natural language processing to assist address recognition so as to generate standard and standard structured information;
the marketing data collection unit is used for accurately marketing and providing technical support of data collection, analysis and marketing means;
the marketing report generation unit is used for generating a summary marketing abstract for a user and supporting automatic generation, report and writing at regular intervals;
the business information analysis unit is used for analyzing business information, competitor analysis and business marketing plan data.
By means of the enterprise information recording unit, the marketing data collecting unit, the marketing report generating unit and the business information analyzing unit in the business processing unit, analysis of enterprise information and marketing data and reference of big data information automatically generate a summary marketing abstract for enterprise users, marketing plans and reference of the next season are provided for the enterprise users, and the enterprise users can conveniently progress.
Example 7
The present embodiment is an explanation based on embodiment 1, and specifically, referring to fig. 2, the service processing unit further includes a talent resource recommending unit and a supplier resource recommending unit;
the talent resource recommendation unit is used for recommending and serving human resources for enterprise users;
and the supplier resource recommending unit is used for matching and linking the matched suppliers of the enterprise users.
When an enterprise needs to improve a marketing strategy, and elite talent resources and supplier resources in the industry matched with enterprise users are matched through a talent resource recommending unit and a supplier resource recommending unit for the enterprise to link, so that the problems that the enterprise cannot obtain multi-channel information and cannot focus on online business cooperation are reduced, and related talent resources and actual multi-party supplier resources may be lacked.
Example 8
The present embodiment is an improvement made on the basis of embodiment 1, and specifically, please refer to fig. 1, where the unstructured data unit includes a visual graphics unit and a main body diagram unit;
the visual graphic unit is used for managing graphics and images; the main body diagram unit visually explains the data through the reality of expression, modeling, surface, attribute and animation.
Example 9
The embodiment is an explanation based on embodiment 1, and specifically, when the service output unit, the service matching unit, and the service processing unit operate, after social network data is collected and analyzed, a service request is input to the service output unit based on a classification and a result of a document, the service request is distributed to service personnel in the service matching unit for interaction, and further processed to enter the service processing unit, so that the service personnel in the service processing unit is triggered to meet a user requirement for recommendation and information search.
Example 10
The present embodiment is an explanation based on embodiment 1, and specifically, referring to fig. 1, the service output unit includes an automatic matching unit and a manual corresponding unit, the automatic matching unit is configured to automatically reply to the user after information required by the user is matched, and the manual corresponding unit is configured to manually reply to the client after information required by the user is manually retrieved and matched.
It is noted that, herein, relational terms such as first and second, and the like may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus.
The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the technical principle of the present invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (10)

1. A big data analysis system based on semantics comprises a data collection unit, a data identification unit, a data analysis unit and a data visualization unit, and is characterized in that: the data acquisition unit is used for storing and updating big data in real time, the data identification unit is used for identifying and preliminarily filtering information required by a user, the data analysis unit is used for integrating, classifying and correlating analysis data of the big data information to form an analysis result and provide real-time data required by analysis, and the data visualization unit is used for presenting the data analysis result in a graphic voice which can be identified by the user;
the system also comprises an information processing unit, a service output unit, a service matching unit and a service processing unit; the information processing unit is used for sequentially transmitting data processing results of the data collection unit, the data identification unit, the data analysis unit and the data visualization unit; the service output unit user inputs, searches and searches the required service information; the service matching unit is used for collecting user requirement information and big data processing information for matching and scheduling, and the service processing unit is used for executing and displaying specific services of the user.
2. The big data analyzing system based on semantics as claimed in claim 1, wherein: the data collection unit comprises a distributed acquisition unit, a data storage unit and a data monitoring unit;
the distributed acquisition unit is used for acquiring data among different acquisition stations;
the data storage unit is used for classifying and storing mass data;
the data monitoring unit is used for monitoring flowing data, finishing effective interception according to a set confusion principle, then carrying out data restoration on the intercepted data, and finally analyzing the restored data and making a certain control decision.
3. The big data analyzing system based on semantics as claimed in claim 1, wherein: the data identification unit comprises a text error correction unit, an emotional tendency analysis unit and a comment viewpoint extraction unit;
the text error correction unit identifies wrong segments in the text, carries out error prompt and gives correct suggested text content;
the emotional tendency analysis unit is used for judging the emotional tendency type of the text comprising the subjective information;
the comment viewpoint extraction unit is used for automatically analyzing comment attention points and comment viewpoints and outputting comment viewpoint labels and comment viewpoint polarities.
4. The big data analyzing system based on semantics as claimed in claim 1, wherein: the data analysis unit comprises a conversation emotion recognition unit, an article label unit, an article classification unit and a news summarization unit;
the conversation emotion recognition unit is used for recognizing the emotion of the user, which is subsequently included in the texts of the two conversation parties, in the conversation scene and giving out a targeted reference reply operation by combining the context;
the article label unit is used for performing core keyword gap on an article and providing technical support for personalized news urging, similar article aggregation, text content analysis and the like;
the article classification unit is used for automatically classifying the article installation content types, supporting 26 mainstream content types such as entertainment, sports and science and technology, and providing basic technical support for application such as article clustering and text content analysis;
and the news abstract unit automatically extracts key information in the news text and generates a news abstract with a specified length based on the deep semantic analysis model.
5. The big data analyzing system based on semantics as claimed in claim 1, wherein: the data visualization unit comprises a structured data unit and an unstructured data unit;
the structured data is used for presenting data in a fixed format and of a limited length; the unstructured data is used for presentation of data without a fixed format.
6. A semantic-based big data analysis system according to claim 1, wherein: the business processing unit comprises an enterprise information recording unit, a marketing data collecting unit, a marketing report generating unit and a business information analyzing unit;
the enterprise information recording unit is used for accurately extracting name, telephone and address information in the text, and performing automatic supplement and correction through natural language processing to assist address identification so as to generate standard and standard structured information;
the marketing data collection unit is used for accurately marketing and providing technical support of data collection, analysis and marketing means;
the marketing report generation unit is used for generating a summary marketing abstract for a user and supporting automatic generation, report and writing at regular intervals;
the business information analysis unit is used for analyzing business information, competitor analysis and business marketing plan data.
7. The big data analyzing system based on semantics as claimed in claim 1, wherein: the service processing unit also comprises a talent resource recommending unit and a supplier resource recommending unit;
the talent resource recommendation unit is used for recommending and serving human resources for enterprise users;
and the supplier resource recommending unit is used for matching and linking the matched suppliers of the enterprise users.
8. A big data analysis system based on semantics as claimed in claim 5, wherein: the unstructured data unit comprises a visual graphic unit and a main body diagram unit;
the visual graphic unit is used for managing graphics and images; the main body diagram unit visually explains the data through the reality of expression, modeling, surface, attribute and animation.
9. The big data analyzing system based on semantics as claimed in claim 1, wherein: when the service output unit, the service matching unit and the service processing unit operate, after social network data are collected and analyzed, a service request is input to the service output unit based on the classification and the result of the document, the service request is distributed to service personnel in the service matching unit for interaction, the service request is further processed and enters the service processing unit, and then the service personnel in the service processing unit is triggered to meet the user requirements for recommending and searching information.
10. A semantic-based big data analysis system according to claim 1, wherein: the service output unit comprises an automatic matching unit and a manual corresponding unit, the automatic matching unit is used for automatically replying to the user after matching the information required by the user, and the manual corresponding unit is used for manually replying to the client after manual retrieval and matching.
CN202210708476.5A 2022-06-22 2022-06-22 Big data analysis system based on semantics Pending CN115269771A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210708476.5A CN115269771A (en) 2022-06-22 2022-06-22 Big data analysis system based on semantics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210708476.5A CN115269771A (en) 2022-06-22 2022-06-22 Big data analysis system based on semantics

Publications (1)

Publication Number Publication Date
CN115269771A true CN115269771A (en) 2022-11-01

Family

ID=83761689

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210708476.5A Pending CN115269771A (en) 2022-06-22 2022-06-22 Big data analysis system based on semantics

Country Status (1)

Country Link
CN (1) CN115269771A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117150271A (en) * 2023-09-08 2023-12-01 南京栖西科技有限公司 Communication path matching method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117150271A (en) * 2023-09-08 2023-12-01 南京栖西科技有限公司 Communication path matching method and system

Similar Documents

Publication Publication Date Title
Batool et al. Precise tweet classification and sentiment analysis
KR102075833B1 (en) Curation method and system for recommending of art contents
Zhao et al. Personalized reason generation for explainable song recommendation
CN103336793B (en) A kind of personalized article recommends method and system thereof
CN111797898B (en) Online comment automatic reply method based on deep semantic matching
CN110472017A (en) A kind of analysis of words art and topic point identify matched method and system
CN106062730A (en) Systems and methods for actively composing content for use in continuous social communication
CN109389423A (en) A kind of marketing application method based on big data fusion business
CN111475625A (en) News manuscript generation method and system based on knowledge graph
CN110929007A (en) Electric power marketing knowledge system platform and application method
CN111159341A (en) Information recommendation method and device based on user investment and financing preference
CN104484336A (en) Chinese commentary analysis method and system
WO2017107010A1 (en) Information analysis system and method based on event regression test
CN111027838A (en) Crowdsourcing task pushing method, device, equipment and storage medium thereof
Bach et al. Big data text mining in the financial sector
Wang et al. Seeft: Planned social event discovery and attribute extraction by fusing twitter and web content
CN115269771A (en) Big data analysis system based on semantics
Ennaji et al. Social intelligence framework: Extracting and analyzing opinions for social CRM
Zhao et al. Why you should listen to this song: Reason generation for explainable recommendation
Kim et al. Customer preference analysis based on SNS data
CN106407271B (en) Intelligent customer service system and updating method of intelligent customer service knowledge base thereof
Riccardi et al. The sensei project: Making sense of human conversations
Das et al. Opinion based on polarity and clustering for product feature extraction
CN111967251B (en) Customer sound wisdom insight system
CN110866084A (en) Data processing method and device for family tree character and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination