CN112559731B - Market emotion monitoring method and system - Google Patents

Market emotion monitoring method and system Download PDF

Info

Publication number
CN112559731B
CN112559731B CN202011499398.XA CN202011499398A CN112559731B CN 112559731 B CN112559731 B CN 112559731B CN 202011499398 A CN202011499398 A CN 202011499398A CN 112559731 B CN112559731 B CN 112559731B
Authority
CN
China
Prior art keywords
text
emotion
topic
analysis result
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011499398.XA
Other languages
Chinese (zh)
Other versions
CN112559731A (en
Inventor
杨次光
于洋
郜卓琪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Glabal Tone Communication Technology Co ltd
Original Assignee
Glabal Tone Communication Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Glabal Tone Communication Technology Co ltd filed Critical Glabal Tone Communication Technology Co ltd
Priority to CN202011499398.XA priority Critical patent/CN112559731B/en
Priority to PCT/CN2020/139953 priority patent/WO2022126718A1/en
Publication of CN112559731A publication Critical patent/CN112559731A/en
Application granted granted Critical
Publication of CN112559731B publication Critical patent/CN112559731B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/06Asset management; Financial planning or analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Development Economics (AREA)
  • Finance (AREA)
  • Accounting & Taxation (AREA)
  • Economics (AREA)
  • Game Theory and Decision Science (AREA)
  • Human Resources & Organizations (AREA)
  • Operations Research (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a market emotion monitoring method and a system, wherein the method comprises the following steps: acquiring a text set containing a plurality of texts; inputting any text into the text topic analysis model to obtain topic labels output by the text topic analysis model; inputting any text into the text emotion analysis model, and obtaining a text emotion analysis result output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; selecting texts containing preset topics from the topic labels in the text set as topic texts; based on the text emotion analysis result of each topic text, obtaining a topic emotion analysis result of a preset topic; and obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic. The method and the system provided by the embodiment of the invention realize automatic market emotion monitoring, effectively improve emotion monitoring efficiency, have strong instantaneity and are beneficial to improving investment accuracy.

Description

Market emotion monitoring method and system
Technical Field
The invention relates to the technical field of text analysis, in particular to a market emotion monitoring method and system.
Background
The accurate judgment of market situation is an important premise for investors to carry out effective investment. The market emotion is taken as an important factor influencing the market situation, and the accuracy of the judgment of the market situation is greatly influenced. How to accurately count and judge market emotion is always a concern.
Previously, investors mainly conduct national policy analysis and industry research by browsing news information and referencing research reports, judge market situation by comprehensive technical analysis and the like, and conduct investment. Due to the lack of information, it is extremely difficult to mine market emotion in limited information.
With the rapid development of the Internet, various news and information layers related to the market are endless, the content is related to aspects, and the quantity is also increased in an explosive manner. The timeliness and diversity of market information can provide rich market emotion monitoring materials for investors, and meanwhile, the investors can hardly accurately judge market emotion from massive information in a short time. And the statistical judgment efficiency of the artificial market emotion is low, the obtained market emotion also has great subjectivity, the accuracy is low, the timeliness is poor, and the requirements of investors cannot be met.
How to realize high-efficiency, accurate and objective market emotion monitoring becomes a problem to be solved by people.
Disclosure of Invention
The embodiment of the invention provides a market emotion monitoring method and system, which are used for solving the problems of poor objectivity, low accuracy and poor timeliness of the existing manual analysis market emotion.
In a first aspect, an embodiment of the present invention provides a market emotion monitoring method, including:
acquiring a text set containing a plurality of texts;
inputting any text into a text topic analysis model, and acquiring a topic label of any text output by the text topic analysis model; the text topic analysis model is obtained based on sample text and sample topic label training;
inputting any text into a text emotion analysis model, and obtaining a text emotion analysis result of any text output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result;
selecting the text containing a preset theme from the text set as a theme text;
based on the text emotion analysis result of each topic text, obtaining a topic emotion analysis result of the preset topic;
and obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic.
In a second aspect, an embodiment of the present invention provides a market emotion monitoring system, including:
a text acquisition unit configured to acquire a text set including a plurality of texts;
the topic analysis unit is used for inputting any text into the text topic analysis model and acquiring a topic label of any text output by the text topic analysis model; the text topic analysis model is obtained based on sample text and sample topic label training;
the emotion analysis unit is used for inputting any text into the text emotion analysis model and obtaining a text emotion analysis result of any text output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result; the minimum confusion degree matching principle is adopted to mine the true semantic emotion of the text, and the calculation of the total value of the emotion of the single text and all the emotion of the text under a certain label can be realized.
The topic screening unit is used for selecting the texts containing preset topics from the topic labels from the text set to serve as topic texts;
the topic monitoring unit is used for acquiring topic emotion analysis results of the preset topics based on the text emotion analysis results of each topic text;
and the market monitoring unit is used for acquiring a market emotion analysis result based on the topic emotion analysis result of each preset topic.
In a third aspect, an embodiment of the present invention provides an electronic device, including a processor, a communication interface, a memory, and a bus, where the processor, the communication interface, and the memory are in communication with each other through the bus, and the processor may invoke logic instructions in the memory to perform the steps of the method as provided in the first aspect.
In a fourth aspect, embodiments of the present invention provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the method as provided by the first aspect.
According to the market emotion monitoring method and system provided by the embodiment of the invention, the text emotion analysis result of each text is obtained, so that the theme emotion analysis result and the market emotion analysis result of the preset theme are obtained, automatic market emotion monitoring based on artificial intelligence is realized, emotion monitoring efficiency is effectively improved, objective and accurate emotion analysis results can be obtained, instantaneity is high, investment investors are helped to reduce misjudgment, and investment accuracy is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a schematic flow chart of a market emotion monitoring method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a market emotion monitoring system according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of an emotion calculation flow;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The timeliness and diversity of market information can provide rich market emotion monitoring materials for investors, and meanwhile, the investors can hardly accurately judge market emotion from massive information in a short time. And the efficiency of manually carrying out statistics and judgment on market emotion is low, the objectivity and the accuracy are low, the timeliness is poor, and the requirements of investors cannot be met. In contrast, the embodiment of the invention provides a market emotion monitoring method. Fig. 1 is a flow chart of a market emotion monitoring method according to an embodiment of the present invention, as shown in fig. 1, the method includes:
step 110, a text set is obtained that includes a plurality of texts.
Specifically, the text set contains a large amount of text. Here, the text may be a news report, or may be a market report or a comment of a social platform, which is not limited in particular in the embodiment of the present invention.
Step 120, inputting any text into the text topic analysis model to obtain the topic label of the text output by the text topic analysis model; the text topic analysis model is trained based on sample text and sample topic labels.
Specifically, the text topic analysis model is used for topic analysis of an input text, and further outputting topic labels corresponding to the text. Here, the topic labels output by the text topic analysis model are one or more topic labels in a preset topic label set, and the topic labels are used for characterizing topics to which the text belongs.
In addition, before executing step 120, a text topic analysis model may be trained in advance, specifically, may be trained as follows: firstly, collecting a large number of sample texts and sample theme labels thereof; the sample text is a pre-acquired text, and the sample topic label is a topic label corresponding to a topic to which the sample text belongs, which is selected from a preset topic label set after topic analysis is performed on the sample text. And training the initial model based on the sample text and the sample topic label thereof, thereby obtaining a text topic analysis model. The initial model may be a single neural network model or a combination of multiple neural network models, and the embodiment of the invention does not specifically limit the type and structure of the initial model.
Step 130, inputting any text into the text emotion analysis model, and obtaining a text emotion analysis result of the text output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result.
Specifically, the text emotion analysis model is used for performing emotion analysis on the input text, and further outputting a text emotion analysis result of the text. Here, the text emotion analysis result includes a score corresponding to each preset emotion type, that is, a preset emotion type, of the text. The invention relates to a method for finding emotion words by directly adopting a sliding window mode based on emotion value calculation of a dictionary, and the invention combines the confusion degree of words when finding that the emotion words are extracted by using the sliding window, and adopts the principle of minimum confusion degree matching of a deformation 'mode of' ″ preferentially matching words conforming to a rule '>' phrase '>' single word in the rule matching process. The confusion indicates the accuracy of the semantic meaning that the emotional words are in the model. Wherein, the smaller the confusion degree, the more true the emotion, and the greater the assigned emotion weight. In the emotion calculation process, the longer the words are, the lower the confusion degree is, the more true the expressed emotion is, and the additional length information of the information such as the adverbs is added, so that different weights of emotion vocabularies are given. The invention innovatively applies the minimum confusion degree matching principle to ten-level emotion, better disambiguates, precisely matches emotion categories with finer granularity, and more truly expresses text emotion information.
In addition, before executing step 130, a text emotion analysis model may be trained in advance, specifically, may be trained as follows: firstly, collecting a large number of sample texts and emotion analysis results of the sample texts; the sample text is a pre-acquired text, and the emotion analysis result of the sample text is a score of each preset emotion type obtained by scoring each preset emotion type after emotion analysis of the sample text. And training the initial model based on the sample text and the emotion analysis result of the sample text, thereby obtaining a text emotion analysis model. The initial model may be a single neural network model or a combination of multiple neural network models, and the embodiment of the invention does not specifically limit the type and structure of the initial model.
And 140, selecting texts containing preset topics from the topic labels in the text set as topic texts.
Here, the preset theme is a preselected theme tag of the set of theme tags. On the basis of obtaining the topic label of each text in step 120, selecting the text containing the preset topic from the topic label in the text set as the topic text, wherein the topic text is the text corresponding to the preset topic.
And step 150, obtaining a theme emotion analysis result of the preset theme based on the text emotion analysis result of each theme text.
Step 160, obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic.
According to the method provided by the embodiment of the invention, the text emotion analysis result of each text is obtained, so that the topic emotion analysis result and the market emotion analysis result of the preset topic are obtained, automatic market emotion monitoring based on artificial intelligence is realized, emotion monitoring efficiency is effectively improved, objective and accurate emotion analysis results can be obtained, instantaneity is high, investment investors can be helped to reduce misjudgment, and investment accuracy is improved.
For any text, preprocessing the text first includes: removing messy codes, stopping words, separating sentences and the like; carrying out emotion value calculation on each sentence of text through a circulation mechanism; and finally, comprehensively mapping each text emotion value of the text into an overall emotion value of the text by adopting a mapping function. The specific text emotion calculation and analysis flow is shown in fig. 3.
Based on the above embodiment, in the method, step 110 specifically includes: determining a data acquisition mode; the data acquisition method is determined according to a preset acquisition object, acquisition frequency, acquisition content and object grade; based on the data acquisition method, texts are acquired from the acquired objects through the web crawlers, and a text set is constructed.
Specifically, the collection object is an object for text collection, and the collection object may be a website of a market-related institution, a website of financial news, a social media or a market report disclosed by a research institution, etc., which is not particularly limited in the embodiment of the present invention. The collection frequency is the frequency of text collection, the collection content comprises a title, a text (including pictures and videos), a release time, a release platform and the like, the object grade is used for representing the importance degree of a collection object, and the object grade can be determined according to the heat or the public trust of the collection object. Each acquisition object is provided with a corresponding acquisition frequency, acquisition content and object level.
A web crawler (also called a web spider or a web robot) is a program or script that automatically captures web information according to preset rules. After the data acquisition mode is determined, the text can be grabbed from each acquisition object through the web crawler according to the data acquisition mode. Further, the weight of the text emotion analysis result of the grabbing text from the corresponding collection object in the theme emotion analysis result and the market emotion monitoring result can be set according to the object grade.
Based on any of the above embodiments, the method further includes, before step 120: preprocessing a text set; the preprocessing includes data cleaning and data governance.
In particular, text captured from individual collection objects, especially from social media, may have a large number of miswords, grammatical errors, and stop words, as well as invalid, outdated, or very weak data associated with the market, the presence of which may interfere with the accuracy of market emotion statistics and monitoring. Data cleansing can discover and correct identifiable errors in text, such as checking data for consistency, deduplication, processing invalid and missing values, and so forth. Through data cleaning on the text, the accuracy and the effectiveness of the text can be ensured.
In addition, the texts obtained by grabbing from all the collection objects have different expression styles, and all the texts need to be unified through standardized processing so as to obtain market emotion statistics and monitoring based on all the texts. The data governance refers to a process from using scattered data to using unified main data, and text can be governed by six aspects of metadata, main data, data standards, data models, data quality and data security, so that the standardization of text data is realized.
After preprocessing for each text, each text after preprocessing is applied to market emotion monitoring.
Based on any of the above embodiments, the method further includes, before step 120: and (3) carrying out market relevance screening on the text set, and deleting the text with the market relevance lower than a preset relevance threshold.
In particular, the information content related to various aspects in the text set, wherein partial text is irrelevant to the market may exist, the text irrelevant to the market in the text set needs to be filtered before the text is applied to market emotion monitoring, and only the text relevant to the market is reserved. Here, the market relevance is used to measure the relevance of the text to the market, and the preset relevance threshold may be a preset threshold that measures whether the text is relevant to the market.
Any text may be input into the relevance model, and the market relevance of the text output by the relevance model is obtained. The relevance model herein may be trained based on the sample text and the market relevance to which the sample text corresponds.
Based on any of the above embodiments, the method in step 130 specifically includes: inputting any text into an emotion marking sub-model in the text emotion analysis model, and obtaining a marking result output by the emotion marking sub-model; wherein, the emotion marking sub-model is established based on an emotion dictionary; and inputting the labeling result into a labeling analysis sub-model in the text emotion analysis model, and obtaining the text emotion analysis result output by the labeling analysis sub-model.
Specifically, the text emotion analysis model comprises an emotion labeling sub-model and a labeling analysis sub-model, wherein the emotion labeling sub-model is used for analyzing and carding each word in an input text and collocation among the words based on an emotion dictionary, labeling of each preset emotion type in the text is achieved, and labeling results are output. The labeling analysis sub-model is used for carrying out statistical analysis on the labeling results output by the emotion labeling sub-model and obtaining text emotion analysis results.
The invention adopts a series of techniques to support the construction and expansion of emotion dictionary. Based on the existing emotion dictionary, an automatic construction technology and a manual auditing mode are adopted, and a domain dictionary which accords with the characteristics of financial news, namely a financial emotion dictionary, is creatively constructed. Compared with the traditional manual dictionary construction method, the automatic dictionary construction technology of the 'seed word+word 2 Vec+dictionary' mode reduces time consumption and improves quality of the dictionary.
Meanwhile, in order to overcome the semantic gap between pocket words, in view of the fact that similarity in vector space can be used for representing similarity in text semantics, the invention simplifies the processing of text contents into vector operation in k-dimensional vector space through training by utilizing the Word2vec deep learning idea. When the vector is calculated specifically, a CBOM model is mainly adopted, and the central word is predicted by the context word. Firstly, a huge number of raw corpus is established, word2Vec is used for pre-training Word vectors, and the relation among the character lines is well described by the result of CBOM model training, so that the vector multiplied by the quality is obtained.
In the process of maintaining the dictionary, a new word discovery technology of part of speech, word frequency and PMI modes is adopted, so that a dictionary of candidate emotion words can be constructed in time, and the dictionary is convenient to maintain and expand.
Using PMI to measure the correlation between two variables, it can be seen that if x and y are independent, p (x, y) =p (x) p (y); if x and y are more correlated, the ratio of p (x, y) to p (x) p (y) is greater, thereby better explaining the correlation of the occurrence of the latter two conditional probabilities. The new words with larger relativity are aggregated through a new word discovery technology, a candidate emotion dictionary is constructed, and then manual auditing, screening and the like are performed, so that the emotion dictionary is maintained and expanded rapidly and efficiently. Therefore, the invention combines the new word discovery technology with the ten-level emotion dictionary application, can better perfect and supplement the ten-level emotion dictionary, and has high innovation and strong applicability.
Based on any of the above embodiments, the method further includes, after step 150: and if the topic emotion analysis result of any preset topic exceeds the emotion threshold value corresponding to the preset topic, sending out alarm information.
Specifically, after a topic emotion analysis result of any preset topic is obtained, comparing the topic emotion analysis result with an emotion threshold value corresponding to the preset topic. Here, the emotion threshold value is the maximum value of the emotion values set in advance. And if the topic emotion analysis result exceeds the emotion threshold value, triggering an early warning mechanism to early warn related personnel. Here, sending out the early warning information may choose to notify investors, researchers, and other related personnel in various manners such as system phone call, short message, mail, etc.
Based on any of the above embodiments, the method further includes, after step 160: and if the market emotion analysis result exceeds a preset market emotion threshold value, sending out alarm information.
Specifically, after the market emotion analysis result is obtained, the market emotion analysis result is compared with a preset market emotion threshold value. Here, the preset market emotion threshold value is the maximum value of the preset emotion threshold values. And if the market emotion analysis result exceeds a preset market emotion threshold value, triggering an early warning mechanism to early warn related personnel.
Based on any of the above embodiments, in the method, the predetermined emotion types include worry, fear, disappointment, panic, despair, hope, optimism, ease, excitement and excitement.
Specifically, the paraphrasing of the ten preset emotion types and the corresponding emotion sensitivities are shown in the following table:
in the table, the emotional sensitivity of any preset emotion type can be classified into 6 grades from 0 to 5, and the higher the grade is, the stronger the preset emotion type is identified.
Based on any of the above embodiments, a market emotion monitoring method includes the steps of:
first, a market emotion classification system is established. Market emotion classification systems mainly include ten preset emotion types, such as worry, fear, disappointment, panic, despair, hope, optimism, mind, excitement, and contrasts, and corresponding six emotion acuity indicators. Here, the emotion acuity index is a score of a preset emotion type.
And secondly, determining the acquisition object and the corresponding acquisition frequency, acquisition content and object grade thereof, and determining a data acquisition mode. Here, the collection objects include, but are not limited to, domestic and foreign institution websites, financial news websites, research institution public reports, social media, and the like. The object level is divided according to the platform influence of the acquisition object. The collected content can cover information such as title, text (including pictures, videos, and the like involved in the text), release time, release platform, and the like.
Subsequently, a sample text is collected. Based on the data acquisition method, sample texts are acquired by the web crawlers going to each acquisition object. Here, the sample text involves 65 languages covering more than 200 countries and regions worldwide, and the final sample text size reaches over 5000 tens of thousands.
Then, the sample text is subjected to data processing. The data processing here includes data cleansing, data governance, text screening and corpus labeling. The text screening is to check and screen the sample texts after cleaning and treating one by professionals (including professional background personnel of finance profession, economics profession, psychology profession, linguistics and the like), and screen out the sample texts related to the market. The corpus labeling is to carry out theme labeling and emotion labeling on each sample text by a professional to obtain a sample theme label of the sample text, calculate emotion labeling results of the sample text types in a statistics mode, and obtain sample text emotion analysis results of the sample text.
Training a text topic analysis model based on the sample text and the sample topic label; based on the sample emotion marking result, combing the related words, fixed collocations and clauses corresponding to each preset emotion type, and expanding the existing emotion dictionary and corpus by utilizing technologies such as vocabulary similarity, new word discovery, text similarity and the like. And constructing a text emotion analysis model on the basis of the emotion dictionary. The minimum confusion degree matching principle is adopted to mine the true semantic emotion of the text, so that emotion calculation of a single text can be realized, and emotion calculation can be realized in a mode of calculating total emotion values of all texts under a certain label.
After the text topic analysis model and the text emotion analysis model are obtained, the text is obtained, any text is input into the text topic analysis model, and the topic label of the text output by the text topic analysis model is obtained. And inputting any text into the text emotion analysis model to obtain a text emotion analysis result of the text output by the text emotion analysis model.
And then, selecting texts containing preset topics in the topic labels as topic texts, and acquiring topic emotion analysis results of the preset topics based on the text emotion analysis results of each topic text. And if the topic emotion analysis result of any preset topic exceeds the emotion threshold value corresponding to the preset topic, sending out alarm information.
Based on the topic emotion analysis results of each preset topic, market emotion analysis results are obtained. And if the market emotion analysis result exceeds a preset market emotion threshold value, sending out alarm information.
According to the method provided by the embodiment of the invention, the text emotion analysis result of each text is obtained, so that the topic emotion analysis result and the market emotion analysis result of the preset topic are obtained, automatic market emotion monitoring based on artificial intelligence is realized, emotion monitoring efficiency is effectively improved, objective and accurate emotion analysis results can be obtained, instantaneity is high, investment investors can be helped to reduce misjudgment, and investment accuracy is improved.
Based on any of the above embodiments, fig. 2 is a schematic structural diagram of a market emotion monitoring system according to an embodiment of the present invention, and as shown in fig. 2, the system includes a text obtaining unit 210, a topic analyzing unit 220, an emotion analyzing unit 230, a topic screening unit 240, a topic monitoring unit 250, and a market monitoring unit 260;
the text acquisition unit 210 is configured to acquire a text set including a plurality of texts;
the topic analysis unit 220 is configured to input any text to a text topic analysis model, and obtain a topic label of the any text output by the text topic analysis model; the text topic analysis model is obtained based on sample text and sample topic label training;
the emotion analysis unit 230 is configured to input any one of the texts into a text emotion analysis model, and obtain a text emotion analysis result of the any one of the texts output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result; the minimum confusion degree matching principle is adopted to mine the true semantic emotion of the text, and the calculation of the total value of the emotion of the single text and all the emotion of the text under a certain label can be realized.
The topic screening unit 240 is configured to select, from the text set, the text including a preset topic in the topic label as a topic text;
the topic monitoring unit 250 is configured to obtain a topic emotion analysis result of the preset topic based on a text emotion analysis result of each topic text;
the market monitoring unit 260 is configured to obtain a market emotion analysis result based on the topic emotion analysis result of each of the preset topics.
According to the system provided by the embodiment of the invention, the text emotion analysis result of each text is obtained, so that the topic emotion analysis result and the market emotion analysis result of the preset topic are obtained, automatic market emotion monitoring based on artificial intelligence is realized, emotion monitoring efficiency is effectively improved, objective and accurate emotion analysis results can be obtained, instantaneity is high, investment investors can be helped to reduce misjudgment, and investment accuracy is improved.
Based on any of the above embodiments, the text obtaining unit 210 is specifically configured to:
determining a data acquisition mode; the data acquisition method is determined according to a preset acquisition object, acquisition frequency, acquisition content and object grade;
based on the data acquisition method, acquiring the text from the acquisition object through a web crawler, and constructing the text set.
Based on any of the above embodiments, the system further comprises a preprocessing unit; the preprocessing unit is used for:
preprocessing the text set; the preprocessing includes data cleaning and data governance.
Based on any of the above embodiments, the system further comprises a text filtering unit; the text screening unit is used for:
and screening the market relevance of the text set, and deleting the text with the market relevance lower than a preset relevance threshold.
Based on any of the above embodiments, the emotion analysis unit 230 specifically includes:
inputting any text into an emotion marking sub-model in the text emotion analysis model, and obtaining a marking result output by the emotion marking sub-model; wherein the emotion marking sub-model is established based on an emotion dictionary;
and inputting the labeling result into a labeling analysis sub-model in the text emotion analysis model, and obtaining the text emotion analysis result output by the labeling analysis sub-model.
Based on any of the above embodiments, the system further comprises an alarm unit; the alarm unit is specifically used for:
if the topic emotion analysis result of any preset topic exceeds the emotion threshold value corresponding to any preset topic, sending out alarm information;
or, the obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic further includes:
and if the market emotion analysis result exceeds a preset market emotion threshold value, sending out alarm information.
Based on any of the above embodiments, in the system, the predetermined emotion types include worry, fear, disappointment, panic, desperate, hope, optimism, mind, excitement and excitement.
Fig. 4 is a schematic physical structure diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 4, where the electronic device may include: processor 301, communication interface (Communications Interface) 302, memory (memory) 303 and communication bus 304, wherein processor 301, communication interface 302, memory 303 accomplish the communication between each other through communication bus 304. The processor 301 may invoke a computer program stored in the memory 303 and executable on the processor 301 to perform the market emotion monitoring method provided by the above embodiments, for example, including: acquiring a text set containing a plurality of texts; inputting any text into a text topic analysis model, and acquiring a topic label of any text output by the text topic analysis model; the text topic analysis model is obtained based on sample text and sample topic label training; inputting any text into a text emotion analysis model, and obtaining a text emotion analysis result of any text output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result; selecting the text containing a preset theme from the text set as a theme text; based on the text emotion analysis result of each topic text, obtaining a topic emotion analysis result of the preset topic; and obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic.
Further, the logic instructions in the memory 303 may be implemented in the form of software functional units and stored in a computer readable storage medium when sold or used as a stand alone product. Based on such understanding, the technical solution of the embodiments of the present invention may be embodied in essence or a part contributing to the prior art or a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Embodiments of the present invention also provide a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the market emotion monitoring method provided by the above embodiments, for example, including: acquiring a text set containing a plurality of texts; inputting any text into a text topic analysis model, and acquiring a topic label of any text output by the text topic analysis model; the text topic analysis model is obtained based on sample text and sample topic label training; inputting any text into a text emotion analysis model, and obtaining a text emotion analysis result of any text output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result; selecting the text containing a preset theme from the text set as a theme text; based on the text emotion analysis result of each topic text, obtaining a topic emotion analysis result of the preset topic; and obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (7)

1. A method of market emotion monitoring, comprising:
acquiring a text set containing a plurality of texts;
inputting any text into a text topic analysis model, and acquiring a topic label of any text output by the text topic analysis model; the text topic analysis model is obtained based on sample text and sample topic label training;
inputting any text into a text emotion analysis model, and obtaining a text emotion analysis result of any text output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result;
selecting the text containing a preset theme from the text set as a theme text;
based on the text emotion analysis result of each topic text, obtaining a topic emotion analysis result of the preset topic;
based on the topic emotion analysis results of each preset topic, obtaining market emotion analysis results;
the method for acquiring the text set containing a plurality of texts specifically comprises the following steps:
determining a data acquisition mode; the data acquisition method is determined according to a preset acquisition object, acquisition frequency, acquisition content and object grade;
based on the data acquisition method, acquiring the text from the acquisition object through a web crawler to construct the text set;
inputting any text into a text emotion analysis model, and acquiring a text emotion analysis result of any text output by the text emotion analysis model, wherein the text emotion analysis result specifically comprises:
inputting any text into an emotion marking sub-model in the text emotion analysis model, and obtaining a marking result output by the emotion marking sub-model; the emotion marking sub-model is established based on an emotion dictionary, and the establishment of the emotion dictionary adopts an automatic construction technology and a manual auditing mode to construct a dictionary which accords with the characteristics of the financial field; meanwhile, a new word discovery technology is adopted to realize the maintenance and expansion of a dictionary;
inputting the labeling result into a labeling analysis sub-model in the text emotion analysis model, and obtaining the text emotion analysis result output by the labeling analysis sub-model;
the text emotion analysis result based on each topic text is obtained, the topic emotion analysis result of the preset topic is obtained, and the real semantic emotion of the text is mined by adopting a minimum confusion degree matching principle; the calculation of the emotion of a single text and the calculation of the total value of all the text emotions under a certain label can be realized; and then further comprises:
if the topic emotion analysis result of any preset topic exceeds the emotion threshold value corresponding to any preset topic, sending out alarm information;
or, the obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic further includes:
and if the market emotion analysis result exceeds a preset market emotion threshold value, sending out alarm information.
2. The market emotion monitoring method of claim 1, wherein the inputting any one of the texts into a text topic analysis model and obtaining a topic label of any one of the texts output by the text topic analysis model further comprises:
preprocessing the text set; the preprocessing includes data cleaning and data governance.
3. The market emotion monitoring method of claim 1, wherein the inputting any one of the texts into a text topic analysis model and obtaining a topic label of any one of the texts output by the text topic analysis model further comprises:
and screening the market relevance of the text set, and deleting the text with the market relevance lower than a preset relevance threshold.
4. A market emotion monitoring method according to any of claims 1 to 3, characterized in that the preset emotion types include worry, fear, disappointment, panic, desperate, hope, optimism, mind, excitement and excitement.
5. A market emotion monitoring system, comprising:
a text acquisition unit configured to acquire a text set including a plurality of texts;
the topic analysis unit is used for inputting any text into the text topic analysis model and acquiring a topic label of any text output by the text topic analysis model; the text topic analysis model is obtained based on sample text and sample topic label training;
the emotion analysis unit is used for inputting any text into the text emotion analysis model and obtaining a text emotion analysis result of any text output by the text emotion analysis model; the text emotion analysis result comprises a score corresponding to each preset emotion type; the text emotion analysis model is trained based on the sample text and the sample text emotion analysis result; the minimum confusion degree matching principle is adopted to mine the true semantic emotion of the text, so that the calculation of the total value of the emotion of the single text and all the emotion of the text under a certain label can be realized;
the topic screening unit is used for selecting the texts containing preset topics from the topic labels from the text set to serve as topic texts;
the topic monitoring unit is used for acquiring topic emotion analysis results of the preset topics based on the text emotion analysis results of each topic text;
the market monitoring unit is used for acquiring a market emotion analysis result based on the topic emotion analysis result of each preset topic;
the method for acquiring the text set containing a plurality of texts specifically comprises the following steps:
determining a data acquisition mode; the data acquisition method is determined according to a preset acquisition object, acquisition frequency, acquisition content and object grade;
based on the data acquisition method, acquiring the text from the acquisition object through a web crawler to construct the text set;
inputting any text into a text emotion analysis model, and acquiring a text emotion analysis result of any text output by the text emotion analysis model, wherein the text emotion analysis result specifically comprises:
inputting any text into an emotion marking sub-model in the text emotion analysis model, and obtaining a marking result output by the emotion marking sub-model; the emotion marking sub-model is established based on an emotion dictionary, and the establishment of the emotion dictionary adopts an automatic construction technology and a manual auditing mode to construct a dictionary which accords with the characteristics of the financial field; meanwhile, a new word discovery technology is adopted to realize the maintenance and expansion of a dictionary;
inputting the labeling result into a labeling analysis sub-model in the text emotion analysis model, and obtaining the text emotion analysis result output by the labeling analysis sub-model;
the text emotion analysis result based on each topic text is obtained, the topic emotion analysis result of the preset topic is obtained, and the real semantic emotion of the text is mined by adopting a minimum confusion degree matching principle; the calculation of the emotion of a single text and the calculation of the total value of all the text emotions under a certain label can be realized; and then further comprises:
if the topic emotion analysis result of any preset topic exceeds the emotion threshold value corresponding to any preset topic, sending out alarm information;
or, the obtaining a market emotion analysis result based on the topic emotion analysis result of each preset topic further includes:
and if the market emotion analysis result exceeds a preset market emotion threshold value, sending out alarm information.
6. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the steps of the market emotion monitoring method of any of claims 1 to 4 when the program is executed by the processor.
7. A non-transitory computer readable storage medium having stored thereon a computer program, which when executed by a processor implements the steps of the market emotion monitoring method of any of claims 1 to 4.
CN202011499398.XA 2020-12-17 2020-12-17 Market emotion monitoring method and system Active CN112559731B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011499398.XA CN112559731B (en) 2020-12-17 2020-12-17 Market emotion monitoring method and system
PCT/CN2020/139953 WO2022126718A1 (en) 2020-12-17 2020-12-28 Method and system for monitoring market emotion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011499398.XA CN112559731B (en) 2020-12-17 2020-12-17 Market emotion monitoring method and system

Publications (2)

Publication Number Publication Date
CN112559731A CN112559731A (en) 2021-03-26
CN112559731B true CN112559731B (en) 2024-01-02

Family

ID=75063318

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011499398.XA Active CN112559731B (en) 2020-12-17 2020-12-17 Market emotion monitoring method and system

Country Status (2)

Country Link
CN (1) CN112559731B (en)
WO (1) WO2022126718A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109299271A (en) * 2018-10-30 2019-02-01 腾讯科技(深圳)有限公司 Training sample generation, text data, public sentiment event category method and relevant device
CN110189170A (en) * 2019-05-27 2019-08-30 中译语通科技股份有限公司 Market sentiment analysis method and system
CN110400173A (en) * 2019-07-23 2019-11-01 中译语通科技股份有限公司 Market sentiment monitoring system method for building up and system
KR20200022144A (en) * 2018-08-22 2020-03-03 울산과학기술원 System and Method for Analyzing Housing Market using Development of Emotion Dictionary

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20200022144A (en) * 2018-08-22 2020-03-03 울산과학기술원 System and Method for Analyzing Housing Market using Development of Emotion Dictionary
CN109299271A (en) * 2018-10-30 2019-02-01 腾讯科技(深圳)有限公司 Training sample generation, text data, public sentiment event category method and relevant device
CN110189170A (en) * 2019-05-27 2019-08-30 中译语通科技股份有限公司 Market sentiment analysis method and system
CN110400173A (en) * 2019-07-23 2019-11-01 中译语通科技股份有限公司 Market sentiment monitoring system method for building up and system

Also Published As

Publication number Publication date
WO2022126718A1 (en) 2022-06-23
CN112559731A (en) 2021-03-26

Similar Documents

Publication Publication Date Title
CN107092596B (en) Text emotion analysis method based on attention CNNs and CCR
CN110727880B (en) Sensitive corpus detection method based on word bank and word vector model
CN105893478B (en) A kind of tag extraction method and apparatus
CN106886580B (en) Image emotion polarity analysis method based on deep learning
WO2018184518A1 (en) Microblog data processing method and device, computer device and storage medium
CN104063399A (en) Method and system for automatically identifying emotional probability borne by texts
CN109918648B (en) Rumor depth detection method based on dynamic sliding window feature score
Salleh et al. A Malay Named Entity Recognition using conditional random fields
Ceballos Delgado et al. Deception detection using machine learning
Ruposh et al. A computational approach of recognizing emotion from Bengali texts
Suhas Bharadwaj et al. A novel multimodal hybrid classifier based cyberbullying detection for social media platform
WO2021012684A1 (en) Method and system for establishing market sentiment monitoring system
Aires et al. A deep learning approach to classify aspect-level sentiment using small datasets
CN112559731B (en) Market emotion monitoring method and system
CN111639494A (en) Case affair relation determining method and system
CN115033668B (en) Story venation construction method and device, electronic equipment and storage medium
US11748573B2 (en) System and method to quantify subject-specific sentiment
CN109597879B (en) Service behavior relation extraction method and device based on 'citation relation' data
Alorini et al. Machine learning enabled sentiment index estimation using social media big data
Vitório et al. Investigating opinion mining through language varieties: a case study of Brazilian and European Portuguese tweets
US20160350410A1 (en) Context-dependent evidence detection
Orellana et al. Evaluating named entities recognition (NER) tools vs algorithms adapted to the extraction of locations
Saggion et al. Can text summaries help predict ratings? a case study of movie reviews
Umidjon UNLOCKING THE POWER OF NATURAL LANGUAGE PROCESSING (NLP) FOR TEXT ANALYSIS
Zydziunaite Automatic content analysis of social media short texts: Scoping review of methods and tools

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant