CN110543547A - automobile public praise semantic emotion analysis system - Google Patents

automobile public praise semantic emotion analysis system Download PDF

Info

Publication number
CN110543547A
CN110543547A CN201910745662.4A CN201910745662A CN110543547A CN 110543547 A CN110543547 A CN 110543547A CN 201910745662 A CN201910745662 A CN 201910745662A CN 110543547 A CN110543547 A CN 110543547A
Authority
CN
China
Prior art keywords
topic
word
automobile
semantic
primary
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910745662.4A
Other languages
Chinese (zh)
Other versions
CN110543547B (en
Inventor
陈延伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong Ding Ding Technology Co Ltd
Original Assignee
Guangdong Ding Ding Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong Ding Ding Technology Co Ltd filed Critical Guangdong Ding Ding Technology Co Ltd
Priority to CN201910745662.4A priority Critical patent/CN110543547B/en
Publication of CN110543547A publication Critical patent/CN110543547A/en
Application granted granted Critical
Publication of CN110543547B publication Critical patent/CN110543547B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/3332Query translation
    • G06F16/3335Syntactic pre-processing, e.g. stopword elimination, stemming
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

the invention discloses an automobile public praise semantic emotion analysis system, which comprises the following components: the system comprises a real-time monitoring module, a semantic mining module and a system display module; the real-time monitoring module is used for acquiring automobile user comment data in real time through the whole network according to a crawler technology; the semantic mining module is used for mining theme key words related to the automobile from the comment data and analyzing the emotional tendency of the theme key words to obtain emotional analysis data; the system display module carries out statistical analysis on the whole vehicle type, key attributes and product competitiveness details according to the emotion analysis data to form a visual chart; the method and the system utilize the artificial intelligence algorithm to analyze the related comment corpora of the user, solve the technical problems that no system specially analyzes the automobile evaluation of the user in the prior art, the user cannot quickly know the automobile evaluation, quickly know the automobile evaluation of the user, can be in a favorable position of knowing the user in competition, and can be widely applied to the automobile industry.

Description

Automobile public praise semantic emotion analysis system
Technical Field
The invention relates to the field of big data analysis, in particular to a method and a system for analyzing automobile comment emotion by semantic mining.
Background
Through the development of years, the automobile market slowly tends to be mature and stable, and meanwhile, along with the influence of the economic environment in recent years, the competition between automobile manufacturers and dealers is intensified; moreover, the consumers have more individual needs, and their buying habits and preferences need to be paid more attention, and whether to gain advantages in competition depends on the consumer's preference. Based on the background, the wide and objective listening to the sound of the consumer is important, and the purpose of the project is to establish a set of method and system for extracting the content with guiding significance from the text-form language such as a large number of comments of the consumer on the Internet.
In the prior art, no system specially analyzes the automobile evaluation of the user and cannot quickly know the automobile evaluation of the user, so that the method analyzes the related comment corpus of the user by using an artificial intelligence algorithm to realize the semantic emotion analysis of the automobile public praise.
Disclosure of Invention
The invention provides an automobile public praise semantic emotion analysis system, which aims to solve the technical problems that no system specially analyzes the automobile evaluation of a user and cannot quickly know the automobile evaluation of the user in the prior art, so that the relevant comment corpora of the user are analyzed by using an artificial intelligence algorithm, the automobile evaluation of the user is quickly known, and the automobile public praise semantic emotion analysis system can be in a position of being beneficial to knowing the automobile in competition.
Consumers comment on the safety, movement, simple atmosphere and the like of the automobile or more specifically such as 'line dynamic effect', 'headlight beautiful' and the like, reflect the feeling of the whole or a certain part of the automobile to people, and form the image characteristics of the automobile. The brand construction is helped, brand positioning analysis can be carried out accordingly, and whether the brand construction is consistent or not is judged. In addition, the emotional tendency may be determined.
in order to solve the above technical problem, an embodiment of the present invention provides an automobile public praise semantic emotion analysis system, including: the system comprises a real-time monitoring module, a semantic mining module and a system display module;
The real-time monitoring module is used for acquiring automobile user comment data in real time through the whole network according to a crawler technology;
The semantic mining module is used for mining theme key words related to the automobile from the comment data and analyzing the emotional tendency of the theme key words to obtain emotional analysis data;
And the system display module performs statistical analysis on the whole vehicle type, key attributes and product competitiveness details according to the emotion analysis data to form a visual chart and assist in management decision.
as a preferred scheme, the semantic mining module comprises a word stock unit, a semantic theme mining unit and an emotional tendency analyzing unit;
the word bank unit is used for storing the comment data and performing word segmentation processing on the comment data;
The semantic topic mining unit is used for identifying topic segmentation of the comment data after the topic segmentation processing, wherein the topic comprises a primary topic and a secondary topic;
The emotion tendency analysis unit is used for carrying out effective recognition of commendative and deresitive emotions on the theme.
preferably, the primary theme comprises appearance, interior, power, control and oil consumption; the secondary theme comprises a vehicle head, a paint surface and a seat.
As a preferred scheme, the word bank unit comprises an automobile word bank, an adjective word bank, a primary topic keyword mapping bank, a secondary topic keyword mapping bank and a word segmentation sub-module;
The automobile word bank is used for storing the comment data; the word segmentation sub-module is used for segmenting the comment data in the automobile word stock through a word segmentation technology to obtain related adjectives and storing the related adjectives in the adjective word stock; the primary topic keyword mapping library is used for performing primary topic keyword mapping on data in the adjective word library; and the secondary topic keyword mapping library is used for performing primary topic keyword mapping on the data in the adjective word library.
Preferably, the identifying of the primary topic includes:
Combining the word segmentation result and the primary topic keyword mapping library, extracting keywords which can point to the primary topic in the sentence, and determining a context for the keywords;
for the input keywords and the contexts thereof, judging the primary theme pointed by the keywords according to the calculation result of the language model;
And judging the primary topic of the sentence according to all the keywords in the sentence.
Preferably, the identifying of the secondary topic includes:
setting a decision tree type secondary topic identification rule according to the secondary topic keyword mapping library, and directly pointing the keywords of the single topic to the secondary topic;
For the keywords with multiple topics, other secondary keywords are found in the context window of the keywords, the probability that the keywords point to the secondary topics is increased, and the secondary topics are determined.
Preferably, the effective recognition of the commendatory and devastating emotion of the theme by the emotion tendency analysis unit includes:
respectively establishing regression models aiming at different topics;
training and confirming each adjective weight by using a gradient descent method based on the adjective word library to obtain an optimized regression model;
And substituting the new comment sentences into the optimized regression model to judge the commendability of the test sample according to the function values.
As a preferred scheme, the statistical analysis of the vehicle type entirety, key attributes and product competitiveness details includes:
Displaying objective analysis indexes aiming at different vehicle types;
Analyzing the advantages and disadvantages of the public praise of the consumers aiming at each vehicle type, and distinguishing an advantage public praise point, a disadvantage public praise point, a dispute public praise point and a secondary disadvantage public praise point by combining the evaluation breadth;
word-of-mouth details under each topic are deeply mined, as well as the results of evaluations on a particular configuration.
Preferably, the objective analysis index includes a good evaluation rate, a bad evaluation rate and an emotion index.
Preferably, the word-of-mouth details comprise overall style under the appearance theme, poor rating and good rating of the front face, the side body and the tail.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
the method and the system utilize the artificial intelligence algorithm to analyze the related comment linguistic data of the user, and solve the technical problems that no system specially analyzes the automobile evaluation of the user and cannot quickly know the automobile evaluation of the user in the prior art, so that the user can quickly know the automobile evaluation and can be in a position of being good at knowing the user in competition.
drawings
FIG. 1: is a system structure diagram in the embodiment of the invention;
FIG. 2: is a flow chart of the overall system framework in the embodiment of the invention;
FIG. 3: a logic sequence diagram of a secondary theme operation rule in the embodiment of the invention;
FIG. 4: is a linear regression schematic diagram of the emotion analysis model in the embodiment of the invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1 and 2, a preferred embodiment of the present invention provides a semantic emotion analyzing system for a public praise of an automobile, including: the system comprises a real-time monitoring module, a semantic mining module and a system display module;
the real-time monitoring module is used for acquiring automobile user comment data including contents such as automobile owner public praise, forum comments, social media and the like in real time through the whole network according to a crawler technology and providing basic data materials for semantic mining analysis.
The semantic mining module is used for mining theme key words related to the automobile from the comment data and analyzing the emotional tendency of the theme key words to obtain emotional analysis data; the semantic mining module consists of a primary theme recognition module, a secondary theme recognition module and an emotion analysis module. Defining the primary theme and the secondary theme for describing the automobile and establishing an automobile related word stock. On the basis of more than 200 million pieces of word-of-mouth data on the internet, a model is trained by using algorithms such as logistic regression, a Markov chain language model, a decision tree and the like, the semantic topic mining sub-model can identify 9 primary topics and 73 secondary topics, and the identification accuracy rate reaches 81%; the emotion tendency analysis submodel can effectively identify the positive emotion and the negative emotion under different subjects, and the identification accuracy reaches more than 83%.
the corpus label defines 9 primary subjects for describing the automobile and 73 secondary subjects under the primary subjects, strict definitions are given to the subjects, manual judgment methods of the subjects are formulated, and standards are formed. And marking the captured comment corpus according to the above, manually marking a first-level theme label and a second-level theme label for the sentence, and judging the emotional tendency of each theme to be used as a training sample.
In this embodiment, the semantic mining module includes a word stock unit, a semantic topic mining unit and an emotional tendency analysis unit; the word bank unit is used for storing the comment data and performing word segmentation processing on the comment data; the semantic topic mining unit is used for identifying topic segmentation of the comment data after the topic segmentation processing, wherein the topic comprises a primary topic and a secondary topic; the emotion tendency analysis unit is used for carrying out effective recognition of commendative and deresitive emotions on the theme.
The word bank unit analyzes a large number of comment linguistic data, closely combines the experience of the automobile industry, refers to TF-IDF values of the linguistic data, and establishes important databases such as an automobile characteristic noun word bank, an adjective word bank, a primary topic keyword theme mapping bank, a secondary topic keyword theme mapping bank and the like.
in this embodiment, the word bank unit includes an automobile word bank, an adjective word bank, a primary topic keyword mapping bank, a secondary topic keyword mapping bank, and a participle sub-module; the automobile word bank is used for storing the comment data; the word segmentation sub-module is used for segmenting the comment data in the automobile word stock through a word segmentation technology to obtain related adjectives and storing the related adjectives in the adjective word stock; the primary topic keyword mapping library is used for performing primary topic keyword mapping on data in the adjective word library; and the secondary topic keyword mapping library is used for performing primary topic keyword mapping on the data in the adjective word library.
In this embodiment, the primary theme includes appearance, trim, power, handling, and fuel consumption; the secondary theme comprises a vehicle head, a paint surface and a seat.
In this embodiment, the identifying the primary topic includes: combining the word segmentation result and the primary topic keyword mapping library, extracting keywords which can point to the primary topic in the sentence, and determining a context for the keywords; for the input keywords and the contexts thereof, judging the primary theme pointed by the keywords according to the calculation result of the language model; and judging the primary topic of the sentence according to all the keywords in the sentence.
the primary theme is a theme describing main concerns of the automobile, and comprises nine themes of appearance, interior trim, space, power, control, comfort, oil consumption, cost performance, configuration and the like. The primary topic identification method is a Bayes method based on keyword position coefficients, and adopts a generative model. The method mainly comprises the following steps:
(1) Dividing sentences into words and removing stop words;
(2) Finding keywords pointing to a primary topic in a sentence;
(3) Judging the primary theme pointed by each keyword according to the context;
(4) and synthesizing all keywords in the sentence to judge the primary topic of the sentence.
Four first-level theme related word banks are established, which are respectively as follows: word _1ec, word _ ecs, word _ adj, word _ super, where: the keywords only point to one primary topic and are stored in word _1 ec; keywords that point to more than one primary topic are stored in word _ ecs; non-lexical keywords, such as adjectives with certain directivity, are stored in word _ adj; to avoid accidental injury, words that remain very directional are not filtered and stored in word _ super.
Keywords pointing to multiple primary topics need to compute their corresponding primary topics through a probabilistic model.
1, using Unigram unitary model
The Unigram model belongs to a unitary model in the N-gram model, the Unigram model assumes that words in a text are subject to multinomial distribution, the probability of occurrence of each word in a context is independent, and sentences are independent. Whether the context belongs to the primary topic ecp or ecq is determined by the context of the single word composition in the vicinity of the keyword.
introducing position coefficients of context words, and improving a Markov chain language model:
p(W,ec)=p(ec)p(W|ec)p(W|ec)p(W|ec)
And 3, comparing and judging the primary theme, and meeting the following requirements:
p(W,ec)>p(W,ec)
ecp is the correct category and ecq is the wrong category
4, training process
And defining a Loss function by using the difference between the correct class probability and the error class probability, and performing parameter optimization to minimize the Loss function to obtain alpha, beta and gamma. α, β, γ obey the following constraints:
0≤α≤1
0≤β≤1
0≤γ≤1
The results of α, β, γ were substituted back into p (Wc, ecp) for topic determination.
5, implementation of primary topic identification
the method mainly comprises two parts, wherein a first part of training program determines parameter values of a function; the second part uses the parameter result to solve the function value of the input statement and output the primary theme.
the training program mainly uses modules including a word segmentation module jieba, a mathematical computation module numpy and scipy, a natural language processing common module nltk, and built-in pickle and json modules of python.
1) Reading the initial comment data file to generate sample files classified by subject, and then according to 7: and 3, dividing a training set and a testing set.
2) And segmenting and storing sample data.
3) And counting the occurrence rate of all words in the sample data after word segmentation.
4) And performing parameter optimization on all keywords based on the word frequency data of the samples and the established primary topic keyword library, optimizing the loss function to be minimum, outputting [ alpha vector, gamma value and beta vector ] of the keywords, and storing training results.
In the application, the primary theme of a sentence is judged, and the primary themes corresponding to all keywords in the sentence are integrated to be used as the primary theme of the whole sentence.
the secondary theme is theme division with finer granularity relative to the primary theme, and the secondary theme exists depending on the primary theme, for example, the appearance is further refined into categories such as overall appearance, car head, car body, car tail, car body size, paint finish and the like. Under 9 primary topics, a total of 73 secondary topics are defined.
the secondary topic is mainly judged by designing a logic rule similar to a decision tree, and mainly depends on the mapping of the keywords to the secondary topic. The key words of the secondary topic are similar to the primary topic in logic and are divided into key words of a single topic and key words of multiple topics.
the secondary theme is mainly judged by the designed rule, and the operation rule logic is shown in fig. 3.
And judging the secondary topic by using a method of referring to TF-IDF value to obtain the secondary topic key word. TF-IDF is the basic method to evaluate how important a word is to a certain topic. Formula for TF-IDF value calculation:
TF-IDF=TF*IDF
and after the TF-IDF values of all the words are calculated, extracting all the words with the values more than 0.04, storing the words, and performing a round of manual verification, wherein a better result can be obtained by verifying for a plurality of times according to the test result.
The secondary topic lexicon is generated from the extracted secondary topic keywords. Establishing a word bank for a secondary theme, and dividing the word bank into a single theme word bank and a multi-theme word bank:
sec _ words (words of a single topic, in the form of { word: { primary topic number: secondary topic }) });
sec _ words _2 (multiple topics in the form of { word: { Primary topic number: [ Secondary topic 1, Secondary topic 2 … ] } })
for each keyword, dividing a context range for the keyword, wherein the divided context is 8 words in front of and behind the keyword, and the division range cannot exceed front and rear punctuations; judging whether the keywords in the sentence belong to a secondary topic keyword library one by one, if not, discarding the keywords, and directly starting to judge the next keyword.
in this embodiment, the identifying the secondary topic includes: setting a decision tree type secondary topic identification rule according to the secondary topic keyword mapping library, and directly pointing the keywords of the single topic to the secondary topic; for the keywords with multiple topics, other secondary keywords are found in the context window of the keywords, the probability that the keywords point to the secondary topics is increased, and the secondary topics are determined.
the emotion analysis is to judge the emotional tendency of the sentences according to the comment sentences, and aims to judge the satisfaction degree of the user on each part of the automobile. In the embodiment, a method for judging the judgment of.
Because the same word may have different emotional tendencies under different themes, for example, the adjective "high" in "high oil consumption" and "high cost performance" is opposite emotion, the emotion analysis model establishes different themes for training.
The emotion analysis model takes emotion word vectors as input and emotional tendency positive direction, negative direction, neutral direction and the like as output, is a multi-classification problem essentially, and logistic regression is an effective solution method. As shown in FIG. 4, Logistic regression is a generalized linear regression model that relies on a sigmoid function for classification.
wherein z is ω 0x0+ ω 1x1+ ω 2x2+ … + ω nxn
Let vector ω be (ω 0, ω 1, ω 2, …, ω n), x be (x0, x1, x2, …, xn)
Then z is ω · x
omega is the coefficient of the linear equation, corresponding to the weight of each emotional word, x is the adjective existence vector of the sentence, and the logistic regression is to calculate the category of the input word vector by training out omega.
in vector x, the order of the elements is consistent with the indices in the adjective table, if an adjective exists in the sentence, the corresponding x value is 1, and if not, it is 0, and the negation of this adjective is-1 if "not XX". As in the sentence "atmospheric but not delicate", the vector is shown in table 1:
Table 1: list of vector examples
substituting the vector into sigmoid function to calculate the category.
1, reading user comment data as a sample, and dividing a test set and a training set according to a three-to-seven ratio;
2, training procedure
And (4) preparing a font and vocabulary table. The generation of the adjective vector needs to depend on all adjectives in the corpus, and in this embodiment, all primary topics and secondary topics use a unified adjective table.
And generating an adjective vector. Adjectives in the forms of 'good looking', 'true good looking', 'good looking' and the like are unified into a 'good looking' word list so as to be fully extracted after word segmentation. Each adjective generates a whole set of various expression modes to be made into a replacement word list.
Reading an adjective list, combining all adjectives and negative forms thereof into a complete word list adj, and for an input sentence, firstly, using a word segmentation tool to segment words; then, unifying the adjectives in all forms in the sentence in a unified form by using the replacement word list through the regular expression; then matching negative words through a regular expression; finally, comprehensively obtaining the adjectives in the uniform form in the sentence and the matched adjective negation form, and sequentially confirming whether the adjectives in the adj exist in the sentence according to the sequence of the adjective table adj: absence is 0, presence is 1, if there is a negative expression in the sentence "no/less/no + adjective" etc. -1, thus generating a presence vector of adjectives, each sentence generating such a vector.
and performing topic-based training by using the samples. Firstly, obtaining the emotion word existence vector of each training sample sentence to form an emotion word existence matrix. Establishing a weight vector of the emotional words, wherein the weight vector is equal to the length of an emotional word list, and taking all initial values of the emotional words as 1, namely:
ω=(1,1,1,…,1)
and (5) updating the omega vector by using a gradient descent method, taking the step length as 0.001, and iterating for 500 times.
The final training results are stored in the form of an [ adjective table, weight table ], where the order of the words in the adjective table and weight table correspond.
In this embodiment, the effective recognition of the commendatory and devastating emotion of the theme by the emotion tendency analysis unit includes: respectively establishing regression models aiming at different topics; training and confirming each adjective weight by using a gradient descent method based on the adjective word library to obtain an optimized regression model; and substituting the new comment sentences into the optimized regression model to judge the commendability of the test sample according to the function values.
establishing a logistic regression model aiming at different topics, training and confirming each adjective weight by using a gradient descent method based on the adjective library to obtain the logistic regression model, and substituting a new comment statement into a sigmoid function of the logistic regression to judge the acceptance and the rejection of the test sample according to the function value.
logistic regression model:
And the system display module performs statistical analysis on the whole vehicle type, key attributes and product competitiveness details according to the emotion analysis data to form a visual chart and assist in management decision. The system display module performs statistical analysis on the vehicle type integrity, key attributes and product competitiveness details to form a visual chart and assist in management decision. Including but not limited to: objective analysis indexes such as good evaluation rate, bad evaluation rate and sentiment index are displayed for different vehicle types; analyzing the advantages and disadvantages of the public praise of the consumers aiming at each vehicle type, and distinguishing an advantage public praise point, a disadvantage public praise point, a dispute public praise point and a secondary disadvantage public praise point by combining the evaluation breadth; deeply mining the word-of-mouth details under each primary theme, such as the overall style, front face, side body and tail poor rating and good rating under the appearance theme; and the result of the evaluation of the particular configuration.
In this embodiment, the statistical analysis of the vehicle model entirety, key attributes and product competitiveness details includes: displaying objective analysis indexes aiming at different vehicle types; analyzing the advantages and disadvantages of the public praise of the consumers aiming at each vehicle type, and distinguishing an advantage public praise point, a disadvantage public praise point, a dispute public praise point and a secondary disadvantage public praise point by combining the evaluation breadth; word-of-mouth details under each topic are deeply mined, as well as the results of evaluations on a particular configuration.
In this embodiment, the objective analysis index includes a good rating, a bad rating, and an emotion index.
In this embodiment, the word-of-mouth details include overall style under the appearance theme, poor rating and good rating of the front face, the side body and the tail.
The semantic emotion analysis method for the automobile public praise is realized by utilizing an artificial intelligence algorithm, and is a semantic emotion analysis method for the automobile relevant comments, which is realized by combining a large number of automobile relevant comment corpora, automobile industry experience and the artificial intelligence algorithm. Based on automobile related comment linguistic data and industry experience, 9 primary subjects for describing an automobile and 73 secondary subjects under the primary subjects are defined, a plurality of word banks are established, and the recognition degree and emotional tendency of a commented automobile in various aspects of the primary subjects (such as appearance, interior decoration, power, control, oil consumption and the like) and the secondary subjects (such as a head, a painted surface, seats and the like) are analyzed by a consumer by aiming at public praise comment texts related to the automobile by the consumer in combination with a natural language processing technology and the word banks.
the real-time monitoring automobile public praise semantic emotion analysis is greatly helpful to product development, marketing, product pricing and the like. (1) And providing a new vehicle pre-market product planning suggestion. And adjusting adverse factors of the product in time according to the comments of the car owner, such as product design defects, positioning ambiguity and the like. (2) The product line is improved in a targeted manner for the existing vehicle type, and configuration increase and decrease optimization is realized, such as configuration adjustment and interior design during annual change. (3) The hidden quality trouble is found in time, and the possibility of serious accidents such as oil leakage of an engine and easy damage of a gearbox is prevented in advance. (4) The method is used for marketing promotion, obtains the image of the vehicle type in the heart of consumers, makes full use of the advantages and avoids the disadvantages, guides the product advantages and the core selling points to be concerned, and monitors the marketing effect in real time. (5) Public opinion analysis of product pre-sale price.
The above-mentioned embodiments are provided to further explain the objects, technical solutions and advantages of the present invention in detail, and it should be understood that the above-mentioned embodiments are only examples of the present invention and are not intended to limit the scope of the present invention. It should be understood that any modifications, equivalents, improvements and the like, which come within the spirit and principle of the invention, may occur to those skilled in the art and are intended to be included within the scope of the invention.

Claims (10)

1. An automobile public praise semantic emotion analysis system, which is characterized by comprising: the system comprises a real-time monitoring module, a semantic mining module and a system display module;
the real-time monitoring module is used for acquiring automobile user comment data in real time through the whole network according to a crawler technology;
The semantic mining module is used for mining theme key words related to the automobile from the comment data and analyzing the emotional tendency of the theme key words to obtain emotional analysis data;
and the system display module performs statistical analysis on the whole vehicle type, key attributes and product competitiveness details according to the emotion analysis data to form a visual chart and assist in management decision.
2. the automotive word-of-mouth semantic emotion analysis system of claim 1, wherein the semantic mining module comprises a thesaurus unit, a semantic topic mining unit and an emotional tendency analysis unit;
the word bank unit is used for storing the comment data and performing word segmentation processing on the comment data;
the semantic topic mining unit is used for identifying topic segmentation of the comment data after the topic segmentation processing, wherein the topic comprises a primary topic and a secondary topic;
the emotion tendency analysis unit is used for carrying out effective recognition of commendative and deresitive emotions on the theme.
3. The automotive public praise semantic emotion analysis system of claim 2, wherein the primary theme includes appearance, trim, power, handling, and fuel consumption; the secondary theme comprises a vehicle head, a paint surface and a seat.
4. the automotive word-of-mouth semantic emotion analysis system of claim 3, wherein said lexicon unit comprises an automotive lexicon, an adjective lexicon, a primary topic keyword mapping repository, a secondary topic keyword mapping repository, and a participle sub-module;
The automobile word bank is used for storing the comment data; the word segmentation sub-module is used for segmenting the comment data in the automobile word stock through a word segmentation technology to obtain related adjectives and storing the related adjectives in the adjective word stock; the primary topic keyword mapping library is used for performing primary topic keyword mapping on data in the adjective word library; and the secondary topic keyword mapping library is used for performing primary topic keyword mapping on the data in the adjective word library.
5. the automotive public word semantic emotion analysis system of claim 4, wherein the identification of primary topics comprises:
Combining the word segmentation result and the primary topic keyword mapping library, extracting keywords which can point to the primary topic in the sentence, and determining a context for the keywords;
For the input keywords and the contexts thereof, judging the primary theme pointed by the keywords according to the calculation result of the language model;
And judging the primary topic of the sentence according to all the keywords in the sentence.
6. the automotive public word semantic emotion analysis system of claim 4, wherein the identification of secondary topics comprises:
Setting a decision tree type secondary topic identification rule according to the secondary topic keyword mapping library, and directly pointing the keywords of the single topic to the secondary topic;
for the keywords with multiple topics, other secondary keywords are found in the context window of the keywords, the probability that the keywords point to the secondary topics is increased, and the secondary topics are determined.
7. The automotive public praise semantic emotion analysis system of claim 4 wherein the emotion tendency analysis unit is operative to identify the subject as positive and negative emotion comprising:
respectively establishing regression models aiming at different topics;
training and confirming each adjective weight by using a gradient descent method based on the adjective word library to obtain an optimized regression model;
And substituting the new comment sentences into the optimized regression model to judge the commendability of the test sample according to the function values.
8. The automotive public praise semantic emotion analysis system of claim 1, wherein the statistical analysis of the vehicle type ensemble, key attributes and product competitiveness details comprises:
displaying objective analysis indexes aiming at different vehicle types;
analyzing the advantages and disadvantages of the public praise of the consumers aiming at each vehicle type, and distinguishing an advantage public praise point, a disadvantage public praise point, a dispute public praise point and a secondary disadvantage public praise point by combining the evaluation breadth;
Word-of-mouth details under each topic are deeply mined, as well as the results of evaluations on a particular configuration.
9. the automotive word of mouth semantic emotion analysis system of claim 8, wherein the objective analysis indicators include a good rating, a bad rating, and an emotion index.
10. The automotive public praise semantic emotion analysis system of claim 8, wherein the public praise details include overall style under the apparent theme, front face, side, poor rating and good rating of the tail.
CN201910745662.4A 2019-08-13 2019-08-13 Automobile public praise semantic emotion analysis system Active CN110543547B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910745662.4A CN110543547B (en) 2019-08-13 2019-08-13 Automobile public praise semantic emotion analysis system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910745662.4A CN110543547B (en) 2019-08-13 2019-08-13 Automobile public praise semantic emotion analysis system

Publications (2)

Publication Number Publication Date
CN110543547A true CN110543547A (en) 2019-12-06
CN110543547B CN110543547B (en) 2021-12-28

Family

ID=68711499

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910745662.4A Active CN110543547B (en) 2019-08-13 2019-08-13 Automobile public praise semantic emotion analysis system

Country Status (1)

Country Link
CN (1) CN110543547B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428510A (en) * 2020-03-10 2020-07-17 蚌埠学院 Public praise-based P2P platform risk analysis method
CN111612015A (en) * 2020-05-26 2020-09-01 创新奇智(西安)科技有限公司 Vehicle identification method and device and electronic equipment
CN111931497A (en) * 2020-07-16 2020-11-13 中国汽车技术研究中心有限公司 Optimization method for language of questionnaire for automobile consumer
CN112101033A (en) * 2020-09-01 2020-12-18 广州威尔森信息科技有限公司 Emotion analysis method and device for automobile public praise
CN112905740A (en) * 2021-02-04 2021-06-04 合肥工业大学 Topic preference mining method for competitive product hierarchy
CN114066117A (en) * 2020-08-05 2022-02-18 四川大学 Park multi-scale evaluation method based on comment text

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170024459A1 (en) * 2015-07-24 2017-01-26 International Business Machines Corporation Processing speech to text queries by optimizing conversion of speech queries to text
CN107944060A (en) * 2018-01-02 2018-04-20 天津大学 A kind of product information search method towards automotive vertical website
CN109408809A (en) * 2018-09-25 2019-03-01 天津大学 A kind of sentiment analysis method for automobile product comment based on term vector

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170024459A1 (en) * 2015-07-24 2017-01-26 International Business Machines Corporation Processing speech to text queries by optimizing conversion of speech queries to text
CN107944060A (en) * 2018-01-02 2018-04-20 天津大学 A kind of product information search method towards automotive vertical website
CN109408809A (en) * 2018-09-25 2019-03-01 天津大学 A kind of sentiment analysis method for automobile product comment based on term vector

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111428510A (en) * 2020-03-10 2020-07-17 蚌埠学院 Public praise-based P2P platform risk analysis method
CN111428510B (en) * 2020-03-10 2023-04-07 蚌埠学院 Public praise-based P2P platform risk analysis method
CN111612015A (en) * 2020-05-26 2020-09-01 创新奇智(西安)科技有限公司 Vehicle identification method and device and electronic equipment
CN111612015B (en) * 2020-05-26 2023-10-31 创新奇智(西安)科技有限公司 Vehicle identification method and device and electronic equipment
CN111931497A (en) * 2020-07-16 2020-11-13 中国汽车技术研究中心有限公司 Optimization method for language of questionnaire for automobile consumer
CN114066117A (en) * 2020-08-05 2022-02-18 四川大学 Park multi-scale evaluation method based on comment text
CN114066117B (en) * 2020-08-05 2023-04-07 四川大学 Park multi-scale evaluation method based on comment text
CN112101033A (en) * 2020-09-01 2020-12-18 广州威尔森信息科技有限公司 Emotion analysis method and device for automobile public praise
CN112101033B (en) * 2020-09-01 2021-06-15 广州威尔森信息科技有限公司 Emotion analysis method and device for automobile public praise
CN112905740A (en) * 2021-02-04 2021-06-04 合肥工业大学 Topic preference mining method for competitive product hierarchy
CN112905740B (en) * 2021-02-04 2022-08-30 合肥工业大学 Topic preference mining method for competitive product hierarchy

Also Published As

Publication number Publication date
CN110543547B (en) 2021-12-28

Similar Documents

Publication Publication Date Title
CN110543547B (en) Automobile public praise semantic emotion analysis system
CN112001185B (en) Emotion classification method combining Chinese syntax and graph convolution neural network
CN108733653B (en) Sentiment analysis method of Skip-gram model based on fusion of part-of-speech and semantic information
CN111914096B (en) Public opinion knowledge graph-based public transportation passenger satisfaction evaluation method and system
CN112001187B (en) Emotion classification system based on Chinese syntax and graph convolution neural network
CN107862087B (en) Emotion analysis method and device based on big data and deep learning and storage medium
CN112001186A (en) Emotion classification method using graph convolution neural network and Chinese syntax
CN108388660B (en) Improved E-commerce product pain point analysis method
CN108038725A (en) A kind of electric business Customer Satisfaction for Product analysis method based on machine learning
CN102929860B (en) Chinese clause emotion polarity distinguishing method based on context
Zhou et al. Sentiment analysis of text based on CNN and bi-directional LSTM model
CN110765769A (en) Entity attribute dependency emotion analysis method based on clause characteristics
CN108363691A (en) A kind of field term identifying system and method for 95598 work order of electric power
CN107818173B (en) Vector space model-based Chinese false comment filtering method
Biba et al. Sentiment analysis through machine learning: an experimental evaluation for Albanian
CN114255096A (en) Data requirement matching method and device, electronic equipment and storage medium
CN112632982A (en) Dialogue text emotion analysis method capable of being used for supplier evaluation
CN113360647B (en) 5G mobile service complaint source-tracing analysis method based on clustering
CN111241290A (en) Comment tag generation method and device and computing equipment
CN112200674B (en) Stock market emotion index intelligent calculation information system
KR20110044112A (en) Semi-automatic building of pattern database for mining review of product attributes
Rong et al. Sentiment analysis of ecommerce product review data based on deep learning
CN107729509B (en) Discourse similarity determination method based on recessive high-dimensional distributed feature representation
CN116070620A (en) Information processing method and system based on big data
CN115906824A (en) Text fine-grained emotion analysis method, system, medium and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant