CN113204710A - Public opinion analysis method and device, terminal equipment and storage medium - Google Patents

Public opinion analysis method and device, terminal equipment and storage medium Download PDF

Info

Publication number
CN113204710A
CN113204710A CN202110605199.0A CN202110605199A CN113204710A CN 113204710 A CN113204710 A CN 113204710A CN 202110605199 A CN202110605199 A CN 202110605199A CN 113204710 A CN113204710 A CN 113204710A
Authority
CN
China
Prior art keywords
index
standard
similarity
competitiveness
indexes
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110605199.0A
Other languages
Chinese (zh)
Inventor
陈凯
徐冰
汪伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202110605199.0A priority Critical patent/CN113204710A/en
Publication of CN113204710A publication Critical patent/CN113204710A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The application is suitable for the technical field of data analysis, and provides a public opinion analysis method, a device, terminal equipment and a storage medium, wherein the method comprises the following steps: acquiring a competitiveness index of public opinion information; if the pre-stored multiple standard indexes do not comprise the competitiveness index, calculating the semantic similarity between the competitiveness index and each standard index; when the semantic similarity is smaller than a first preset threshold, screening candidate indexes from the standard indexes based on the semantic similarity; calculating literal similarity of the candidate indexes and the competitiveness indexes; if the literal similarity greater than a second preset threshold exists in the literal similarities, taking the candidate index corresponding to the maximum value in the literal similarities greater than the second preset threshold as a standard index matched with the competitiveness index; can standardize the competitiveness index through semantic similarity and literal similarity in this application, information quantity when having reduced public opinion analysis has brought the facility for public opinion information's analysis, has improved public opinion analysis's efficiency.

Description

Public opinion analysis method and device, terminal equipment and storage medium
Technical Field
The application belongs to the technical field of data analysis, and particularly relates to a public opinion analysis method, a public opinion analysis device, terminal equipment and a storage medium.
Background
Public opinion is the opinion or attitude that the public holds around the occurrence, development and change of an event in a certain period of time. It is the sum of the beliefs, attitudes, opinions, emotions, and the like expressed by most people on various phenomena, problems, and the like in the society. The analysis of public sentiment plays an important role in the development of enterprises or units.
At present, public sentiment is analyzed by manually extracting keywords and manually analyzing the conditions of the keywords to achieve the purpose of public sentiment analysis. When public sentiment is analyzed, a large amount of public sentiment information may exist at the same time, keywords extracted from the large amount of public sentiment information are various, a large amount of manpower and time are consumed when the large amount of different keywords are analyzed, and the efficiency of public sentiment analysis is low.
Disclosure of Invention
The embodiment of the application provides a public opinion analysis method, a public opinion analysis device, terminal equipment and a storage medium, which can solve the problem of low public opinion analysis efficiency.
In a first aspect, an embodiment of the present application provides a public opinion analysis method, including:
acquiring a competitiveness index of public opinion information;
if the pre-stored multiple standard indexes do not comprise the competitiveness index, calculating the semantic similarity between the competitiveness index and each standard index;
when the semantic similarity is smaller than a first preset threshold, screening candidate indexes from the standard indexes based on the semantic similarity;
calculating literal similarity of the candidate index and the competitiveness index;
and if the literal similarity greater than a second preset threshold exists in the literal similarities, taking the candidate index corresponding to the maximum value in the literal similarities greater than the second preset threshold as the standard index matched with the competitiveness index.
In a second aspect, an embodiment of the present application provides a public opinion analysis device, including:
the index acquisition module is used for acquiring a competitiveness index of public opinion information;
the semantic similarity calculation module is used for calculating the semantic similarity between the competitive index and each standard index if the pre-stored standard indexes do not comprise the competitive index;
the screening module is used for screening candidate indexes from the standard indexes based on the semantic similarity when the semantic similarity is smaller than a first preset threshold;
the literal similarity calculation module is used for calculating the literal similarity of the candidate indexes and the competitiveness indexes;
and the judging module is used for taking the candidate index corresponding to the maximum value in the literal similarity greater than the second preset threshold value as the standard index matched with the competitiveness index if the literal similarity greater than the second preset threshold value exists in the literal similarity.
In a third aspect, an embodiment of the present application provides a terminal device, including: the public opinion analysis method comprises a memory, a processor and a computer program which is stored in the memory and can run on the processor, wherein the processor implements the public opinion analysis method according to any one of the first aspect when executing the computer program.
In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when the computer program is executed by a processor, the public opinion analysis method in any one of the above first aspects is implemented.
In a fifth aspect, the present application provides a computer program product, which when run on a terminal device, causes the terminal device to execute the public opinion analysis method according to any one of the above first aspects.
It is understood that the beneficial effects of the second aspect to the fifth aspect can be referred to the related description of the first aspect, and are not described herein again.
Compared with the prior art, the embodiment of the application has the advantages that: firstly, obtaining a competitive index of public opinion information, and if the pre-stored standard indexes do not comprise the competitive index, calculating the semantic similarity between the competitive index and each standard index; when the semantic similarity is smaller than a first preset threshold, screening candidate indexes from the standard indexes based on the semantic similarity; calculating literal similarity of the candidate indexes and the competitiveness indexes; if the literal similarity greater than a second preset threshold exists in the literal similarities, taking the candidate index corresponding to the maximum value in the literal similarities greater than the second preset threshold as a standard index matched with the competitiveness index; can standardize the competitiveness index through semantic similarity and literal similarity in this application, information quantity when having reduced public opinion analysis has brought the facility for public opinion information's analysis, has improved public opinion analysis's efficiency.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without inventive exercise.
Fig. 1 is a schematic view of an application scenario of a public opinion analysis method according to an embodiment of the present application;
fig. 2 is a schematic flow chart illustrating a public opinion analysis method according to an embodiment of the present application;
fig. 3 is a schematic flowchart of a semantic similarity calculation method according to an embodiment of the present application;
FIG. 4 is a schematic flow chart illustrating a method for screening candidate indicators according to an embodiment of the present disclosure;
FIG. 5 is a flowchart illustrating a method for processing opinion words and standard indicators according to an embodiment of the present application;
fig. 6 is a schematic structural diagram of a display page of a public opinion analysis model according to an embodiment of the present application;
fig. 7 is a schematic structural diagram of a public opinion analysis device according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a terminal device according to an embodiment of the present application.
Detailed Description
In the following description, for purposes of explanation and not limitation, specific details are set forth, such as particular system structures, techniques, etc. in order to provide a thorough understanding of the embodiments of the present application. It will be apparent, however, to one skilled in the art that the present application may be practiced in other embodiments that depart from these specific details. In other instances, detailed descriptions of well-known systems, devices, circuits, and methods are omitted so as not to obscure the description of the present application with unnecessary detail.
It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.
It should also be understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
As used in the specification of this application and the appended claims, the term "if" may be interpreted contextually as "when … …" or "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".
Furthermore, in the description of the present application and the appended claims, the terms "first," "second," "third," and the like are used for distinguishing between descriptions and not necessarily for describing or implying relative importance.
Reference throughout this specification to "one embodiment" or "some embodiments," or the like, means that a particular feature, structure, or characteristic described in connection with the embodiment is included in one or more embodiments of the present application. Thus, appearances of the phrases "in one embodiment," "in some embodiments," "in other embodiments," or the like, in various places throughout this specification are not necessarily all referring to the same embodiment, but rather "one or more but not all embodiments" unless specifically stated otherwise. The terms "comprising," "including," "having," and variations thereof mean "including, but not limited to," unless expressly specified otherwise.
Fig. 1 is a schematic view of an application scenario of a public opinion analysis method according to an embodiment of the present application, where the public opinion analysis method can be used for analyzing public opinion information. The storage device 10 is used for storing public opinion information and competitive indexes corresponding to the public opinion information, and the processor 20 is used for acquiring the public opinion information and the competitive indexes corresponding to the public opinion information from the storage device 10, standardizing the competitive indexes through analysis of the competitive indexes and a plurality of preset standard indexes, and further achieving the purpose of analyzing the public opinion information.
The public opinion analysis method according to the embodiment of the present application is described in detail below with reference to fig. 1.
Fig. 2 shows a schematic flow chart of a public opinion analysis method provided by the present application, and with reference to fig. 2, the method is described in detail as follows:
s101, obtaining a competitiveness index of public sentiment information.
In this embodiment, the competitive index of public opinion information can be obtained from the storage device.
Optionally, public sentiment information may be acquired first, and then a competitiveness index is acquired based on the public sentiment information.
Specifically, the public opinion information can be downloaded from the internet. Public opinion information may also be obtained from a storage device or database. After the public opinion information is obtained, the public opinion information can be input into the index extraction model to obtain the competitive index of the public opinion information. The index extraction model may be a neural network model or the like.
In this embodiment, the competitiveness index is an index that can reflect the competitiveness, which is the ability of the object to display in the competition.
In this embodiment, the competitiveness index may be a keyword extracted from public opinion information and capable of reflecting competitiveness.
As an example, if the public opinion information is: the brand value of brand A is increased strongly, three-year three-hop ranking is realized in a global brand value 500 strong list ranking, and the keyword 'brand value' can be used as a competitive index.
In this embodiment, a public opinion information may include one or more competitive indicators.
And S102, if the pre-stored standard indexes do not comprise the competitiveness index, calculating the semantic similarity between the competitiveness index and each standard index.
In the present embodiment, the standard index may be an index stored in advance in a public opinion analysis model or a database. For example, the standard index may be the number of customers, business performance, valuation of premium, research and development ability, business performance, market risk, and sales profit margin, etc.
In this embodiment, after obtaining the competitiveness index, it may be first searched whether the competitiveness index exists in the standard index, and if the competitiveness index is the standard index, the competitiveness index may not be processed. If the competitiveness index is not in the standard indexes, the competitiveness index needs to be standardized, namely, the standard index matched with the competitiveness index is found, and the competitiveness index is mapped to the standard index matched with the competitiveness index.
Specifically, if the competitive index is not included in the plurality of standard indexes, the semantic similarity between the competitive index and each standard index can be calculated. The semantic similarity represents the similarity degree of the competitiveness index and the standard index in the meaning of language, and if the two indexes are similar semantically, the standard index and the competitiveness index can be determined to be matched with each other.
Specifically, the cosine similarity between the competitiveness index and the standard index can be calculated, and the calculated cosine similarity is used as the semantic similarity between the competitiveness index and the standard index.
S103, when the semantic similarity is smaller than a first preset threshold value, screening candidate indexes from the standard indexes based on the semantic similarity.
In this embodiment, after obtaining the semantic similarity between the competitiveness index and the standard index, the semantic similarity may be compared with a first preset threshold, so as to determine whether the standard index may be used as the standard index matched with the competitiveness index.
In the present embodiment, the first preset threshold may be set as needed, for example, 0.8, 0.7, 0.75, or the like.
Specifically, if there is a semantic similarity greater than or equal to the first preset threshold in the semantic similarities, a maximum value of the semantic similarities greater than or equal to the first preset threshold may be searched for; and taking the standard index corresponding to the maximum value in the semantic similarity greater than or equal to the first preset threshold as the standard index matched with the competitiveness index, namely taking the standard index with the highest similarity as the standard index matched with the competitiveness index, and mapping the competitiveness index to the standard index with the highest similarity.
For example, if there are 3 standard indicators and 3 competitiveness indicators whose semantic similarities are greater than a first preset threshold, the semantic similarities of the 3 standard indicators and the competitiveness indicators are respectively: the semantic similarity between the standard index A and the competitiveness index is 0.81; the semantic similarity between the standard index B and the competitiveness index is 0.88; the semantic similarity between the standard index C and the competitiveness index is 0.91. And if the semantic similarity between the standard index C and the competitiveness index is the highest, taking the standard index C as the standard index matched with the competitiveness index.
In this embodiment, if the semantic similarity is smaller than the first preset threshold, it indicates that it is not possible to determine whether a standard index matching the competitiveness index exists in the standard indexes only from the semantic similarity, that is, an index similar to the competitiveness index in semantics does not exist in the standard indexes. Therefore, further judgment is required.
Specifically, according to the semantic similarity, a standard index which may be matched with the competitiveness index is first screened from the standard indexes, and the screened standard index is marked as a candidate index in the application.
Optionally, the semantic similarities may be arranged from large to small, a preset number of semantic similarities are selected from the first one, and the standard index corresponding to the selected semantic similarities is used as the candidate index. The preset number can be set according to needs, and can also be the product of the total number of the standard indexes and the preset percentage.
And S104, calculating the literal similarity of the candidate index and the competitiveness index.
In this embodiment, the literal similarity characterizes the literal similarity of the candidate index and the competitiveness index.
Optionally, the candidate index and the competitiveness index are input into the deep learning model to obtain the literal similarity of the candidate index and the competitiveness index.
And S105, if the literal similarity greater than a second preset threshold exists in the literal similarities, taking a candidate index corresponding to the maximum value in the literal similarities greater than the second preset threshold as a standard index matched with the competitiveness index.
In this embodiment, after the literal similarity between the candidate index and the competitiveness index is obtained, the candidate index matching the competitiveness index, that is, the standard index matching the competitiveness index, may be determined according to the literal similarity.
Specifically, a second preset threshold may be set as required, whether the literal similarity greater than the second preset threshold exists in the literal similarity is determined, and if the literal similarity greater than the second preset threshold exists in the literal similarity, it is determined that a candidate index matching the competitiveness index exists in the candidate index. And if only one literal similarity greater than a second preset threshold exists in the literal similarities, taking the candidate index corresponding to the literal similarities as a candidate index matched with the competitiveness index, wherein the candidate index is the standard index.
If a plurality of face similarities larger than a second preset threshold exist in the face similarities, the largest face similarity can be found, and the candidate index corresponding to the largest face similarity is used as the standard index matched with the competitiveness index. After the standard index matched with the competitiveness index is obtained, the competitiveness index can be mapped into the standard index.
In this embodiment, if there is no literal similarity greater than the second preset threshold in the literal similarities, it is determined that there is no candidate index matching the competitiveness index in the candidate indexes.
In this embodiment, if the literal similarities are less than or equal to the second preset threshold, the standard index matching the competitiveness index cannot be found in the standard indexes, and the competitiveness index cannot be mapped to the standard index, the competitiveness index and public opinion information are displayed to the user, or the competitiveness index is stored as a new standard index.
In the embodiment of the application, firstly, competitive indexes of public opinion information are obtained, and if the competitive indexes are not included in a plurality of pre-stored standard indexes, semantic similarity between the competitive indexes and each standard index is calculated; when the semantic similarity is smaller than a first preset threshold, screening candidate indexes from the standard indexes based on the semantic similarity; calculating literal similarity of the candidate indexes and the competitiveness indexes; if the literal similarity greater than a second preset threshold exists in the literal similarities, taking the candidate index corresponding to the maximum value in the literal similarities greater than the second preset threshold as a standard index matched with the competitiveness index; can standardize the competitiveness index through semantic similarity and literal similarity in this application, information quantity when having reduced public opinion analysis has brought the facility for public opinion information's analysis, has improved public opinion analysis's efficiency. If only according to the literal similarity, the competitiveness index can only be judged literally whether similar to the standard index. Or only by utilizing the semantic similarity, whether the competitiveness index is similar to the standard index can be judged semantically, which is not comprehensive enough and not accurate enough, and judgment errors are easily caused. Therefore, whether the competitiveness index is matched with the standard index or not is judged by combining the semantic similarity and the literal similarity, and the judged structure is more comprehensive and more accurate.
As shown in fig. 3, in a possible implementation manner, the implementation process of step S102 may include:
s1021, obtaining the first word vector of the competitiveness index.
In this embodiment, Word vector (Word embedding), which is a way to mathematically transform words in a language, is a very long vector to represent a Word.
In this embodiment, the word vector of the competitiveness index, referred to as the first word vector in this application, may be obtained from a storage device.
Alternatively, the word vector of the competitiveness index may also be obtained from a word vector calculation model. Specifically, the competitiveness index is input into a word vector calculation model to obtain a word vector of the competitiveness index. The word vector computation model may be a neural network model, a probabilistic model, etc.
S1022, acquiring the second word vector of each standard index.
In this embodiment, the method for obtaining the word vector of the standard indicator is similar to the method for obtaining the word vector of the competitiveness indicator, and reference may be made to the description of the method for obtaining the word vector of the competitiveness indicator, which is not repeated herein.
And S1023, calculating the cosine distance between the first word vector and the second word vector, and taking the cosine distance as the semantic similarity.
In this embodiment, the cosine distance is also called cosine similarity, and the cosine distance is a cosine value of an included angle between two word vectors.
In this embodiment, the cosine distance may be obtained according to a cosine distance calculation model.
Specifically, the cosine distance calculation model includes: y is 1-cos (a, B),
Figure BDA0003093863020000091
wherein y is a cosine distance, cos (A, B) is cosine values of A and B, A is a first word vector, B is a second word vector, | | A | | Y2The two-norm, representing a, refers to the linear distance of two vector matrices in space. | B | non-conducting phosphor2Representing the two-norm of B.
In the embodiment of the application, the cosine distance between the word vector of the standard index and the word vector of the competitiveness index is used as the semantic similarity, rather than the Euclidean distance between two word vectors in the prior art. The reason why the cosine distance is used as the semantic similarity in the present application is that when the length difference of the similarity of a pair of texts is very large but the contents are close, if the word vector is used as the feature, the Euclidean distance of the word vector in the feature space is usually very large; if cosine similarity is used, the angle between them may be small and thus the similarity is high. Therefore, the cosine distance is used as the semantic similarity, so that the calculated semantic similarity can be more accurate.
As shown in fig. 4, in a possible implementation manner, the implementation process of step S103 may include:
and S1031, determining whether the semantic similarity in a preset range exists in the semantic similarities.
In the present embodiment, the preset range may be set as needed, for example, the preset range may be set to 0.4-0.8.
S1032, if the semantic similarity in the preset range exists in the semantic similarities, taking the standard index corresponding to the semantic similarity in the preset range as the candidate index.
In this embodiment, if the semantic similarity exists in the preset range, it indicates that there may be a standard index that matches the competitiveness index in the standard index, and therefore, the standard index that may match the competitiveness index may be screened out as a candidate index.
In this embodiment, if there is no semantic similarity within a preset range in the semantic similarities, it is determined that there is no standard index matching the competitiveness index in the standard indexes.
In the embodiment of the application, the standard indexes are screened through the preset range, so that the marking indexes possibly matched with the competitive indexes can be preliminarily screened out, the data volume of subsequent calculation is reduced, and the judgment efficiency is improved.
In one possible implementation manner, the implementation process of step S104 may include:
and calculating the edit distance between the candidate index and the competitiveness index. And taking the editing distance as the literal similarity of the candidate index and the competitiveness index.
In this embodiment, the edit distance is a quantitative measure of the difference between two strings (e.g. english text), and the measure is to determine how many times at least one string needs to be changed into another string. Edit distance can be used in natural language processing, for example spell checking can determine which word(s) are more likely based on the edit distance of a misspelled word and other correct words.
In this embodiment, the edit distance between the candidate index and the competitiveness index may be obtained from a storage device. In addition, the edit distance between the candidate index and the competitiveness index may be obtained from an edit distance algorithm model. Specifically, the edit distance can be obtained by inputting the candidate index and the competitiveness index into the edit distance algorithm model.
In the embodiment of the application, the editing distance is taken as the literal similarity, so that the literal similarity between the candidate index and the competitiveness index can be better reflected, and the matching degree of the judged standard index and the competitiveness index is more accurate.
In one possible implementation, after the competitiveness index is standardized, the public opinion information may be analyzed by using the standardized competitiveness index.
As shown in fig. 5, specifically, after step S105, the method may further include:
s201, obtaining viewpoint words corresponding to the competitive indexes in the public sentiment information.
In this embodiment, the public opinion information often includes the evaluation of the competition index by the public, which is referred to as a term of opinion in this application.
In this embodiment, the opinion term of the competitive index in the public opinion information can be obtained from the storage device. The term of opinion may also be obtained from an extraction model. Specifically, the public sentiment information and the competitiveness index are input into an extraction model to obtain viewpoint words corresponding to the competitiveness.
As an example, if the public opinion information is: the brand value of brand A is strongly increased, three-year three-jump is realized in 500-year brand ranking, and the concept word corresponding to the competitive index of brand value is strongly increased.
S202, the viewpoint words and the standard indexes matched with the competitiveness indexes are stored in an associated mode.
In this embodiment, after the competitiveness index is obtained, the competitiveness index and the determined standard index matching with the competitiveness may be stored in association and displayed to the user.
Specifically, the viewpoint words may be stored in a public opinion analysis model, and the standard indexes matching the competitive indexes in the viewpoint words public opinion analysis model may be stored in association with each other.
For example, if the competitive index is "brand value", the standard index matching the competitive index is "estimate premium", and the term of opinion corresponding to the "brand value" is "growth strong". Thus, "growth aggressiveness" and "valuation premium" can be stored in association.
In this embodiment, the public opinion analysis model can classify the standard index, and specifically, the public opinion analysis model can classify the corresponding standard index according to the viewpoint words, and different labels are marked on the standard index, so that the public evaluation trend included in the public opinion information can be clearly obtained through the classified labels of the standard index, and the purpose of public opinion information analysis is achieved. The labels of the standard index may include positive, neutral, and negative, among others. And finally, the public opinion analysis model displays the information stored in the model so as to be convenient for visually obtaining the result of the public opinion information analysis.
As shown in fig. 6, for example, if the competitive index is "brand value", the standard index matching the competitive index is "estimate premium", and the term of opinion corresponding to the "brand value" is "growth strong", the "estimate premium" can be labeled positively.
In the embodiment of the application, the evaluation of the public on the competitiveness index can be reflected more visually by storing the viewpoint words and the standard index matched with the competitiveness index in a correlation manner, so that the follow-up query is simpler and more convenient.
It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.
Fig. 7 is a block diagram illustrating a configuration of a public opinion analyzing apparatus according to an embodiment of the present application, which corresponds to the public opinion analyzing method according to the above embodiment.
Referring to fig. 7, the apparatus 300 may include: the index obtaining module 310, the semantic similarity calculating module 320, the screening module 330, the literal similarity calculating module 340, and the judging module 350.
The index obtaining module 310 is configured to obtain a competitiveness index of public opinion information;
the semantic similarity calculating module 320 is configured to calculate a semantic similarity between the competitive index and each standard index if the pre-stored multiple standard indexes do not include the competitive index;
the screening module 330 is configured to screen candidate indexes from the standard indexes based on the semantic similarity when the semantic similarity is smaller than a first preset threshold;
a literal similarity calculation module 340, configured to calculate literal similarities of the candidate index and the competitiveness index;
the determining module 350 is configured to, if there is a literal similarity greater than a second preset threshold in the literal similarities, take a candidate index corresponding to a maximum value in the literal similarities greater than the second preset threshold as a standard index matched with the competitiveness index.
In one possible implementation, the semantic similarity calculation module 320 may be specifically configured to:
acquiring a first word vector of the competitiveness index;
acquiring a second word vector of each standard index;
and calculating the cosine distance between the first word vector and the second word vector, and taking the cosine distance as the semantic similarity.
In a possible implementation manner, the semantic similarity calculation module 320 further includes, connected thereto:
the threshold comparison module is used for determining the maximum value of the semantic similarities which are greater than or equal to the first preset threshold if the semantic similarities which are greater than or equal to the first preset threshold exist in the semantic similarities;
and the index output module is used for taking the standard index corresponding to the maximum value in the semantic similarity greater than or equal to the first preset threshold as the standard index matched with the competitiveness index.
In a possible implementation manner, the screening module 330 may specifically be configured to:
determining whether semantic similarity within a preset range exists in the semantic similarity;
and if the semantic similarity in a preset range exists in the semantic similarities, taking a standard index corresponding to the semantic similarity in the preset range as the candidate index.
In one possible implementation, the literal similarity calculation module 340 may be specifically configured to:
calculating an edit distance between the candidate index and the competitiveness index;
and taking the editing distance as the literal similarity of the candidate index and the competitiveness index.
In a possible implementation manner, the word similarity calculation module 340, connected thereto, further includes:
and the information output module is used for determining that the candidate indexes do not have the candidate indexes matched with the competitiveness index if the literal similarity larger than the second preset threshold does not exist in the literal similarity.
In a possible implementation manner, the connection with the determining module 350 further includes:
the viewpoint acquisition module is used for acquiring viewpoint words corresponding to the competitiveness index in public opinion information;
and the association module is used for associating and storing the viewpoint words and the standard indexes matched with the competitiveness indexes.
It should be noted that, for the information interaction, execution process, and other contents between the above-mentioned devices/units, the specific functions and technical effects thereof are based on the same concept as those of the embodiment of the method of the present application, and specific reference may be made to the part of the embodiment of the method, which is not described herein again.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
An embodiment of the present application further provides a terminal device, and referring to fig. 8, the terminal device 400 may include: at least one processor 410, a memory 420, and a computer program stored in the memory 420 and executable on the at least one processor 410, wherein the processor 410 when executing the computer program implements the steps of any of the method embodiments described above, such as the steps S101 to S105 in the embodiment shown in fig. 2. Alternatively, the processor 410, when executing the computer program, implements the functions of the modules/units in the above-described device embodiments, such as the functions of the modules 310 to 350 shown in fig. 7.
Illustratively, a computer program may be partitioned into one or more modules/units, which are stored in the memory 420 and executed by the processor 410 to accomplish the present application. The one or more modules/units may be a series of computer program segments capable of performing specific functions, which are used to describe the execution of the computer program in the terminal device 400.
Those skilled in the art will appreciate that fig. 8 is merely an example of a terminal device and is not limiting and may include more or fewer components than shown, or some components may be combined, or different components such as input output devices, network access devices, buses, etc.
The Processor 410 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory 420 may be an internal storage unit of the terminal device, or may be an external storage device of the terminal device, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. The memory 420 is used for storing the computer programs and other programs and data required by the terminal device. The memory 420 may also be used to temporarily store data that has been output or is to be output.
The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (EISA) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, the buses in the figures of the present application are not limited to only one bus or one type of bus.
The public opinion analysis method provided by the embodiment of the application can be applied to terminal equipment such as a computer, a tablet computer, a notebook computer, a netbook, a Personal Digital Assistant (PDA) and the like, and the embodiment of the application does not limit the specific type of the terminal equipment at all.
The embodiment of the application also provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program can implement the steps in the embodiments of the public opinion analysis method.
The embodiment of the application provides a computer program product, which when running on a mobile terminal, enables the mobile terminal to implement the steps in the embodiments of the public opinion analysis method when executed.
The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, all or part of the processes in the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium and can implement the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer readable medium may include at least: any entity or device capable of carrying computer program code to a photographing apparatus/terminal apparatus, a recording medium, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), an electrical carrier signal, a telecommunications signal, and a software distribution medium. Such as a usb-disk, a removable hard disk, a magnetic or optical disk, etc. In certain jurisdictions, computer-readable media may not be an electrical carrier signal or a telecommunications signal in accordance with legislative and patent practice.
In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus/network device and method may be implemented in other ways. For example, the above-described apparatus/network device embodiments are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implementing, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims (10)

1. A public opinion analysis method is characterized by comprising the following steps:
acquiring a competitiveness index of public opinion information;
if the pre-stored multiple standard indexes do not comprise the competitiveness index, calculating the semantic similarity between the competitiveness index and each standard index;
when the semantic similarity is smaller than a first preset threshold, screening candidate indexes from the standard indexes based on the semantic similarity;
calculating literal similarity of the candidate index and the competitiveness index;
and if the literal similarity greater than a second preset threshold exists in the literal similarities, taking the candidate index corresponding to the maximum value in the literal similarities greater than the second preset threshold as the standard index matched with the competitiveness index.
2. The public opinion analysis method according to claim 1, wherein the calculating semantic similarity of the competitive power index and each standard index comprises:
acquiring a first word vector of the competitiveness index;
acquiring a second word vector of each standard index;
and calculating the cosine distance between the first word vector and the second word vector, and taking the cosine distance as the semantic similarity.
3. The consensus analysis method as claimed in claim 1, wherein after the calculating the semantic similarity of the competitive index to each standard index, the method further comprises:
if the semantic similarity greater than or equal to the first preset threshold exists in the semantic similarities, determining the maximum value of the semantic similarities greater than or equal to the first preset threshold;
and taking the standard index corresponding to the maximum value in the semantic similarity greater than or equal to the first preset threshold value as the standard index matched with the competitiveness index.
4. The public opinion analysis method according to claim 1, wherein the screening of candidate indexes from the standard indexes based on the semantic similarity includes:
determining whether semantic similarity within a preset range exists in the semantic similarity;
and if the semantic similarity in a preset range exists in the semantic similarities, taking a standard index corresponding to the semantic similarity in the preset range as the candidate index.
5. The consensus analysis method as claimed in claim 1, wherein the calculating of the literal similarity between the candidate indicator and the competitive indicator comprises:
calculating an edit distance between the candidate index and the competitiveness index;
and taking the editing distance as the literal similarity of the candidate index and the competitiveness index.
6. The consensus analysis method as claimed in any one of claims 1 to 5, wherein after the calculating the literal similarity of the candidate indicator and the competitive indicator, the method comprises:
and if the literal similarity larger than the second preset threshold does not exist in the literal similarity, determining that the candidate indexes do not exist in the candidate indexes and matched with the competitiveness index.
7. The public opinion analysis method according to claim 1, wherein after taking the candidate index corresponding to the maximum value of the literal similarity greater than the second preset threshold as the standard index matching the competitiveness index, the method comprises:
obtaining viewpoint words corresponding to the competitive indexes in the public opinion information;
and storing the viewpoint words and the standard indexes matched with the competitiveness indexes in an associated mode.
8. A public opinion analysis device, characterized by comprising:
the index acquisition module is used for acquiring a competitiveness index of public opinion information;
the semantic similarity calculation module is used for calculating the semantic similarity between the competitive index and each standard index if the pre-stored standard indexes do not comprise the competitive index;
the screening module is used for screening candidate indexes from the standard indexes based on the semantic similarity when the semantic similarity is smaller than a first preset threshold;
the literal similarity calculation module is used for calculating the literal similarity of the candidate indexes and the competitiveness indexes;
and the judging module is used for taking the candidate index corresponding to the maximum value in the literal similarity greater than the second preset threshold value as the standard index matched with the competitiveness index if the literal similarity greater than the second preset threshold value exists in the literal similarity.
9. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor implements the public opinion analysis method according to any one of claims 1 to 7 when executing the computer program.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the public opinion analysis method according to any one of claims 1 to 7.
CN202110605199.0A 2021-05-31 2021-05-31 Public opinion analysis method and device, terminal equipment and storage medium Pending CN113204710A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110605199.0A CN113204710A (en) 2021-05-31 2021-05-31 Public opinion analysis method and device, terminal equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110605199.0A CN113204710A (en) 2021-05-31 2021-05-31 Public opinion analysis method and device, terminal equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113204710A true CN113204710A (en) 2021-08-03

Family

ID=77023840

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110605199.0A Pending CN113204710A (en) 2021-05-31 2021-05-31 Public opinion analysis method and device, terminal equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113204710A (en)

Similar Documents

Publication Publication Date Title
CN110163478B (en) Risk examination method and device for contract clauses
CN109815487B (en) Text quality inspection method, electronic device, computer equipment and storage medium
CN111046142A (en) Text examination method and device, electronic equipment and computer storage medium
CN110162754B (en) Method and equipment for generating post description document
CN110941951B (en) Text similarity calculation method, text similarity calculation device, text similarity calculation medium and electronic equipment
WO2018171295A1 (en) Method and apparatus for tagging article, terminal, and computer readable storage medium
CN111553151A (en) Question recommendation method and device based on field similarity calculation and server
CN110795942B (en) Keyword determination method and device based on semantic recognition and storage medium
CN111814481A (en) Shopping intention identification method and device, terminal equipment and storage medium
CN115018588A (en) Product recommendation method and device, electronic equipment and readable storage medium
CN109740156B (en) Feedback information processing method and device, electronic equipment and storage medium
CN113590945B (en) Book recommendation method and device based on user borrowing behavior-interest prediction
CN114240568A (en) Recommendation method and recommendation device for associated products
CN115375385A (en) Commodity information processing method and device, computer equipment and storage medium
US11676231B1 (en) Aggregating procedures for automatic document analysis
CN112597299A (en) Text entity classification method and device, terminal equipment and storage medium
CN115687790B (en) Advertisement pushing method and system based on big data and cloud platform
CN115964474A (en) Policy keyword extraction method and device, storage medium and electronic equipment
CN110389963A (en) The recognition methods of channel effect, device, equipment and storage medium based on big data
CN115358817A (en) Intelligent product recommendation method, device, equipment and medium based on social data
CN113204710A (en) Public opinion analysis method and device, terminal equipment and storage medium
CN115373982A (en) Test report analysis method, device, equipment and medium based on artificial intelligence
CN113392184A (en) Method and device for determining similar texts, terminal equipment and storage medium
CN115017385A (en) Article searching method, device, equipment and storage medium
CN113836297A (en) Training method and device for text emotion analysis model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination