CN111427999A - Theme real-time influence evaluation method and system for comprehensive integration discussion environment - Google Patents

Theme real-time influence evaluation method and system for comprehensive integration discussion environment Download PDF

Info

Publication number
CN111427999A
CN111427999A CN202010195669.6A CN202010195669A CN111427999A CN 111427999 A CN111427999 A CN 111427999A CN 202010195669 A CN202010195669 A CN 202010195669A CN 111427999 A CN111427999 A CN 111427999A
Authority
CN
China
Prior art keywords
theme
time
influence
message
speech
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010195669.6A
Other languages
Chinese (zh)
Other versions
CN111427999B (en
Inventor
郑楠
王丹力
戴汝为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202010195669.6A priority Critical patent/CN111427999B/en
Publication of CN111427999A publication Critical patent/CN111427999A/en
Application granted granted Critical
Publication of CN111427999B publication Critical patent/CN111427999B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations

Abstract

The invention relates to a theme real-time influence evaluation method and system for a comprehensive integrated discussion environment, wherein the evaluation method comprises the following steps: generating a theme of the current discussion according to the current speech content of experts in the discussion hall; constructing a speech message network according to the reply relation between expert speeches; calculating the influence of the speech message based on the speech message network; calculating the time effect influence of the theme according to the influence of the speaking message and the release time of the message; and determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme. The invention can generate the topic of the current discussion according to the current speech content, determine the influence of the current speech message by constructing the speech message network, and further accurately determine the time effect of the topic and the real-time evolution condition of the topic according to the release time of the message so as to feed back to the participants, assist the participants in clearly studying the change and trend of the topic and improve the conference discussion efficiency.

Description

Theme real-time influence evaluation method and system for comprehensive integration discussion environment
Technical Field
The invention relates to the technical field of information processing, in particular to a theme real-time influence evaluation method and system for an integrated discussion environment.
Background
The comprehensive integrated research hall from man-machine combination, qualitative to quantitative is a methodology for processing open complex huge systems and related problems proposed by famous scientists, chien-schoen and the like in China. The method has advantages in the aspects of processing national major decision problems, complex scientific research problems and the like, and is the leading direction of complex scientific research. The concept of the comprehensive integrated research hall system is to integrate people into the system, adopt a man-machine combined and man-made technical route, fully exert the advantages of people and computers in the aspect of information processing, and combine the limitations, the empirical processing capability and the quick and accurate processing capability of the computers of people, thereby gradually obtaining the key information for processing complex problems and solving the problems which are difficult to solve by people or computers alone.
The essence of the comprehensive integration workshop system is to guide people to integrate related experience, theory, knowledge, information and data to the maximum extent in a man-machine combination and group study mode when processing open complex giant systems, and realize the emergence of group intelligence and obtain better understanding of the complex systems by mutual excitation among group members and collective processing of the resources. This clearly promotes individual wisdom in the integrated approach to group wisdom and significantly enhances the operability of open-ended complex system methodologies.
In the comprehensive integrated discussion environment, the conference participants can qualitatively discuss a certain problem in the form of characters, audio and video on line, and can also quantitatively evaluate the problem in the form of questionnaires and votes. The single discussion mode causes low discussion efficiency and dispersed opinions, which is not good for understanding the change of discussion theme trend, so it is necessary to effectively arrange and evaluate the speaking content.
The existing speech evaluation results in the comprehensive integrated discussion environment mainly include the following:
the method comprises the following steps: and calculating the authority of the speaking opinions according to the number of times each viewpoint is responded. The higher the number of responses, the higher the authority of the utterance, and conversely, the smaller the authority of the utterance (sunless, dugan, li dazong, the emergence of group wisdom in the integrated seminar system, system simulation bulletin, 2003,15(1), 146-.
The second method comprises the following steps: the method is characterized in that an expert reduces the dimension of a multidimensional data group of the evaluation opinions of the research schemes, the results after the dimension reduction are visually represented in a low-dimensional data space, and the classification conditions of the opinions of the expert groups can be visually observed through the visual representation of the evaluation opinions of the expert groups in the low-dimensional visual space (Liuchunmi, Dynasty, namely, the visualization of the evaluation results of the expert groups of the research hall, mode identification and artificial intelligence, 2005,18(1), 6-11).
The third method comprises the following steps: and calculating the overall authority of the expert by using the evaluation content of the expert speech. (Li Ming flower, Li Guangdong, Zhao Mingchang, Wangchun, Dynasty, expert authority evaluation method and system based on network integrated discussion environment, invention patent No. CN 101312423A.).
The method four comprises the following steps: the evaluation of expert authority is calculated by using the interactive structure among experts (Wang ai, Li Yandong, Li Virgi, CWME expert authority calculation method based on SemRank, computer application research 2010,27(7), 2441-.
The method comprises the steps of calculating the authority degree of the speech to obtain the authority degree of the speech of each expert; the second method is to reduce the dimension of the speech of a plurality of experts to form the classification of opinions; calculating the authority of the expert according to the speech content; and fourthly, calculating the authority of the expert according to the interactive structure of the speech. When the number of the speeches is large and the focus of the expert on the problem is scattered, the above methods cannot obtain the change of the focus of the expert in real time.
In addition, in the process of the discussion, the user is mainly focused on the discussion, the discussion process has large time pressure, the participating users have large workload, the related speech is searched in a no time, or the discussion process is influenced by frequent searching, and the generation of knowledge and the promotion of knowledge in the discussion process are hindered.
Disclosure of Invention
In order to solve the above problems in the prior art, that is, to assist participants in clearly discussing changes and trends of topics and improve conference discussion efficiency, the present invention aims to provide a topic real-time influence evaluation method and system oriented to a comprehensive integrated discussion environment.
In order to solve the technical problems, the invention provides the following scheme:
a theme real-time influence evaluation method oriented to a comprehensive integration discussion environment comprises the following steps:
generating a theme of the current discussion according to the current speech content of experts in the discussion hall;
constructing a speech message network according to the reply relation between expert speeches;
calculating the influence of the speech message based on the speech message network;
calculating the time effect influence of the theme according to the influence of the speaking message and the release time of the message;
and determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme.
Optionally, the generating a topic of the current discussion according to the current speech content of the expert in the discussion hall specifically includes:
preprocessing the current speech content to obtain a preprocessed text;
dividing the preprocessed text into N sections according to speaking time to obtain N sub-texts;
and processing each sub-text by adopting a theme generation model to generate a theme corresponding to the sub-text.
Optionally, the preprocessing the current speech content to obtain a preprocessed text specifically includes:
and segmenting words, removing stop words and useless symbols from the current speech content to obtain a preprocessed text.
Optionally, processing each sub-text by using a topic generation model to generate a topic corresponding to the sub-text, which specifically includes:
mapping the speaking message in the sub-text to a corresponding theme through a vocabulary entry;
calculating the word frequency of each theme by adopting a three-layer Bayes probability model;
and determining the theme of the sub-text according to the word frequency of each theme.
Optionally, the word frequency includes: entry wiFor subject zjEntry probability P (z)j|wi) And subject zjFor the speech message dmMessage probability P (z)j|dm);
Calculating the word frequency of each topic according to the following formula:
Figure BDA0002417502780000041
Figure BDA0002417502780000042
Figure BDA0002417502780000043
Figure BDA0002417502780000044
Figure BDA0002417502780000045
Figure BDA0002417502780000046
Figure BDA0002417502780000051
Figure BDA0002417502780000052
wherein, | zjI is subject Z in the speech setjNumber of utterances, dmIs a speech message; p (z)iJ) is the probability that the jth topic belongs to the current utterance, P (w)i|ziJ) is an entry wiA probability of belonging to topic j;
let phi (j) be P (w)i|ziJ) indicates that topic j is in entry wiA polynomial distribution of θ (i) ═ p (z) denotes a polynomial distribution of utterance d on the subject; parameters phi and theta represent the association relationship between entries and topics and between topics and speeches; t denotes the number of subjects, CWTAnd CDTCount matrices of dimensions W × T and D × T respectively,
Figure BDA0002417502780000053
indicating that the current entry w is not includediThe entry count assigned to subject j,
Figure BDA0002417502780000054
indicating that the current entry w is not includediAnd topic j is assigned to the count of the corresponding entry in utterance d,
Figure BDA0002417502780000055
indicating that the entry count assigned to subject j does not include the current entry w,
Figure BDA0002417502780000056
indicating that the current entry w is not includediAnd topic t is assigned to the count of the corresponding entry in utterance d,
Figure BDA0002417502780000057
indicates an entry count assigned to subject j that does not include the current entry i,
Figure BDA0002417502780000058
indicates the entry count assigned to topic j excluding the current entry k, W is the number of entries, and D is the number of utterances.
Optionally, the determining the topic of the sub-text according to the word frequency of each topic specifically includes:
comparing entry probabilities P (z) respectivelyj|wi) And a set threshold value THjMessage probability P (z)j|dm) And a set threshold value THj
Selecting more than the set threshold THjEntry probability P (z)j|wi) Corresponding entry and more than the set threshold THjMessage probability P (z)j|dm) A corresponding talk message;
and determining the theme of the sub-text according to the selected entry and the speaking message.
Optionally, the calculating, based on the utterance message network, an influence of the utterance message specifically includes:
calculating the quantity element quantity of the speech through an introductive feature analysis method of the social network:
Figure BDA0002417502780000061
wherein the speech message network GnAs directed weighted graphs, Gn=(Vn,En,Wn) Set of nodes VnRepresenting a set of messages; the edge set En represents the reply relationship between the experts; set of weights WnIs shown at time tnFrequency of internal recovery, du、dvRepresenting speech information, Wn(dv,du) Is shown at time tnInner speech information dvAnd speech information duThe frequency of recovery of (d);
calculating the range element quantity of the speech through a degree-out feature analysis method of the social network:
Figure BDA0002417502780000062
wherein, I (d)v,du) Representing speech information d as an indication functionvAnd speech information duWhether an association exists;
determining the current speech information d according to the number element quantity and the range element quantity of the speechuInfluence of (S)n(di):
Figure BDA0002417502780000063
Wherein N iswAnd NITo normalize the constants, the smoothing factor α determines the magnitude of the quantity element and the magnitude of the range element at the message impact S of the current inventionn(du) The weight occupied by (c).
Optionally, the calculating the time-dependent influence of the topic according to the influence of the speech message and the issue time of the message specifically includes:
determining a time period t from the selected talk messagenWithin a subject zjMessage list of
Figure BDA0002417502780000071
Figure BDA0002417502780000072
Indicating the selected talk message, N indicates the segment number of the time segment, and N is 1,2, …, N;
calculating the topic z according to the selected speech messagejInfluence of aging Sn(zj):
Figure BDA0002417502780000073
Figure BDA0002417502780000074
Wherein, anRepresenting the age weight, with the time period t of publicationnIn correlation, λ is a preset reference amount, λ is greater than or equal to 0 and less than or equal to 1,
Figure BDA0002417502780000075
representing speech information
Figure BDA0002417502780000076
The influence of (c).
Optionally, the determining the real-time evolution direction of the theme according to the timeliness of the influence of the theme specifically includes:
calculating similarity sim (z) of two time sequence subjects of any adjacent time periodj,zj+1):
Figure BDA0002417502780000077
Wherein z isj,zj+1Two time sequence themes of adjacent time periods and corresponding start time s (z)j)<s(zj+1);P(wi|zj) Is an entry wiBelonging to a topic zjThe probability of (d);
comparing the similarity sim (z)j,zj+1) And the preset similarity threshold value:
when sim (z)j,zj+1) When > z is determinedjAnd zj+1Between which a chronological theme transition occurs, and zj+1From zjEvolved to show as
Figure BDA0002417502780000081
When sim (z)j,zj+1) When the value is less than or equal to the predetermined value, z is determinedjAnd zj+1Are the same time sequence theme;
and combining the detected time sequence theme transition conditions along the time direction to obtain the theme evolution network.
In order to solve the technical problems, the invention also provides the following scheme:
a theme real-time influence appraisal system oriented to an integrated research environment, the appraisal system comprising:
the generating unit is used for generating a current discussion theme according to the current speech content of experts in the discussion hall;
the construction unit is used for constructing a speech message network according to the reply relation between expert speeches;
a first calculation unit, configured to calculate an influence of the utterance message based on the utterance message network;
the second calculation unit is used for calculating the time effect influence of the theme according to the influence of the speaking message and the release time of the message;
and the determining unit is used for determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme.
According to the embodiment of the invention, the invention discloses the following technical effects:
the invention can generate the topic of the current discussion according to the current speech content, determine the influence of the current speech message by constructing the speech message network, and further accurately determine the time effect of the topic and the real-time evolution condition of the topic according to the release time of the message so as to feed back to the participants, assist the participants in clearly studying the change and trend of the topic and improve the conference discussion efficiency.
Drawings
FIG. 1 is a flow chart of the subject real-time influence evaluation method oriented to the comprehensive integration discussion environment of the present invention;
FIG. 2 is a schematic block diagram of a real-time influence evaluation system for a subject oriented to an integrated research environment according to the present invention.
Description of the symbols:
the system comprises a generating unit-1, a constructing unit-2, a first calculating unit-3, a second calculating unit-4 and a determining unit-5.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are only for explaining the technical principle of the present invention, and are not intended to limit the scope of the present invention.
The invention aims to provide a theme real-time influence evaluating method facing to a comprehensive integrated discussion environment, which can generate a current discussion theme according to current speech content, determine the influence of the current speech message by constructing a speech message network, and further accurately determine the time-dependent influence of the theme and the real-time evolution condition of the theme according to the message issuing time so as to feed back to participants, assist the participants in clearly studying the change and trend of the theme and improve the conference discussion efficiency.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
As shown in fig. 1, the method for evaluating the real-time influence of the theme oriented to the comprehensive integrated research environment of the present invention comprises:
step 100: generating a theme of the current discussion according to the current speech content of experts in the discussion hall;
step 200: constructing a speech message network according to the reply relation between expert speeches;
step 300: calculating the influence of the speech message based on the speech message network;
step 400: calculating the time effect influence of the theme according to the influence of the speaking message and the release time of the message;
step 500: and determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme.
In step 100, the generating a topic of the current discussion according to the current speech content of the expert in the discussion hall specifically includes:
step 101: and preprocessing the current speaking content to obtain a preprocessed text.
In this embodiment, the current utterance content may be subjected to word segmentation, word deactivation, useless symbol removal, and the like, so as to obtain a preprocessed text.
Step 102: and dividing the preprocessed text into N sections according to the speaking time to obtain N sub-texts. That is, the content of speech is divided into (t) according to the speech time1,…,tn,..,tN) N time periods. N represents the segment number of the time segment, and N is 1,2, … and N.
Step 103: and processing each sub-text by adopting a theme generation model to generate a theme corresponding to the sub-text.
If an entry is a term belonging to a certain topic, the topic it belongs to can be easily inferred from this entry. These terms typically occur more in the utterance of one topic and less in the utterances of other topics. For the preprocessed sub-texts, terms belonging to a certain topic can be obtained by calculating the distribution of terms on different topics.
The process of learning topics from text can be a predictive process and the probabilistic topic model is a generative model for the text domain.A topic generation model is used to extract the topic of each utterance, where the topic generation model uses a latent Dirichlet model (L DA model). the basic idea is that an expert utterance is a mixture of different topics and the topics are represented as a probability distribution over the subject words.
Then, in step 103, processing each sub-text by using a topic generation model to generate a topic corresponding to the sub-text, which specifically includes:
step 1031: and mapping the speaking message in the sub-text to the corresponding subject through the vocabulary entry.
Step 1032: and calculating the word frequency of each theme by adopting a three-layer Bayesian probability model.
Wherein the word frequency comprises: entry wiFor subject ZjEntry probability P (z)j|wi) And subject ZjFor the speech message dmMessage probability P (z)j|dm)。
Calculating the word frequency of each topic according to the following formula:
Figure BDA0002417502780000111
Figure BDA0002417502780000112
Figure BDA0002417502780000113
Figure BDA0002417502780000114
Figure BDA0002417502780000115
Figure BDA0002417502780000121
Figure BDA0002417502780000122
Figure BDA0002417502780000123
wherein, | zjI is subject Z in the speech setjNumber of utterances, dmIs a speech message; p (z)iJ) is the probability that the jth topic belongs to the current utterance, P (w)i|ziJ) is an entry wiProbability of belonging to topic j.
Let phi (j) be P (w)i|ziJ) indicates that topic j is in entry wiA polynomial distribution of θ (i) ═ p (z) denotes a polynomial distribution of utterance d on the subject; parameters phi and theta represent the association relationship between entries and topics and between topics and speeches; t denotes the number of subjects, CWTAnd CDTCount matrices of dimensions W × T and D × T respectively,
Figure BDA0002417502780000124
indicating that the current entry w is not includediIs assigned to topic jThe number of the strips is counted,
Figure BDA0002417502780000125
indicating that the current entry w is not includediAnd topic j is assigned to the count of the corresponding entry in utterance d,
Figure BDA0002417502780000126
indicating that the entry count assigned to subject j does not include the current entry w,
Figure BDA0002417502780000127
indicating that the current entry w is not includediAnd topic t is assigned to the count of the corresponding entry in utterance d,
Figure BDA0002417502780000128
indicates an entry count assigned to subject j that does not include the current entry i,
Figure BDA0002417502780000129
indicates the entry count assigned to topic j excluding the current entry k, W is the number of entries, and D is the number of utterances.
Step 1033: and determining the theme of the sub-text according to the word frequency of each theme. The method specifically comprises the following steps:
step 1033 a: comparing entry probabilities P (z) respectivelyj|wi) And a set threshold value THjMessage probability P (z)j|dm) And a set threshold value THj
Step 1033 b: selecting more than the set threshold THjEntry probability P (z)j|wi) Corresponding entry and more than the set threshold THjMessage probability P (z)j|dm) A corresponding talk message. I.e. P (z)j|wi)>THj,P(zj|dm)>THj
Step 1033 c: and determining the theme of the sub-text according to the selected entry and the speaking message.
In step 200, at tnConstructing a speech message cancel according to the reply relation between expert speeches within timeAnd (4) a network. Networking floor messages as a directed authoritative graph Gn=(Vn,En,Wn). Set of nodes VnRepresenting a set of messages; the edge set En represents the reply relationship between the experts; set of weights WnRecord at time tnThe frequency of the inner recovery.
To record the evolution of the reply relation network over time, divided according to a pre-processing (t)1,…,tn,..,tN) N time intervals, and representing the evolution process of the speaking message network by a network sequence G ═ G1,G2,...,GTIt is assumed here that the set of messages V in G is unchanged, the set of edges E and the set of weights W evolve over time.
In addition, the invention adopts a social network analysis method to calculate the influence of the speaking message. The main social network analysis method is based on social theory to research the interactive behavior characteristics of the group in the network. The social network analysis method comprises a plurality of indexes for mining network structure characteristics, wherein the two most basic and important indexes are the in-degree and out-degree of a node.
In a network constructed based on an expert speech reply relationship, the degree of entry of a node represents the number of messages replied by the speech, and reflects the degree of the speech which is valued to a certain extent; and the out degree of the node represents the number of messages of the speech replying to other speech, and reflects the activity degree of the speech. Therefore, based on the structural characteristics of the in-degree and out-degree analysis network, the interaction mode of the speaking relation can be deeply analyzed.
Further, in step 300, the calculating, based on the talk message network, the influence of the talk message specifically includes:
step 301: through an in-degree feature analysis method of the social network, the quantity element quantity of the speech (namely, the degree of the speech message concerned is measured) is calculated:
Figure BDA0002417502780000141
wherein the weight set WnIs shown at time tnWith internal returnFrequency, du、dvRepresenting speech information, Wn(dv,du) Is shown at time tnInner speech information dvAnd speech information duThe frequency of recovery of (1).
Step 302: by a degree-out feature analysis method of the social network, calculating the range element quantity of the speech (namely measuring the quality of the current message):
Figure BDA0002417502780000142
wherein, I (d)v,du) Representing speech information d as an indication functionvAnd speech information duWhether an association exists.
Step 303: determining the current speech information d according to the number element quantity and the range element quantity of the speechuInfluence of (S)n(du):
Figure BDA0002417502780000143
Wherein N iswAnd NITo normalize the constants, the smoothing factor α determines the magnitude of the quantity element and the magnitude of the range element at the message impact S of the current inventionn(du) The weight occupied by (c).
The attention degree of the message and the quality of the message are comprehensively considered, the smoothing factor α is introduced to carry out linear integration on the quantity element quantity and the range element quantity, and the influence of the message of the current invention is accurately determined.
Optionally, in step 400, the calculating the time-dependent influence of the topic according to the influence of the speech message and the issue time of the message specifically includes:
step 401: determining a time period t from the selected talk messagenWithin a subject zjMessage list of
Figure BDA0002417502780000151
Figure BDA0002417502780000152
Representing the selected talk message;
step 402: calculating the topic z according to the selected speech messagejInfluence of aging Sn(zj):
Figure BDA0002417502780000153
Figure BDA0002417502780000154
Wherein, anRepresenting the age weight, with the time period t of publicationnIn correlation, λ is a preset reference amount, λ is greater than or equal to 0 and less than or equal to 1,
Figure BDA0002417502780000155
representing speech information
Figure BDA0002417502780000156
The influence of (c).
Considering the timeliness factor of the study behavior, an exponential smoothing method is adopted, and by utilizing the exponential decay characteristic with the base number less than 1, the speech message with far time in the study sequence is endowed with a lower timeliness weight, and the speech message with near time is endowed with a higher timeliness weight.
Further, defining: the time-series theme γ is a theme having a certain time span, and γ ═ z, s (γ), t (γ) > represents a time-series theme in which s (γ) and t (γ) are the start time and the end time of γ, respectively.
Let gamma be1=<z1,s(γ1),t(γ1)>And gamma2=<z2,s(γ2),t(γ2)>Are two time-sequential topics, where t (γ)1)<s(γ2). Let sim (gamma)12) Is gamma1And gamma2The similarity of (2) is a preset similarity threshold. When sim (gamma)12)>When it is called gamma1And gamma2Between which a time sequence theme transition occurs, and is called gamma2From gamma1Evolved to show as
Figure BDA0002417502780000157
In order to detect the evolution relation between different time sequence topics in adjacent time periods, the cosine similarity is utilized to calculate the similarity degree between the two time sequence topics.
Specifically, in step 500, the determining the real-time evolution direction of the topic according to the timeliness of the influence of the topic includes:
step 501: calculating similarity sim (z) of two time sequence subjects of any adjacent time periodj,zj+1):
Figure BDA0002417502780000161
Wherein z isj,zj+1Two time sequence themes of adjacent time periods and corresponding start time s (z)j)<s(zj+1);P(wi|zj) Is an entry wiBelonging to a topic zjThe probability of (c).
Step 502: comparing the similarity sim (z)j,zj+1) And the preset similarity threshold value:
when sim (z)j,zj+1) When > z is determinedjAnd zj+1Between which a chronological theme transition occurs, and zj+1From zjEvolved to show as
Figure BDA0002417502780000162
When sim (z)j,zj+1) When the value is less than or equal to the predetermined value, z is determinedjAnd zj+1Are the same chronological theme.
Step 503: and combining the detected time sequence theme transition conditions along the time direction to obtain the theme evolution network.
Let p be ═<γ1,γ2,...,γN>Is a time sequence subject matterColumn, subject conversion relationship exists between adjacent subjects, i.e.
Figure BDA0002417502780000163
Then p is defined as a topic evolution model. And combining the topic conversion relations among all the time along a time axis to obtain a complete topic evolution network.
When the specific problems are discussed in the environment of the comprehensive integrated discussion hall, the invention can provide the subjects for meeting participants to observe the discussion in real time and the trends of the discussion subjects, provide good interaction conditions for the meeting participants and greatly improve the discussion efficiency and decision efficiency of the conference. The integrated seminar environment can be extended to any network conference room or electronic conference room system.
In addition, the invention also provides a theme real-time influence evaluation system oriented to the comprehensive integrated discussion environment, which can assist participants in clearly studying the change and trend of themes and improve the conference discussion efficiency.
As shown in fig. 2, the subject real-time influence evaluation system for the comprehensive integrated research environment of the present invention includes a generating unit 1, a constructing unit 2, a first calculating unit 3, a second calculating unit 4, and a determining unit 5.
The generating unit 1 is used for generating a current discussion theme according to the current speaking content of experts in a discussion hall;
the construction unit 2 is used for constructing a speech message network according to the reply relation between expert speeches;
the first calculating unit 3 is used for calculating the influence of the speaking message based on the speaking message network;
the second calculating unit 4 is configured to calculate the time-dependent influence of the topic according to the influence of the speech message and the issue time of the message;
the determining unit 5 is configured to determine a real-time evolution situation of the theme according to timeliness of the influence of the theme.
Compared with the prior art, the subject real-time influence evaluating system facing the comprehensive integrated discussion environment has the same beneficial effects as the subject real-time influence evaluating method facing the comprehensive integrated discussion environment, and is not repeated herein.
So far, the technical solutions of the present invention have been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of the present invention is obviously not limited to these specific embodiments. Equivalent changes or substitutions of related technical features can be made by those skilled in the art without departing from the principle of the invention, and the technical scheme after the changes or substitutions can fall into the protection scope of the invention.

Claims (10)

1. A theme real-time influence evaluation method oriented to a comprehensive integrated discussion environment is characterized by comprising the following steps:
generating a theme of the current discussion according to the current speech content of experts in the discussion hall;
constructing a speech message network according to the reply relation between expert speeches;
calculating the influence of the speech message based on the speech message network;
calculating the time effect influence of the theme according to the influence of the speaking message and the release time of the message;
and determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme.
2. The method for evaluating the real-time influence of the theme oriented to the integrated research environment according to claim 1, wherein the generating of the theme of the current research according to the current speech content of the expert in the research hall specifically comprises:
preprocessing the current speech content to obtain a preprocessed text;
dividing the preprocessed text into N sections according to speaking time to obtain N sub-texts;
and processing each sub-text by adopting a theme generation model to generate a theme corresponding to the sub-text.
3. The method for assessing the real-time influence of the theme oriented to the integrated research environment as claimed in claim 2, wherein the pre-processing the current speech content to obtain a pre-processed text specifically comprises:
and segmenting words, removing stop words and useless symbols from the current speech content to obtain a preprocessed text.
4. The method for evaluating the real-time influence of the topics facing the comprehensive integrated discussion environment according to claim 2 or 3, wherein the generating of the topics includes:
mapping the speaking message in the sub-text to a corresponding theme through a vocabulary entry;
calculating the word frequency of each theme by adopting a three-layer Bayes probability model;
and determining the theme of the sub-text according to the word frequency of each theme.
5. The method for assessing the real-time influence of the theme of the comprehensive integrated research environment according to claim 4, wherein the word frequency comprises: entry wiFor subject zjEntry probability P (z)j|wi) And subject zjFor the speech message dmMessage probability P (z)j|dm);
Calculating the word frequency of each topic according to the following formula:
Figure FDA0002417502770000021
Figure FDA0002417502770000022
Figure FDA0002417502770000023
Figure FDA0002417502770000024
Figure FDA0002417502770000025
Figure FDA0002417502770000026
Figure FDA0002417502770000027
Figure FDA0002417502770000028
wherein, | zjI is the topic z in the set of utterancesjNumber of utterances, dmIs a speech message; p (z)iJ) is the probability that the jth topic belongs to the current utterance, P (w)i|ziJ) is an entry wiA probability of belonging to topic j;
let phi (j) be P (w)i|ziJ) indicates that topic j is in entry wiA polynomial distribution of θ (i) ═ p (z) denotes a polynomial distribution of utterance d on the subject; parameters phi and theta represent the association relationship between entries and topics and between topics and speeches; t denotes the number of subjects, CWTAnd CDTCount matrices of dimensions W × T and D × T respectively,
Figure FDA0002417502770000031
indicating that the current entry w is not includediThe entry count assigned to subject j,
Figure FDA0002417502770000032
indicating that the current entry w is not includediAnd topic j is assigned to the count of the corresponding entry in utterance d,
Figure FDA0002417502770000033
indicating that the entry count assigned to subject j does not include the current entry w,
Figure FDA0002417502770000034
indicating that the current entry w is not includediAnd topic t is assigned to the count of the corresponding entry in utterance d,
Figure FDA0002417502770000035
indicates an entry count assigned to subject j that does not include the current entry i,
Figure FDA0002417502770000036
indicates the entry count assigned to topic j excluding the current entry k, W is the number of entries, and D is the number of utterances.
6. The method for evaluating the real-time influence of the topics facing the comprehensive integrated discussion environment as claimed in claim 5, wherein the determining the topics of the sub-text according to the word frequency of each topic specifically comprises:
comparing entry probabilities P (z) respectivelyj|wi) And a set threshold value THjMessage probability P (z)j|dm) And a set threshold value THj
Selecting more than the set threshold THjEntry probability P (z)j|wi) Corresponding entry and more than the set threshold THjMessage probability P (z)j|dm) A corresponding talk message;
and determining the theme of the sub-text according to the selected entry and the speaking message.
7. The method for assessing the influence of a topic in a comprehensive integrated research environment according to claim 1, wherein the computing the influence of the speech message based on the speech message network specifically comprises:
calculating the quantity element quantity of the speech through an introductive feature analysis method of the social network:
Figure FDA0002417502770000041
wherein the speech message network GnAs directed weighted graphs, Gn=(Vn,En,Wn) Set of nodes VnRepresenting a set of messages; the edge set En represents the reply relationship between the experts; set of weights WnIs shown at time tnFrequency of internal recovery, du、dvRepresenting speech information, Wn(dv,du) Is shown at time tnInner speech information dvAnd speech information duThe frequency of recovery of (d);
calculating the range element quantity of the speech through a degree-out feature analysis method of the social network:
Figure FDA0002417502770000042
wherein, I (d)v,du) Representing speech information d as an indication functionvAnd speech information duWhether an association exists;
determining the current speech information d according to the number element quantity and the range element quantity of the speechuInfluence of (S)n(du):
Figure FDA0002417502770000043
Wherein N iswAnd NITo normalize the constants, the smoothing factor α determines the magnitude of the quantity element and the magnitude of the range element at the message impact S of the current inventionn(du) The weight occupied by (c).
8. The method for evaluating the real-time influence of the theme oriented to the integrated research environment according to claim 6, wherein the calculating the time-based influence of the theme according to the influence of the speech message and the message publishing time specifically comprises:
determining a time period t from the selected talk messagenWithin a subject zjMessage list of
Figure FDA0002417502770000044
Figure FDA0002417502770000045
Indicating the selected speaking message, wherein N represents the segment number of the time segment, and N is 1, 2.., N;
calculating the topic z according to the selected speech messagejInfluence of aging Sn(zj):
Figure FDA0002417502770000051
Figure FDA0002417502770000052
Wherein, anRepresenting the age weight, with the time period t of publicationnIn correlation, λ is a preset reference amount, λ is greater than or equal to 0 and less than or equal to 1,
Figure FDA0002417502770000053
representing speech information
Figure FDA0002417502770000054
The influence of (c).
9. The method for evaluating the real-time influence of the theme oriented to the integrated research environment according to claim 1, wherein the determining the real-time evolution direction of the theme according to the timeliness of the influence of the theme specifically comprises:
calculating similarity sim (z) of two time sequence subjects of any adjacent time periodj,zj+1):
Figure FDA0002417502770000055
Wherein z isj,zj+1Two time sequence themes of adjacent time periods and corresponding start time s (z)j)<s(zk+1);P(wi|zj) Is an entry wiBelonging to a topic zjThe probability of (d);
comparing the similarity sim (z)j,zj+1) And the preset similarity threshold value:
when sim (z)j,zj+1) When > z is determinedjAnd zj+1Between which a chronological theme transition occurs, and zj+1From zjEvolved to show as
Figure FDA0002417502770000056
When sim (z)j,zj+1) When the value is less than or equal to the predetermined value, z is determinedjAnd zj+1Are the same time sequence theme;
and combining the detected time sequence theme transition conditions along the time direction to obtain the theme evolution network.
10. A theme real-time influence evaluation system for an integrated research environment, the theme real-time influence evaluation system comprising:
the generating unit is used for generating a current discussion theme according to the current speech content of experts in the discussion hall;
the construction unit is used for constructing a speech message network according to the reply relation between expert speeches;
a first calculation unit, configured to calculate an influence of the utterance message based on the utterance message network;
the second calculation unit is used for calculating the time effect influence of the theme according to the influence of the speaking message and the release time of the message;
and the determining unit is used for determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme.
CN202010195669.6A 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment Active CN111427999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010195669.6A CN111427999B (en) 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010195669.6A CN111427999B (en) 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment

Publications (2)

Publication Number Publication Date
CN111427999A true CN111427999A (en) 2020-07-17
CN111427999B CN111427999B (en) 2023-05-12

Family

ID=71548144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010195669.6A Active CN111427999B (en) 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment

Country Status (1)

Country Link
CN (1) CN111427999B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101312423A (en) * 2007-05-24 2008-11-26 中国科学院自动化研究所 Expert authority evaluation method and system based on network integrated discussion environment
CN101782920A (en) * 2009-12-23 2010-07-21 中国科学院自动化研究所 Integrated session environment-oriented information recommendation method
CN102360366A (en) * 2011-09-30 2012-02-22 河海大学 Interactive visual HWME (Hall for Workshop of Metasynthetic Engineering) system
CN102929942A (en) * 2012-09-27 2013-02-13 福建师范大学 Social network overlapping community finding method based on ensemble learning
CN103425774A (en) * 2013-08-13 2013-12-04 北京航空航天大学 Tacit knowledge acquisition method based on HWME (Hall for Workshop of Metasynthetic Engineering)
US20190325029A1 (en) * 2018-04-18 2019-10-24 HelpShift, Inc. System and methods for processing and interpreting text messages

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101312423A (en) * 2007-05-24 2008-11-26 中国科学院自动化研究所 Expert authority evaluation method and system based on network integrated discussion environment
CN101782920A (en) * 2009-12-23 2010-07-21 中国科学院自动化研究所 Integrated session environment-oriented information recommendation method
CN102360366A (en) * 2011-09-30 2012-02-22 河海大学 Interactive visual HWME (Hall for Workshop of Metasynthetic Engineering) system
CN102929942A (en) * 2012-09-27 2013-02-13 福建师范大学 Social network overlapping community finding method based on ensemble learning
CN103425774A (en) * 2013-08-13 2013-12-04 北京航空航天大学 Tacit knowledge acquisition method based on HWME (Hall for Workshop of Metasynthetic Engineering)
US20190325029A1 (en) * 2018-04-18 2019-10-24 HelpShift, Inc. System and methods for processing and interpreting text messages

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LANCASTER: "an integrated art workshop curriculum innovation: art in a professional course of teacher education" *
佟瑞;: "基于综合集成方法论的产业技术路线图研究" *

Also Published As

Publication number Publication date
CN111427999B (en) 2023-05-12

Similar Documents

Publication Publication Date Title
Abrishami et al. Predicting citation counts based on deep neural network learning techniques
CN106650780B (en) Data processing method and device, classifier training method and system
Peling et al. Implementation of Data Mining To Predict Period of Students Study Using Naive Bayes Algorithm
CN110491416A (en) It is a kind of based on the call voice sentiment analysis of LSTM and SAE and recognition methods
CN109615129B (en) Real estate customer transaction probability prediction method, server and computer storage medium
JP2017504883A (en) Model-driven candidate sorting based on audio cues
Hou et al. A big data application to predict depression in the university based on the reading habits
CN103150333A (en) Opinion leader identification method in microblog media
WO2021103401A1 (en) Data object classification method and apparatus, computer device and storage medium
CN105183743A (en) Prediction method of MicroBlog public sentiment propagation range
Brodeur et al. We need to talk about mechanical turk: What 22,989 hypothesis tests tell us about publication bias and p-hacking in online experiments
Cardaioli et al. Predicting Twitter users' political orientation: an application to the italian political scenario
Pentland et al. Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research
CN111427999A (en) Theme real-time influence evaluation method and system for comprehensive integration discussion environment
CN109871889A (en) Mass psychology appraisal procedure under emergency event
Kaschesky et al. Bringing representativeness into social media monitoring and analysis
JP2019153013A (en) Program, device and method for estimating influence of empathy of content on user
Hajare et al. A machine learning pipeline to examine political bias with congressional speeches
Mellon Making inferences about elections and public opinion using incidentally collected data
CN112632218A (en) Network public opinion monitoring method for enterprise crisis public customs
CN105447128A (en) Method for predicting spread range of microblog public opinions
Pokhriyal et al. Social media data reveals signal for public consumer perceptions
CN116756347B (en) Semantic information retrieval method based on big data
Hosaka et al. An analytical model of website relationships based on browsing history embedding considerations of page transitions
Unnikrishnan et al. A Literature Review of Sentiment Evolution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant