CN111427999B - Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment - Google Patents

Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment Download PDF

Info

Publication number
CN111427999B
CN111427999B CN202010195669.6A CN202010195669A CN111427999B CN 111427999 B CN111427999 B CN 111427999B CN 202010195669 A CN202010195669 A CN 202010195669A CN 111427999 B CN111427999 B CN 111427999B
Authority
CN
China
Prior art keywords
theme
speaking
message
influence
current
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010195669.6A
Other languages
Chinese (zh)
Other versions
CN111427999A (en
Inventor
郑楠
王丹力
戴汝为
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Institute of Automation of Chinese Academy of Science
Original Assignee
Institute of Automation of Chinese Academy of Science
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Institute of Automation of Chinese Academy of Science filed Critical Institute of Automation of Chinese Academy of Science
Priority to CN202010195669.6A priority Critical patent/CN111427999B/en
Publication of CN111427999A publication Critical patent/CN111427999A/en
Application granted granted Critical
Publication of CN111427999B publication Critical patent/CN111427999B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/06Resources, workflows, human or project management; Enterprise or organisation planning; Enterprise or organisation modelling
    • G06Q10/063Operations research, analysis or management
    • G06Q10/0639Performance analysis of employees; Performance analysis of enterprise or organisation operations

Abstract

The invention relates to a method and a system for evaluating the real-time influence of a theme facing to a comprehensive integrated discussion environment, wherein the evaluation method comprises the following steps: generating a theme of the current discussion according to the current speaking content of the expert in the seminar; constructing an speaking message network according to the reply relation between expert speaking; calculating influence of the speaking message based on the speaking message network; according to the influence of the speaking message and the release time of the message, calculating the timeliness influence of the theme; and determining the real-time evolution condition of the theme according to the timeliness of the theme influence. The invention can generate the currently discussed theme according to the current speaking content, and determine the influence of the current speaking message by constructing the speaking message network, and further accurately determine the aging influence of the theme and the real-time evolution condition of the theme according to the release time of the message, so as to feed back to conference staff, assist the conference staff to clearly discuss the change and trend of the theme, and improve the conference discussion efficiency.

Description

Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment
Technical Field
The invention relates to the technical field of information processing, in particular to a method and a system for evaluating the real-time influence of a theme facing to a comprehensive integrated discussion environment.
Background
The comprehensive integrated seminar from qualitative to quantitative by combining human and machine is a methodology for treating open complex giant systems and related problems proposed by well-known scientists Qian Xuesen in China. The method has advantages in the aspects of treating serious decision problems, complex scientific research problems and the like of the country, and is the leading direction of complex scientific research. The comprehensive integrated seminar system is designed to integrate people into the system, adopt man-machine combination and man-made main technical routes, fully exert the advantages of people and computers in the aspect of information processing, combine the limitations of people, empirical processing capacity and quick and accurate processing capacity of computers, so as to gradually obtain key information for processing complex problems, and solve the problems which are difficult to solve by people or computers alone.
The essence of the comprehensive integrated seminar system is to guide people to integrate related experiences, theories, knowledge, information and data to the maximum extent through the modes of man-machine combination and group seminar when processing an open complex giant system, and realize the emergence of group wisdom and obtain better knowledge on the complex system through mutual excitation among group members and collective processing of the resources. This clearly increases the individual wisdom in the comprehensive integration approach to population wisdom and significantly enhances the operability of the open complex system methodology.
In the comprehensive integrated discussion environment, participants can conduct qualitative discussion on a certain problem in the form of text, audio and video, and can conduct quantitative evaluation in the form of questionnaires and votes. The single discussion mode makes the discussion inefficiency and the opinion disperse, which is unfavorable for grasping the change of the discussion theme trend, so that the effective arrangement and evaluation of the speaking content are needed.
The speech evaluation results in the existing comprehensive integrated discussion environment mainly comprise the following steps:
the method comprises the following steps: and (3) calculating authority degree of each speaking opinion according to the response times of each opinion. The higher the number of responses, the higher the authority of the utterance, and vice versa, the lower the authority of the utterance (Cui Xia, dai Ru is Li Yaodong, the emergence of group wisdom in the integrated seminar system, system simulation theory, 2003,15 (1), 146-452.).
The second method is as follows: the expert carries out dimension reduction on the multidimensional data set of the evaluation opinions of the discussion scheme, and the dimension-reduced result is visually represented in a low-dimensional data space, so that the classification condition of the expert group opinion can be intuitively observed through the visual representation of the expert evaluation opinion in the low-dimensional visual space (Liu Chunmei, dai Ru is the visual of the comprehensive integration of the evaluation result of the expert group in the discussion hall, and the mode recognition and the artificial intelligence are 2005,18 (1), 6-11).
And a third method: and calculating the overall authority of the expert by using the evaluation content of the expert speaking. (Li Minhua, li Yaodong, zhao Mingchang, wang Chunheng, dai Ru are expert authority evaluation methods and systems based on network integrated research environment, patent publication No. cn101312423 a.).
The method four: evaluation of expert authority was calculated using interaction structure between experts (Wang Ai, li Yaodong, li Weijie, semRank-based CWME expert authority calculation method, computer application research, 2010,27 (7), 2441-2444).
Firstly, calculating authority of speaking, and obtaining authority of each expert speaking; secondly, reducing the dimension of the speech of a plurality of experts to form the classification of the opinion; calculating authority of an expert according to speaking contents; and fourthly, calculating authority of the expert according to the interactive structure of the speech. When the number of utterances is large and the attention points of the expert to the problem are dispersed, the above methods cannot obtain the change of the attention points of the expert in real time.
In addition, in the process of the study, the user mainly takes effort on the study itself, the study process has great time pressure, and the participating users have great workload, do not have to search related utterances or influence the study process due to frequent search, thus preventing the generation of knowledge and the improvement of knowledge in the study process.
Disclosure of Invention
In order to solve the problems in the prior art, namely to assist conference participants to clearly discuss the change and trend of the theme and improve conference discuss efficiency, the invention aims to provide a method and a system for evaluating the real-time influence of the theme facing to the comprehensive integrated discuss environment.
In order to solve the technical problems, the invention provides the following scheme:
a method for evaluating the real-time influence of a theme facing to a comprehensive integrated discussion environment comprises the following steps:
generating a theme of the current discussion according to the current speaking content of the expert in the seminar;
constructing an speaking message network according to the reply relation between expert speaking;
calculating influence of the speaking message based on the speaking message network;
according to the influence of the speaking message and the release time of the message, calculating the timeliness influence of the theme;
and determining the real-time evolution condition of the theme according to the timeliness of the theme influence.
Optionally, the generating the theme of the current discussion according to the current speaking content of the expert in the seminar specifically includes:
preprocessing the current speaking content to obtain a preprocessed text;
dividing the preprocessed text into N segments according to speaking time to obtain N sub-texts;
and processing each sub-text by adopting a theme generation model to generate a theme corresponding to the sub-text.
Optionally, the preprocessing the current speaking content to obtain a preprocessed text specifically includes:
and segmenting the current speaking content, removing stop words and removing useless symbols to obtain a preprocessing text.
Optionally, a theme generation model is adopted to process each sub-text, so as to generate a theme of the corresponding sub-text, which specifically comprises:
mapping the speaking message in the sub-text to a corresponding theme through an entry;
calculating word frequency of each theme by adopting a three-layer Bayesian probability model;
and determining the theme of the sub-text according to the word frequency of each theme.
Optionally, provided thatThe word frequency includes: entry w i For subject z j Term probability P (z) j |w i ) Theme z j For speaking message d m Message probability P (z) j |d m );
The word frequency of each topic is calculated according to the following formula:
Figure BDA0002417502780000041
Figure BDA0002417502780000042
/>
Figure BDA0002417502780000043
Figure BDA0002417502780000044
Figure BDA0002417502780000045
Figure BDA0002417502780000046
Figure BDA0002417502780000051
Figure BDA0002417502780000052
wherein z j I is the topic Z in the speech set j Speech number d of (2) m Is a talk message; p (z) i J) is the probability that the j-th topic belongs to the current utterance, P (w i |z i =j) is the term w i Probability of belonging to topic j;
let phi (j) =p (w i |z i =j) indicates that subject j is in term w i The polynomial distribution of the above, θ (i) =p (z) represents the polynomial distribution of the utterance d on the subject; parameters phi and theta represent the association relationship between the entry and the subject and between the subject and the speech; t represents the number of subjects, C WT And C DT Representing the count matrices in W x T and D x T dimensions respectively,
Figure BDA0002417502780000053
representing not including the current term w i Item count assigned to topic j, +.>
Figure BDA0002417502780000054
Representing not including the current term w i And topic j is assigned to the count of the corresponding entry in utterance d,/->
Figure BDA0002417502780000055
Representing the count of terms assigned to subject j, excluding the current term w, ++>
Figure BDA0002417502780000056
Representing not including the current term w i And topic t is assigned to the count of the corresponding entry in utterance d,/->
Figure BDA0002417502780000057
Representing the count of entries assigned to topic j that do not include the current entry i, +.>
Figure BDA0002417502780000058
Representing the count of terms assigned to subject j excluding the current term k, W is the number of terms and D is the number of utterances.
Optionally, the determining the theme of the sub-text according to the word frequency of each theme specifically includes:
respectively comparing term probabilities P (z j |w i ) And set threshold value TH j Message probability P (z j |d m ) And settingThreshold value TH j
Selecting greater than the set threshold TH j Term probability P (z) j |w i ) Corresponding entry is greater than the set threshold value TH j Message probability P (z) j |d m ) A corresponding speaking message;
and determining the theme of the sub-text according to the selected entry and the speaking message.
Optionally, the calculating the influence of the speaking message based on the speaking message network specifically includes:
calculating the quantity element quantity of the utterances by a social network entrance feature analysis method:
Figure BDA0002417502780000061
wherein the speaking message network G n G is a directed weighted graph n =(V n ,E n ,W n ) Node set V n Representing a set of messages; the edge set En represents the reply relationship between the experts; weight set W n Indicated at time t n Frequency of internal recovery d u 、d v Representing speech information, W n (d v ,d u ) Indicated at time t n Internal speaking information d v And speaking information d u Is a frequency of return of (a);
calculating the range element quantity of the speech through the out-degree feature analysis method of the social network:
Figure BDA0002417502780000062
wherein I (d) v ,d u ) To indicate the function, the speaking information d is represented v And speaking information d u Whether there is an association;
determining current speech information d according to the number element quantity and the range element quantity of the speech u Influence S of (2) n (d i ):
Figure BDA0002417502780000063
Wherein N is w And N I For normalizing constant, the smoothing factor alpha determines the quantity element quantity and the range element quantity to be in the influence S of the current invention message n (d u ) Is the weight occupied by (a).
Optionally, calculating the aging influence of the theme according to the influence of the speaking message and the release time of the message specifically includes:
determining a time period t according to the selected speaking message n Is subject z j Is a list of messages of (2)
Figure BDA0002417502780000071
Figure BDA0002417502780000072
Representing the selected talk message, N representing the segment number of the time segment, n=1, 2, …, N;
calculating the subject z according to the selected speaking message j Age effect S of (2) n (z j ):
Figure BDA0002417502780000073
Figure BDA0002417502780000074
Wherein a is n Representing age weight, and time period t of publication n In the related, lambda is a preset reference quantity, the value is equal to or more than 0 and equal to or less than 1,
Figure BDA0002417502780000075
representing speaking information +.>
Figure BDA0002417502780000076
Is a function of the influence of (a) on the influence of (b) on the influence.
Optionally, the determining the real-time evolution direction of the theme according to the timeliness of the theme influence specifically includes:
calculating the similarity sim (z) of two time sequence topics of any adjacent time period j ,z j+1 ):
Figure BDA0002417502780000077
Wherein z is j ,z j+1 Is the two timing subjects of the adjacent time period, and the corresponding start time s (z j )<s(z j+1 );P(w i |z j ) Is the term w i Belonging to subject z j Probability of (2);
comparing the similarity sim (z j ,z j+1 ) And the similarity threshold epsilon is as large as a preset similarity threshold epsilon:
when sim (z) j ,z j+1 ) At > ε, then z is determined j And z j+1 A time sequence theme transition occurs between, and z j+1 From z j Evolved, expressed as
Figure BDA0002417502780000081
When sim (z) j ,z j+1 ) When epsilon is less than or equal to epsilon, then determining z j And z j+1 Is the same timing theme;
and combining the detected time sequence theme conversion conditions along the time direction to obtain a theme evolution network.
In order to solve the technical problems, the invention also provides the following scheme:
a system for evaluating the real-time impact of a subject on a comprehensive integrated seminar environment, the system comprising:
the generating unit is used for generating a theme of the current discussion according to the current speaking content of the expert in the discussion hall;
the construction unit is used for constructing an speaking message network according to the reply relation among the expert speaking;
a first calculation unit for calculating an influence of the speaking message based on the speaking message network;
the second calculation unit is used for calculating the aging influence of the theme according to the influence of the speaking message and the release time of the message;
and the determining unit is used for determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme.
According to the embodiment of the invention, the following technical effects are disclosed:
the invention can generate the currently discussed theme according to the current speaking content, and determine the influence of the current speaking message by constructing the speaking message network, and further accurately determine the aging influence of the theme and the real-time evolution condition of the theme according to the release time of the message, so as to feed back to conference staff, assist the conference staff to clearly discuss the change and trend of the theme, and improve the conference discussion efficiency.
Drawings
FIG. 1 is a flow chart of a subject real-time impact assessment method for an integrated discussion environment of the present invention;
fig. 2 is a schematic block diagram of a system for evaluating the real-time impact of a subject in an integrated discussion environment.
Symbol description:
the system comprises a generating unit-1, a constructing unit-2, a first calculating unit-3, a second calculating unit-4 and a determining unit-5.
Detailed Description
Preferred embodiments of the present invention are described below with reference to the accompanying drawings. It should be understood by those skilled in the art that these embodiments are merely for explaining the technical principles of the present invention, and are not intended to limit the scope of the present invention.
The invention aims to provide a method for evaluating the real-time influence of a theme facing an integrated discussion environment, which can generate a theme of the current discussion according to the content of the current speaking, determine the influence of the current speaking message by constructing a speaking message network, further accurately determine the aging influence of the theme and the real-time evolution condition of the theme according to the release time of the message, feed back the theme to meeting personnel, assist the meeting personnel to clearly study the change and trend of the theme, and improve the discussion efficiency of the meeting.
In order that the above-recited objects, features and advantages of the present invention will become more readily apparent, a more particular description of the invention will be rendered by reference to the appended drawings and appended detailed description.
As shown in FIG. 1, the method for evaluating the real-time influence of the theme of the integrated discussion environment comprises the following steps:
step 100: generating a theme of the current discussion according to the current speaking content of the expert in the seminar;
step 200: constructing an speaking message network according to the reply relation between expert speaking;
step 300: calculating influence of the speaking message based on the speaking message network;
step 400: according to the influence of the speaking message and the release time of the message, calculating the timeliness influence of the theme;
step 500: and determining the real-time evolution condition of the theme according to the timeliness of the theme influence.
In step 100, the generating a theme of the current discussion according to the current speaking content of the expert in the seminar specifically includes:
step 101: and preprocessing the current speaking content to obtain a preprocessed text.
In this embodiment, the preprocessing text may be obtained by performing processing such as word segmentation, word deactivation, and useless symbol removal on the current speech content.
Step 102: dividing the preprocessed text into N segments according to the speaking time to obtain N sub-texts. That is, the speech content is divided into (t 1 ,…,t n ,..,t N ) N time periods. N denotes the segment number of the time segment, n=1, 2, …, N.
Step 103: and processing each sub-text by adopting a theme generation model to generate a theme corresponding to the sub-text.
If a term is a term belonging to a certain topic, the topic it is located on can easily be inferred from this term. These terms typically appear more in the utterances of one topic and less in the utterances of other topics. For the pre-processed sub-text, terms belonging to a certain topic can be obtained by calculating the distribution of the terms over different topics.
The process of learning topics from text can be referred to as a predictive process, and the probabilistic topic model is a generative model for the text domain. The topic generation model is adopted to extract the topic of each utterance, and the topic generation model adopts a Latent Dirichlet Analysis (LDA) model. The basic idea is that an expert utterance is made up of a mix of different topics, which are represented as probability distributions over the topic words. Under this assumption, the generation process of a new utterance is: first, a probability distribution of a topic is selected for the utterance, then a topic is selected for the position of each topic word according to the probability distribution, and finally a term is generated for the position from the topic.
In step 103, a theme generation model is adopted to process each sub-text, so as to generate a theme of the corresponding sub-text, which specifically includes:
step 1031: the speaking messages in the sub-texts are mapped to corresponding topics through entries.
Step 1032: and calculating the word frequency of each theme by adopting a three-layer Bayesian probability model.
Wherein, the word frequency includes: entry w i For subject Z j Term probability P (z) j |w i ) Theme Z j For speaking message d m Message probability P (z) j |d m )。
The word frequency of each topic is calculated according to the following formula:
Figure BDA0002417502780000111
Figure BDA0002417502780000112
Figure BDA0002417502780000113
Figure BDA0002417502780000114
Figure BDA0002417502780000115
/>
Figure BDA0002417502780000121
Figure BDA0002417502780000122
Figure BDA0002417502780000123
wherein z j I is the topic Z in the speech set j Speech number d of (2) m Is a talk message; p (z) i J) is the probability that the j-th topic belongs to the current utterance, P (w i |z i =j) is the term w i Probability of belonging to topic j.
Let phi (j) =p (w i |z i =j) indicates that subject j is in term w i The polynomial distribution of the above, θ (i) =p (z) represents the polynomial distribution of the utterance d on the subject; parameters phi and theta represent the association relationship between the entry and the subject and between the subject and the speech; t represents the number of subjects, C WT And C DT Representing the count matrices in W x T and D x T dimensions respectively,
Figure BDA0002417502780000124
representing not including the current term w i Item count assigned to topic j, +.>
Figure BDA0002417502780000125
Representing not including the current term w i And topic j is assigned to the count of the corresponding entry in utterance d,/->
Figure BDA0002417502780000126
Representing the count of terms assigned to subject j, excluding the current term w, ++>
Figure BDA0002417502780000127
Representing not including the current term w i And topic t is assigned to the count of the corresponding entry in utterance d,/->
Figure BDA0002417502780000128
Representing the count of entries assigned to topic j that do not include the current entry i, +.>
Figure BDA0002417502780000129
The term count assigned to subject j, which indicates that the current term k is not included, W is the number of terms and D is the number of utterances.
Step 1033: and determining the theme of the sub-text according to the word frequency of each theme. The method specifically comprises the following steps:
step 1033a: respectively comparing term probabilities P (z j |w i ) And set threshold value TH j Message probability P (z j |d m ) And set threshold value TH j
Step 1033b: selecting greater than the set threshold TH j Term probability P (z) j |w i ) Corresponding entry is greater than the set threshold value TH j Message probability P (z) j |d m ) A corresponding talk message. Namely P (z) j |w i )>TH j ,P(z j |d m )>TH j
Step 1033c: and determining the theme of the sub-text according to the selected entry and the speaking message.
In step 200, at t n And in time, constructing a speaking message network according to the reply relation among the expert speaking. Network of floor messages as a directed graph G n =(V n ,E n ,W n ). Node set V n Representing a set of messages; the edge set En represents the reply relationship between the experts; weight set W n At time t is recorded n Frequency of internal recovery.
In order to record the evolution of the reply-related network over time, a method of (t 1 ,…,t n ,..,t N ) N time intervals, the evolution process of the speaking message network is represented by a network sequence as G= { G 1 ,G 2 ,...,G T Let us assume here that the message set V in G is unchanged, the edge set E and the weight set W evolve over time.
In addition, the invention calculates the influence of the speaking message by adopting a social network analysis method. The main social network analysis method is based on social theory, and the interaction behavior characteristics of groups in the network are researched. The social network analysis method comprises a plurality of indexes for mining network structural characteristics, wherein the two indexes which are the most basic and important are the input degree and the output degree of the node.
In a network constructed based on expert speaking reply relations, the degree of arrival of a node represents the number of messages to which the speaking is replied, which reflects to a certain extent the degree to which the speaking is valued; the degree of the node represents the number of messages of the speech replying to other speech and reflects the activity of the speech. Therefore, based on the structural characteristics of the input degree and the output degree analysis network, the interaction mode of the speaking relationship can be deeply analyzed.
Further, in step 300, the calculating, based on the speaking message network, the influence of the speaking message specifically includes:
step 301: calculating the quantity element quantity of the utterances (namely measuring the degree of interest of the utterances) through an entrance feature analysis method of the social network:
Figure BDA0002417502780000141
wherein the weight set W n Indicated at time t n Frequency of internal recovery d u 、d v Representation ofSpeaking information, W n (d v ,d u ) Indicated at time t n Internal speaking information d v And speaking information d u Is a frequency of reversion of (c).
Step 302: calculating the range element quantity of the speaking (namely measuring the quality of the current message) through the outbound feature analysis method of the social network:
Figure BDA0002417502780000142
wherein I (d) v ,d u ) To indicate the function, the speaking information d is represented v And speaking information d u Whether there is an association.
Step 303: determining current speech information d according to the number element quantity and the range element quantity of the speech u Influence S of (2) n (d u ):
Figure BDA0002417502780000143
Wherein N is w And N I For normalizing constant, the smoothing factor alpha determines the quantity element quantity and the range element quantity to be in the influence S of the current invention message n (d u ) Is the weight occupied by (a).
And comprehensively considering the concerned degree of the message and the quality of the message, introducing a smoothing factor alpha to linearly integrate the number element quantity and the range element quantity, and accurately determining the influence of the message of the current invention.
Optionally, in step 400, the calculating the aging influence of the theme according to the influence of the speaking message and the release time of the message specifically includes:
step 401: determining a time period t according to the selected speaking message n Is subject z j Is a list of messages of (2)
Figure BDA0002417502780000151
Figure BDA0002417502780000152
Representing the selected talk message;
step 402: calculating the subject z according to the selected speaking message j Age effect S of (2) n (z j ):
Figure BDA0002417502780000153
Figure BDA0002417502780000154
Wherein a is n Representing age weight, and time period t of publication n In the related, lambda is a preset reference quantity, the value is equal to or more than 0 and equal to or less than 1,
Figure BDA0002417502780000155
representing speaking information +.>
Figure BDA0002417502780000156
Is a function of the influence of (a) on the influence of (b) on the influence.
Considering timeliness factors of the study behavior, an exponential smoothing method is adopted, and by utilizing the exponential decay characteristic with a base less than 1, a speaking message with a longer time in the study sequence is given a smaller value of timeliness weight, and a speaking message with a shorter time is given a larger value of timeliness weight.
Further, define: the time-series theme γ is a theme having a certain time span, γ= < z, s (γ), and t (γ) > is expressed as one time-series theme, where s (γ) and t (γ) are the start time and the end time of γ, respectively.
Let gamma 1 =<z 1 ,s(γ 1 ),t(γ 1 )>And gamma 2 =<z 2 ,s(γ 2 ),t(γ 2 )>Is two timing topics, where t (gamma 1 )<s(γ 2 ). Let sim (gamma) 12 ) Is gamma 1 And gamma 2 Epsilon is a similarity threshold set in advance. When sim (gamma) 12 )>For ε, we call γ 1 And gamma 2 The time sequence subject conversion is generated between the two, and is called gamma 2 From gamma 1 Evolved, expressed as
Figure BDA0002417502780000157
In order to detect the evolution relation between different time sequence topics in adjacent time periods, the similarity degree between the two time sequence topics is calculated by using cosine similarity.
Specifically, in step 500, the determining, according to the timeliness of the impact of the theme, the real-time evolution direction of the theme includes:
step 501: calculating the similarity sim (z) of two time sequence topics of any adjacent time period j ,z j+1 ):
Figure BDA0002417502780000161
Wherein z is j ,z j+1 Is the two timing subjects of the adjacent time period, and the corresponding start time s (z j )<s(z j+1 );P(w i |z j ) Is the term w i Belonging to subject z j Is a probability of (2).
Step 502: comparing the similarity sim (z j ,z j+1 ) And the similarity threshold epsilon is as large as a preset similarity threshold epsilon:
when sim (z) j ,z j+1 ) At > ε, then z is determined j And z j+1 A time sequence theme transition occurs between, and z j+1 From z j Evolved, expressed as
Figure BDA0002417502780000162
When sim (z) j ,z j+1 ) When epsilon is less than or equal to epsilon, then determining z j And z j+1 Is the same timing theme.
Step 503: and combining the detected time sequence theme conversion conditions along the time direction to obtain a theme evolution network.
Let p=<γ 1 ,γ 2 ,...,γ N >For time sequence theme sequence, there is theme conversion relation between adjacent themes, namely
Figure BDA0002417502780000163
P is defined as a topic evolution mode. And combining the theme conversion relations among all the times along a time axis to obtain a complete theme evolution network.
When a specific problem is discussed in the comprehensive integrated seminar environment, the invention can provide real-time observation of the seminar theme for the conference participants and the trend of the seminar theme, provide good interaction conditions for the conference participants, and greatly improve conference seminar efficiency and decision efficiency. The integrated seminar environment may be extended to any web conference room or electronic conference room system.
In addition, the invention also provides a theme real-time influence evaluation system facing the comprehensive integrated discussion environment, which can assist meeting participants to clearly discuss the change and trend of the theme and improve the conference discussion efficiency.
As shown in fig. 2, the system for evaluating the real-time impact of a subject in an integrated discussion environment according to the present invention includes a generating unit 1, a constructing unit 2, a first calculating unit 3, a second calculating unit 4, and a determining unit 5.
The generating unit 1 is used for generating a theme of current discussion according to the current speaking content of an expert in the seminar;
the construction unit 2 is used for constructing an speaking message network according to the reply relation between expert speaking;
the first calculating unit 3 is configured to calculate an influence of an speaking message based on a speaking message network;
the second calculating unit 4 is configured to calculate an aging influence of the theme according to the influence of the speaking message and the release time of the message;
the determining unit 5 is configured to determine a real-time evolution condition of the theme according to timeliness of influence of the theme.
Compared with the prior art, the system for evaluating the real-time influence of the theme facing the integrated discussion environment has the same beneficial effects as the method for evaluating the real-time influence of the theme facing the integrated discussion environment, and is not repeated herein.
Thus far, the technical solution of the present invention has been described in connection with the preferred embodiments shown in the drawings, but it is easily understood by those skilled in the art that the scope of protection of the present invention is not limited to these specific embodiments. Equivalent modifications and substitutions for related technical features may be made by those skilled in the art without departing from the principles of the present invention, and such modifications and substitutions will fall within the scope of the present invention.

Claims (3)

1. A method for evaluating the real-time influence of a theme facing a comprehensive integrated discussion environment is characterized by comprising the following steps:
generating a theme of the current discussion according to the current speaking content of the expert in the seminar;
constructing an speaking message network according to the reply relation between expert speaking;
calculating influence of the speaking message based on the speaking message network;
according to the influence of the speaking message and the release time of the message, calculating the timeliness influence of the theme;
determining the real-time evolution condition of the theme according to the timeliness of the theme influence;
the generating the theme of the current discussion according to the current speaking content of the expert in the discussion hall specifically comprises the following steps:
preprocessing the current speaking content to obtain a preprocessed text;
dividing the preprocessed text into N segments according to speaking time to obtain N sub-texts;
and processing each sub-text by adopting a theme generation model to generate a theme of the corresponding sub-text, wherein the method specifically comprises the following steps of: mapping the speaking message in the sub-text to a corresponding theme through an entry; calculating word frequency of each theme by adopting a three-layer Bayesian probability model; determining the topic of the sub-text according to the word frequency of each topic;
the word frequency includes: entry w i For subject z j Term probability P (z) j |w i ) Theme z j For speaking message d m Message probability P (z) j |d m );
The word frequency of each topic is calculated according to the following formula:
Figure FDA0004172142760000011
Figure FDA0004172142760000021
Figure FDA0004172142760000022
Figure FDA0004172142760000023
Figure FDA0004172142760000024
Figure FDA0004172142760000025
Figure FDA0004172142760000026
Figure FDA0004172142760000027
/>
wherein z j I is the topic z in the speech set j Speech number d of (2) m Is a talk message; p (z) i J) is the probability that the j-th topic belongs to the current utterance, P (w i |z i =j) is the term w i Probability of belonging to topic j;
let phi (j) =p (w i |z i =j) indicates that subject j is in term w i The polynomial distribution of the above, θ (i) =p (z) represents the polynomial distribution of the utterance d on the subject; parameters phi and theta represent the association relationship between the entry and the subject and between the subject and the speech; t represents the number of subjects, C WT And C DT Representing the count matrices in W x T and D x T dimensions respectively,
Figure FDA0004172142760000028
representing not including the current term w i Item count assigned to topic j, +.>
Figure FDA0004172142760000029
Representing not including the current term w i And topic j is assigned to the count of the corresponding entry in utterance d,/->
Figure FDA00041721427600000210
Representing the count of terms assigned to subject j, excluding the current term w, ++>
Figure FDA0004172142760000031
Representing not including the current term w i And topic t is assigned to the count of the corresponding entry in utterance d,/->
Figure FDA0004172142760000032
Representing the count of entries assigned to topic j that do not include the current entry i, +.>
Figure FDA0004172142760000033
Representing a count of terms assigned to subject j excluding current term k, W being the number of terms and D being the number of utterances;
the determining the theme of the sub-text according to the word frequency of each theme specifically comprises the following steps:
respectively comparing term probabilities P (z j |w i ) And set threshold value TH j Message probability P (z j |d m ) And set threshold value TH j
Selecting greater than the set threshold TH j Term probability P (z) j |w i ) Corresponding entry is greater than the set threshold value TH j Message probability P (z) j |d m ) A corresponding speaking message;
determining the theme of the sub-text according to the selected entry and the speaking message;
the speaking message network is based on, and the influence of the speaking message is calculated, which concretely comprises the following steps:
calculating the quantity element quantity of the utterances by a social network entrance feature analysis method:
Figure FDA0004172142760000034
wherein the speaking message network G n G is a directed weighted graph n =(V n ,E n ,W n ) Node set V n Representing a set of messages; the edge set En represents the reply relationship between the experts; weight set W n Indicated at time t n Frequency of internal recovery d u 、d v Representing speech information, W n (d v ,d u ) Indicated at time t n Internal speaking information d v And speaking information d u Is a frequency of return of (a);
calculating the range element quantity of the speech through the out-degree feature analysis method of the social network:
Figure FDA0004172142760000035
wherein I (d) v ,d u ) To indicate the function, the speaking information d is represented v And speaking information d u Whether there is an association;
determining current speech information d according to the number element quantity and the range element quantity of the speech u Influence S of (2) n (d u ):
Figure FDA0004172142760000041
Wherein N is w And N I To normalize constant, the smoothing factor alpha 1 Determining the quantity element quantity and the range element quantity in the influence S of the current invention message n (d u ) The weight of the (b);
according to the influence of the speaking message and the release time of the message, calculating the aging influence of the theme specifically comprises the following steps:
determining a time period t according to the selected speaking message n Is subject z j Is a list of messages of (2)
Figure FDA0004172142760000042
Representing the selected talk message, N representing the segment number of the time segment, n=1, 2, …, N;
calculating the subject z according to the selected speaking message j Age effect S of (2) n (z j ):
Figure FDA0004172142760000043
Figure FDA0004172142760000044
Wherein a is n Representing age weight, and time period t of publication n In the related, lambda is a preset reference quantity, the value is equal to or more than 0 and equal to or less than 1,
Figure FDA0004172142760000045
representing speech messagesRest->
Figure FDA0004172142760000046
Is a part of the influence of (1);
determining a real-time evolution direction of a theme according to timeliness of influence of the theme, which specifically comprises the following steps:
calculating the similarity sim (z) of two time sequence topics of any adjacent time period j ,z j+1 ):
Figure FDA0004172142760000051
Wherein z is j ,z j+1 Is the two timing subjects of the adjacent time period, and the corresponding start time s (z j )<s(z j+1 );P(w i |z j ) Is the term w i Belonging to subject z j Probability of (2);
comparing the similarity sim (z j ,z j+1 ) And the similarity threshold epsilon is as large as a preset similarity threshold epsilon:
when sim (z) j ,z j+1 ) At > ε, then z is determined j And z j+1 A time sequence theme transition occurs between, and z j+1 From z j Evolved, expressed as
Figure FDA0004172142760000052
When sim (z) j ,z j+1 ) When epsilon is less than or equal to epsilon, then determining z j And z j+1 Is the same timing theme;
and combining the detected time sequence theme conversion conditions along the time direction to obtain a theme evolution network.
2. The method for evaluating the real-time influence of a theme in an integrated seminar environment according to claim 1, wherein the preprocessing of the current speaking content to obtain a preprocessed text specifically comprises:
and segmenting the current speaking content, removing stop words and removing useless symbols to obtain a preprocessing text.
3. The utility model provides a comprehensive integration discusses theme real-time influence evaluation system of environment which characterized in that, the evaluation system includes:
the generating unit is used for generating a theme of the current discussion according to the current speaking content of the expert in the discussion hall;
the construction unit is used for constructing an speaking message network according to the reply relation among the expert speaking;
a first calculation unit for calculating an influence of the speaking message based on the speaking message network;
the second calculation unit is used for calculating the aging influence of the theme according to the influence of the speaking message and the release time of the message;
the determining unit is used for determining the real-time evolution condition of the theme according to the timeliness of the influence of the theme;
the generating the theme of the current discussion according to the current speaking content of the expert in the discussion hall specifically comprises the following steps:
preprocessing the current speaking content to obtain a preprocessed text;
dividing the preprocessed text into N segments according to speaking time to obtain N sub-texts;
and processing each sub-text by adopting a theme generation model to generate a theme of the corresponding sub-text, wherein the method specifically comprises the following steps of: mapping the speaking message in the sub-text to a corresponding theme through an entry; calculating word frequency of each theme by adopting a three-layer Bayesian probability model; determining the topic of the sub-text according to the word frequency of each topic;
the word frequency includes: entry w i For subject z j Term probability P (z) j |w i ) Theme z j For speaking message d m Message probability P (z) j |d m );
The word frequency of each topic is calculated according to the following formula:
Figure FDA0004172142760000061
Figure FDA0004172142760000062
Figure FDA0004172142760000063
Figure FDA0004172142760000064
Figure FDA0004172142760000065
Figure FDA0004172142760000071
Figure FDA0004172142760000072
Figure FDA0004172142760000073
wherein z j I is the topic z in the speech set j Speech number d of (2) m Is a talk message; p (z) i J) is the probability that the j-th topic belongs to the current utterance, P (w i |z i =j) is the term w i Probability of belonging to topic j;
let phi (j) =p (w i |z i =j) indicates that subject j is in term w i The polynomial distribution of the above, θ (i) =p (z) represents the polynomial distribution of the utterance d on the subject; the parameters phi and theta represent entriesA topic, an association of the topic with the utterance; t represents the number of subjects, C WT And C DT Representing the count matrices in W x T and D x T dimensions respectively,
Figure FDA0004172142760000074
representing not including the current term w i Item count assigned to topic j, +.>
Figure FDA0004172142760000075
Representing not including the current term w i And topic j is assigned to the count of the corresponding entry in utterance d,/->
Figure FDA0004172142760000076
Representing the count of terms assigned to subject j, excluding the current term w, ++>
Figure FDA0004172142760000077
Representing not including the current term w i And topic t is assigned to the count of the corresponding entry in utterance d,/->
Figure FDA0004172142760000078
Representing the count of entries assigned to topic j that do not include the current entry i, +.>
Figure FDA0004172142760000079
Representing a count of terms assigned to subject j excluding current term k, W being the number of terms and D being the number of utterances;
the determining the theme of the sub-text according to the word frequency of each theme specifically comprises the following steps:
respectively comparing term probabilities P (z j |w i ) And set threshold value TH j Message probability P (z j |d m ) And set threshold value TH j
Selecting greater than the set threshold TH j Term probability P (z) j |w i ) Corresponding entry is greater than the set threshold value TH j Message probability P (z) j |d m ) A corresponding speaking message;
determining the theme of the sub-text according to the selected entry and the speaking message;
the speaking message network is based on, and the influence of the speaking message is calculated, which concretely comprises the following steps:
calculating the quantity element quantity of the utterances by a social network entrance feature analysis method:
Figure FDA0004172142760000081
wherein the speaking message network G n G is a directed weighted graph n =(V n ,E n ,W n ) Node set V n Representing a set of messages; the edge set En represents the reply relationship between the experts; weight set W n Indicated at time t n Frequency of internal recovery d u 、d v Representing speech information, W n (d v ,d u ) Indicated at time t n Internal speaking information d v And speaking information d u Is a frequency of return of (a);
calculating the range element quantity of the speech through the out-degree feature analysis method of the social network:
Figure FDA0004172142760000082
wherein I (d) v ,d u ) To indicate the function, the speaking information d is represented v And speaking information d u Whether there is an association;
determining current speech information d according to the number element quantity and the range element quantity of the speech u Influence S of (2) n (d u ):
Figure FDA0004172142760000083
Wherein N is w And N I To normalize constant, the smoothing factor alpha 1 Determining the quantity element quantity and the range element quantity in the influence S of the current invention message n (d u ) The weight of the (b);
according to the influence of the speaking message and the release time of the message, calculating the aging influence of the theme specifically comprises the following steps:
determining a time period t according to the selected speaking message n Is subject z j Is a list of messages of (2)
Figure FDA0004172142760000091
Representing the selected talk message, N representing the segment number of the time segment, n=1, 2, …, N;
calculating the subject z according to the selected speaking message j Age effect S of (2) n (z j ):
Figure FDA0004172142760000092
Figure FDA0004172142760000093
Wherein a is n Representing age weight, and time period t of publication n In the related, lambda is a preset reference quantity, the value is equal to or more than 0 and equal to or less than 1,
Figure FDA0004172142760000094
representing speaking information +.>
Figure FDA0004172142760000095
Is a part of the influence of (1);
determining a real-time evolution direction of a theme according to timeliness of influence of the theme, which specifically comprises the following steps:
calculating the similarity sim (z) of two time sequence topics of any adjacent time period j ,z j+1 ):
Figure FDA0004172142760000096
Wherein z is j ,z j+1 Is the two timing subjects of the adjacent time period, and the corresponding start time s (Z j )<s(Z j+1 );P(w i |z j ) Is the term w i Belonging to subject z j Probability of (2);
comparing the similarity sim (z j ,z j+1 ) And the similarity threshold epsilon is as large as a preset similarity threshold epsilon:
when sim (z) j ,z j+1 ) At > ε, then z is determined j And z j+1 A time sequence theme transition occurs between, and z j+1 From z j Evolved, expressed as
Figure FDA0004172142760000097
When sim (z) j ,z j+1 ) When epsilon is less than or equal to epsilon, then determining z j And z j+1 Is the same timing theme;
and combining the detected time sequence theme conversion conditions along the time direction to obtain a theme evolution network.
CN202010195669.6A 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment Active CN111427999B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010195669.6A CN111427999B (en) 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010195669.6A CN111427999B (en) 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment

Publications (2)

Publication Number Publication Date
CN111427999A CN111427999A (en) 2020-07-17
CN111427999B true CN111427999B (en) 2023-05-12

Family

ID=71548144

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010195669.6A Active CN111427999B (en) 2020-03-19 2020-03-19 Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment

Country Status (1)

Country Link
CN (1) CN111427999B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101312423A (en) * 2007-05-24 2008-11-26 中国科学院自动化研究所 Expert authority evaluation method and system based on network integrated discussion environment
CN101782920A (en) * 2009-12-23 2010-07-21 中国科学院自动化研究所 Integrated session environment-oriented information recommendation method
CN102360366A (en) * 2011-09-30 2012-02-22 河海大学 Interactive visual HWME (Hall for Workshop of Metasynthetic Engineering) system
CN102929942A (en) * 2012-09-27 2013-02-13 福建师范大学 Social network overlapping community finding method based on ensemble learning
CN103425774A (en) * 2013-08-13 2013-12-04 北京航空航天大学 Tacit knowledge acquisition method based on HWME (Hall for Workshop of Metasynthetic Engineering)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11017180B2 (en) * 2018-04-18 2021-05-25 HelpShift, Inc. System and methods for processing and interpreting text messages

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101312423A (en) * 2007-05-24 2008-11-26 中国科学院自动化研究所 Expert authority evaluation method and system based on network integrated discussion environment
CN101782920A (en) * 2009-12-23 2010-07-21 中国科学院自动化研究所 Integrated session environment-oriented information recommendation method
CN102360366A (en) * 2011-09-30 2012-02-22 河海大学 Interactive visual HWME (Hall for Workshop of Metasynthetic Engineering) system
CN102929942A (en) * 2012-09-27 2013-02-13 福建师范大学 Social network overlapping community finding method based on ensemble learning
CN103425774A (en) * 2013-08-13 2013-12-04 北京航空航天大学 Tacit knowledge acquisition method based on HWME (Hall for Workshop of Metasynthetic Engineering)

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Lancaster.an integrated art workshop curriculum innovation: art in a professional course of teacher education.studies in design education craft &amp technology.2009,1-26. *
佟瑞.基于综合集成方法论的产业技术路线图研究.科技进步与对策.2012,第29卷(第1期),69-73. *

Also Published As

Publication number Publication date
CN111427999A (en) 2020-07-17

Similar Documents

Publication Publication Date Title
Su et al. Analyzing public sentiments online: Combining human-and computer-based content analysis
Abd El-Jawad et al. Sentiment analysis of social media networks using machine learning
Martín-Morató et al. What is the ground truth? reliability of multi-annotator data for audio tagging
Ma et al. Dual attention based suicide risk detection on social media
Welch et al. Learning from personal longitudinal dialog data
Kucher et al. Visual Analysis of Sentiment and Stance in Social Media Texts.
CN111427999B (en) Method and system for evaluating real-time influence of theme facing comprehensive integrated discussion environment
CN113360643A (en) Electronic medical record data quality evaluation method based on short text classification
Sheerman-Chase et al. Cultural factors in the regression of non-verbal communication perception
TW201734759A (en) Method and apparatus for distinguishing topics
Castanedo et al. Building an occupancy model from sensor networks in office environments.
Pentland et al. Does accuracy matter? Methodological considerations when using automated speech-to-text for social science research
CN112632218A (en) Network public opinion monitoring method for enterprise crisis public customs
Umamaheswaran et al. Mapping Climate Themes From 2008-2021—An Analysis of Business News Using Topic Models
Cao et al. News detection for recurrent neural network approach
Shayegan et al. A lexicon weighted sentiment analysis approach on Twitter
CN116756347B (en) Semantic information retrieval method based on big data
Lini Application Research of Decision Tree Algorithm in Sports Grade Analysis
Zhang et al. A novel microblog sentiment classification method based on top-k pooling
Zhang et al. Spatiotemporal analysis of the evolution of public opinion in public health emergencies with SEIR model
Fang et al. Topic trend prediction based on wavelet transformation
Sheremet et al. Speech to Mind Map Conversion in Infocommunication Systems
Zeng A Composited Framework for High-precision Network Public Opinion Risk Event Prediction
Kamalam et al. A Text-Based Approach for Diagnosing Depression Using Social Media Texts
Liu Knowledge model: a method to evaluate an individual's knowledge quantitatively

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant