CN113111658A - Method, device, equipment and storage medium for checking information - Google Patents
Method, device, equipment and storage medium for checking information Download PDFInfo
- Publication number
- CN113111658A CN113111658A CN202110380128.5A CN202110380128A CN113111658A CN 113111658 A CN113111658 A CN 113111658A CN 202110380128 A CN202110380128 A CN 202110380128A CN 113111658 A CN113111658 A CN 113111658A
- Authority
- CN
- China
- Prior art keywords
- text information
- information
- text
- stored
- audio data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 57
- 238000012795 verification Methods 0.000 claims description 29
- 238000012545 processing Methods 0.000 claims description 18
- 238000004590 computer program Methods 0.000 claims description 11
- 238000013145 classification model Methods 0.000 claims description 9
- 238000000605 extraction Methods 0.000 claims description 6
- 238000005516 engineering process Methods 0.000 abstract description 8
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 238000004891 communication Methods 0.000 description 11
- 238000010586 diagram Methods 0.000 description 11
- 238000004422 calculation algorithm Methods 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 230000006870 function Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000004927 fusion Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 239000013589 supplement Substances 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000002457 bidirectional effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000000047 product Substances 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000001953 sensory effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000153 supplemental effect Effects 0.000 description 1
- 238000012706 support-vector machine Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Machine Translation (AREA)
Abstract
The present disclosure discloses a method, an apparatus, a device and a storage medium for checking information, which are applied to the field of computer technology, in particular to the field of speech recognition and the field of natural language processing. The specific implementation scheme of the information checking method is as follows: acquiring audio data from any one of the parties in the multi-party conversation; identifying audio data, and obtaining first text information aiming at the audio data; under the condition that either party is a host party and the audio data is determined to be target data based on the first text information, acquiring stored text information from a preset storage space; and verifying the first text information based on the stored text information. Wherein the stored textual information includes textual information for the identified audio data in the multi-party conversation.
Description
Technical Field
The present disclosure relates to the field of computer technologies, and in particular, to the field of speech recognition and the field of natural language processing, and more particularly, to a method, an apparatus, a device, and a storage medium for verifying information.
Background
With the development of computer technology and network technology, online conferences and online education, etc., developed in the form of multi-party conversations, have been rapidly developed. The multi-party conversation form provides a convenient communication mode for users.
Due to environmental and other factors, it is often difficult for a conference initiator or conference recorder to accurately record the complete content of a conference in a multi-party conversation.
Disclosure of Invention
A method, apparatus, device and storage medium for verifying information are provided that improve verification efficiency and verification accuracy.
According to an aspect of the present disclosure, there is provided a method of verifying information, the method including: acquiring audio data from any one of the parties in the multi-party conversation; identifying audio data, and obtaining first text information aiming at the audio data; under the condition that the any party is a host party and the audio data is determined to be target data based on the first text information, acquiring stored text information from a preset storage space; and verifying the first text information based on the stored text information, wherein the stored text information comprises text information for the identified audio data in the multi-party conversation.
According to another aspect of the present disclosure, there is provided an apparatus for verifying information, the apparatus including: the data acquisition module is used for acquiring audio data from any party in the multi-party conversation; the data identification module is used for identifying the audio data and obtaining first text information aiming at the audio data; the information acquisition module is used for acquiring stored text information from a preset storage space under the condition that any party is a host party and the audio data is determined to be target data based on the first text information; and a verification module for verifying the first text information based on stored text information, wherein the stored text information comprises text information for the identified audio data in the multi-party conversation.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of verifying information provided by the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform a method of verifying information provided by the present disclosure.
According to another aspect of the present disclosure, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the method of verifying information provided by the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 is a schematic view of an application scenario of a method, an apparatus, a device and a storage medium for verifying information according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram of a method of verifying information in accordance with an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart diagram of a method of verifying information according to another embodiment of the present disclosure;
FIG. 4 is a schematic diagram illustrating a principle of retrieving stored text information from a predetermined storage space according to an embodiment of the present disclosure;
FIG. 5 is a schematic flow chart illustrating verification of a first text message based on a stored text message according to an embodiment of the present disclosure;
FIG. 6 is a block diagram of an apparatus for verifying information according to an embodiment of the present disclosure; and
FIG. 7 is a block diagram of an electronic device for implementing a method of verifying information of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The present disclosure provides a method for verifying information, which includes an audio data acquisition stage, an audio data recognition stage, a text information acquisition stage and a text information verification stage. In the audio data acquisition phase, audio data from any of the multiple parties to the conversation is acquired. In the audio data identification stage, the audio data are identified, and first text information aiming at the audio data is obtained. In the text information acquisition stage, in the case where the either party is the host party and the audio data is determined to be the target data based on the first text information, the stored text information is acquired from a predetermined storage space. And in the text information checking stage, checking the first text information based on the stored text information. Wherein the stored textual information includes textual information for the identified audio data in the multi-party conversation.
An application scenario of the method and apparatus provided by the present disclosure will be described below with reference to fig. 1.
Fig. 1 is a schematic view of an application scenario of a method, an apparatus, a device, and a storage medium for checking information according to an embodiment of the present disclosure.
As shown in FIG. 1, the application scenario 100 includes terminal devices 111-114 and a server 120. The terminal devices 111-114 can be communicatively coupled to the server 120 via a network, which can include wired or wireless communication links.
According to the embodiment of the disclosure, the terminal devices 111 to 114 may be terminal devices having a display screen and capable of audio and/or video calls, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like. The terminal devices 111 to 114 may be installed with various communication client applications, such as an instant messaging tool, social platform software, a web browser application, a search application, and the like (for example only).
In one embodiment, a user may use terminal devices 111-114 to interact with server 120 over a network to construct a multi-party conversation. The user may be, for example, a worker in the enterprise, or may be a teacher, a student, or the like. The users can respectively use personal terminal equipment to establish remote sessions with other users through social platform software and the like so as to share information, teach knowledge or study knowledge and the like.
The server 120 may be used as an intermediary, for example, to receive audio information and/or video information of a user collected by each terminal device and send audio information and/or video information collected by other terminal devices to each terminal device, so as to implement a remote session between multiple users. The server 120 may be a server that provides various services, such as a backend management server that provides support for social platform software. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
In an embodiment, after receiving the audio information and/or the video information collected by each terminal device, the server 120 may further perform recognition conversion on the audio information to obtain text information, and store the text information obtained by the conversion. According to actual requirements, the server 120 may perform semantic understanding on the converted text information to check the text information, for example.
It should be noted that the method for verifying information provided by the present disclosure may be executed by the server 120. Accordingly, the apparatus for verifying information provided by the present disclosure may be disposed in the server 120. Alternatively, the method of verifying information provided by the present disclosure may also be performed by a server or cluster of servers that is different from the server 120 and is capable of communicating with the server 120. Accordingly, the apparatus for verifying information provided by the present disclosure may also be disposed in a server or a server cluster different from the server 120 and capable of communicating with the server 120.
It should be understood that the number and type of terminal devices and servers in fig. 1 are merely illustrative. There may be any number and type of terminal devices, as desired for implementation.
Fig. 2 is a flow chart diagram of a method of verifying information according to an embodiment of the present disclosure.
As shown in fig. 2, the method 200 of verifying information of this embodiment may include operation S210, operation S230, operation S250, and operation S270.
In operation S210, audio data from any one of the multi-party conversations is acquired.
According to embodiments of the present disclosure, a multi-party session may be initiated, for example, by a plurality of users, which may include a moderator and a participant, employing social platform software in respective end devices. The host can be a team leader or project manager in the enterprise, and the participant can be a member in the team or a project engineer. Alternatively, the moderator may be a teacher and the participants may be students.
According to the embodiment of the disclosure, the terminal device of each user in a plurality of users can acquire the voice of each user to obtain audio data and send the audio data to the server supporting the operation of the social platform software, so that the server can acquire the audio data from any party in a multi-party conversation.
In operation S230, the audio data is recognized, and first text information for the audio data is obtained.
According to an embodiment of the present disclosure, audio data may be converted into first text information using a Speech Recognition technology (ASR). Specifically, the audio data may be converted into the first text information by using a Dynamic Time Warping (DTW) method, Hidden Markov (HMM) theory, Vector Quantization (VQ) technology, or Artificial Neural Network (ANN) technology.
In operation S250, in the case where the either party is the host party and it is determined that the audio data is the target data based on the first text information, the stored text information is acquired from a predetermined storage space.
According to an embodiment of the present disclosure, the obtained audio data may carry account information, for example, where the account information uniquely indicates any party in a multi-party conversation. When a multi-party conversation is initiated, the server can maintain a moderator's account information list and a participant's account information list based on the operation of a plurality of users on social platform software in respective terminal devices, for example. By comparing the account information carried by the acquired audio data with the maintained account information list, whether any party is a host party or a participant party can be determined.
According to an embodiment of the present disclosure, the target data is data characterizing the conclusion. For example, the target data may be voice collected by the host expressing the conclusion during the multi-party conference. The embodiment may determine whether the audio data is the target data by performing semantic recognition on the first text information in a case where it is determined that either one of the parties is the host. Alternatively, by extracting key information from the first text information, it is determined whether the video data is the target data based on the extracted key information. It is to be understood that the above-described target data are merely examples to facilitate understanding of the present disclosure, and the present disclosure is not limited thereto.
According to an embodiment of the present disclosure, the predetermined storage space may store, for example, text information obtained by identifying the acquired audio data in the multi-party conversation, i.e., the text information stored in the predetermined storage space includes text information for the identified audio data in the multi-party conversation. The text information stored in the predetermined storage space may be obtained by using operations similar to the aforementioned operations S210 and S230. The recognized audio data is speech uttered by a plurality of users collected before the audio data is acquired in operation S210.
In operation S270, the first text information is checked based on the stored text information.
According to an embodiment of the present disclosure, a semantic understanding model may be first employed to determine conclusion information of stored textual information based on the stored textual information. The semantic understanding model may be used, for example, to perform entity word recognition on text information, and embed the recognized entity word into the predetermined template to obtain conclusion information. The input of the semantic understanding model may be stored text information, and the output may be conclusion text, and the semantic understanding model may be constructed based on a Bidirectional Encoder (BERT) or a knowledge-graph enhanced BERT language Representation model (ERNIE), for example. It is to be understood that the type of semantic understanding model described above is merely an example to facilitate understanding of the present disclosure, and the present disclosure is not limited thereto. For example, the semantic understanding model may also include a long-short term memory network model or the like.
Illustratively, the semantic understanding model may perform entity word recognition based on the semantics of the stored textual information. For example, for the stored textual information "item X is followed by zhang san and lie sequ," the identified entity words may include "item X" and "zhang san" but not "lie sequ. This is because "lie four" no longer has an association with "item X" as known based on the semantics of the stored text information.
According to the embodiment of the disclosure, after the conclusion information is obtained by adopting the semantic understanding model, the conclusion information can be compared with the first text information to determine whether the conclusion information is matched with the first text information. Specifically, the similarity between the conclusion information and the first text information may be determined in at least one of the following parameter forms: cosine similarity, a pearson correlation coefficient, a Jacard similarity coefficient, and the like, and when the similarity between the conclusion information and the first text information is greater than a predetermined similarity, determining that the conclusion information matches the first text information.
For example, it may be determined that the first text information is not verified when it is determined that the conclusion information does not match the first text information. And when the conclusion information is determined to be matched with the first text information, determining that the first text information passes the verification.
By adopting the semantic understanding model to obtain conclusion information, the fusion of the semantics of a plurality of stored text information can be realized, and the semantic relevance among the plurality of stored text information is fully considered. The first text information is verified based on the comparison between the conclusion information obtained by fusion and the first text information, so that the accuracy of verifying the first text information can be improved.
According to the embodiment of the disclosure, when the first text information is verified, for example, only the text processing model may be used to extract the key information from the stored text information. And then determining whether the first text information comprises the key information, if so, determining that the first text information passes the verification, otherwise, determining that the first text information does not pass the verification. The text processing model can be constructed based on a two-way long-short term memory network model and a conditional random field model. It is to be understood that this type of text processing model is merely an example to facilitate an understanding of the present disclosure, and the present disclosure is not limited thereto.
Illustratively, the text processing model may be constructed based on a keyword extraction algorithm, which may, for example, perform keyword extraction based on a predetermined lexicon. The embodiment can maintain a predetermined word bank in advance according to actual requirements, and does not limit words included in the predetermined word bank.
According to the embodiment of the invention, under the condition that the audio data is the target data and the audio data is obtained by collecting the voice of the host, the text information of the recognized audio data in the multi-party conversation is obtained, so that the effect of automatically verifying the target data through the context of the associated conversation can be achieved, the verification accuracy can be improved, and the use coverage and the commercial value of the voice recognition technology can be improved.
Fig. 3 is a flow chart illustrating a method of verifying information according to another embodiment of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 3, the method 300 of verifying information in this embodiment may include operation S310, operation S330, operation S350, and operation S370, and operation S391 and operation S393 performed between operation S330 and operation S350.
Operations S310 to S330 are performed in a loop during the multi-party conversation to obtain audio data from any party in the multi-party conversation, and identify the audio data to obtain text information for the audio data.
After the first text information for the audio data is obtained, operation S391 may be performed to determine whether any one of the parties is the host. If the host side is the host side, the operation S393 is executed, otherwise, the operation returns to continue to acquire the audio data.
In operation S393, it is determined whether the first text information is a conclusion type text.
For example, the first text information may be classified by using a text classification model, the classification result indicates a probability that the first text information expresses the conclusion, and when the probability is greater than a predetermined value, the audio data is determined as the target data. Or, the classification result indicates whether the first text information belongs to a conclusion category, and when the first text information belongs to the conclusion category, the audio data is determined to be the target data. The text classification model may be, for example, a support vector machine model, a K-Nearest Neighbor classification algorithm (KNN), or the like. The predetermined value may be any value, such as 0.8, and the predetermined value and the text classification model may be set according to actual requirements, which is not limited in this disclosure.
For example, the text classification model may first extract key information from the first text information. After extracting the key information, determining the category of the first text information according to the key information. In an embodiment, the text classification model may be constructed, for example, based on the text processing model described above, to extract the key information from the first text information.
Illustratively, the aforementioned predetermined word library may have maintained therein entity words, verbs of associated scenes, words indicating conclusions, words indicating supplementary contents, and question words, etc. When the associated scenario is an enterprise scenario, the verb may include, for example: follow-up, resolution, processing, confirmation, evaluation, push, etc., the words indicating a conclusion may include: concluding, briefly, summarizing, etc., the words indicating supplemental content may include: supplement, remind, reduce, also, etc. The keyword extraction algorithm may be, for example, a Term Frequency-Inverse text Frequency (TF-IDF) algorithm, a web page ranking (PageRank) algorithm, or the like. It is to be understood that the predetermined lexicon and the text processing model described above are merely examples to facilitate understanding of the present disclosure, and the present disclosure is not limited thereto. In the case where the word indicating the conclusion and the word indicating the supplementary contents are included in the key information and the query word is not included, the first text information may be determined to be the conclusion type.
If the first text information is a conclusion type text, operations S350 and S370 are performed to obtain stored text information from a predetermined storage space and verify the first text information based on the stored text information. And if the first text information is not the text of the conclusion type, returning to obtain the audio data.
According to the embodiment, the first text information is classified according to the text classification model, and whether the audio data is the target data or not is determined, so that the accuracy of the determined target data can be improved, the verification of non-target data can be avoided, and the verification efficiency is improved.
Fig. 4 is a schematic diagram of a principle of obtaining stored text information from a predetermined storage space according to an embodiment of the present disclosure.
According to the embodiment of the disclosure, when the stored text information is acquired from the predetermined storage space, for example, only the text information related to the topic expressed by the current first text information may be acquired, so as to improve the efficiency and accuracy of checking the first text information.
As shown in fig. 4, in the embodiment 400, when the stored text information is obtained, the topic information 420 included in the first text information 410 may be determined, and the topic information 420 may be determined by the identified key information. For example, for the first text information "i say the conclusion of question Y, followed by E", the topic information may be determined to be "question Y" based on the key information. After the subject information of the first text information is obtained, the text information including the subject information may be searched from a predetermined storage space as the second text information. And then the second text information is acquired from the predetermined storage space and the text information stored in the predetermined storage space after the second text information is stored is used as the stored text information for verifying the first text information.
Illustratively, as shown in FIG. 4, text information 431-434 is stored in the predetermined storage space, and the text information 431-434 is stored in the predetermined storage space in chronological order. If it is determined that the text information including the topic information in the first text information is the text information 432, the acquired stored text information is the text information 432-434.
According to an embodiment of the present disclosure, when a plurality of text messages including the topic information are present in a predetermined storage space, the embodiment may determine that the text message stored earliest among the plurality of text messages is the second text message. In the process of discussing the theme expressed by the theme information, if the theme information is mentioned for a plurality of times, the acquired stored text information can be ensured to cover the whole process of discussing the theme by the mode. The predetermined storage space is the storage space which is only corresponding to the multi-party conversation, so that the influence of the text information obtained before the current discussion on the current verification is avoided when the text information obtained by the multiple discussions aiming at the same theme is stored in the storage space.
According to the embodiment of the disclosure, in the case that the acquired audio data is from a participant other than the host side, or the audio data is non-target data, the audio data may be stored in the predetermined storage space, so as to be used in the subsequent verification of the text information for the target data.
According to the embodiment of the disclosure, under the condition that the obtained audio data is from the host and the audio data is the target data, if the verification of the text information of the target data is completed, the text information of the target data can be stored in the predetermined storage space, so that the text information corresponding to the complete audio data of the multi-party conversation is stored in the predetermined storage space.
For example, when storing the first text information for the audio data, the text processing model described above may be used to extract the key information from the first text information. In a case where the key information includes the topic information, the topic information is added to the first text information. The first text information finally stored to the predetermined storage space is the text information to which the subject information is added. The topic information may be, for example, a name of an item, a name of a course, a chapter of a course, and the like included in the text information. By adding the topic information to the first text information, the determination of the second text information in the predetermined storage space can be facilitated.
For example, after the first text information is obtained, key information may be extracted from the first text information, and the key information may be used as a tag of the first text information, so as to perform text classification and conclusion information generation on the first text information in a subsequent processing process. Wherein the key information may also be obtained, for example, in connection with the source of the audio data. For example, if the first text information is "i follow up" obtained for audio data from party B, the resulting key information may be "B follow up".
Illustratively, when a multiparty conversation is initiated for the purpose of discussing the progress of a project, the method of verifying information using this embodiment may obtain text information and key information as shown in the following table for each audio data. Where speaker a is the moderator and speaker B, D, E, F, G is the participant.
Illustratively, for verification of the first text message "i say the conclusion of question Y, followed by E", the stored text message "a next topic, question Y on line, and also a question Y that needs to be looked at, who followed? "text information" question Y i also follow the bar together ", and by using the semantic understanding model, it can be concluded that" question Y follows E, D and G, this week is completed ", and by comparing this conclusion information with the first text information, this first text information is incomplete, and does not mention the key information" D "," G ", and" this week ", it is determined that the verification fails.
Fig. 5 is a schematic flowchart of verifying the first text information based on the stored text information according to an embodiment of the present disclosure.
According to the embodiment of the disclosure, in the case that it is determined that the first text information is not verified, the server may further push a predetermined prompt message to the any party to prompt the any party to issue an inaccurate statement. The predetermined prompt information may be, for example, text information, sound information, or the like, and the predetermined prompt information may be pushed to the terminal device used by the any party, so that the terminal device used by the any party displays the text information, or plays the sound information, or the like.
Illustratively, the prompt message may be, for example, a prompt word indicating that the first text message is incorrect, or the prompt message may further include difference information between the conclusion information and the first text message to prompt the host to supplement the utterance content.
According to the embodiment of the disclosure, in a multiparty conference, there may be a case where there are a plurality of presenters, or where the presenters make multiple speeches when making a summarized speeches, and therefore there may be a case where there are a plurality of pieces of audio data belonging to target data from the presenters in the acquired audio data for a topic expressed by topic information. In this case, the text information for the plurality of audio data may be selected from the stored text information as the third text information. And after the third text information is obtained, fusing the third text information and the current first text information to obtain the text information to be verified. And then, the text information to be verified is verified based on other text information except the third text information in the stored text information. Through the method, the accuracy of checking the first text information can be guaranteed.
As shown in fig. 5, in this embodiment, after the stored text information is obtained from the predetermined storage space uniquely corresponding to the multi-party conversation through operation S550, the process of verifying the first text based on the stored text information may include operations S571 to S577.
In operation S571, it is determined whether the third text information is included in the stored text information. If so, operation S573 is performed, otherwise, operation S575 is performed. The third text information is text information which contains the subject contained in the first text information and is obtained by identifying the target data of the host. The target data of the host is the audio data which comes from the host and belongs to the target data.
In operation S573, text information to be verified is generated based on the third text information and the first text information. For example, a keyword may be extracted from the third text information and the first text information, and then the keyword is nested in a predetermined template to obtain the text information to be verified. For example, if the third text message is "question Y follows up with E" and the first text message is "question Y follows up with D", the extracted keywords include: the generated text information to be verified can be the question Y which is followed by the question E and the question F. The predetermined template is "XX followed by XXX". The text information to be verified may be generated by the aforementioned semantic understanding model, for example, which is not limited by this disclosure.
In operation S575, the first text information is used as the text information to be verified.
After the text information to be verified is obtained, operation S577 is performed, and the text information to be verified is verified based on the other text information except the third text information in the stored text information. It can be understood that the implementation method of operation S577 is similar to the method for verifying the first text information based on the stored text information, and is not described herein again.
Based on the method for verifying information provided by the present disclosure, the present disclosure also provides an apparatus for verifying information, which will be described in detail below with reference to fig. 6.
Fig. 6 is a block diagram of a structure of an apparatus for verifying information according to an embodiment of the present disclosure.
As shown in fig. 6, the apparatus 600 for verifying information of this embodiment includes a data acquisition module 610, a data identification module 630, an information acquisition module 650, and a verification module 670.
The data acquisition module 610 is used to acquire audio data from any one of the multiple parties in the conversation. In an embodiment, the data obtaining module 610 may be configured to perform the operation S210 described above, for example, and is not described herein again.
The data identification module 630 is configured to identify the audio data and obtain first text information for the audio data. In an embodiment, the data identification module 630 may be configured to perform the operation S230 described above, for example, and is not described herein again.
The information obtaining module 650 is configured to obtain the stored text information from a predetermined storage space if the host is the any party and the audio data is determined to be the target data based on the first text information. Wherein the stored textual information includes textual information for the identified audio data in the multi-party conversation. In an embodiment, the information obtaining module 650 may be configured to perform the operation S250 described above, for example, and is not described herein again.
The verification module 670 is configured to verify the first text information based on the stored text information. In an embodiment, the checking module 670 may be configured to perform the operation S270 described above, for example, and is not described herein again.
According to an embodiment of the present disclosure, the apparatus 600 for verifying information may further include a target data determining module, for example, configured to determine whether the audio data is the target data based on the first text information. The target data determination module may include, for example, a text type determination sub-module and a data determination sub-module. And the text type determining submodule is used for determining whether the first text information is the text of the conclusion type by adopting a text classification model under the condition that any party is the host. The data determination submodule is used for determining the audio data as the target data under the condition that the first text information is the text of the conclusion type.
According to an embodiment of the present disclosure, the first text information includes subject information, and the predetermined storage space uniquely corresponds to a multi-party conversation. The information acquiring module 650 may include a text determining sub-module and an acquiring sub-module. The text determination submodule is used for determining second text information in the preset storage space, and the second text information contains theme information. The obtaining submodule is used for obtaining the second text information from the preset storage space and storing the second text information into the text information of the preset storage space after the second text information is stored.
According to an embodiment of the present disclosure, the check module 670 may include an information determination submodule and a check submodule. The information determination submodule is used for determining conclusion information aiming at the stored text information by adopting a semantic understanding model based on the stored text information. The verification submodule is used for determining that the first text information passes verification under the condition that the conclusion information is matched with the first text information.
According to an embodiment of the present disclosure, the apparatus 600 for verifying information may further include an information pushing module, for pushing predetermined prompt information to any party if it is determined that the first text information is not verified.
According to an embodiment of the present disclosure, the first text information includes subject information, and the predetermined storage space uniquely corresponds to the multiparty conversation. The checking module comprises an information generation submodule and a checking submodule. The information generation submodule is used for generating the text information to be verified based on the third text information and the first text information under the condition that the stored text information comprises the third text information. The verification sub-module is used for verifying the text information to be verified based on other text information except the third text information in the stored text information. The third text information is text information containing subject information, and the third text information is obtained by identifying target data of a supporter.
According to an embodiment of the present disclosure, the apparatus 600 for verifying information may further include a storage module, for example, configured to store the first text information into a predetermined storage space when any party is a participant or when the audio data is determined to be non-target data based on the first text information. Or the storage module is used for storing the first text information into the preset storage space after the first text information is verified based on the stored text information.
According to the embodiment of the disclosure, the storage module comprises an information extraction submodule, a theme information adding submodule and a storage submodule. The second information extraction submodule is used for extracting key information from the first text information by adopting a text processing model under the condition that the audio data is determined to be non-target data based on the first text information. The topic information adding sub-module is used for adding the topic information to the first text information under the condition that the key information comprises the topic information. The storage submodule is used for storing the first text information added with the theme information into a preset storage space.
It should be noted that, in the technical solution of the present disclosure, the acquisition, storage, application, and the like of the personal information of the related user all conform to the regulations of the relevant laws and regulations, and do not violate the common customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 is a block diagram of an electronic device for implementing a method of verifying information of an embodiment of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.
Claims (19)
1. A method of verifying information, comprising:
acquiring audio data from any one of the parties in the multi-party conversation;
identifying the audio data, and obtaining first text information aiming at the audio data;
acquiring stored text information from a predetermined storage space in the case where the either party is a host party and it is determined that the audio data is target data based on the first text information; and
verifying the first text information based on the stored text information,
wherein the stored textual information includes textual information for identified audio data in the multi-party conversation.
2. The method of claim 1, further comprising determining whether the audio data is target data based on the first textual information, comprising:
under the condition that any party is a host party, determining whether the first text information is a text of a conclusion type by adopting a text classification model; and
and determining the audio data as target data when the first text information is a conclusion type text.
3. The method of claim 1, wherein the first text information includes subject information, and the predetermined storage space uniquely corresponds to the multi-party conversation; the acquiring of the stored text information from the predetermined storage space includes:
determining second text information in the preset storage space, wherein the second text information contains the theme information; and
and acquiring the second text information from the preset storage space and storing the text information in the preset storage space after the second text information is stored.
4. The method of claim 1, wherein the verifying the first textual information based on the stored textual information comprises:
determining conclusion information for the stored text information using a semantic understanding model based on the stored text information; and
and determining that the first text information passes verification under the condition that the conclusion information is matched with the first text information.
5. The method of claim 4, further comprising:
and under the condition that the first text information is determined not to pass the verification, pushing preset prompt information to any party.
6. The method of claim 1, wherein the first text information comprises subject information, and the predetermined storage space uniquely corresponds to the multi-party conversation; the verifying the first text information based on the stored text information comprises:
under the condition that the stored text information comprises third text information, generating text information to be verified based on the third text information and the first text information; and
checking the text information to be checked based on other text information except the third text information in the stored text information,
wherein the third text information is text information including the subject information, and the third text information is obtained by identifying target data of the host.
7. The method of claim 1, further comprising:
storing the first text information into the predetermined storage space when the either party is a participant or the audio data is determined to be non-target data based on the first text information; or
And after the first text information is verified based on the stored text information, storing the first text information into the preset storage space.
8. The method of claim 7, wherein storing the first text information in the predetermined storage space comprises:
extracting key information from the first text information by adopting a text processing model;
adding the topic information to the first text information in the case that the key information includes topic information; and
and storing the first text information added with the theme information into the preset storage space.
9. An apparatus for verifying information, comprising:
the data acquisition module is used for acquiring audio data from any party in the multi-party conversation;
the data identification module is used for identifying the audio data and obtaining first text information aiming at the audio data;
the information acquisition module is used for acquiring stored text information from a preset storage space under the condition that the any party is a host party and the audio data is determined to be target data based on the first text information; and
a verification module for verifying the first text information based on the stored text information,
wherein the stored textual information includes textual information for identified audio data in the multi-party conversation.
10. The apparatus of claim 9, further comprising a target data determination module to determine whether the audio data is target data based on the first textual information; the target data determination module includes:
the text type determining submodule is used for determining whether the first text information is a text of a conclusion type by adopting a text classification model under the condition that any party is a host party; and
and the data determination sub-module is used for determining the audio data as target data under the condition that the first text information is a text of a conclusion type.
11. The apparatus of claim 9, wherein the first text information comprises subject information, and the predetermined storage space uniquely corresponds to the multi-party conversation; the information acquisition module includes:
the text determining submodule is used for determining second text information in the preset storage space, and the second text information contains the theme information; and
and the acquisition submodule is used for acquiring the second text information from the preset storage space and storing the second text information into the text information of the preset storage space after the second text information is stored.
12. The apparatus of claim 9, wherein the verification module comprises:
the information determination submodule is used for determining conclusion information aiming at the stored text information by adopting a semantic understanding model based on the stored text information; and
and the verification sub-module is used for determining that the first text information passes verification under the condition that the conclusion information is matched with the first text information.
13. The apparatus of claim 12, further comprising:
and the information pushing module is used for pushing preset prompt information to any party under the condition that the first text information is determined not to pass the verification.
14. The apparatus of claim 9, wherein the first text information comprises subject information, and the predetermined storage space uniquely corresponds to the multi-party conversation; the verification module comprises:
the information generation submodule is used for generating text information to be verified based on the third text information and the first text information under the condition that the stored text information comprises third text information; and
a checking sub-module for checking the text information to be checked based on other text information except the third text information in the stored text information,
wherein the third text information is text information including the subject information, and the third text information is obtained by identifying target data of the supporter.
15. The apparatus of claim 9, further comprising a storage module to:
storing the first text information into the predetermined storage space when the either party is a participant or the audio data is determined to be non-target data based on the first text information; or
And after the first text information is verified based on the stored text information, storing the first text information into the preset storage space.
16. The apparatus of claim 15, wherein the storage module comprises:
the information extraction submodule is used for extracting key information from the first text information by adopting a text processing model;
the topic information adding submodule is used for adding the topic information to the first text information under the condition that the key information comprises topic information; and
and the storage submodule is used for storing the first text information added with the theme information into the preset storage space.
17. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-8.
18. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any of claims 1-8.
19. A computer program product comprising a computer program which, when executed by a processor, implements a method according to any one of claims 1 to 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110380128.5A CN113111658B (en) | 2021-04-08 | 2021-04-08 | Method, device, equipment and storage medium for checking information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110380128.5A CN113111658B (en) | 2021-04-08 | 2021-04-08 | Method, device, equipment and storage medium for checking information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113111658A true CN113111658A (en) | 2021-07-13 |
CN113111658B CN113111658B (en) | 2023-08-18 |
Family
ID=76714932
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110380128.5A Active CN113111658B (en) | 2021-04-08 | 2021-04-08 | Method, device, equipment and storage medium for checking information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113111658B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114897104A (en) * | 2022-06-14 | 2022-08-12 | 北京金堤科技有限公司 | Information acquisition method and device, electronic equipment and storage medium |
CN114970762A (en) * | 2022-06-22 | 2022-08-30 | 阿维塔科技(重庆)有限公司 | Data processing method, device, equipment and computer storage medium |
Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140082100A1 (en) * | 2012-09-20 | 2014-03-20 | Avaya Inc. | Virtual agenda participant |
CN109887508A (en) * | 2019-01-25 | 2019-06-14 | 广州富港万嘉智能科技有限公司 | A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print |
CN110139062A (en) * | 2019-05-09 | 2019-08-16 | 平安科技(深圳)有限公司 | A kind of creation method, device and the terminal device of video conference record |
CN110379429A (en) * | 2019-07-16 | 2019-10-25 | 招联消费金融有限公司 | Method of speech processing, device, computer equipment and storage medium |
US20200152200A1 (en) * | 2017-07-19 | 2020-05-14 | Alibaba Group Holding Limited | Information processing method, system, electronic device, and computer storage medium |
CN111277589A (en) * | 2020-01-19 | 2020-06-12 | 腾讯云计算(北京)有限责任公司 | Conference document generation method and device |
CN112528660A (en) * | 2020-12-04 | 2021-03-19 | 北京百度网讯科技有限公司 | Method, apparatus, device, storage medium and program product for processing text |
-
2021
- 2021-04-08 CN CN202110380128.5A patent/CN113111658B/en active Active
Patent Citations (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140082100A1 (en) * | 2012-09-20 | 2014-03-20 | Avaya Inc. | Virtual agenda participant |
US20200152200A1 (en) * | 2017-07-19 | 2020-05-14 | Alibaba Group Holding Limited | Information processing method, system, electronic device, and computer storage medium |
CN109887508A (en) * | 2019-01-25 | 2019-06-14 | 广州富港万嘉智能科技有限公司 | A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print |
CN110139062A (en) * | 2019-05-09 | 2019-08-16 | 平安科技(深圳)有限公司 | A kind of creation method, device and the terminal device of video conference record |
CN110379429A (en) * | 2019-07-16 | 2019-10-25 | 招联消费金融有限公司 | Method of speech processing, device, computer equipment and storage medium |
CN111277589A (en) * | 2020-01-19 | 2020-06-12 | 腾讯云计算(北京)有限责任公司 | Conference document generation method and device |
CN112528660A (en) * | 2020-12-04 | 2021-03-19 | 北京百度网讯科技有限公司 | Method, apparatus, device, storage medium and program product for processing text |
Non-Patent Citations (1)
Title |
---|
123代码搬运工123: "开发实战:如何利用实时语音转写技术搞定会议纪要", Retrieved from the Internet <URL:https://blog.csdn.net/weixin_49343044/article/details/109113444> * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114897104A (en) * | 2022-06-14 | 2022-08-12 | 北京金堤科技有限公司 | Information acquisition method and device, electronic equipment and storage medium |
CN114970762A (en) * | 2022-06-22 | 2022-08-30 | 阿维塔科技(重庆)有限公司 | Data processing method, device, equipment and computer storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN113111658B (en) | 2023-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11250839B2 (en) | Natural language processing models for conversational computing | |
US10963505B2 (en) | Device, system, and method for automatic generation of presentations | |
US11063890B2 (en) | Technology for multi-recipient electronic message modification based on recipient subset | |
JP6604836B2 (en) | Dialog text summarization apparatus and method | |
US9483582B2 (en) | Identification and verification of factual assertions in natural language | |
US9064006B2 (en) | Translating natural language utterances to keyword search queries | |
US20190066696A1 (en) | Method and apparatus for verifying information | |
CN111428010B (en) | Man-machine intelligent question-answering method and device | |
US10956480B2 (en) | System and method for generating dialogue graphs | |
CN107430616A (en) | The interactive mode of speech polling re-forms | |
CN113111658B (en) | Method, device, equipment and storage medium for checking information | |
KR102030551B1 (en) | Instant messenger driving apparatus and operating method thereof | |
WO2023142451A1 (en) | Workflow generation methods and apparatuses, and electronic device | |
CN110738056B (en) | Method and device for generating information | |
CN115099239A (en) | Resource identification method, device, equipment and storage medium | |
CN115098729A (en) | Video processing method, sample generation method, model training method and device | |
US9747891B1 (en) | Name pronunciation recommendation | |
CN113470625A (en) | Voice conversation processing method, device, equipment and storage medium | |
US20200320134A1 (en) | Systems and methods for generating responses for an intelligent virtual | |
CN116049370A (en) | Information query method and training method and device of information generation model | |
CN116010571A (en) | Knowledge base construction method, information query method, device and equipment | |
KR102222637B1 (en) | Apparatus for analysis of emotion between users, interactive agent system using the same, terminal apparatus for analysis of emotion between users and method of the same | |
CN112632241A (en) | Method, device, equipment and computer readable medium for intelligent conversation | |
JP6885217B2 (en) | User dialogue support system, user dialogue support method and program | |
CN113241061B (en) | Method and device for processing voice recognition result, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |