CN113111658B - Method, device, equipment and storage medium for checking information - Google Patents

Method, device, equipment and storage medium for checking information Download PDF

Info

Publication number
CN113111658B
CN113111658B CN202110380128.5A CN202110380128A CN113111658B CN 113111658 B CN113111658 B CN 113111658B CN 202110380128 A CN202110380128 A CN 202110380128A CN 113111658 B CN113111658 B CN 113111658B
Authority
CN
China
Prior art keywords
text information
information
text
stored
audio data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110380128.5A
Other languages
Chinese (zh)
Other versions
CN113111658A (en
Inventor
刘俊启
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202110380128.5A priority Critical patent/CN113111658B/en
Publication of CN113111658A publication Critical patent/CN113111658A/en
Application granted granted Critical
Publication of CN113111658B publication Critical patent/CN113111658B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The disclosure discloses a method, a device, equipment and a storage medium for checking information, which are applied to the technical field of computers, in particular to the field of voice recognition and natural language processing. The specific implementation scheme of the information checking method is as follows: acquiring audio data from any one of the multi-party sessions; identifying audio data and obtaining first text information for the audio data; in the case where either one of the parties is a host and the audio data is determined to be target data based on the first text information, acquiring stored text information from a predetermined storage space; and verifying the first text information based on the stored text information. Wherein the stored text information includes text information for the identified audio data in the multiparty conversation.

Description

Method, device, equipment and storage medium for checking information
Technical Field
The present disclosure relates to the field of computer technology, in particular to the field of speech recognition and natural language processing, and more particularly to a method, apparatus, device and storage medium for verifying information.
Background
With the development of computer technology and network technology, online conferences, online education, etc. developed in the form of multiparty conversations have been rapidly developed. The multiparty session form provides a convenient communication mode for users.
Due to environmental factors, in multiparty sessions, it is often difficult for a conference initiator or conference recorder to accurately record the complete content of a conference.
Disclosure of Invention
Provided are a method, apparatus, device and storage medium for checking information, which improve checking efficiency and checking accuracy.
According to one aspect of the present disclosure, there is provided a method of checking information, the method comprising: acquiring audio data from any one of the multi-party sessions; identifying audio data and obtaining first text information for the audio data; in the case where either one of the parties is a host and the audio data is determined to be target data based on the first text information, acquiring stored text information from a predetermined storage space; and verifying the first text information based on the stored text information, wherein the stored text information includes text information for the identified audio data in the multiparty conversation.
According to another aspect of the present disclosure, there is provided an apparatus for checking information, the apparatus including: the data acquisition module is used for acquiring audio data from any one party of the multi-party session; the data identification module is used for identifying the audio data and obtaining first text information aiming at the audio data; the information acquisition module is used for acquiring stored text information from a preset storage space under the condition that any one party is a host and the audio data is determined to be target data based on the first text information; and a verification module for verifying the first text information based on the stored text information, wherein the stored text information includes text information for the identified audio data in the multiparty conversation.
According to another aspect of the present disclosure, there is provided an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of verifying information provided by the present disclosure.
According to another aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform a method of verifying information provided by the present disclosure.
According to another aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of verifying information provided by the present disclosure.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a schematic view of an application scenario of a method, apparatus, device and storage medium for verifying information according to an embodiment of the present disclosure;
FIG. 2 is a flow diagram of a method of verifying information according to an embodiment of the present disclosure;
FIG. 3 is a flow diagram of a method of verifying information according to another embodiment of the present disclosure;
FIG. 4 is a schematic diagram of retrieving stored text information from a predetermined storage space according to an embodiment of the present disclosure;
FIG. 5 is a flow diagram of verifying first text information based on stored text information according to an embodiment of the present disclosure;
FIG. 6 is a block diagram of an apparatus for verifying information according to an embodiment of the present disclosure; and
fig. 7 is a block diagram of an electronic device for implementing a method of verifying information in accordance with an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
The present disclosure provides a method of verifying information, the method comprising an audio data acquisition phase, an audio data identification phase, a text information acquisition phase, and a text information verification phase. In the audio data acquisition phase, audio data from any one of the parties to the multi-party conversation is acquired. In the audio data recognition stage, audio data is recognized, and first text information for the audio data is obtained. In the text information acquisition stage, in the case where either one of the parties is a host and the audio data is determined to be target data based on the first text information, the stored text information is acquired from a predetermined storage space. In the text information verification stage, the first text information is verified based on the stored text information. Wherein the stored text information includes text information for the identified audio data in the multiparty conversation.
An application scenario of the method and apparatus provided by the present disclosure will be described below with reference to fig. 1.
Fig. 1 is an application scenario schematic diagram of a method, an apparatus, a device and a storage medium for checking information according to an embodiment of the present disclosure.
As shown in fig. 1, the application scenario 100 includes terminal devices 111 to 114 and a server 120. Terminal devices 111-114 may be communicatively coupled to server 120 via a network, which may include wired or wireless communication links.
According to embodiments of the present disclosure, terminal devices 111-114 may be terminal devices having a display, capable of audio and/or video calls, including but not limited to smartphones, tablets, laptop and desktop computers, and the like. Various communication client applications, such as an instant messaging tool, social platform software, a web browser application, a search class application, etc., may be installed on the terminal devices 111-114, as just examples.
In one embodiment, a user may interact with server 120 over a network using terminal devices 111-114 to construct a multiparty session. The user may be, for example, a staff member in an enterprise, or may be a teacher, a student, or the like. The users can respectively use the personal terminal devices to establish remote sessions with other users through social platform software and the like so as to share information, teach knowledge, study knowledge and the like.
The server 120 may, for example, act as an intermediary, receive audio information and/or video information of the user collected by each terminal device, and send audio information and/or video information collected by other terminal devices to each terminal device, so as to implement a remote session between multiple users. The server 120 may be a server that provides various services, such as a background management server that provides support for social platform software. The server may be a cloud server, a server of a distributed system, or a server incorporating a blockchain.
In an embodiment, after receiving the audio information and/or the video information collected by each terminal device, the server 120 may, for example, further perform recognition conversion on the audio information to obtain text information, and store the converted text information. According to practical requirements, the server 120 may also perform semantic understanding on the converted text information, for example, so as to verify the text information.
It should be noted that the method for checking information provided by the present disclosure may be performed by the server 120. Accordingly, the means for verifying information provided by the present disclosure may be provided in the server 120. Alternatively, the method of verifying information provided by the present disclosure may also be performed by a server or a server cluster that is different from the server 120 and is capable of communicating with the server 120. Accordingly, the means for verifying information provided by the present disclosure may also be provided in a server or a server cluster that is different from the server 120 and is capable of communicating with the server 120.
It should be understood that the number and type of terminal devices and servers in fig. 1 are merely illustrative. There may be any number and type of terminal devices, as desired for implementation.
Fig. 2 is a flow diagram of a method of verifying information according to an embodiment of the present disclosure.
As shown in fig. 2, the method 200 of checking information of this embodiment may include operation S210, operation S230, operation S250, and operation S270.
In operation S210, audio data from any one of the multi-party sessions is acquired.
According to embodiments of the present disclosure, a multiparty session may be initiated, for example, by a plurality of users, which may include a host and a participant, employing social platform software in respective terminal devices. Where the host may be a team leader or project manager within the enterprise and the participant may be a team member or project engineer. Alternatively, the host may be a teacher and the participants may be students.
According to the embodiment of the disclosure, the terminal equipment of each user in the plurality of users can collect voice of each user to obtain audio data, and the audio data is sent to the server supporting the operation of the social platform software, so that the server obtains the audio data from any one party in the multi-party session.
In operation S230, the audio data is recognized, and first text information for the audio data is obtained.
According to embodiments of the present disclosure, speech recognition techniques (Automatic Speech Recognition, ASR) may be employed to convert the audio data into the first text information. In particular, the audio data may be converted into the first text information using a dynamic time warping (Dynamic Time Warping, DTW) method, a hidden markov (Hidden Markov Model, HMM) theory, a vector quantization (Vector Quantization, VQ) technique, or an artificial neural network (Artificial Neural Network, ANN) based technique, or the like.
In operation S250, in the case where either one of the parties is a host and it is determined that the audio data is the target data based on the first text information, the stored text information is acquired from the predetermined storage space.
According to embodiments of the present disclosure, the aforementioned acquired audio data may carry account information, for example, that uniquely indicates any one of the parties to the multiparty conversation. When a multiparty session is initiated, the server may, for example, maintain a host's account information list and a participant's account information list based on the operation of the social platform software in the respective terminal device by multiple users. By comparing the account information carried by the acquired audio data with the maintained account information list, it is possible to determine whether the party is a host or a participant.
According to an embodiment of the present disclosure, the target data is data characterizing a conclusion. For example, the target data may be speech that the collected moderator expresses a conclusion during the course of the multiparty conference. The embodiment may determine whether the audio data is the target data by performing semantic recognition on the first text information in a case where it is determined that either one of the parties is the host. Or, by extracting key information from the first text information, determining whether the video data is target data based on the extracted key information. It will be appreciated that the above-described target data is merely exemplary to facilitate an understanding of the present disclosure, which is not limited thereto.
According to an embodiment of the present disclosure, the predetermined storage space may, for example, store text information obtained by identifying audio data that has been acquired in the multiparty conversation, i.e., the text information stored in the predetermined storage space includes text information for the audio data that has been identified in the multiparty conversation. The text information stored in the predetermined storage space may be obtained using operations similar to the aforementioned operation S210 and operation S230. The recognized audio data is speech uttered by a plurality of users collected before the audio data is acquired in operation S210.
In operation S270, the first text information is verified based on the stored text information.
According to embodiments of the present disclosure, a semantic understanding model may be employed to determine conclusion information for stored textual information based on the stored textual information. The semantic understanding model can be used for identifying entity words of text information, and embedding the identified entity words into the preset template to obtain conclusion information. The input of the semantic understanding model can be stored text information, the output can be conclusion text, and the semantic understanding model can be constructed based on a bi-directional conversion coder (Bidirectional Encoder Representation from Transformers, BERT) or a knowledge-graph-enhanced BERT language characterization model (ERNIE) and the like. It will be appreciated that the types of semantic understanding models described above are merely examples to facilitate understanding of the present disclosure, which is not limited by the present disclosure. For example, the semantic understanding model may also include a long-short term memory network model, and the like.
For example, the semantic understanding model may perform entity word recognition based on semantics of stored text information. For example, for stored text information "item X is followed by Zhang San and Lifour is followed", the identified entity words may include "item X" and "Zhang San" and not "Lifour". This is because the "Li four" and "item X" no longer have an association relationship based on the semantics of the stored text information.
According to the embodiment of the disclosure, after the conclusion information is obtained by adopting the semantic understanding model, the conclusion information can be compared with the first text information to determine whether the conclusion information and the first text information are matched. In particular, the similarity between the conclusion information and the first text information may be determined using at least one of the following parametric forms: cosine similarity, pearson correlation coefficient, jaccard similarity coefficient, and the like, and determining that the conclusion information matches the first text information when the similarity between the conclusion information and the first text information is greater than a predetermined similarity.
For example, upon determining that the conclusion information does not match the first text information, it may be determined that the first text information fails verification. And when the conclusion information is determined to be matched with the first text information, determining that the first text information passes the verification.
By adopting the semantic understanding model to obtain conclusion information, the fusion of the semantics of the plurality of stored text information can be realized, and the semantic relevance among the plurality of stored text information is fully considered. The accuracy of the verification of the first text information can be improved by verifying the first text information based on the comparison of the conclusion information obtained through fusion and the first text information.
In accordance with embodiments of the present disclosure, when verifying the first text information, for example, only the text processing model may be used to extract key information from the stored text information. And then determining whether the first text information comprises the key information, if so, determining that the first text information passes the verification, otherwise, determining that the first text information does not pass the verification. The text processing model can be constructed based on a two-way long-short-term memory network model and a conditional random field model, for example. It is to be understood that the type of text processing model is merely an example to facilitate an understanding of the present disclosure, which is not limited thereto.
The text processing model may be constructed based on a keyword extraction algorithm, which may be based on a predetermined word stock, for example. The embodiment can maintain a predetermined word stock in advance according to actual requirements, and the embodiment does not limit words included in the predetermined word stock.
According to the embodiment of the disclosure, under the condition that the audio data is the target data and the audio data is obtained by collecting the voice of the host, the effect of automatically checking the target data by correlating the context of the session can be achieved by acquiring the text information of the identified audio data in the multiparty session, and therefore, the checking accuracy can be improved, and the use coverage and commercial value of the voice recognition technology can be improved.
Fig. 3 is a flow chart of a method of verifying information according to another embodiment of the present disclosure.
According to an embodiment of the present disclosure, as shown in fig. 3, the method 300 of checking information in this embodiment may include operations S310, S330, S350, and S370, and operations S391 and S393 performed between operations S330 and S350.
Operations S310 to S330 are performed in a loop during the multiparty session to obtain audio data from any one of the multiparty sessions, and identify the audio data to obtain text information for the audio data.
After obtaining the first text information for the audio data, operation S391 may be performed to determine whether the either party is a host. If the host is the host, operation S393 is performed, otherwise, the process returns to continue to acquire audio data.
In operation S393, it is determined whether the first text information is a conclusion type text.
For example, the first text information may be classified using a text classification model, the classification result indicating a probability that the first text information expresses a conclusion, and the audio data is determined to be target data when the probability is greater than a predetermined value. Or, the classification result indicates whether the first text information belongs to a conclusion category, and when the first text information belongs to the conclusion category, the audio data is determined to be target data. The text classification model may be, for example, a support vector machine model, a K Nearest Neighbor (KNN) classification algorithm, and the like. The predetermined value may be, for example, any value such as 0.8, and the predetermined value and the text classification model may be set according to actual requirements, which is not limited in this disclosure.
For example, the text classification model may first extract key information from the first text information. After the key information is extracted, the category of the first text information is determined according to the key information. In an embodiment, the text classification model may be constructed, for example, based on the text processing model described above to extract key information from the first text information.
For example, the foregoing predetermined word library may have maintained therein entity words, verbs associated with scenes, words indicating conclusions, words indicating supplementary contents, and query words, etc. When the associated scenario is an enterprise scenario, the verb may include, for example: follow-up, resolution, processing, validation, evaluation, promotion, etc., words indicating conclusions may include: conclusion, brief summary, and the like, words indicating supplemental content may include: supplement, remind, return, etc. The keyword extraction algorithm may be, for example, a Term Frequency-inverse text Frequency (TF-IDF) algorithm, a web page ranking (PageRank) algorithm, or the like. It will be appreciated that the predetermined word stock and text processing model described above are merely examples to facilitate an understanding of the present disclosure, and the present disclosure is not limited thereto. In the case where a word indicating a conclusion and a word indicating supplementary content are included in the key information and a query word is not included, the first text information may be determined as a conclusion type.
If the first text information is a conclusion type text, operations S350 and S370 are performed to acquire stored text information from a predetermined storage space and verify the first text information based on the stored text information. And if the first text information is not the text of the conclusion type, returning to acquire the audio data.
According to the embodiment, the first text information is classified according to the text classification model, whether the audio data are target data or not is determined, the accuracy of the determined target data can be improved, verification of non-target data can be avoided, and verification efficiency is improved.
Fig. 4 is a schematic diagram of retrieving stored text information from a predetermined storage space according to an embodiment of the present disclosure.
According to the embodiment of the disclosure, when the stored text information is acquired from the predetermined storage space, for example, only the text information related to the subject expressed by the current first text information can be acquired, so that the efficiency and the accuracy of checking the first text information are improved.
As shown in fig. 4, in this embodiment 400, when the stored text information is acquired, the subject information 420 included in the first text information 410 may be determined first, and the subject information 420 may be determined by the aforementioned identified key information. For example, for the first text information "i say conclusion of question Y, followed by E", the subject information may be determined as "question Y" from the key information. After obtaining the subject information of the first text information, the text information including the subject information may be searched from the predetermined storage space as the second text information. And then acquiring the second text information from the predetermined storage space and the text information stored in the predetermined storage space after the second text information is stored as stored text information for verifying the first text information.
Illustratively, as shown in fig. 4, text information 431 to 434 is stored in a predetermined storage space, and the text information 431 to 434 is stored in the predetermined storage space in chronological order. If the text information including the subject information in the first text information is determined to be the text information 432, the acquired stored text information is the text information 432-434.
According to an embodiment of the present disclosure, when there are a plurality of text information including the subject information in a predetermined storage space, the embodiment may determine the text information stored earliest among the plurality of text information as the second text information. In the process of discussing the theme expressed by the theme information, if the theme information is mentioned for a plurality of times, the method can ensure that the acquired stored text information can cover the whole process of discussing the theme. The preset storage space is a storage space which is only corresponding to the multi-party conversation, so that the influence of text information obtained before the current discussion on the current verification is avoided when the text information obtained by multiple discussions aiming at the same theme is stored in the storage space.
According to the embodiment of the disclosure, in the case that the acquired audio data is from a participant other than the host or the audio data is non-target data, the audio data may be stored in the aforementioned predetermined storage space so as to be used in the subsequent verification of the text information for the target data.
According to the embodiment of the disclosure, if the acquired audio data is from the host and the audio data is the target data, after the verification of the text information of the target data is completed, the text information of the target data may be stored in the predetermined storage space, so that the text information corresponding to the complete audio data of the multiparty session is stored in the predetermined storage space.
For example, when storing the first text information for the audio data, the text processing model described above may be used to extract key information from the first text information. In the case where the key information includes subject information, the subject information is added to the first text information. The first text information finally stored to the predetermined storage space is text information to which the subject information is added. The topic information may be, for example, a project name, a course chapter, or the like included in the text information. By adding the subject information to the first text information, it is possible to facilitate determination of the second text information in the predetermined storage space.
For example, after the first text information is obtained, the key information may be extracted from the first text information, and the key information may be used as a tag of the first text information, so as to facilitate text classification and conclusion information generation of the first text information in a subsequent processing process. Wherein the key information may also be obtained in connection with the source of the audio data, for example. For example, if the first text message is "i come to follow" obtained from the audio data of party B, the key information obtained may be "B follow".
Illustratively, when a multiparty conversation is initiated based on the purpose of discussing project progress, the method of checking information employing this embodiment may result in text information and key information for each audio data as shown in the following table. Wherein speaker a is the moderator and speaker B, D, E, F, G is the participant.
For example, for the first text message "i say conclusion of question Y followed by E", the stored text message "say next topic, on-line question Y, and also a question Y need to be seen, who follows? The problem Y I also follows the bar together, the problem Y is followed by E, D and G, the period is completed by adopting a semantic understanding model, the conclusion information is obtained, the first text information is incomplete by comparing the conclusion information with the first text information, and the verification is determined to be failed when key information D, G and period are not mentioned.
Fig. 5 is a flow diagram of verifying first text information based on stored text information according to an embodiment of the present disclosure.
According to the embodiment of the disclosure, in the case that the first text information is determined to not pass the verification, the server may further push a predetermined prompt message to the either party to prompt the either party to speak inaccurately. The predetermined prompting information may be, for example, text information, sound information, etc., and the predetermined prompting information may be pushed to a terminal device used by any party, so that the terminal device used by any party displays the text information, or plays the sound information, etc.
The prompt may be, for example, a prompt word indicating that the first text information is incorrect, or the prompt may further include difference information between the conclusion information and the first text information to prompt the presenter to supplement the utterance.
According to the embodiments of the present disclosure, in a multiparty conference, there may be a case where a host makes a plurality of utterances or the host makes a plurality of utterances while making a summary utterance, and thus there may be a case where there are a plurality of audio data belonging to target data from the host among the acquired audio data for the subjects expressed by the subject information. In this case, the text information for the plurality of audio data may be first selected from the stored text information as the third text information. And after the third text information is obtained, fusing the third text information and the current first text information to obtain the text information to be checked. And then verifying the text information to be verified based on other text information except the third text information in the stored text information. By the method, the accuracy of checking the first text information can be ensured.
As shown in fig. 5, in this embodiment, after the stored text information is acquired from the predetermined storage space of the unique multi-party conversation through operation S550, the process of verifying the first text based on the stored text information may include operations S571 to S577.
In operation S571, it is determined whether the third text information is included in the stored text information. If so, operation S573 is performed, otherwise operation S575 is performed. The third text information is text information which contains the subject contained in the first text information and is obtained by identifying target data of the host. The target data of the host is the audio data which comes from the host and belongs to the target data.
In operation S573, text information to be verified is generated based on the third text information and the first text information. For example, a keyword may be extracted from the third text information and the first text information, and then the keyword is sleeved into a predetermined template, so as to obtain the text information to be verified. For example, if the third text information is "question Y is followed by E", and the first text information is "question Y is followed by D", the extracted keywords include: "question Y", "E", "F" and "follow-up", the generated text information to be verified may be "question Y is followed by E and F". The predetermined template is "XX is followed by XXX". The text information to be verified may be generated, for example, by the foregoing semantic understanding model, which is not limited by the present disclosure.
In operation S575, the first text information is taken as the text information to be checked.
After obtaining the text information to be checked, operation S577 is performed to check the text information to be checked based on other text information than the third text information in the stored text information. It is to be appreciated that the implementation method of the operation S577 is similar to the method for verifying the first text information based on the stored text information described above, and will not be described herein.
Based on the method for checking information provided by the present disclosure, the present disclosure further provides an apparatus for checking information, which will be described in detail below with reference to fig. 6.
Fig. 6 is a block diagram of an apparatus for checking information according to an embodiment of the present disclosure.
As shown in fig. 6, the apparatus 600 for checking information of this embodiment includes a data acquisition module 610, a data identification module 630, an information acquisition module 650, and a checking module 670.
The data acquisition module 610 is configured to acquire audio data from any one of the parties in the multi-party conversation. In an embodiment, the data obtaining module 610 may be used to perform the operation S210 described above, which is not described herein.
The data recognition module 630 is configured to recognize audio data and obtain first text information for the audio data. In an embodiment, the data identifying module 630 may be used to perform the operation S230 described above, which is not described herein.
The information acquisition module 650 is configured to acquire stored text information from a predetermined storage space in a case where either one of the parties is a host and it is determined that the audio data is target data based on the first text information. Wherein the stored text information includes text information for the identified audio data in the multiparty conversation. In an embodiment, the information obtaining module 650 may be used to perform the operation S250 described above, which is not described herein.
The verification module 670 is configured to verify the first text information based on the stored text information. In an embodiment, the verification module 670 may be used to perform the operation S270 described above, which is not described herein.
The apparatus 600 for verifying information according to the embodiment of the present disclosure may further include a target data determining module for determining whether the audio data is target data based on the first text information, for example. The target data determination module may include, for example, a text type determination submodule and a data determination submodule. The text type determining submodule is used for determining whether the first text information is the text of the conclusion type or not by adopting a text classification model under the condition that any one party is a host party. The data determining submodule is used for determining that the audio data is target data in the case that the first text information is the text of the conclusion type.
According to an embodiment of the present disclosure, the first text information includes theme information, and the predetermined storage space uniquely corresponds to a multiparty session. The information acquisition module 650 may include a text determination sub-module and an acquisition sub-module. The text determination submodule is used for determining second text information in a preset storage space, wherein the second text information comprises theme information. The acquisition submodule is used for acquiring the second text information from the preset storage space and storing the text information in the preset storage space after the second text information is stored.
The verification module 670 may include an information determination sub-module and a verification sub-module according to an embodiment of the present disclosure. The information determination submodule is used for determining conclusion information aiming at stored text information by adopting a semantic understanding model based on the stored text information. And the verification sub-module is used for determining that the first text information passes verification under the condition that the conclusion information is matched with the first text information.
According to an embodiment of the present disclosure, the apparatus 600 for checking information may further include an information pushing module, configured to push, if it is determined that the first text information fails to pass the check, a predetermined prompt message to any party.
According to an embodiment of the present disclosure, the first text information includes subject information, and the predetermined storage space uniquely corresponds to the multiparty session. The verification module comprises an information generation sub-module and a verification sub-module. The information generation sub-module is used for generating text information to be checked based on the third text information and the first text information in the case that the stored text information comprises the third text information. The verification sub-module is used for verifying the text information to be verified based on other text information except the third text information in the stored text information. Wherein the third text information is text information including subject information, and the third text information is obtained by identifying target data of the support.
According to an embodiment of the present disclosure, the apparatus 600 for verifying information may further include a storage module for storing the first text information in a predetermined storage space, for example, in a case where either one of the parties is a participant or it is determined that the audio data is non-target data based on the first text information. Or the storage module is used for storing the first text information into a preset storage space after the first text information is verified based on the stored text information.
According to an embodiment of the present disclosure, the storage module includes an information extraction sub-module, a theme information addition sub-module, and a storage sub-module. The second information extraction sub-module is used for extracting key information from the first text information by adopting a text processing model under the condition that the audio data is determined to be non-target data based on the first text information. The topic information adding sub-module is used for adding topic information to the first text information in the case that the key information comprises topic information. The storage submodule is used for storing the first text information added with the theme information into a preset storage space.
It should be noted that, in the technical solution of the present disclosure, the acquisition, storage, application, etc. of the related personal information of the user all conform to the rules of the related laws and regulations, and do not violate the popular regulations of the public order.
According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.
Fig. 7 is a block diagram of an electronic device for implementing a method of verifying information in accordance with an embodiment of the present disclosure.
Fig. 7 illustrates a schematic block diagram of an example electronic device 700 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the apparatus 700 includes a computing unit 701 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 may also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other through a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in device 700 are connected to I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, etc.; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, an optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
The computing unit 701 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The calculation unit 701 performs the respective methods and processes described above, for example, a method of checking information. For example, in some embodiments, the method of verifying information may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 708. In some embodiments, part or all of the computer program may be loaded and/or installed onto device 700 via ROM 702 and/or communication unit 709. When the computer program is loaded into the RAM 703 and executed by the computing unit 701, one or more steps of the method of verifying information described above may be performed. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the method of checking information by any other suitable means (e.g. by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.
The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (16)

1. A method of verifying information, comprising:
acquiring audio data from any one of the multi-party sessions;
identifying the audio data and obtaining first text information for the audio data;
acquiring stored text information from a predetermined storage space in a case where either one of the parties is a host and the audio data is determined to be target data based on the first text information; and
verifying the first text information based on the stored text information,
wherein the stored text information includes text information for identified audio data in the multiparty conversation;
the first text message comprises theme information, and the preset storage space uniquely corresponds to the multiparty session; the acquiring the stored text information from the predetermined storage space includes:
Determining second text information in the preset storage space, wherein the second text information comprises the theme information; and
the second text information is acquired from the predetermined storage space and the text information stored in the predetermined storage space after the second text information is stored.
2. The method of claim 1, further comprising determining whether the audio data is target data based on the first text information, comprising:
if any party is a host, determining whether the first text information is a conclusion type text by adopting a text classification model; and
and determining the audio data as target data under the condition that the first text information is the text of the conclusion type.
3. The method of claim 1, wherein the verifying the first text information based on the stored text information comprises:
determining conclusion information for the stored text information by adopting a semantic understanding model based on the stored text information; and
and determining that the first text information passes verification under the condition that the conclusion information is matched with the first text information.
4. The method of claim 1, further comprising:
and under the condition that the first text information is determined to not pass the verification, pushing the preset prompt information to any party.
5. The method of claim 1, wherein the verifying the first text information based on the stored text information comprises:
generating text information to be checked based on third text information and the first text information in the case that the stored text information comprises the third text information; and
verifying the text information to be verified based on other text information except the third text information in the stored text information,
wherein the third text information is text information including the subject information, and the third text information is obtained by identifying target data of the host.
6. The method of claim 1, further comprising:
storing the first text information in the predetermined storage space in case that either one of the parties is a participant or the audio data is determined to be non-target data based on the first text information; or alternatively
After the first text information is verified based on the stored text information, the first text information is stored in the preset storage space.
7. The method of claim 6, wherein storing the first text information in the predetermined memory space comprises:
extracting key information from the first text information by adopting a text processing model;
in the case where the key information includes subject information, adding the subject information to the first text information; and
and storing the first text information added with the theme information into the preset storage space.
8. An apparatus for verifying information, comprising:
the data acquisition module is used for acquiring audio data from any one party of the multi-party session;
the data identification module is used for identifying the audio data and obtaining first text information aiming at the audio data;
an information acquisition module for acquiring stored text information from a predetermined storage space in a case where the either party is a host and the audio data is determined to be target data based on the first text information; and
a verification module for verifying the first text information based on the stored text information,
wherein the stored text information includes text information for identified audio data in the multiparty conversation;
The first text message comprises theme information, and the preset storage space uniquely corresponds to the multiparty session; the information acquisition module includes:
a text determination sub-module, configured to determine second text information in the predetermined storage space, where the second text information includes the subject information; and
and the acquisition sub-module is used for acquiring the second text information from the preset storage space and storing the text information in the preset storage space after the second text information is stored.
9. The apparatus of claim 8, further comprising a target data determination module to determine whether the audio data is target data based on the first text information; the target data determination module includes:
the text type determining submodule is used for determining whether the first text information is a conclusion type text or not by adopting a text classification model under the condition that any one party is a host party; and
and the data determination submodule is used for determining the audio data as target data under the condition that the first text information is the text of the conclusion type.
10. The apparatus of claim 8, wherein the verification module comprises:
The information determination submodule is used for determining conclusion information aiming at the stored text information by adopting a semantic understanding model based on the stored text information; and
and the verification sub-module is used for determining that the first text information passes verification under the condition that the conclusion information is matched with the first text information.
11. The apparatus of claim 10, further comprising:
and the information pushing module is used for pushing the preset prompt information to any party under the condition that the first text information is determined to not pass the verification.
12. The apparatus of claim 8, wherein the verification module comprises:
an information generation sub-module for generating text information to be checked based on a third text information and the first text information in case the stored text information includes the third text information; and
a verification sub-module for verifying the text information to be verified based on other text information except the third text information in the stored text information,
wherein the third text information is text information including the subject information, and the third text information is obtained by identifying target data of the host.
13. The apparatus of claim 8, further comprising a storage module to:
storing the first text information in the predetermined storage space in case that either one of the parties is a participant or the audio data is determined to be non-target data based on the first text information; or alternatively
After the first text information is verified based on the stored text information, the first text information is stored in the preset storage space.
14. The apparatus of claim 13, wherein the storage module comprises:
the information extraction sub-module is used for extracting key information from the first text information by adopting a text processing model;
a topic information adding sub-module, configured to add topic information to the first text information when the key information includes topic information; and
and the storage sub-module is used for storing the first text information added with the theme information into the preset storage space.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-7.
CN202110380128.5A 2021-04-08 2021-04-08 Method, device, equipment and storage medium for checking information Active CN113111658B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110380128.5A CN113111658B (en) 2021-04-08 2021-04-08 Method, device, equipment and storage medium for checking information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110380128.5A CN113111658B (en) 2021-04-08 2021-04-08 Method, device, equipment and storage medium for checking information

Publications (2)

Publication Number Publication Date
CN113111658A CN113111658A (en) 2021-07-13
CN113111658B true CN113111658B (en) 2023-08-18

Family

ID=76714932

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110380128.5A Active CN113111658B (en) 2021-04-08 2021-04-08 Method, device, equipment and storage medium for checking information

Country Status (1)

Country Link
CN (1) CN113111658B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109887508A (en) * 2019-01-25 2019-06-14 广州富港万嘉智能科技有限公司 A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print
CN110139062A (en) * 2019-05-09 2019-08-16 平安科技(深圳)有限公司 A kind of creation method, device and the terminal device of video conference record
CN110379429A (en) * 2019-07-16 2019-10-25 招联消费金融有限公司 Method of speech processing, device, computer equipment and storage medium
CN111277589A (en) * 2020-01-19 2020-06-12 腾讯云计算(北京)有限责任公司 Conference document generation method and device
CN112528660A (en) * 2020-12-04 2021-03-19 北京百度网讯科技有限公司 Method, apparatus, device, storage medium and program product for processing text

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9269073B2 (en) * 2012-09-20 2016-02-23 Avaya Inc. Virtual agenda participant
CN109285548A (en) * 2017-07-19 2019-01-29 阿里巴巴集团控股有限公司 Information processing method, system, electronic equipment and computer storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109887508A (en) * 2019-01-25 2019-06-14 广州富港万嘉智能科技有限公司 A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print
CN110139062A (en) * 2019-05-09 2019-08-16 平安科技(深圳)有限公司 A kind of creation method, device and the terminal device of video conference record
CN110379429A (en) * 2019-07-16 2019-10-25 招联消费金融有限公司 Method of speech processing, device, computer equipment and storage medium
CN111277589A (en) * 2020-01-19 2020-06-12 腾讯云计算(北京)有限责任公司 Conference document generation method and device
CN112528660A (en) * 2020-12-04 2021-03-19 北京百度网讯科技有限公司 Method, apparatus, device, storage medium and program product for processing text

Also Published As

Publication number Publication date
CN113111658A (en) 2021-07-13

Similar Documents

Publication Publication Date Title
US10176804B2 (en) Analyzing textual data
US10777207B2 (en) Method and apparatus for verifying information
US11063890B2 (en) Technology for multi-recipient electronic message modification based on recipient subset
US11250839B2 (en) Natural language processing models for conversational computing
US9483582B2 (en) Identification and verification of factual assertions in natural language
US10956480B2 (en) System and method for generating dialogue graphs
CN107430616A (en) The interactive mode of speech polling re-forms
CN111428010B (en) Man-machine intelligent question-answering method and device
US10180988B2 (en) Persona-based conversation
WO2023142451A1 (en) Workflow generation methods and apparatuses, and electronic device
US10762906B2 (en) Automatically identifying speakers in real-time through media processing with dialog understanding supported by AI techniques
CN111832308A (en) Method and device for processing consistency of voice recognition text
US10102289B2 (en) Ingesting forum content
KR102030551B1 (en) Instant messenger driving apparatus and operating method thereof
CN110738056A (en) Method and apparatus for generating information
WO2020199590A1 (en) Mood detection analysis method and related device
CN113111658B (en) Method, device, equipment and storage medium for checking information
CN116049370A (en) Information query method and training method and device of information generation model
CN113470625A (en) Voice conversation processing method, device, equipment and storage medium
CN112632241A (en) Method, device, equipment and computer readable medium for intelligent conversation
CN112969000A (en) Control method and device of network conference, electronic equipment and storage medium
CN114501112B (en) Method, apparatus, device, medium, and article for generating video notes
CN113241061B (en) Method and device for processing voice recognition result, electronic equipment and storage medium
US20240143678A1 (en) Intelligent content recommendation within a communication session
US11741298B1 (en) Real-time meeting notes within a communication platform

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant