CN112599130A - Intelligent conference system based on intelligent screen - Google Patents

Intelligent conference system based on intelligent screen Download PDF

Info

Publication number
CN112599130A
CN112599130A CN202011408172.4A CN202011408172A CN112599130A CN 112599130 A CN112599130 A CN 112599130A CN 202011408172 A CN202011408172 A CN 202011408172A CN 112599130 A CN112599130 A CN 112599130A
Authority
CN
China
Prior art keywords
information
voice
text
display
voice information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011408172.4A
Other languages
Chinese (zh)
Other versions
CN112599130B (en
Inventor
李广垒
陈祖涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Baoxin Information Technology Co ltd
Original Assignee
Anhui Baoxin Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Baoxin Information Technology Co ltd filed Critical Anhui Baoxin Information Technology Co ltd
Priority to CN202011408172.4A priority Critical patent/CN112599130B/en
Publication of CN112599130A publication Critical patent/CN112599130A/en
Application granted granted Critical
Publication of CN112599130B publication Critical patent/CN112599130B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems

Abstract

The invention discloses an intelligent conference system based on an intelligent screen, which comprises: the acquisition and recognition module is used for acquiring voice information and performing voice recognition on the acquired voice information; the voice amplification module is used for receiving the voice information sent by the acquisition and recognition module, amplifying the voice information, transmitting the amplified voice information to a loudspeaker, converting the amplified voice information into a voice signal and transmitting the voice signal in a directional manner; the information processing module is used for receiving the voice information sent by the acquisition and recognition module and converting the received voice information into character information; and the subtitle display module is used for receiving the character information sent by the information processing module and displaying the character information on a screen. The invention improves the accuracy of speaking information transmission, and enables the information to be transmitted better.

Description

Intelligent conference system based on intelligent screen
The technical field is as follows:
the invention relates to the technical field of intelligent conference systems, in particular to an intelligent conference system based on an intelligent screen.
Background art:
since the 21 st century, human beings gradually enter the age of multimedia information, mass media mainly comprise internet, television, mobile phones and the like, and multimedia information gradually becomes an indispensable important part in life. As three elements of a media carrier, live broadcast, which is the combination of sound, characters and images, is the most direct way for people to transfer information and understand, and is particularly obvious in scenes such as a release meeting, a large-scale conference, live television broadcast, educational training and the like. Any information which needs to be transmitted by taking sound, images and characters as carriers, such as interviews, meetings, legal disputes, doctor inquiry and the like, needs a product system capable of providing real-time screen uploading.
In order to realize live broadcasting in the above environment, the conventional solution is: in the process of on-site recording, a professional shorthand team is matched with the audio to perform character transcription and proofreading, the audio is matched with a video or a picture and text after the transcription is completed, and the audio is released after the completion of the character transcription and the picture and text, so that the on-site live broadcasting is realized, and the method has the following limitations: 1. the message is delayed, because the video is released after manual later-stage transcription, a certain time difference exists between the video and the scene; 2. the information acquisition is inefficient; 3. resource consumption in subsequent arrangement, and time stamp correction of the transcribed words and the video to form a subtitle when live video live broadcasting is carried out by consuming manpower.
The invention content is as follows:
the invention aims to solve the defects and provides an intelligent conference system based on an intelligent screen.
The invention provides an intelligent conference display method based on an intelligent screen, which comprises the following steps:
acquiring a speech signal of a speech;
amplifying the voice signal and transmitting the amplified voice signal to a loudspeaker to be converted into a voice signal for outputting;
recognizing and converting the acquired voice signal into text information, and displaying the converted text information;
the method comprises the following steps that text information is updated in real time in the following mode, the text information comprises first display text and calibration display text, the first display text is selected keywords when the degree of engagement between voice information collected in a first preset time period and corresponding text in sample information exceeds a first threshold value degree of engagement, and the first display text displays the identified keywords at intervals; the words are calibrated and displayed by sentences determined when the coincidence degree of the voice information collected in a second preset time period and the corresponding words in the sample information exceeds a second threshold coincidence degree, the first preset time is less than the second preset time, and the first threshold coincidence degree is less than the second threshold coincidence degree; the calibration display text overlays the first display text.
In another aspect, the present invention provides an intelligent conference system based on an intelligent screen, including:
the acquisition and recognition module is used for acquiring voice information and performing voice recognition on the acquired voice information;
the voice amplification module is used for receiving the voice information sent by the acquisition and recognition module, amplifying the voice information, transmitting the amplified voice information to a loudspeaker, converting the amplified voice information into a voice signal and transmitting the voice signal in a directional manner;
the information processing module is used for receiving the voice information sent by the acquisition and recognition module and converting the received voice information into character information;
and the subtitle display module is used for receiving the character information sent by the information processing module and displaying the character information on a screen.
Further, the information processing module converts voice information into text information for real-time updating, the text information comprises first display text and calibration display text, the first display text is a keyword deleted when the degree of engagement between the voice information collected in a first preset time period and corresponding text in the sample information exceeds a first threshold value degree of engagement, and the first display text displays the identified keyword at intervals; the words are calibrated and displayed by sentences determined when the coincidence degree of the voice information collected in a second preset time period and the corresponding words in the sample information exceeds a second threshold coincidence degree, the first preset time is less than the second preset time, and the first threshold coincidence degree is less than the second threshold coincidence degree; the calibration display text overlays the first display text.
Further, the acquisition and identification module comprises a microphone, a sound console and a sound card, the sound console is connected with the microphone through a data line, the sound console is connected with the sound card through a data line, and the sound card is used for conveying signals to the voice amplification module.
Another aspect of the invention provides a display device comprising a display, a memory, a processor,
the processor is used for executing the intelligent conference display method;
the memory is to store the processor-executable instructions;
the display is used for displaying the character information processed by the processor.
Another aspect of the present invention provides a storage medium storing a computer program for executing the intelligent conference display method.
The invention discloses an intelligent conference system based on an intelligent screen, which has the following beneficial effects: by the method, the voice information generated by the speech can be amplified and transmitted, the voice information is recognized, and the character signal generated by recognition is displayed, so that conference participants can hear the sound of the speaker and recognize the character information corresponding to the speech through the display screen; in comparison, the speed of acquiring the text information by a human is faster than that of acquiring the voice information, and the text information can skip some contents which do not need to be concerned, so that the expression information can be acquired more quickly through text display; the accuracy of information transmission is improved, so that the information is transmitted better; when the voice information is recognized and converted into the character information, the accurately recognized key words in the voice information sentences of the speakers are quickly displayed firstly, so that the key information can be responded at the first time, and then the whole sentence information is displayed after a period of time is delayed from the key words through whole sentence recognition, so that the expression integrity is improved, the speaking information can be timely displayed and completely displayed, and the information transmission is improved more accurately and quickly.
Drawings
Fig. 1 is a schematic diagram of an intelligent conference system based on an intelligent screen.
Detailed Description
The present invention is further illustrated by the following examples, which are carried out on the premise of the technical scheme of the present invention, and detailed embodiments and specific operation procedures are given, but the protection scope of the present invention is not limited to the following examples:
an exemplary method:
the application provides an intelligent conference display method based on an intelligent screen, which comprises the following steps:
acquiring a speech signal of a speech;
amplifying the voice signal and transmitting the amplified voice signal to a loudspeaker to be converted into a voice signal for outputting;
recognizing and converting the acquired voice signal into text information, and displaying the converted text information;
wherein the text information is updated in real time in the following manner, the text information comprises a first display text and a calibration display text,
the first display characters are selected keywords which are deleted when the contact degree between the voice information collected in a first preset time period and the corresponding characters in the sample information exceeds a first threshold contact degree, and the first display characters display the identified keywords at intervals; in some embodiments, the keywords can be displayed in a flashing manner, and the keywords can be identified and displayed after the voice of the speaker is sent out, so that the timeliness is improved, and a reader can skip some contents which do not need to be concerned according to the text information of the keywords, so that the general semantics can be understood, and the related expression information of the voice information can be quickly known;
the words are calibrated and displayed by sentences determined when the coincidence degree of the voice information collected in a second preset time period and the corresponding words in the sample information exceeds a second threshold coincidence degree, the first preset time is less than the second preset time, and the first threshold coincidence degree is less than the second threshold coincidence degree; the calibration display text covers the first display text; in some embodiments, the whole sentence is matched and converted into a proper calibration character in a longer time with a certain lag, and the original keyword display character is covered by the more accurate calibration character, so that the accuracy is improved; and in some embodiments, the second preset time of a plurality of different lag times can be increased, so that the calibration display texts generated in a plurality of different time periods are generated, and the accuracy of the texts is further improved.
An exemplary system:
an intelligent conference system based on a smart screen, as shown in fig. 1, comprises:
the acquisition and recognition module is used for acquiring voice information and performing voice recognition on the acquired voice information;
the voice amplification module is used for receiving the voice information sent by the acquisition and recognition module, amplifying the voice information, transmitting the amplified voice information to a loudspeaker, converting the amplified voice information into a voice signal and transmitting the voice signal in a directional manner;
the information processing module is used for receiving the voice information sent by the acquisition and recognition module and converting the received voice information into character information;
and the subtitle display module is used for receiving the character information sent by the information processing module and displaying the character information on a screen.
Specifically, the information processing module converts voice information into text information for real-time updating, the text information comprises first display text and calibration display text, the first display text is a keyword deleted when the degree of engagement between the voice information collected in a first preset time period and corresponding text in the sample information exceeds a first threshold value degree of engagement, and the first display text displays the identified keyword at intervals; the words are calibrated and displayed by sentences determined when the coincidence degree of the voice information collected in a second preset time period and the corresponding words in the sample information exceeds a second threshold coincidence degree, the first preset time is less than the second preset time, and the first threshold coincidence degree is less than the second threshold coincidence degree; the calibration display text overlays the first display text.
It should be noted that the acquisition and identification module comprises a microphone, a sound console and a sound card, the sound console is connected with the microphone through a data line, the sound console is connected with the sound card through a data line, and the sound card is used for transmitting signals to the voice amplification module.
Example devices:
a display device comprising a display, a memory, a processor;
the processor is used for executing the intelligent conference display method;
the memory is to store the processor-executable instructions;
the display is used for displaying the character information processed by the processor.
Exemplary computer program products and computer-readable storage media:
in addition to the above-described methods and apparatus, embodiments of the present disclosure may also be a computer program product comprising computer program instructions that, when executed by a processor, cause the processor to perform the steps in the method of determining a closest obstacle according to the various embodiments of the present disclosure described in the "exemplary methods" section of this specification above.
The computer program product may write program code for carrying out operations for embodiments of the present disclosure in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device and partly on a remote computing device, or entirely on the remote computing device or server.
Furthermore, embodiments of the present disclosure may also be a computer-readable storage medium having stored thereon computer program instructions that, when executed by a processor, cause the processor to perform the steps in the intelligent conference display method according to various embodiments of the present disclosure described in the "exemplary methods" section above in this specification.
The computer-readable storage medium may take any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may include, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The foregoing is merely an example of the present invention and common general knowledge of features in the schemes is not described here in any greater extent. It should be noted that, for a person skilled in the art, several modifications can be made without departing from the invention, which should also be considered as the protection scope of the invention.

Claims (6)

1. The utility model provides an intelligence conference system based on wisdom screen which characterized in that: the method comprises the following steps:
the acquisition and recognition module is used for acquiring voice information and performing voice recognition on the acquired voice information;
the voice amplification module is used for receiving the voice information sent by the acquisition and recognition module, amplifying the voice information, transmitting the amplified voice information to a loudspeaker, converting the amplified voice information into a voice signal and transmitting the voice signal in a directional manner;
the information processing module is used for receiving the voice information sent by the acquisition and recognition module and converting the received voice information into character information;
and the subtitle display module is used for receiving the character information sent by the information processing module and displaying the character information on a screen.
2. The intelligent conference system based on the intelligent screen as claimed in claim 1, wherein: the information processing module converts voice information into text information for real-time updating, the text information comprises first display text and calibration display text, the first display text is a keyword deleted when the degree of engagement between the voice information collected in a first preset time period and corresponding text in sample information exceeds a first threshold value degree of engagement, and the first display text displays the identified keyword at intervals; the words are calibrated and displayed by sentences determined when the coincidence degree of the voice information collected in a second preset time period and the corresponding words in the sample information exceeds a second threshold coincidence degree, the first preset time is less than the second preset time, and the first threshold coincidence degree is less than the second threshold coincidence degree; the calibration display text overlays the first display text.
3. The intelligent conference system based on the intelligent screen as claimed in claim 2, wherein: the collecting and identifying module comprises a microphone, a sound console and a sound card, the sound console is connected with the microphone through a data line, the sound console is connected with the sound card through a data line, and the sound card is used for conveying signals to the voice amplifying module.
4. An intelligent conference display method based on an intelligent screen is characterized in that: the method comprises the following steps:
acquiring a speech signal of a speech;
amplifying the voice signal and transmitting the amplified voice signal to a loudspeaker to be converted into a voice signal for outputting;
recognizing and converting the acquired voice signal into text information, and displaying the converted text information;
the method comprises the following steps that text information is updated in real time in the following mode, the text information comprises first display text and calibration display text, the first display text is selected keywords when the degree of engagement between voice information collected in a first preset time period and corresponding text in sample information exceeds a first threshold value degree of engagement, and the first display text displays the identified keywords at intervals; the words are calibrated and displayed by sentences determined when the coincidence degree of the voice information collected in a second preset time period and the corresponding words in the sample information exceeds a second threshold coincidence degree, the first preset time is less than the second preset time, and the first threshold coincidence degree is less than the second threshold coincidence degree; the calibration display text overlays the first display text.
5. A display device characterized by: which comprises a display, a memory and a processor,
the processor is configured to execute the intelligent conference display method of claim 4;
the memory is to store the processor-executable instructions;
the display is used for displaying the character information processed by the processor.
6. A storage medium, characterized by: the storage medium stores a computer program for executing the intelligent conference display method according to claim 4.
CN202011408172.4A 2020-12-03 2020-12-03 Intelligent conference system based on intelligent screen Active CN112599130B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011408172.4A CN112599130B (en) 2020-12-03 2020-12-03 Intelligent conference system based on intelligent screen

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011408172.4A CN112599130B (en) 2020-12-03 2020-12-03 Intelligent conference system based on intelligent screen

Publications (2)

Publication Number Publication Date
CN112599130A true CN112599130A (en) 2021-04-02
CN112599130B CN112599130B (en) 2022-08-19

Family

ID=75188448

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011408172.4A Active CN112599130B (en) 2020-12-03 2020-12-03 Intelligent conference system based on intelligent screen

Country Status (1)

Country Link
CN (1) CN112599130B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113992926A (en) * 2021-10-19 2022-01-28 北京有竹居网络技术有限公司 Interface display method and device, electronic equipment and storage medium
CN116405836A (en) * 2023-06-08 2023-07-07 安徽声讯信息技术有限公司 Microphone tuning method and system based on Internet

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6243676B1 (en) * 1998-12-23 2001-06-05 Openwave Systems Inc. Searching and retrieving multimedia information
US20090287488A1 (en) * 2006-03-24 2009-11-19 Nec Corporation Text display, text display method, and program
CN105007557A (en) * 2014-04-16 2015-10-28 上海柏润工贸有限公司 Intelligent hearing aid with voice identification and subtitle display functions
CN106331893A (en) * 2016-08-31 2017-01-11 科大讯飞股份有限公司 Real-time subtitle display method and system
CN106487757A (en) * 2015-08-28 2017-03-08 华为技术有限公司 Carry out method, conference client and the system of voice conferencing
CN108366216A (en) * 2018-02-28 2018-08-03 深圳市爱影互联文化传播有限公司 TV news recording, record and transmission method, device and server
CN108536654A (en) * 2018-04-13 2018-09-14 科大讯飞股份有限公司 Identify textual presentation method and device
CN108566565A (en) * 2018-03-30 2018-09-21 科大讯飞股份有限公司 Barrage methods of exhibiting and device
CN109361825A (en) * 2018-11-12 2019-02-19 平安科技(深圳)有限公司 Meeting summary recording method, terminal and computer storage medium
CN110444197A (en) * 2018-05-10 2019-11-12 腾讯科技(北京)有限公司 Data processing method, device, system and storage medium based on simultaneous interpretation
CN110505201A (en) * 2019-07-10 2019-11-26 平安科技(深圳)有限公司 Conferencing information processing method, device, computer equipment and storage medium
CN110784751A (en) * 2019-08-21 2020-02-11 腾讯科技(深圳)有限公司 Information display method and device
CN111798835A (en) * 2020-07-25 2020-10-20 深圳市维度统计咨询股份有限公司 Voice recognition conversion system and method
CN111899742A (en) * 2020-08-06 2020-11-06 广州科天视畅信息科技有限公司 Method and system for improving conference efficiency

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6243676B1 (en) * 1998-12-23 2001-06-05 Openwave Systems Inc. Searching and retrieving multimedia information
US20090287488A1 (en) * 2006-03-24 2009-11-19 Nec Corporation Text display, text display method, and program
CN105007557A (en) * 2014-04-16 2015-10-28 上海柏润工贸有限公司 Intelligent hearing aid with voice identification and subtitle display functions
CN106487757A (en) * 2015-08-28 2017-03-08 华为技术有限公司 Carry out method, conference client and the system of voice conferencing
CN106331893A (en) * 2016-08-31 2017-01-11 科大讯飞股份有限公司 Real-time subtitle display method and system
CN108366216A (en) * 2018-02-28 2018-08-03 深圳市爱影互联文化传播有限公司 TV news recording, record and transmission method, device and server
CN108566565A (en) * 2018-03-30 2018-09-21 科大讯飞股份有限公司 Barrage methods of exhibiting and device
CN108536654A (en) * 2018-04-13 2018-09-14 科大讯飞股份有限公司 Identify textual presentation method and device
CN110444197A (en) * 2018-05-10 2019-11-12 腾讯科技(北京)有限公司 Data processing method, device, system and storage medium based on simultaneous interpretation
CN109361825A (en) * 2018-11-12 2019-02-19 平安科技(深圳)有限公司 Meeting summary recording method, terminal and computer storage medium
CN110505201A (en) * 2019-07-10 2019-11-26 平安科技(深圳)有限公司 Conferencing information processing method, device, computer equipment and storage medium
CN110784751A (en) * 2019-08-21 2020-02-11 腾讯科技(深圳)有限公司 Information display method and device
CN111798835A (en) * 2020-07-25 2020-10-20 深圳市维度统计咨询股份有限公司 Voice recognition conversion system and method
CN111899742A (en) * 2020-08-06 2020-11-06 广州科天视畅信息科技有限公司 Method and system for improving conference efficiency

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
于琦: "数字会议系统的设计与应用", 《大众科技》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113992926A (en) * 2021-10-19 2022-01-28 北京有竹居网络技术有限公司 Interface display method and device, electronic equipment and storage medium
CN113992926B (en) * 2021-10-19 2023-09-12 北京有竹居网络技术有限公司 Interface display method, device, electronic equipment and storage medium
CN116405836A (en) * 2023-06-08 2023-07-07 安徽声讯信息技术有限公司 Microphone tuning method and system based on Internet
CN116405836B (en) * 2023-06-08 2023-09-08 安徽声讯信息技术有限公司 Microphone tuning method and system based on Internet

Also Published As

Publication number Publication date
CN112599130B (en) 2022-08-19

Similar Documents

Publication Publication Date Title
CN108847214B (en) Voice processing method, client, device, terminal, server and storage medium
CN103327181B (en) Voice chatting method capable of improving efficiency of voice information learning for users
JP6928642B2 (en) Audio broadcasting method and equipment
US8315866B2 (en) Generating representations of group interactions
CN106941619A (en) Program prompting method, device and system based on artificial intelligence
CN111161714B (en) Voice information processing method, electronic equipment and storage medium
CN112599130B (en) Intelligent conference system based on intelligent screen
US20240021202A1 (en) Method and apparatus for recognizing voice, electronic device and medium
CN104078044A (en) Mobile terminal and sound recording search method and device of mobile terminal
US20110112821A1 (en) Method and apparatus for multimodal content translation
CN111919249A (en) Continuous detection of words and related user experience
WO2023029904A1 (en) Text content matching method and apparatus, electronic device, and storage medium
CN112399269B (en) Video segmentation method, device, equipment and storage medium
CN108899036A (en) A kind of processing method and processing device of voice data
CN111479124A (en) Real-time playing method and device
CN110990534A (en) Data processing method and device and data processing device
CN110379406B (en) Voice comment conversion method, system, medium and electronic device
EP2503545A1 (en) Arrangement and method relating to audio recognition
WO2021169825A1 (en) Speech synthesis method and apparatus, device and storage medium
CN113761986A (en) Text acquisition method, text live broadcast equipment and storage medium
CN111354362A (en) Method and device for assisting hearing-impaired communication
CN116629236A (en) Backlog extraction method, device, equipment and storage medium
CN111522992A (en) Method, device and equipment for putting questions into storage and storage medium
CN111339881A (en) Baby growth monitoring method and system based on emotion recognition
CN112669847A (en) Intelligent screen capable of being used for automatic editing and sorting of conference records

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant