CN105827618A - Method for improving speech communication quality of fragment asynchronous conference system - Google Patents
Method for improving speech communication quality of fragment asynchronous conference system Download PDFInfo
- Publication number
- CN105827618A CN105827618A CN201610258803.6A CN201610258803A CN105827618A CN 105827618 A CN105827618 A CN 105827618A CN 201610258803 A CN201610258803 A CN 201610258803A CN 105827618 A CN105827618 A CN 105827618A
- Authority
- CN
- China
- Prior art keywords
- speech
- meeting
- signal
- conference
- cloud server
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M3/00—Automatic or semi-automatic exchanges
- H04M3/42—Systems providing special services or facilities to subscribers
- H04M3/56—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
- H04M3/568—Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0232—Processing in the frequency domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/45—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/1066—Session management
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/14—Session management
Landscapes
- Engineering & Computer Science (AREA)
- Signal Processing (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Computer Networks & Wireless Communication (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Quality & Reliability (AREA)
- General Business, Economics & Management (AREA)
- Business, Economics & Management (AREA)
- Telephonic Communication Services (AREA)
Abstract
The invention discloses a method for improving the speech communication quality of a fragment asynchronous conference system. The method includes the following steps: S1, a conference speaker records voice statement information through a conference terminal; S2, the conference terminal performs de-noising processing on the voice statement information; S3, the conference terminal sends de-noised voice statement information to a conference cloud server; S4, the conference cloud server performs de-noising processing on received voice statement information; and S5, the conference cloud server sends the voice statement information to participants in a pre-created statement content receiving member list. According to the method of the invention, first de-noising processing is performed on voice statement content on the conference terminal used by the conference speaker, and then, second de-noising processing is performed on the voice statement content in the conference cloud server, and therefore, noise signals in the voice statement content can be greatly inhibited, and the speech communication quality can be improved.
Description
Technical field
The present invention relates to videoconference speech quality technical field, particularly relate to a kind of method improving fragmentation asynchronous conference system speech quality.
Background technology
Current social, the office facility of many enterprise institutions is distributed in all over the world, in the day-to-day operations of enterprise, needs often to sit on solving the problem in enterprise operation, but traditional conference model is that personnel participating in the meeting all focuses on a local meeting.There is the cost shortcoming such as low high, ageing in this conference model for having the enterprise of numerous branch, therefore, videoconference is risen therewith.
Along with breeding phase is gradually walked out in domestic call conference service market, either videoconference or mobile phone meeting start lively quickly.But under the videoconference market background of so clamour, videoconference the most on the market is nearly all confined to traditional simultaneous voice videoconference, or use VOIP technology, or use SS7 voice technology, powerful network function is not effectively utilized with chip time, no matter use which kind of product will take participant and go a large amount of lock in time to participate in meeting.Existing videoconference is frequently present of noise signal so that the speech content in videoconference is difficult to catch, and has a strong impact on meeting quality.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of method improving fragmentation asynchronous conference system speech quality, speech utterance content is successively carried out twice noise reduction process, improves speech quality.
It is an object of the invention to be achieved through the following technical solutions: the method improving fragmentation asynchronous conference system speech quality, comprise the following steps:
S1. conference speech person is by conference terminal recorded speech speech information;
S2. conference terminal carries out noise reduction process to speech utterance information;
S3. the speech utterance information after noise reduction process is sent to meeting Cloud Server by conference terminal;
S4. meeting Cloud Server carries out noise reduction process to the speech utterance information received;
S5. the member that respectively attends a meeting during speech utterance information is sent to the speech content reception members list being pre-created by meeting Cloud Server.
In described step S1, conference speech person is by the noise reduction microphone recorded speech speech information on conference terminal.
Described step S2 includes following sub-step:
S21. the signal framing framing to input speech utterance information, takes into Hamming window;
S22. time-domain signal is converted to lean and signal, the spectral power distribution of signal calculated;
S23. collecting mail according to the condition adjudgement docking receiving signal and number carry out gain oscillations detection, detection updates the end according to the state being presently in after terminating and makes an uproar spectral power distribution;
S24. utilize and receive the spectral power distribution of signal and the spectral power distribution made an uproar in the end calculates Spectral structure posteriori SNR, calculate general gain coefficient by MMSE method of estimation, and utilize gain coefficient to suppress noise;
S25. utilize the spectral power distribution after noise reduction and the end make an uproar spectral power distribution calculate frame signal to noise ratio, the frame signal to noise ratio preserving and updating in nearest certain time;
S26. according to frame signal to noise ratio and spectrum envelope signal to noise ratio record information, carry out spectrum envelope multimode transfer, judge that input signal is voice or noise according to the State-output of multimode transfer;
S27. the signal after noise reduction is carried out conversion and window superposition during frequency, output signal is carried out voice head protection, export after noise reduction voice or quiet according to the result of quiet detection.
The step creating speech content reception members list is also included before described step S5.
Described step S5 includes following sub-step:
S51. meeting Cloud Server is respectively attended a meeting in judging the speech content reception members list being pre-created the type of member:
If S52. attending a meeting, member is conference terminal member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member conference terminal on;
If S53. attending a meeting, member is mobile phone member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member mobile phone on.
Described step S2 is identical with the mode that speech utterance information carries out noise reduction process in step S4.
The invention has the beneficial effects as follows: in the present invention, speech utterance content is carried out noise reduction process for the first time by the conference terminal that conference speech person uses, can discuss the most again and speech utterance content is carried out second time noise reduction process by Cloud Server, greatly suppress the noise signal in speech utterance content, improve speech quality.
Accompanying drawing explanation
Fig. 1 is the flow chart that the present invention improves the method for fragmentation asynchronous conference system speech quality.
Detailed description of the invention
Technical scheme is described in further detail below in conjunction with the accompanying drawings, but protection scope of the present invention is not limited to the following stated.
As it is shown in figure 1, the method improving fragmentation asynchronous conference system speech quality, comprise the following steps:
S1. conference speech person is by conference terminal recorded speech speech information.
In described step S1, conference speech person is by the noise reduction microphone recorded speech speech information on conference terminal, use noise reduction microphone recorded speech speech information, from source, reduce the noise signal in speech utterance information, thus improve the speech quality of meeting.
S2. conference terminal carries out noise reduction process to speech utterance information.
Described step S2 includes following sub-step:
S21. the signal framing framing to input speech utterance information, takes into Hamming window;
S22. time-domain signal is converted to lean and signal, the spectral power distribution of signal calculated;
S23. collecting mail according to the condition adjudgement docking receiving signal and number carry out gain oscillations detection, detection updates the end according to the state being presently in after terminating and makes an uproar spectral power distribution;
S24. utilize and receive the spectral power distribution of signal and the spectral power distribution made an uproar in the end calculates Spectral structure posteriori SNR, calculate general gain coefficient by MMSE method of estimation, and utilize gain coefficient to suppress noise;
S25. utilize the spectral power distribution after noise reduction and the end make an uproar spectral power distribution calculate frame signal to noise ratio, the frame signal to noise ratio preserving and updating in nearest certain time;
S26. according to frame signal to noise ratio and spectrum envelope signal to noise ratio record information, carry out spectrum envelope multimode transfer, judge that input signal is voice or noise according to the State-output of multimode transfer;
S27. the signal after noise reduction is carried out conversion and window superposition during frequency, output signal is carried out voice head protection, export after noise reduction voice or quiet according to the result of quiet detection.
Signal framing framing to input speech utterance information in described step S21, every frame is made up of 128-512 sampling point, and every time the sampling point of renewal frame length half, is multiplied by a Hamming window by every frame signal, and window length is identical with frame length.
Described step S22, transfers the time-domain signal received to frequency-region signal by fast fourier transform;According to mankind's phonation characteristics, will be less than 300Hz and the spectrum energy more than 3400Hz is set to zero.
S3. the speech utterance information after noise reduction process is sent to meeting Cloud Server by conference terminal.
S4. meeting Cloud Server carries out noise reduction process to the speech utterance information received.
Described step S2 is identical with the mode that speech utterance information carries out noise reduction process in step S4.
S5. the member that respectively attends a meeting during speech utterance information is sent to the speech content reception members list being pre-created by meeting Cloud Server.
The step creating speech content reception members list is also included before described step S5.Creating speech content reception members list and specifically include following sub-step: first, conference speech person edits speech content reception member list on conference terminal;Secondly, speech content reception member list is encrypted by conference terminal, and the speech content reception member list after encryption is sent to meeting Cloud Server;Again, speech content reception member list is decrypted by meeting Cloud Server, and creates speech content reception members list according to speech content reception member list.
Described step S5 includes following sub-step:
S51. meeting Cloud Server is respectively attended a meeting in judging the speech content reception members list being pre-created the type of member:
If S52. attending a meeting, member is conference terminal member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member conference terminal on;
If S53. attending a meeting, member is mobile phone member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member mobile phone on.
The above is only the preferred embodiment of the present invention, it is to be understood that the present invention is not limited to form disclosed herein, it is not to be taken as the eliminating to other embodiments, and can be used for other combinations various, amendment and environment, and can be modified by above-mentioned teaching or the technology of association area or knowledge in contemplated scope described herein.And the change that those skilled in the art are carried out and change are without departing from the spirit and scope of the present invention, the most all should be in the protection domain of claims of the present invention.
Claims (6)
1. the method improving fragmentation asynchronous conference system speech quality, it is characterised in that: comprise the following steps:
S1. conference speech person is by conference terminal recorded speech speech information;
S2. conference terminal carries out noise reduction process to speech utterance information;
S3. the speech utterance information after noise reduction process is sent to meeting Cloud Server by conference terminal;
S4. meeting Cloud Server carries out noise reduction process to the speech utterance information received;
S5. the member that respectively attends a meeting during speech utterance information is sent to the speech content reception members list being pre-created by meeting Cloud Server.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: in described step S1, conference speech person is by the noise reduction microphone recorded speech speech information on conference terminal.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: described step S2 includes following sub-step:
S21. the signal framing framing to input speech utterance information, takes into Hamming window;
S22. time-domain signal is converted to lean and signal, the spectral power distribution of signal calculated;
S23. collecting mail according to the condition adjudgement docking receiving signal and number carry out gain oscillations detection, detection updates the end according to the state being presently in after terminating and makes an uproar spectral power distribution;
S24. utilize and receive the spectral power distribution of signal and the spectral power distribution made an uproar in the end calculates Spectral structure posteriori SNR, calculate general gain coefficient by MMSE method of estimation, and utilize gain coefficient to suppress noise;
S25. utilize the spectral power distribution after noise reduction and the end make an uproar spectral power distribution calculate frame signal to noise ratio, the frame signal to noise ratio preserving and updating in nearest certain time;
S26. according to frame signal to noise ratio and spectrum envelope signal to noise ratio record information, carry out spectrum envelope multimode transfer, judge that input signal is voice or noise according to the State-output of multimode transfer;
S27. the signal after noise reduction is carried out conversion and window superposition during frequency, output signal is carried out voice head protection, export after noise reduction voice or quiet according to the result of quiet detection.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: also include the step creating speech content reception members list before described step S5.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: described step S5 includes following sub-step:
S51. meeting Cloud Server is respectively attended a meeting in judging the speech content reception members list being pre-created the type of member:
If S52. attending a meeting, member is conference terminal member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member conference terminal on;
If S53. attending a meeting, member is mobile phone member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member mobile phone on.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: described step S2 is identical with the mode that speech utterance information carries out noise reduction process in step S4.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610258803.6A CN105827618A (en) | 2016-04-25 | 2016-04-25 | Method for improving speech communication quality of fragment asynchronous conference system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610258803.6A CN105827618A (en) | 2016-04-25 | 2016-04-25 | Method for improving speech communication quality of fragment asynchronous conference system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105827618A true CN105827618A (en) | 2016-08-03 |
Family
ID=56526529
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610258803.6A Pending CN105827618A (en) | 2016-04-25 | 2016-04-25 | Method for improving speech communication quality of fragment asynchronous conference system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105827618A (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111968638A (en) * | 2020-08-14 | 2020-11-20 | 上海茂声智能科技有限公司 | Method, system, equipment and storage medium for voice control display terminal |
CN115550595A (en) * | 2021-06-30 | 2022-12-30 | Oppo广东移动通信有限公司 | Online conference implementation method, device, equipment and readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103632676A (en) * | 2013-11-12 | 2014-03-12 | 广州海格通信集团股份有限公司 | Low SNR (signal to noise ratio) speech noise reduction method |
CN103632681A (en) * | 2013-11-12 | 2014-03-12 | 广州海格通信集团股份有限公司 | Spectral envelope silence detection method |
US20140105407A1 (en) * | 2012-10-11 | 2014-04-17 | International Business Machines Corporation | Reducing noise in a shared media sesssion |
CN104200811A (en) * | 2014-08-08 | 2014-12-10 | 华迪计算机集团有限公司 | Self-adaption spectral subtraction and noise elimination processing method and device for voice signals |
CN104427068A (en) * | 2013-09-06 | 2015-03-18 | 中兴通讯股份有限公司 | Voice communication method and device |
CN104579710A (en) * | 2015-01-16 | 2015-04-29 | 四川联友电讯技术有限公司 | Method for conference member to issue voice information in fragmentation asynchronous conference system |
-
2016
- 2016-04-25 CN CN201610258803.6A patent/CN105827618A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140105407A1 (en) * | 2012-10-11 | 2014-04-17 | International Business Machines Corporation | Reducing noise in a shared media sesssion |
CN104427068A (en) * | 2013-09-06 | 2015-03-18 | 中兴通讯股份有限公司 | Voice communication method and device |
CN103632676A (en) * | 2013-11-12 | 2014-03-12 | 广州海格通信集团股份有限公司 | Low SNR (signal to noise ratio) speech noise reduction method |
CN103632681A (en) * | 2013-11-12 | 2014-03-12 | 广州海格通信集团股份有限公司 | Spectral envelope silence detection method |
CN104200811A (en) * | 2014-08-08 | 2014-12-10 | 华迪计算机集团有限公司 | Self-adaption spectral subtraction and noise elimination processing method and device for voice signals |
CN104579710A (en) * | 2015-01-16 | 2015-04-29 | 四川联友电讯技术有限公司 | Method for conference member to issue voice information in fragmentation asynchronous conference system |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111968638A (en) * | 2020-08-14 | 2020-11-20 | 上海茂声智能科技有限公司 | Method, system, equipment and storage medium for voice control display terminal |
CN115550595A (en) * | 2021-06-30 | 2022-12-30 | Oppo广东移动通信有限公司 | Online conference implementation method, device, equipment and readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9640194B1 (en) | Noise suppression for speech processing based on machine-learning mask estimation | |
Li et al. | ICASSP 2021 deep noise suppression challenge: Decoupling magnitude and phase optimization with a two-stage deep network | |
CN103124165B (en) | Automatic growth control | |
US10832696B2 (en) | Speech signal cascade processing method, terminal, and computer-readable storage medium | |
US10237412B2 (en) | System and method for audio conferencing | |
US9979769B2 (en) | System and method for audio conferencing | |
CN108447496B (en) | Speech enhancement method and device based on microphone array | |
CN106887239A (en) | For the enhanced blind source separation algorithm of the mixture of height correlation | |
CN107371079B (en) | A kind of the diamylose gram noise reduction system and noise-reduction method of earphone | |
CN101207663A (en) | Internet communication device and method for controlling noise thereof | |
CN105304079A (en) | Multi-party call multi-mode speech synthesis method and system | |
US10504538B2 (en) | Noise reduction by application of two thresholds in each frequency band in audio signals | |
CN109040501A (en) | A kind of echo cancel method improving VOIP phone quality | |
US10540983B2 (en) | Detecting and reducing feedback | |
CN105793922A (en) | Multi-path audio processing | |
US10192566B1 (en) | Noise reduction in an audio system | |
US20110137644A1 (en) | Decoding speech signals | |
CN105827618A (en) | Method for improving speech communication quality of fragment asynchronous conference system | |
CN104618616B (en) | Videoconference participant identification system and method based on speech feature extraction | |
CN106328160B (en) | Noise reduction method based on double microphones | |
EP3414889B1 (en) | Bi-magnitude processing framework for nonlinear echo cancellation in mobile devices | |
WO2022142984A1 (en) | Voice processing method, apparatus and system, smart terminal and electronic device | |
WO2022166738A1 (en) | Speech enhancement method and apparatus, and device and storage medium | |
US20210389925A1 (en) | Volume adjustments | |
US9258428B2 (en) | Audio bandwidth extension for conferencing |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160803 |
|
RJ01 | Rejection of invention patent application after publication |