CN105827618A - Method for improving speech communication quality of fragment asynchronous conference system - Google Patents

Method for improving speech communication quality of fragment asynchronous conference system Download PDF

Info

Publication number
CN105827618A
CN105827618A CN201610258803.6A CN201610258803A CN105827618A CN 105827618 A CN105827618 A CN 105827618A CN 201610258803 A CN201610258803 A CN 201610258803A CN 105827618 A CN105827618 A CN 105827618A
Authority
CN
China
Prior art keywords
speech
meeting
signal
conference
cloud server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610258803.6A
Other languages
Chinese (zh)
Inventor
王学宗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan Lianyou Telecom Technology Co Ltd
Original Assignee
Sichuan Lianyou Telecom Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan Lianyou Telecom Technology Co Ltd filed Critical Sichuan Lianyou Telecom Technology Co Ltd
Priority to CN201610258803.6A priority Critical patent/CN105827618A/en
Publication of CN105827618A publication Critical patent/CN105827618A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/56Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities
    • H04M3/568Arrangements for connecting several subscribers to a common circuit, i.e. affording conference facilities audio processing specific to telephonic conferencing, e.g. spatial distribution, mixing of participants
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/45Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of analysis window
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/1066Session management
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Quality & Reliability (AREA)
  • General Business, Economics & Management (AREA)
  • Business, Economics & Management (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a method for improving the speech communication quality of a fragment asynchronous conference system. The method includes the following steps: S1, a conference speaker records voice statement information through a conference terminal; S2, the conference terminal performs de-noising processing on the voice statement information; S3, the conference terminal sends de-noised voice statement information to a conference cloud server; S4, the conference cloud server performs de-noising processing on received voice statement information; and S5, the conference cloud server sends the voice statement information to participants in a pre-created statement content receiving member list. According to the method of the invention, first de-noising processing is performed on voice statement content on the conference terminal used by the conference speaker, and then, second de-noising processing is performed on the voice statement content in the conference cloud server, and therefore, noise signals in the voice statement content can be greatly inhibited, and the speech communication quality can be improved.

Description

The method improving fragmentation asynchronous conference system speech quality
Technical field
The present invention relates to videoconference speech quality technical field, particularly relate to a kind of method improving fragmentation asynchronous conference system speech quality.
Background technology
Current social, the office facility of many enterprise institutions is distributed in all over the world, in the day-to-day operations of enterprise, needs often to sit on solving the problem in enterprise operation, but traditional conference model is that personnel participating in the meeting all focuses on a local meeting.There is the cost shortcoming such as low high, ageing in this conference model for having the enterprise of numerous branch, therefore, videoconference is risen therewith.
Along with breeding phase is gradually walked out in domestic call conference service market, either videoconference or mobile phone meeting start lively quickly.But under the videoconference market background of so clamour, videoconference the most on the market is nearly all confined to traditional simultaneous voice videoconference, or use VOIP technology, or use SS7 voice technology, powerful network function is not effectively utilized with chip time, no matter use which kind of product will take participant and go a large amount of lock in time to participate in meeting.Existing videoconference is frequently present of noise signal so that the speech content in videoconference is difficult to catch, and has a strong impact on meeting quality.
Summary of the invention
It is an object of the invention to overcome the deficiencies in the prior art, it is provided that a kind of method improving fragmentation asynchronous conference system speech quality, speech utterance content is successively carried out twice noise reduction process, improves speech quality.
It is an object of the invention to be achieved through the following technical solutions: the method improving fragmentation asynchronous conference system speech quality, comprise the following steps:
S1. conference speech person is by conference terminal recorded speech speech information;
S2. conference terminal carries out noise reduction process to speech utterance information;
S3. the speech utterance information after noise reduction process is sent to meeting Cloud Server by conference terminal;
S4. meeting Cloud Server carries out noise reduction process to the speech utterance information received;
S5. the member that respectively attends a meeting during speech utterance information is sent to the speech content reception members list being pre-created by meeting Cloud Server.
In described step S1, conference speech person is by the noise reduction microphone recorded speech speech information on conference terminal.
Described step S2 includes following sub-step:
S21. the signal framing framing to input speech utterance information, takes into Hamming window;
S22. time-domain signal is converted to lean and signal, the spectral power distribution of signal calculated;
S23. collecting mail according to the condition adjudgement docking receiving signal and number carry out gain oscillations detection, detection updates the end according to the state being presently in after terminating and makes an uproar spectral power distribution;
S24. utilize and receive the spectral power distribution of signal and the spectral power distribution made an uproar in the end calculates Spectral structure posteriori SNR, calculate general gain coefficient by MMSE method of estimation, and utilize gain coefficient to suppress noise;
S25. utilize the spectral power distribution after noise reduction and the end make an uproar spectral power distribution calculate frame signal to noise ratio, the frame signal to noise ratio preserving and updating in nearest certain time;
S26. according to frame signal to noise ratio and spectrum envelope signal to noise ratio record information, carry out spectrum envelope multimode transfer, judge that input signal is voice or noise according to the State-output of multimode transfer;
S27. the signal after noise reduction is carried out conversion and window superposition during frequency, output signal is carried out voice head protection, export after noise reduction voice or quiet according to the result of quiet detection.
The step creating speech content reception members list is also included before described step S5.
Described step S5 includes following sub-step:
S51. meeting Cloud Server is respectively attended a meeting in judging the speech content reception members list being pre-created the type of member:
If S52. attending a meeting, member is conference terminal member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member conference terminal on;
If S53. attending a meeting, member is mobile phone member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member mobile phone on.
Described step S2 is identical with the mode that speech utterance information carries out noise reduction process in step S4.
The invention has the beneficial effects as follows: in the present invention, speech utterance content is carried out noise reduction process for the first time by the conference terminal that conference speech person uses, can discuss the most again and speech utterance content is carried out second time noise reduction process by Cloud Server, greatly suppress the noise signal in speech utterance content, improve speech quality.
Accompanying drawing explanation
Fig. 1 is the flow chart that the present invention improves the method for fragmentation asynchronous conference system speech quality.
Detailed description of the invention
Technical scheme is described in further detail below in conjunction with the accompanying drawings, but protection scope of the present invention is not limited to the following stated.
As it is shown in figure 1, the method improving fragmentation asynchronous conference system speech quality, comprise the following steps:
S1. conference speech person is by conference terminal recorded speech speech information.
In described step S1, conference speech person is by the noise reduction microphone recorded speech speech information on conference terminal, use noise reduction microphone recorded speech speech information, from source, reduce the noise signal in speech utterance information, thus improve the speech quality of meeting.
S2. conference terminal carries out noise reduction process to speech utterance information.
Described step S2 includes following sub-step:
S21. the signal framing framing to input speech utterance information, takes into Hamming window;
S22. time-domain signal is converted to lean and signal, the spectral power distribution of signal calculated;
S23. collecting mail according to the condition adjudgement docking receiving signal and number carry out gain oscillations detection, detection updates the end according to the state being presently in after terminating and makes an uproar spectral power distribution;
S24. utilize and receive the spectral power distribution of signal and the spectral power distribution made an uproar in the end calculates Spectral structure posteriori SNR, calculate general gain coefficient by MMSE method of estimation, and utilize gain coefficient to suppress noise;
S25. utilize the spectral power distribution after noise reduction and the end make an uproar spectral power distribution calculate frame signal to noise ratio, the frame signal to noise ratio preserving and updating in nearest certain time;
S26. according to frame signal to noise ratio and spectrum envelope signal to noise ratio record information, carry out spectrum envelope multimode transfer, judge that input signal is voice or noise according to the State-output of multimode transfer;
S27. the signal after noise reduction is carried out conversion and window superposition during frequency, output signal is carried out voice head protection, export after noise reduction voice or quiet according to the result of quiet detection.
Signal framing framing to input speech utterance information in described step S21, every frame is made up of 128-512 sampling point, and every time the sampling point of renewal frame length half, is multiplied by a Hamming window by every frame signal, and window length is identical with frame length.
Described step S22, transfers the time-domain signal received to frequency-region signal by fast fourier transform;According to mankind's phonation characteristics, will be less than 300Hz and the spectrum energy more than 3400Hz is set to zero.
S3. the speech utterance information after noise reduction process is sent to meeting Cloud Server by conference terminal.
S4. meeting Cloud Server carries out noise reduction process to the speech utterance information received.
Described step S2 is identical with the mode that speech utterance information carries out noise reduction process in step S4.
S5. the member that respectively attends a meeting during speech utterance information is sent to the speech content reception members list being pre-created by meeting Cloud Server.
The step creating speech content reception members list is also included before described step S5.Creating speech content reception members list and specifically include following sub-step: first, conference speech person edits speech content reception member list on conference terminal;Secondly, speech content reception member list is encrypted by conference terminal, and the speech content reception member list after encryption is sent to meeting Cloud Server;Again, speech content reception member list is decrypted by meeting Cloud Server, and creates speech content reception members list according to speech content reception member list.
Described step S5 includes following sub-step:
S51. meeting Cloud Server is respectively attended a meeting in judging the speech content reception members list being pre-created the type of member:
If S52. attending a meeting, member is conference terminal member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member conference terminal on;
If S53. attending a meeting, member is mobile phone member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member mobile phone on.
The above is only the preferred embodiment of the present invention, it is to be understood that the present invention is not limited to form disclosed herein, it is not to be taken as the eliminating to other embodiments, and can be used for other combinations various, amendment and environment, and can be modified by above-mentioned teaching or the technology of association area or knowledge in contemplated scope described herein.And the change that those skilled in the art are carried out and change are without departing from the spirit and scope of the present invention, the most all should be in the protection domain of claims of the present invention.

Claims (6)

1. the method improving fragmentation asynchronous conference system speech quality, it is characterised in that: comprise the following steps:
S1. conference speech person is by conference terminal recorded speech speech information;
S2. conference terminal carries out noise reduction process to speech utterance information;
S3. the speech utterance information after noise reduction process is sent to meeting Cloud Server by conference terminal;
S4. meeting Cloud Server carries out noise reduction process to the speech utterance information received;
S5. the member that respectively attends a meeting during speech utterance information is sent to the speech content reception members list being pre-created by meeting Cloud Server.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: in described step S1, conference speech person is by the noise reduction microphone recorded speech speech information on conference terminal.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: described step S2 includes following sub-step:
S21. the signal framing framing to input speech utterance information, takes into Hamming window;
S22. time-domain signal is converted to lean and signal, the spectral power distribution of signal calculated;
S23. collecting mail according to the condition adjudgement docking receiving signal and number carry out gain oscillations detection, detection updates the end according to the state being presently in after terminating and makes an uproar spectral power distribution;
S24. utilize and receive the spectral power distribution of signal and the spectral power distribution made an uproar in the end calculates Spectral structure posteriori SNR, calculate general gain coefficient by MMSE method of estimation, and utilize gain coefficient to suppress noise;
S25. utilize the spectral power distribution after noise reduction and the end make an uproar spectral power distribution calculate frame signal to noise ratio, the frame signal to noise ratio preserving and updating in nearest certain time;
S26. according to frame signal to noise ratio and spectrum envelope signal to noise ratio record information, carry out spectrum envelope multimode transfer, judge that input signal is voice or noise according to the State-output of multimode transfer;
S27. the signal after noise reduction is carried out conversion and window superposition during frequency, output signal is carried out voice head protection, export after noise reduction voice or quiet according to the result of quiet detection.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: also include the step creating speech content reception members list before described step S5.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: described step S5 includes following sub-step:
S51. meeting Cloud Server is respectively attended a meeting in judging the speech content reception members list being pre-created the type of member:
If S52. attending a meeting, member is conference terminal member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member conference terminal on;
If S53. attending a meeting, member is mobile phone member, then meeting Cloud Server be sent to speech utterance information to attend a meeting member mobile phone on.
The method improving fragmentation asynchronous conference system speech quality the most according to claim 1, it is characterised in that: described step S2 is identical with the mode that speech utterance information carries out noise reduction process in step S4.
CN201610258803.6A 2016-04-25 2016-04-25 Method for improving speech communication quality of fragment asynchronous conference system Pending CN105827618A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610258803.6A CN105827618A (en) 2016-04-25 2016-04-25 Method for improving speech communication quality of fragment asynchronous conference system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610258803.6A CN105827618A (en) 2016-04-25 2016-04-25 Method for improving speech communication quality of fragment asynchronous conference system

Publications (1)

Publication Number Publication Date
CN105827618A true CN105827618A (en) 2016-08-03

Family

ID=56526529

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610258803.6A Pending CN105827618A (en) 2016-04-25 2016-04-25 Method for improving speech communication quality of fragment asynchronous conference system

Country Status (1)

Country Link
CN (1) CN105827618A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968638A (en) * 2020-08-14 2020-11-20 上海茂声智能科技有限公司 Method, system, equipment and storage medium for voice control display terminal
CN115550595A (en) * 2021-06-30 2022-12-30 Oppo广东移动通信有限公司 Online conference implementation method, device, equipment and readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103632676A (en) * 2013-11-12 2014-03-12 广州海格通信集团股份有限公司 Low SNR (signal to noise ratio) speech noise reduction method
CN103632681A (en) * 2013-11-12 2014-03-12 广州海格通信集团股份有限公司 Spectral envelope silence detection method
US20140105407A1 (en) * 2012-10-11 2014-04-17 International Business Machines Corporation Reducing noise in a shared media sesssion
CN104200811A (en) * 2014-08-08 2014-12-10 华迪计算机集团有限公司 Self-adaption spectral subtraction and noise elimination processing method and device for voice signals
CN104427068A (en) * 2013-09-06 2015-03-18 中兴通讯股份有限公司 Voice communication method and device
CN104579710A (en) * 2015-01-16 2015-04-29 四川联友电讯技术有限公司 Method for conference member to issue voice information in fragmentation asynchronous conference system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140105407A1 (en) * 2012-10-11 2014-04-17 International Business Machines Corporation Reducing noise in a shared media sesssion
CN104427068A (en) * 2013-09-06 2015-03-18 中兴通讯股份有限公司 Voice communication method and device
CN103632676A (en) * 2013-11-12 2014-03-12 广州海格通信集团股份有限公司 Low SNR (signal to noise ratio) speech noise reduction method
CN103632681A (en) * 2013-11-12 2014-03-12 广州海格通信集团股份有限公司 Spectral envelope silence detection method
CN104200811A (en) * 2014-08-08 2014-12-10 华迪计算机集团有限公司 Self-adaption spectral subtraction and noise elimination processing method and device for voice signals
CN104579710A (en) * 2015-01-16 2015-04-29 四川联友电讯技术有限公司 Method for conference member to issue voice information in fragmentation asynchronous conference system

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968638A (en) * 2020-08-14 2020-11-20 上海茂声智能科技有限公司 Method, system, equipment and storage medium for voice control display terminal
CN115550595A (en) * 2021-06-30 2022-12-30 Oppo广东移动通信有限公司 Online conference implementation method, device, equipment and readable storage medium

Similar Documents

Publication Publication Date Title
US9640194B1 (en) Noise suppression for speech processing based on machine-learning mask estimation
Li et al. ICASSP 2021 deep noise suppression challenge: Decoupling magnitude and phase optimization with a two-stage deep network
CN103124165B (en) Automatic growth control
US10832696B2 (en) Speech signal cascade processing method, terminal, and computer-readable storage medium
US10237412B2 (en) System and method for audio conferencing
US9979769B2 (en) System and method for audio conferencing
CN108447496B (en) Speech enhancement method and device based on microphone array
CN106887239A (en) For the enhanced blind source separation algorithm of the mixture of height correlation
CN107371079B (en) A kind of the diamylose gram noise reduction system and noise-reduction method of earphone
CN101207663A (en) Internet communication device and method for controlling noise thereof
CN105304079A (en) Multi-party call multi-mode speech synthesis method and system
US10504538B2 (en) Noise reduction by application of two thresholds in each frequency band in audio signals
CN109040501A (en) A kind of echo cancel method improving VOIP phone quality
US10540983B2 (en) Detecting and reducing feedback
CN105793922A (en) Multi-path audio processing
US10192566B1 (en) Noise reduction in an audio system
US20110137644A1 (en) Decoding speech signals
CN105827618A (en) Method for improving speech communication quality of fragment asynchronous conference system
CN104618616B (en) Videoconference participant identification system and method based on speech feature extraction
CN106328160B (en) Noise reduction method based on double microphones
EP3414889B1 (en) Bi-magnitude processing framework for nonlinear echo cancellation in mobile devices
WO2022142984A1 (en) Voice processing method, apparatus and system, smart terminal and electronic device
WO2022166738A1 (en) Speech enhancement method and apparatus, and device and storage medium
US20210389925A1 (en) Volume adjustments
US9258428B2 (en) Audio bandwidth extension for conferencing

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20160803

RJ01 Rejection of invention patent application after publication