CN110010130A - A kind of intelligent method towards participant's simultaneous voice transcription text - Google Patents

A kind of intelligent method towards participant's simultaneous voice transcription text Download PDF

Info

Publication number
CN110010130A
CN110010130A CN201910263845.2A CN201910263845A CN110010130A CN 110010130 A CN110010130 A CN 110010130A CN 201910263845 A CN201910263845 A CN 201910263845A CN 110010130 A CN110010130 A CN 110010130A
Authority
CN
China
Prior art keywords
participant
control centre
spectrogram
information
voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910263845.2A
Other languages
Chinese (zh)
Inventor
汪丹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Anhui Broad Sound Technology Co Ltd
Original Assignee
Anhui Broad Sound Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Anhui Broad Sound Technology Co Ltd filed Critical Anhui Broad Sound Technology Co Ltd
Priority to CN201910263845.2A priority Critical patent/CN110010130A/en
Publication of CN110010130A publication Critical patent/CN110010130A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction

Abstract

The invention discloses a kind of intelligent methods towards participant's simultaneous voice transcription text, the following steps are included: participant microphone of registering to control centre the position of typing oneself, name information in a manner of voice, control centre stores the voice messaging of input by the sequence of input, and control centre pre-processes the voice messaging of typing;Pretreated voice messaging is converted into spectrogram and text information, and spectrogram and text information are stored in the information bank of the participant;Control centre's spectrogram is divided into several groups framing frequency spectrum;Control centre carries out feature extraction to framing frequency spectrum;Store participant's spectral energy values difference DelN;Identify spokesman's identity;Spectral energy values difference DelNf is calculated, and is compared;Formation forms minutes document.The present invention can be realized meeting overall process record, not can recognize that the identity for the spokesman that attends a meeting.

Description

A kind of intelligent method towards participant's simultaneous voice transcription text
Technical field
The invention belongs to intelligent sound technical fields, and in particular to a kind of intelligence towards participant's simultaneous voice transcription text It can method.
Background technique
For some important meetings, needs to record the full content of meeting, be consumed by the way of manual record Take manpower, the existing technology recorded automatically to conference content, the voice signal for usually issuing participant are direct at present Text character is converted into be saved.
It realizes in process of the present invention, at least there are the following problems in the related technology for inventor's discovery: by the voice of participant It is lengthy that signal is directly changed into the minutes that text character is saved and formed, it is difficult to identify each spokesman Any content said.
Summary of the invention
It is an object of the invention to overcome above-mentioned the deficiencies in the prior art, one kind is provided towards participant's simultaneous voice transcription The intelligent method of text.
A kind of intelligent method towards participant's simultaneous voice transcription text, it is characterised in that: the following steps are included:
1) participant microphone of registering to control centre the position of typing oneself, name information in a manner of voice, in control The heart stores the voice messaging of input by the sequence of input, and control centre pre-processes the voice messaging of typing;
2) pretreated voice messaging is converted into spectrogram and text information, and spectrogram and text information are stored in this and attended a meeting The information bank of person;
3) spectrogram of each participant is carried out framing by control centre at a fixed time interval, spectrogram is divided into several Component frame frequency spectrum;
4) control centre to framing frequency spectrum carry out feature extraction, the project of feature extraction include: framing frequency spectrum mass center Ci(i=1, 2 ... n), spectral energy values Ni(i=1,2 ... n), spectral energy values difference DelNd;
5) participant's spectral energy values difference DelN is stored in the information bank of the participant;
6) when participant makes a speech, by the microphone on attending a banquet to control centre by input speech voice, voice messaging of making a speech is through controlling Center processed is pre-processed, and pretreated speech voice messaging is converted into speech spectrogram and text information of making a speech, speech text Word information is stored into minutes document;
7) control centre presses step 3), 4) calculates spectral energy values difference DelNf;
8) DelNf is compared with the DelNd being stored in participant's information bank for control centre, DelNf and participant's information When the threshold value between DelNd in library is less than setting value, spokesman's identity is determined, and increase in the front of minutes document Spokesman's name;
9) after completing meeting, minutes document is formed, and print, sign.
Preferably, spectral energy values Ni in the step 4)=, DelNd=Ni-N(i-1).
Preferably, the pre-treatment step is to remove noise, signal amplification.
Preferably, the fixed time interval is 10-20ms.
Compared with prior art, beneficial effects of the present invention:
In the use of the present invention, the present invention knows otherwise to confirm the identity of each participant by tone color, so as to Conference content is corresponded to each participant, minutes is avoided to be difficult to differentiate the defect of speaker;Pass through pretreatment Technology, to be removed dryness and be amplified to signal, it is ensured that the accuracy of signal;The sound of participant is identified by intelligent automated manner Color has the advantages that accuracy is good;Participant is when carrying out typing identity, it is only necessary to carry out once, in the later period in use, i.e. It does not need to carry out typing.
Specific embodiment
A kind of intelligent method towards participant's simultaneous voice transcription text, it is characterised in that: the following steps are included:
1) participant microphone of registering to control centre the position of typing oneself, name information in a manner of voice, in control The heart stores the voice messaging of input by the sequence of input, and control centre pre-processes the voice messaging of typing;
2) pretreated voice messaging is converted into spectrogram and text information, and spectrogram and text information are stored in this and attended a meeting The information bank of person;
3) spectrogram of each participant is carried out framing by control centre at a fixed time interval, spectrogram is divided into several Component frame frequency spectrum;
4) control centre to framing frequency spectrum carry out feature extraction, the project of feature extraction include: framing frequency spectrum mass center Ci(i=1, 2 ... n), spectral energy values Ni(i=1,2 ... n), spectral energy values difference DelNd;
5) participant's spectral energy values difference DelN is stored in the information bank of the participant;
6) when participant makes a speech, by the microphone on attending a banquet to control centre by input speech voice, voice messaging of making a speech is through controlling Center processed is pre-processed, and pretreated speech voice messaging is converted into speech spectrogram and text information of making a speech, speech text Word information is stored into minutes document;
7) control centre presses step 3), 4) calculates spectral energy values difference DelNf;
8) DelNf is compared with the DelNd being stored in participant's information bank for control centre, DelNf and participant's information When the threshold value between DelNd in library is less than setting value, spokesman's identity is determined, and increase in the front of minutes document Spokesman's name;
9) after completing meeting, minutes document is formed, and print, sign.
Preferably, spectral energy values Ni=* MERGEFORMAT, DelNd=Ni-N(i-1 in the step 4)).
Preferably, the pre-treatment step is to remove noise, signal amplification.
Preferably, the fixed time interval is 10-20ms.
The working principle of the invention is:
In the use of the present invention, by calculating spectral energy values difference, to identify the identity of spokesman.
It should be noted that present invention specific implementation is not subject to the restrictions described above, as long as using side of the invention The various unsubstantialities that method conception and technical scheme carry out improve, or the not improved conception and technical scheme by invention are directly answered It is within the scope of the present invention for other occasions.

Claims (4)

1. a kind of intelligent method towards participant's simultaneous voice transcription text, it is characterised in that: the following steps are included:
1) participant microphone of registering to control centre the position of typing oneself, name information in a manner of voice, in control The heart stores the voice messaging of input by the sequence of input, and control centre pre-processes the voice messaging of typing;
2) pretreated voice messaging is converted into spectrogram and text information, and spectrogram and text information are stored in this and attended a meeting The information bank of person;
3) spectrogram of each participant is carried out framing by control centre at a fixed time interval, spectrogram is divided into several Component frame frequency spectrum;
4) control centre to framing frequency spectrum carry out feature extraction, the project of feature extraction include: framing frequency spectrum mass center Ci(i=1, 2 ... n), spectral energy values Ni(i=1,2 ... n), spectral energy values difference DelNd;
5) participant's spectral energy values difference DelN is stored in the information bank of the participant;
6) when participant makes a speech, by the microphone on attending a banquet to control centre by input speech voice, voice messaging of making a speech is through controlling Center processed is pre-processed, and pretreated speech voice messaging is converted into speech spectrogram and text information of making a speech, speech text Word information is stored into minutes document;
7) control centre presses step 3), 4) calculates spectral energy values difference DelNf;
8) DelNf is compared with the DelNd being stored in participant's information bank for control centre, DelNf and participant's information When the threshold value between DelNd in library is less than setting value, spokesman's identity is determined, and increase in the front of minutes document Spokesman's name;
9) after completing meeting, minutes document is formed, and print, sign.
2. a kind of intelligent method towards participant's simultaneous voice transcription text as described in claim 1, it is characterised in that: institute State spectral energy values Ni in step 4)=, DelNd=Ni-N(i-1).
3. a kind of intelligent method towards participant's simultaneous voice transcription text as described in claim 1, it is characterised in that: institute Stating pre-treatment step is to remove noise, signal amplification.
4. a kind of intelligent method towards participant's simultaneous voice transcription text as described in claim 1, it is characterised in that: institute Stating fixed time interval is 10-20ms.
CN201910263845.2A 2019-04-03 2019-04-03 A kind of intelligent method towards participant's simultaneous voice transcription text Pending CN110010130A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910263845.2A CN110010130A (en) 2019-04-03 2019-04-03 A kind of intelligent method towards participant's simultaneous voice transcription text

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910263845.2A CN110010130A (en) 2019-04-03 2019-04-03 A kind of intelligent method towards participant's simultaneous voice transcription text

Publications (1)

Publication Number Publication Date
CN110010130A true CN110010130A (en) 2019-07-12

Family

ID=67169523

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910263845.2A Pending CN110010130A (en) 2019-04-03 2019-04-03 A kind of intelligent method towards participant's simultaneous voice transcription text

Country Status (1)

Country Link
CN (1) CN110010130A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110600039A (en) * 2019-09-27 2019-12-20 百度在线网络技术(北京)有限公司 Speaker attribute determination method and device, electronic equipment and readable storage medium
CN113660378A (en) * 2020-05-12 2021-11-16 宁波维度数字科技有限公司 Intelligent voice automatic conference record generation system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130144603A1 (en) * 2011-12-01 2013-06-06 Richard T. Lord Enhanced voice conferencing with history
US20170277784A1 (en) * 2016-03-22 2017-09-28 International Business Machines Corporation Audio summarization of meetings driven by user participation
CN107452166A (en) * 2017-06-27 2017-12-08 长江大学 A kind of library book-borrowing method and device based on Application on Voiceprint Recognition
CN108022583A (en) * 2017-11-17 2018-05-11 平安科技(深圳)有限公司 Meeting summary generation method, application server and computer-readable recording medium
CN109388701A (en) * 2018-08-17 2019-02-26 深圳壹账通智能科技有限公司 Minutes generation method, device, equipment and computer storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130144603A1 (en) * 2011-12-01 2013-06-06 Richard T. Lord Enhanced voice conferencing with history
US20170277784A1 (en) * 2016-03-22 2017-09-28 International Business Machines Corporation Audio summarization of meetings driven by user participation
CN107452166A (en) * 2017-06-27 2017-12-08 长江大学 A kind of library book-borrowing method and device based on Application on Voiceprint Recognition
CN108022583A (en) * 2017-11-17 2018-05-11 平安科技(深圳)有限公司 Meeting summary generation method, application server and computer-readable recording medium
CN109388701A (en) * 2018-08-17 2019-02-26 深圳壹账通智能科技有限公司 Minutes generation method, device, equipment and computer storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110600039A (en) * 2019-09-27 2019-12-20 百度在线网络技术(北京)有限公司 Speaker attribute determination method and device, electronic equipment and readable storage medium
CN110600039B (en) * 2019-09-27 2022-05-20 百度在线网络技术(北京)有限公司 Method and device for determining speaker attribute, electronic equipment and readable storage medium
CN113660378A (en) * 2020-05-12 2021-11-16 宁波维度数字科技有限公司 Intelligent voice automatic conference record generation system

Similar Documents

Publication Publication Date Title
CN103236260B (en) Speech recognition system
TW201824250A (en) Method and apparatus for speaker diarization
CN108922518A (en) voice data amplification method and system
CN108986824B (en) Playback voice detection method
JP2017207770A (en) System and method for fingerprinting datasets
CN104091603B (en) Endpoint detection system and its computational methods based on fundamental frequency
CN105244023A (en) System and method for reminding teacher emotion in classroom teaching
CN104883437B (en) The method and system of speech analysis adjustment reminding sound volume based on environment
WO2021082572A1 (en) Wake-up model generation method, smart terminal wake-up method, and devices
CN108766441A (en) A kind of sound control method and device based on offline Application on Voiceprint Recognition and speech recognition
CN110010130A (en) A kind of intelligent method towards participant's simultaneous voice transcription text
CN109256150A (en) Speech emotion recognition system and method based on machine learning
CN106297776A (en) A kind of voice keyword retrieval method based on audio template
Sun et al. Speaker diarization system for RT07 and RT09 meeting room audio
CN104103272B (en) Audio recognition method, device and bluetooth earphone
CN107705791A (en) Caller identity confirmation method, device and Voiceprint Recognition System based on Application on Voiceprint Recognition
CN108806698A (en) A kind of camouflage audio recognition method based on convolutional neural networks
CN110136709A (en) Audio recognition method and video conferencing system based on speech recognition
CN101625858B (en) Method for extracting short-time energy frequency value in voice endpoint detection
CN108735200A (en) A kind of speaker's automatic marking method
CN110148419A (en) Speech separating method based on deep learning
CN109377986B (en) Non-parallel corpus voice personalized conversion method
CN109274819A (en) User emotion method of adjustment, device, mobile terminal and storage medium when call
WO2023088083A1 (en) Speech enhancement method and apparatus
CN106887231A (en) A kind of identification model update method and system and intelligent terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20190712