CN105869626A - Automatic speech rate adjusting method and terminal - Google Patents
Automatic speech rate adjusting method and terminal Download PDFInfo
- Publication number
- CN105869626A CN105869626A CN201610375868.9A CN201610375868A CN105869626A CN 105869626 A CN105869626 A CN 105869626A CN 201610375868 A CN201610375868 A CN 201610375868A CN 105869626 A CN105869626 A CN 105869626A
- Authority
- CN
- China
- Prior art keywords
- information
- speed
- voice messaging
- broadcasting
- described voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/60—Information retrieval; Database structures therefor; File system structures therefor of audio data
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/7243—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
- H04M1/72433—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
Abstract
The invention discloses an automatic speech rate adjusting method. The method comprises the steps that input speech information is acquired; speech feature information of the speech information is extracted; the speech information playing rate corresponding to the speech feature information is inquired from a speech database; the speech information playing rate is adjusted according to the playing rate. According to the method, the preset playing rate corresponding to the speech feature information can be determined according to the speech feature information of the speech information input in real time, and the speech rate of the input speech information is adjusted according to the playing rate to meet the requirements various users; not only is the purpose of self-adaptively adjusting the playing rate according to the content of the speech information achieved, but also the adaptability is high when the method is applied to the occasions of communication, program playing and the like. The invention further discloses a terminal which can achieve the purpose of self-adaptively adjusting the playing rate according to the content of the speech information.
Description
Technical field
The present invention relates to communication technical field, a kind of method being automatically adjusted particularly to word speed and terminal.
Background technology
Due to the difference of the hearing level of people, the broadcasting content of same word speed can be felt for some people
Obtain quickly to such an extent as to can not hear clearly, can think that for another part people word speed is very slow to such an extent as to feels
Losing time.Therefore, the word speed of the broadcasting content in terminal needs to carry out according to the actual demand of people
Set.
In prior art, increase word speed regulation control at user mobile phone client application program so that user
Selecting regulation word speed, selected word speed grade, mobile phone sets regulation word speed grade according to user and plays in voice
Hold.But said method there is also shortcoming: first, although the regulation of word speed is divided into several grade,
However it is necessary that people manually presets, it is impossible to word speed i.e. can not adaptive be adjusted by dynamic regulation.
Secondly, word speed regulation is only limitted to the content that mobile phone client software is play, it is impossible to the tune real-time when call
Joint word speed.Finally, it is impossible to other kind of speech like sound of self adaptation, word speed tune is carried out according to the languages of both call sides
Joint.Therefore, how word speed is adjusted by self adaptation, be those skilled in the art need solve technology
Problem.
Summary of the invention
It is an object of the invention to provide method and terminal that a kind of word speed is automatically adjusted, it is possible to according to the most defeated
The voice characteristics information of the voice messaging entered, determines the predetermined broadcasting corresponding with this voice characteristics information
Speed, is adjusted the word speed of the voice messaging of input according to this broadcasting speed, it is achieved that according to voice
The regulation broadcasting speed of the content-adaptive of information.
For solving above-mentioned technical problem, the present invention provides a kind of method that word speed is automatically adjusted, including:
Obtain the voice messaging of input;
Extract the voice characteristics information of described voice messaging;
The broadcasting of the described voice messaging corresponding with described voice characteristics information is inquired about from speech database
Speed;
The speed that described voice messaging is play is regulated according to described broadcasting speed.
Wherein, the voice characteristics information of the described voice messaging of described extraction, including:
Identify the languages characteristic information of described voice messaging;And/or,
Extract the word speed information of described voice messaging, in Feature Words information and audio-frequency information at least one.
Wherein, described voice messaging is the voice messaging of this end subscriber, and the method also includes:
Obtain the sign information of described end subscriber;
The broadcasting of the described voice messaging corresponding with described voice characteristics information is inquired about from speech database
Speed, including:
From speech database inquire about corresponding with described voice characteristics information and described sign information described in
The broadcasting speed of voice messaging.
Wherein, will inquire about from speech database relative with described voice characteristics information and described sign information
After the broadcasting speed of the described voice messaging answered, also include:
Utilize described voice characteristics information and described sign information, according to machine learning algorithm to speech data
In storehouse, the corresponding relation of broadcasting speed is updated.
Wherein, regulate, according to described broadcasting speed, the speed that described voice messaging is play, including:
By interpolation or take out and cut the digital signal resampling to described voice messaging, regulate described voice
The time scale of information reaches described broadcasting speed.
The present invention also provides for a kind of terminal, including:
Voice messaging acquisition module, for obtaining the voice messaging of input;
Pronunciation extracting module, for extracting the voice characteristics information of described voice messaging;
Broadcasting speed determines module, relative with described voice characteristics information for inquiring about from speech database
The broadcasting speed of the described voice messaging answered;
Broadcasting speed adjustment module, for regulating, according to described broadcasting speed, the speed that described voice messaging is play
Degree.
Wherein, described pronunciation extracting module includes:
First speech feature extraction unit, for identifying the languages characteristic information of described voice messaging;And/or,
Second speech feature extraction unit, for extracting the word speed information of described voice messaging, Feature Words is believed
Breath and audio-frequency information at least one.
Wherein, described voice messaging is the voice messaging of this end subscriber, and this terminal also includes:
Sign information acquisition module, for obtaining the sign information of described end subscriber.
Wherein, described terminal also includes:
Machine learning module, is used for utilizing described voice characteristics information and described sign information, according to machine
The corresponding relation of broadcasting speed in speech database is updated by learning algorithm.
Wherein, described broadcasting speed adjustment module is cut described voice messaging specifically by interpolation or take out
Digital signal resampling, the time scale regulating described voice messaging reaches the mould of described broadcasting speed
Block.
The method that word speed provided by the present invention is automatically adjusted, including: obtain the voice messaging of input;Carry
Take the voice characteristics information of described voice messaging;Inquiry and described voice characteristics information from speech database
The broadcasting speed of corresponding described voice messaging;Regulate described voice messaging according to described broadcasting speed to broadcast
The speed put;
Visible the method can determine and this language according to the voice characteristics information of the voice messaging of input in real time
The predetermined broadcasting speed that sound characteristic information is corresponding, according to this broadcasting speed voice messaging to inputting
Word speed is adjusted, to adapt to the demand of various user;I.e. achieve the content according to voice messaging adaptive
The regulation broadcasting speed answered, and the method may be used for the occasion such as user's communication and program broadcasting, improves
The adaptability of the method.Present invention also offers a kind of terminal, there is above-mentioned beneficial effect, at this not
Repeat again.
Accompanying drawing explanation
In order to be illustrated more clearly that the embodiment of the present invention or technical scheme of the prior art, below will be to reality
Execute the required accompanying drawing used in example or description of the prior art to be briefly described, it should be apparent that below,
Accompanying drawing in description is only embodiments of the invention, for those of ordinary skill in the art, not
On the premise of paying creative work, it is also possible to obtain other accompanying drawing according to the accompanying drawing provided.
The flow chart of the method that the word speed that Fig. 1 is provided by the embodiment of the present invention is automatically adjusted;
The structured flowchart of the terminal that Fig. 2 is provided by the embodiment of the present invention;
The structured flowchart of another terminal that Fig. 3 is provided by the embodiment of the present invention;
The structured flowchart of the another terminal that Fig. 4 is provided by the embodiment of the present invention.
Detailed description of the invention
The core of the present invention is to provide method and the terminal that a kind of word speed is automatically adjusted, it is possible to according to the most defeated
The voice characteristics information of the voice messaging entered, determines the predetermined broadcasting corresponding with this voice characteristics information
Speed, is adjusted the word speed of the voice messaging of input according to this broadcasting speed, it is achieved that according to voice
The regulation broadcasting speed of the content-adaptive of information.
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with this
Accompanying drawing in bright embodiment, is clearly and completely described the technical scheme in the embodiment of the present invention,
Obviously, described embodiment is a part of embodiment of the present invention rather than whole embodiments.Based on
Embodiment in the present invention, those of ordinary skill in the art are obtained under not making creative work premise
The every other embodiment obtained, broadly falls into the scope of protection of the invention.
Refer to the flow chart of the method that Fig. 1, Fig. 1 are automatically adjusted by the word speed that the embodiment of the present invention is provided;
Executive agent in the present embodiment is terminal, and this terminal can be mobile phone;The method may include that
S100, the voice messaging of acquisition input;
Wherein, the acquisition of voice messaging here can be to talk business and to be capable of voice play function
The monitoring of the application program of business realizes;It can be i.e. this end subscriber language when making a phone call or answer the call
Message ceases, it is also possible to be peer user voice messaging when making a phone call or answer the call, it is also possible to be to have
The voice messaging that the application program of voice play function is play.
S110, extract the voice characteristics information of described voice messaging;
Wherein, the kind of the voice characteristics information extracted here and the quantity of kind can be according to users
It is actually needed and confirms, as long as can come corresponding according to this voice characteristics information having in voice messaging
Broadcasting speed according to the voice messaging that preset standard regulation obtains.I.e. can be by voice messaging
Voice characteristics information regulate it according to preset standard and play word speed and realize word speed and be automatically adjusted.Such as,
Here voice characteristics information can include the feature letters such as emotion, languages, phonetic feature, word speed, intonation
Breath.
S120, the described voice messaging that inquiry is corresponding with described voice characteristics information from speech database
Broadcasting speed;
Wherein, after confirming to need to carry out the voice characteristics information extracted, user can pre-set correspondence
Every kind of corresponding broadcasting speed of voice characteristics information, or several voice characteristics information jointly determine correspondence
One broadcasting speed;Here can in speech database with the form of corresponding lists by above-mentioned corresponding relation
Store, it is also possible to utilize the form of mapping table to be stored by above-mentioned corresponding relation.User is all right
The corresponding relation preserved in speech database is modified, deletes, is increased by the change according to practical situation
Deng amendment, to ensure that the corresponding broadcasting speed of the voice characteristics information that sets is as up-to-date, it is possible to satisfied
The actual demand of user.
Here voice inquirement data base, it is also possible to include the voice characteristics information extracted and speech database
The range intervals of such voice characteristics information of middle correspondence contrasts, it is judged that the voice characteristics information of extraction
Numerical value be positioned at which scope, and then confirm the default broadcasting speed that this scope is corresponding.User can also root
According to actual demand, the range intervals of voice characteristics information is modified, it is also possible to corresponding to each scope
Default broadcasting speed is modified, and to adapt to the individual demand of user, improves Consumer's Experience.
S130, the speed play according to the described broadcasting speed described voice messaging of regulation.
Wherein, voice messaging is adjusted by the broadcasting speed according to obtaining, to reach this broadcasting speed.
The most the method for concrete voice messaging regulation is not defined, if the voice letter that can will obtain
Breath is adjusted to the broadcasting speed of correspondence and plays out.Provide below a kind of concrete word speed regulation process:
By interpolation or take out and cut the digital signal resampling to described voice messaging, regulate described voice messaging
Time scale reach described broadcasting speed.I.e. by interpolation or take out and cut digital signal resampling,
Thus elongate or shorten the time scale of voice, reach to change the purpose of word speed.
Such as, during people use mobile phone, call is a basic service, be also one the heaviest
The function wanted.But some people's spoken utterance speed ratio is very fast, and somebody's audition is the most bad, in this case ditch
Lead to the most relatively difficult.The method is during user uses mobile phone to converse, according to obtain
The voice characteristics information such as emotion during double-talk, languages, phonetic feature are adopted by input voice information
Collection is also compared with the information in speech database, thus judges, if word speed is too fast, or
There is abnormal feedback opposite end, confirms the broadcasting speed that this word speed is corresponding, or the broadcasting speed that abnormal feedback is corresponding
Degree, and pass through interpolation or take out and cut digital signal resampling, thus elongate or shorten the time of voice
Yardstick, reaches to change the purpose of word speed.User uses according to this end subscriber or peer user when using mobile phone
The category of language that uses during mobile phone communication, the factor such as emotion changes, regulation automatically plays back from receiver
The speed of sound.To be adapted to the demand of various people.
Wherein, optionally, utilize machine learning algorithm that described speech database carries out study to update.
Safeguard speech database in the terminal, the voice characteristics information parameter of user can be stored,
Make machine learning algorithm that as input, voice characteristics information parameter is carried out study to realize speech database
Update.Can be accustomed to being adjusted according to the life-time service of different user groups rather than fully according to
The original start data instructed regulate, and have more preferable adaptability.
Above-mentioned example implements process and can be such that
This end subscriber i.e. calling terminal user be eager state something or excited time, its speech information content institute
Words and phrases meet in data base user's " irritable " this kind of definition, then will be right according to " irritable "
The broadcasting speed answered reduces the word speed of the input voice information obtained.Reach the purpose releived so that user
Can the more efficient use conversation function of mobile phone with close friend.
The most such as when calling terminal user makes in English, judge that this is English according to voice characteristics information, that
The word speed of input voice information will be regulated according to the broadcasting speed that English is corresponding.So after regulation,
Called end user i.e. peer user can hear the voice messaging after slowing down, and can to a certain degree solve user and exist
The problem of hard of hearing when linking up with non-native user.
Based on technique scheme, the method that the word speed that the embodiment of the present invention carries is automatically adjusted, it is possible to according to
The voice characteristics information of the voice messaging of input in real time, determines make a reservation for corresponding with this voice characteristics information
Broadcasting speed, according to this broadcasting speed, the word speed of voice messaging of input is adjusted, each to adapt to
Plant the demand of user;I.e. achieve the regulation broadcasting speed of the content-adaptive according to voice messaging, and should
Method may be used for the occasion such as user's communication and program broadcasting, improves the strong adaptability of the method.Make
Different user can promote user's impression according to self-demand adaptive voice broadcasting speed.
Based on above-described embodiment, this embodiment can be adaptive according to the category of language of input voice information
Regulate voice messaging broadcasting speed corresponding with each category of language;I.e. can be according to category of language self adaptation
Regulation broadcasting speed.Preferably, the described voice messaging of described extraction voice characteristics information particularly as follows:
Identify the languages characteristic information of described voice messaging.
Wherein, by the identification to the input voice information obtained, the languages that can obtain voice messaging are special
Reference ceases this languages characteristic information can include audio frequency parameter, Feature Words information, believes according to this languages feature
The broadcasting speed preset that breath is corresponding, determines the speed that this voice messaging is play.Here can be permissible with user
Any languages are all respectively provided with the broadcasting speed of correspondence;Or it is right to be respectively provided with the languages of predetermined quantity
The broadcasting speed answered;Or languages are divided into several big classification, the broadcasting of correspondence is set only for every kind
Speed, corresponding languages characteristic information here can be classification information, or will obtain languages and sentence
Which classification is these languages disconnected belong to, and determines the broadcasting speed of correspondence the most again;This languages and broadcasting speed
The corresponding relation of degree can be realized by corresponding lists or mapping table.
Wherein, the recognition methods of languages characteristic information can pass through user's language recognition system and language text
Translation system synthesis " reference voice " of user every kind language, Markovian model based on segment and syllable
Type, pitch contour, formant vector, acoustic features, the phoneme of dialect and prosodic features and former
The speech sound waves feature begun is identified.The sorting technique used can include HMM, specialist system, gather
Class algorithm, secondary classification and artificial neural network.
Below by several concrete application scenarios, above-described embodiment is illustrated:
When will listen to that in terminal, application program exists input voice information, the voice messaging obtained is carried out
Identify, during if it is determined that this languages characteristic information is English, determine the broadcasting speed that the English of user preset is corresponding
Degree, and the word speed of voice messaging is adjusted to the broadcasting speed of correspondence.Its English is only for example.
When user converses, can only detect the languages of the voice messaging of this end subscriber, it is also possible to only
The languages of the voice messaging of detection peer user, it is also possible to detect the voice letter of this end subscriber and peer user
The languages of breath;Illustrate as a example by last a kind of situation below:
During beginning, mobile phone is in normal communication state, and calling and called have turned on.Voice messaging acquisition module obtains
Take the voice messaging of input;Audio frequency parameter and the crucial words and phrases of both sides are carried out by pronunciation extracting module
Extract.Broadcasting speed determines that the audio frequency parameter extracted is resolved by module, and voice inquirement data base is also carried out
Languages judge, determine the broadcasting speed of user preset according to languages.Voice is believed by broadcasting speed adjustment module
Breath carries out temporal elongation or shortening processes.Treated voice messaging play by receiver.Both sides are hung up
Phone, call completes.
This embodiment user can determine the receiving ability to every kind of language according to own actual situation, rationally
Set broadcasting speed, user's problem of hard of hearing when linking up with non-native user can be solved.
Based on above-mentioned any embodiment, when this embodiment is mainly used between user carrying out speech exchange, can
Word speed can be there will be too fast, the situation such as excited, in order to friendship between user in these cases
Stream can be smoothed out, and determines the state of user according to the voice characteristics information of user speech information, determines
The broadcasting speed set under this state;I.e. can according to user speak state self-adaption regulation broadcasting speed.
Preferably, the described voice messaging of described extraction voice characteristics information particularly as follows:
Extract the word speed information of described voice messaging, in Feature Words information and audio-frequency information at least one.
Wherein, these need first to determine every kind of voice characteristics information User Status that is corresponding or that react,
It is determined should arranging which type of broadcasting speed under this kind of state.Here can be only according to word speed
Information judges, it is also possible to carry out judgement etc., i.e. word speed information, feature only according to Feature Words information
Word information and audio-frequency information can be in any combination;
During used aloned, classify according to every kind of voice characteristics information situation, and to sorted every kind
Situation sets corresponding broadcasting speed, such as word speed information, and user speaks word speed one in the case of irritability
As can be too fast, then when word speed information exceedes certain value it is believed that this user is for irritable, by its voice
Information is set to the broadcasting speed under predetermined irritability, naturally it is also possible to word speed is divided into several word speed models
Enclose, and broadcasting speed corresponding under each word speed scope is set.
In order to improve the accuracy of word speed regulation, preferably can be by word speed information, Feature Words information and sound
Frequently information is used in combination, and i.e. determines broadcasting speed according to the informix of three features.Such as, user
Word speed of speaking in the case of irritability typically can be too fast, it may appear that (user can basis for some particular words
The feature of self is set in the habitual word in the case of oneself irritability), and sound can be high, if occurring
Three or at least its voice messaging, i.e. it is believed that this user be irritability, is set to predetermined urgency by both
Broadcasting speed under hot-tempered.
Word speed information in this embodiment, Feature Words information and audio-frequency information can arbitrarily be believed with languages feature
Breath is combined using.As arranged broadcasting speed corresponding under each word speed scope of English, each language of Chinese
Broadcasting speed corresponding under speed scope.
Based on above-described embodiment, the problem of user's energy Automatic adjusument call word speed.Make different user permissible
Change playout of voice according to self-demand, promote user's impression.
Based on above-mentioned any embodiment, this embodiment is mainly for being determined more accurately this end subscriber
State, and then determine this end subscriber broadcasting speed in this condition;Can speak according to this end subscriber
State self-adaption regulation broadcasting speed.The most described voice messaging is the voice messaging of this end subscriber, the method
Can also include:
Obtain the sign information of described end subscriber;
The corresponding described voice messaging that inquiry is corresponding with described voice characteristics information from speech database
Broadcasting speed, including:
From speech database inquire about corresponding with described voice characteristics information and described sign information described in
The broadcasting speed of voice messaging.
Wherein, above-described embodiment can be according to word speed information, and Feature Words information and audio-frequency information determine user
State, in order to be determined more accurately whether this end subscriber is under this state, it is also possible to obtain local terminal
The sign information of user, sign information can include the body temperature of this end subscriber, pulse etc..And sign information
Collection can be gathered by the Intelligent worn device such as Intelligent bracelet etc. adapted with terminal.
Such as this end subscriber i.e. calling terminal user be eager state something or excited time, in its voice messaging
Words and phrases used by appearance meet this kind of definition irritable to user in data base, and collect from Intelligent bracelet
The information such as user's pulse quickening, then may determine that user is in irritable state, can be according to irritable correspondence
Broadcasting speed reduces the word speed of the input voice information obtained.Reach the purpose releived so that user is permissible
More efficiently with friendly use conversation function of mobile phone.Detailed process can be such that
Mobile phone is in normal communication state, and calling and called have turned on.Gather the voice messaging of user, and lead to
Cross the information such as the body temperature during Intelligent bracelet gathers user's communication, pulse.Voice inquirement database information,
In conjunction with the body temperature during user's communication, pulse change and the crucial words and phrases i.e. use of Feature Words information, sentence
Disconnected user whether be in a bad mood excitement situation.And judge whether to need regulation according to word speed information.If it is full
The condition of foot regulation, then be adjusted according to the preset value in speech database, determines new broadcasting speed
Degree.Voice messaging data are carried out temporal elongation or shortening processes.Treated language play by receiver
Sound data.And the emotion changes information of this user and feature statement can be write speech database, with
Optimize the follow-up calculating to emotion judgment.
Based on above-mentioned any embodiment, this embodiment mainly improves the accuracy of speech database, therefore,
The method also includes:
Utilize described voice characteristics information and described sign information, according to machine learning algorithm to speech data
In storehouse, the corresponding relation of broadcasting speed is updated.
Wherein, safeguard speech database in the terminal, the audio-frequency information parameter of user can be stored,
So guidance just possesses the learning functionality of word speed regulation.Can be according to the life-time service of different user groups
Custom is adjusted rather than regulates fully according to the original start data instructed, and has the suitableeest
Ying Xing.There is learning functionality, the key term i.e. Feature Words information that user often uses can be constantly updated,
To optimize the follow-up calculating to judging with user emotion.
Based on technique scheme, the method that the word speed that the embodiment of the present invention carries is automatically adjusted, it is possible to according to
The voice characteristics information of the voice messaging of input in real time, determines make a reservation for corresponding with this voice characteristics information
Broadcasting speed, according to this broadcasting speed, the word speed of voice messaging of input is adjusted, each to adapt to
Plant the demand of user;I.e. achieve the regulation broadcasting speed of the content-adaptive according to voice messaging, and should
Method may be used for the occasion such as user's communication and program broadcasting, improves the strong adaptability of the method.Make
Different user can promote user's impression according to self-demand adaptive voice broadcasting speed.
Embodiments provide the method that word speed is automatically adjusted, it is possible to according to the voice letter of input in real time
The voice characteristics information of breath, determines the predetermined broadcasting speed corresponding with this voice characteristics information, according to
The word speed of the voice messaging of input is adjusted by this broadcasting speed.
The terminal provided the embodiment of the present invention below is introduced, and terminal described below is with described above
The method that is automatically adjusted of word speed can be mutually to should refer to.
Refer to the structured flowchart of the terminal that Fig. 2, Fig. 2 are provided by the embodiment of the present invention;This terminal is permissible
Including:
Voice messaging acquisition module 100, for obtaining the voice messaging of input;
Pronunciation extracting module 200, for extracting the voice characteristics information of described voice messaging;
Broadcasting speed determines module 300, for inquiry from speech database and described voice characteristics information phase
The broadcasting speed of corresponding described voice messaging;
Broadcasting speed adjustment module 400, for regulating what described voice messaging was play according to described broadcasting speed
Speed.
Optionally, described pronunciation extracting module 200 includes:
First speech feature extraction unit, for identifying the languages characteristic information of described voice messaging;And/or,
Second speech feature extraction unit, for extracting the word speed information of described voice messaging, Feature Words is believed
Breath and audio-frequency information at least one.
Optionally, refer to Fig. 3, described voice messaging is the voice messaging of this end subscriber, and this terminal is also wrapped
Include:
Sign information acquisition module 500, for obtaining the sign information of described end subscriber.
Wherein, at this moment broadcasting speed determines that module 300 is specially inquiry and institute's predicate from speech database
The module of the broadcasting speed of sound characteristic information and the corresponding described voice messaging of described sign information.
Optionally, refer to Fig. 4, this terminal also includes:
Machine learning module 600, is used for utilizing described voice characteristics information and described sign information, according to machine
The corresponding relation of broadcasting speed in speech database is updated by device learning algorithm.
Optionally, broadcasting speed adjustment module 400 is cut described voice messaging specifically by interpolation or take out
Digital signal resampling, the time scale regulating described voice messaging reaches the mould of described broadcasting speed
Block.
Wherein, based on above-mentioned any embodiment, this terminal is specifically as follows mobile phone.
In description, each embodiment uses the mode gone forward one by one to describe, and what each embodiment stressed is
With the difference of other embodiments, between each embodiment, identical similar portion sees mutually.Right
For device disclosed in embodiment, owing to it corresponds to the method disclosed in Example, so describe
Fairly simple, relevant part sees method part and illustrates.
Professional further appreciates that, respectively shows in conjunction with what the embodiments described herein described
The unit of example and algorithm steps, it is possible to electronic hardware, computer software or the two be implemented in combination in,
In order to clearly demonstrate the interchangeability of hardware and software, the most general according to function
Describe composition and the step of each example.These functions perform with hardware or software mode actually,
Depend on application-specific and the design constraint of technical scheme.Professional and technical personnel can be to each specific
Should be used for use different methods to realize described function, but this realization is it is not considered that beyond this
The scope of invention.
The method described in conjunction with the embodiments described herein or the step of algorithm can directly use hardware,
The software module that processor performs, or the combination of the two implements.Software module can be placed in and deposit at random
Reservoir (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electric erasable can be compiled
Appointing well known in journey ROM, depositor, hard disk, moveable magnetic disc, CD-ROM or technical field
In the storage medium of other form of anticipating.
The method and the terminal that are automatically adjusted word speed provided by the present invention above are described in detail.This
Literary composition applies specific case principle and the embodiment of the present invention are set forth, above example
Method and the core concept thereof being only intended to help to understand the present invention is described.It should be pointed out that, for this technology
For the those of ordinary skill in field, under the premise without departing from the principles of the invention, it is also possible to the present invention
Carrying out some improvement and modification, these improve and modify in the protection domain also falling into the claims in the present invention.
Claims (10)
1. the method that a word speed is automatically adjusted, it is characterised in that including:
Obtain the voice messaging of input;
Extract the voice characteristics information of described voice messaging;
The broadcasting of the described voice messaging corresponding with described voice characteristics information is inquired about from speech database
Speed;
The speed that described voice messaging is play is regulated according to described broadcasting speed.
2. the method that word speed as claimed in claim 1 is automatically adjusted, it is characterised in that described extraction institute
State the voice characteristics information of voice messaging, including:
Identify the languages characteristic information of described voice messaging;And/or,
Extract the word speed information of described voice messaging, in Feature Words information and audio-frequency information at least one.
3. the method that word speed as claimed in claim 1 or 2 is automatically adjusted, it is characterised in that institute's predicate
Message breath is the voice messaging of this end subscriber, and the method also includes:
Obtain the sign information of described end subscriber;
The broadcasting of the described voice messaging corresponding with described voice characteristics information is inquired about from speech database
Speed, including:
From speech database inquire about corresponding with described voice characteristics information and described sign information described in
The broadcasting speed of voice messaging.
4. the method that word speed as claimed in claim 3 is automatically adjusted, it is characterised in that will be from voice number
According to storehouse is inquired about broadcasting of the described voice messaging corresponding with described voice characteristics information and described sign information
After putting speed, also include:
Utilize described voice characteristics information and described sign information, according to machine learning algorithm to speech data
In storehouse, the corresponding relation of broadcasting speed is updated.
5. the method that word speed as claimed in claim 1 is automatically adjusted, it is characterised in that broadcast according to described
Put speed and regulate the speed that described voice messaging is play, including:
By interpolation or take out and cut the digital signal resampling to described voice messaging, regulate described voice
The time scale of information reaches described broadcasting speed.
6. a terminal, it is characterised in that including:
Voice messaging acquisition module, for obtaining the voice messaging of input;
Pronunciation extracting module, for extracting the voice characteristics information of described voice messaging;
Broadcasting speed determines module, relative with described voice characteristics information for inquiring about from speech database
The broadcasting speed of the described voice messaging answered;
Broadcasting speed adjustment module, for regulating, according to described broadcasting speed, the speed that described voice messaging is play
Degree.
7. terminal as claimed in claim 6, it is characterised in that described pronunciation extracting module includes:
First speech feature extraction unit, for identifying the languages characteristic information of described voice messaging;And/or,
Second speech feature extraction unit, for extracting the word speed information of described voice messaging, Feature Words is believed
Breath and audio-frequency information at least one.
Terminal the most as claimed in claims 6 or 7, it is characterised in that described voice messaging is that local terminal is used
The voice messaging at family, this terminal also includes:
Sign information acquisition module, for obtaining the sign information of described end subscriber.
9. terminal as claimed in claim 8, it is characterised in that also include:
Machine learning module, is used for utilizing described voice characteristics information and described sign information, according to machine
The corresponding relation of broadcasting speed in speech database is updated by learning algorithm.
10. terminal as claimed in claim 6, it is characterised in that described broadcasting speed adjustment module has
Body is by interpolation or to take out and cut the digital signal resampling to described voice messaging, regulates described voice
The time scale of information reaches the module of described broadcasting speed.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610375868.9A CN105869626B (en) | 2016-05-31 | 2016-05-31 | A kind of method and terminal of word speed automatic adjustment |
PCT/CN2016/087741 WO2017206256A1 (en) | 2016-05-31 | 2016-06-29 | Method for automatically adjusting speaking speed and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610375868.9A CN105869626B (en) | 2016-05-31 | 2016-05-31 | A kind of method and terminal of word speed automatic adjustment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105869626A true CN105869626A (en) | 2016-08-17 |
CN105869626B CN105869626B (en) | 2019-02-05 |
Family
ID=56643245
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610375868.9A Active CN105869626B (en) | 2016-05-31 | 2016-05-31 | A kind of method and terminal of word speed automatic adjustment |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN105869626B (en) |
WO (1) | WO2017206256A1 (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106448653A (en) * | 2016-09-27 | 2017-02-22 | 惠州市德赛工业研究院有限公司 | Wearable intelligent terminal |
CN106486111A (en) * | 2016-10-14 | 2017-03-08 | 北京光年无限科技有限公司 | Many tts engines output word speed control method and system based on intelligent robot |
CN106534964A (en) * | 2016-11-23 | 2017-03-22 | 广东小天才科技有限公司 | Speed adjusting method and device |
CN107689229A (en) * | 2017-09-25 | 2018-02-13 | 广东小天才科技有限公司 | A kind of method of speech processing and device for wearable device |
CN108630224A (en) * | 2018-03-22 | 2018-10-09 | 北京云知声信息技术有限公司 | Control the method and device of word speed |
CN108984078A (en) * | 2017-05-31 | 2018-12-11 | 联想(新加坡)私人有限公司 | The method and information processing unit of output setting are adjusted based on the user identified |
CN109119088A (en) * | 2018-08-29 | 2019-01-01 | 歌尔科技有限公司 | A kind of adjusting method of audio signal, device, equipment and computer storage medium |
CN109147802A (en) * | 2018-10-22 | 2019-01-04 | 珠海格力电器股份有限公司 | A kind of broadcasting word speed adjusting method and device |
CN109348068A (en) * | 2018-12-03 | 2019-02-15 | 咪咕数字传媒有限公司 | A kind of information processing method, device and storage medium |
CN109521718A (en) * | 2019-01-11 | 2019-03-26 | 深圳汉尼康科技有限公司 | Electronic speech device and control method |
CN109582275A (en) * | 2018-12-03 | 2019-04-05 | 珠海格力电器股份有限公司 | Voice regulation method, device, storage medium and electronic device |
CN109979474A (en) * | 2019-03-01 | 2019-07-05 | 珠海格力电器股份有限公司 | Speech ciphering equipment and its user speed modification method, device and storage medium |
CN110798327A (en) * | 2019-09-04 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Message processing method, device and storage medium |
CN111031386A (en) * | 2019-12-17 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Video dubbing method and device based on voice synthesis, computer equipment and medium |
CN111292737A (en) * | 2018-12-07 | 2020-06-16 | 阿里巴巴集团控股有限公司 | Voice interaction and voice awakening detection method, device, equipment and storage medium |
CN112185363A (en) * | 2020-10-21 | 2021-01-05 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112185403A (en) * | 2020-09-07 | 2021-01-05 | 广州多益网络股份有限公司 | Voice signal processing method and device, storage medium and terminal equipment |
CN112423019A (en) * | 2020-11-17 | 2021-02-26 | 北京达佳互联信息技术有限公司 | Method and device for adjusting audio playing speed, electronic equipment and storage medium |
CN112565880A (en) * | 2020-12-28 | 2021-03-26 | 北京五街科技有限公司 | Method for playing explanation videos |
CN112565881A (en) * | 2020-12-28 | 2021-03-26 | 北京五街科技有限公司 | Self-adaptive video playing method |
CN112750456A (en) * | 2020-09-11 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Voice data processing method and device in instant messaging application and electronic equipment |
CN112820289A (en) * | 2020-12-31 | 2021-05-18 | 广东美的厨房电器制造有限公司 | Voice playing method, voice playing system, electric appliance and readable storage medium |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112750436B (en) * | 2020-12-29 | 2022-12-30 | 上海掌门科技有限公司 | Method and equipment for determining target playing speed of voice message |
CN113470617A (en) * | 2021-06-28 | 2021-10-01 | 科大讯飞股份有限公司 | Speech recognition method, electronic device and storage device |
CN114979798B (en) * | 2022-04-21 | 2024-03-22 | 维沃移动通信有限公司 | Playing speed control method and electronic equipment |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070177633A1 (en) * | 2006-01-30 | 2007-08-02 | Inventec Multimedia & Telecom Corporation | Voice speed adjusting system of voice over Internet protocol (VoIP) phone and method therefor |
CN101427314A (en) * | 2006-04-25 | 2009-05-06 | 英特尔公司 | Method and apparatus for automatic adjustment of play speed of audio data |
CN101860617A (en) * | 2009-04-12 | 2010-10-13 | 比亚迪股份有限公司 | Mobile terminal with voice processing effect and method thereof |
JP2011087196A (en) * | 2009-10-16 | 2011-04-28 | Nec Saitama Ltd | Telephone set, and speech speed conversion method of telephone set |
JP2015184349A (en) * | 2014-03-20 | 2015-10-22 | 日本放送協会 | Voice signal processing device and program |
CN105405439A (en) * | 2015-11-04 | 2016-03-16 | 科大讯飞股份有限公司 | Voice playing method and device |
-
2016
- 2016-05-31 CN CN201610375868.9A patent/CN105869626B/en active Active
- 2016-06-29 WO PCT/CN2016/087741 patent/WO2017206256A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070177633A1 (en) * | 2006-01-30 | 2007-08-02 | Inventec Multimedia & Telecom Corporation | Voice speed adjusting system of voice over Internet protocol (VoIP) phone and method therefor |
CN101427314A (en) * | 2006-04-25 | 2009-05-06 | 英特尔公司 | Method and apparatus for automatic adjustment of play speed of audio data |
CN101860617A (en) * | 2009-04-12 | 2010-10-13 | 比亚迪股份有限公司 | Mobile terminal with voice processing effect and method thereof |
JP2011087196A (en) * | 2009-10-16 | 2011-04-28 | Nec Saitama Ltd | Telephone set, and speech speed conversion method of telephone set |
JP2015184349A (en) * | 2014-03-20 | 2015-10-22 | 日本放送協会 | Voice signal processing device and program |
CN105405439A (en) * | 2015-11-04 | 2016-03-16 | 科大讯飞股份有限公司 | Voice playing method and device |
Cited By (30)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106448653A (en) * | 2016-09-27 | 2017-02-22 | 惠州市德赛工业研究院有限公司 | Wearable intelligent terminal |
CN106486111A (en) * | 2016-10-14 | 2017-03-08 | 北京光年无限科技有限公司 | Many tts engines output word speed control method and system based on intelligent robot |
CN106534964B (en) * | 2016-11-23 | 2020-02-14 | 广东小天才科技有限公司 | Method and device for adjusting speech rate |
CN106534964A (en) * | 2016-11-23 | 2017-03-22 | 广东小天才科技有限公司 | Speed adjusting method and device |
CN108984078A (en) * | 2017-05-31 | 2018-12-11 | 联想(新加坡)私人有限公司 | The method and information processing unit of output setting are adjusted based on the user identified |
CN107689229A (en) * | 2017-09-25 | 2018-02-13 | 广东小天才科技有限公司 | A kind of method of speech processing and device for wearable device |
CN108630224A (en) * | 2018-03-22 | 2018-10-09 | 北京云知声信息技术有限公司 | Control the method and device of word speed |
CN108630224B (en) * | 2018-03-22 | 2020-06-09 | 云知声智能科技股份有限公司 | Method and device for controlling speech rate |
CN109119088A (en) * | 2018-08-29 | 2019-01-01 | 歌尔科技有限公司 | A kind of adjusting method of audio signal, device, equipment and computer storage medium |
CN109147802A (en) * | 2018-10-22 | 2019-01-04 | 珠海格力电器股份有限公司 | A kind of broadcasting word speed adjusting method and device |
CN109348068A (en) * | 2018-12-03 | 2019-02-15 | 咪咕数字传媒有限公司 | A kind of information processing method, device and storage medium |
CN109582275A (en) * | 2018-12-03 | 2019-04-05 | 珠海格力电器股份有限公司 | Voice regulation method, device, storage medium and electronic device |
CN111292737A (en) * | 2018-12-07 | 2020-06-16 | 阿里巴巴集团控股有限公司 | Voice interaction and voice awakening detection method, device, equipment and storage medium |
CN109521718A (en) * | 2019-01-11 | 2019-03-26 | 深圳汉尼康科技有限公司 | Electronic speech device and control method |
CN109979474A (en) * | 2019-03-01 | 2019-07-05 | 珠海格力电器股份有限公司 | Speech ciphering equipment and its user speed modification method, device and storage medium |
CN109979474B (en) * | 2019-03-01 | 2021-04-13 | 珠海格力电器股份有限公司 | Voice equipment and user speech rate correction method and device thereof and storage medium |
CN110798327A (en) * | 2019-09-04 | 2020-02-14 | 腾讯科技(深圳)有限公司 | Message processing method, device and storage medium |
CN111031386A (en) * | 2019-12-17 | 2020-04-17 | 腾讯科技(深圳)有限公司 | Video dubbing method and device based on voice synthesis, computer equipment and medium |
CN111031386B (en) * | 2019-12-17 | 2021-07-30 | 腾讯科技(深圳)有限公司 | Video dubbing method and device based on voice synthesis, computer equipment and medium |
CN112185403A (en) * | 2020-09-07 | 2021-01-05 | 广州多益网络股份有限公司 | Voice signal processing method and device, storage medium and terminal equipment |
CN112750456A (en) * | 2020-09-11 | 2021-05-04 | 腾讯科技(深圳)有限公司 | Voice data processing method and device in instant messaging application and electronic equipment |
CN112185363A (en) * | 2020-10-21 | 2021-01-05 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112185363B (en) * | 2020-10-21 | 2024-02-13 | 北京猿力未来科技有限公司 | Audio processing method and device |
CN112423019A (en) * | 2020-11-17 | 2021-02-26 | 北京达佳互联信息技术有限公司 | Method and device for adjusting audio playing speed, electronic equipment and storage medium |
CN112423019B (en) * | 2020-11-17 | 2022-11-22 | 北京达佳互联信息技术有限公司 | Method and device for adjusting audio playing speed, electronic equipment and storage medium |
CN112565880A (en) * | 2020-12-28 | 2021-03-26 | 北京五街科技有限公司 | Method for playing explanation videos |
CN112565881A (en) * | 2020-12-28 | 2021-03-26 | 北京五街科技有限公司 | Self-adaptive video playing method |
CN112565881B (en) * | 2020-12-28 | 2023-03-24 | 北京五街科技有限公司 | Self-adaptive video playing method and system |
CN112565880B (en) * | 2020-12-28 | 2023-03-24 | 北京五街科技有限公司 | Method and system for playing explanation videos |
CN112820289A (en) * | 2020-12-31 | 2021-05-18 | 广东美的厨房电器制造有限公司 | Voice playing method, voice playing system, electric appliance and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
WO2017206256A1 (en) | 2017-12-07 |
CN105869626B (en) | 2019-02-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105869626A (en) | Automatic speech rate adjusting method and terminal | |
CN103903627B (en) | The transmission method and device of a kind of voice data | |
CN109979457A (en) | A method of thousand people, thousand face applied to Intelligent dialogue robot | |
CN104538043A (en) | Real-time emotion reminder for call | |
US7792673B2 (en) | Method of generating a prosodic model for adjusting speech style and apparatus and method of synthesizing conversational speech using the same | |
JP5507260B2 (en) | System and technique for creating spoken voice prompts | |
CN105991847A (en) | Call communication method and electronic device | |
CN108184032B (en) | Service method and device of customer service system | |
CN104811559A (en) | Noise reduction method, communication method and mobile terminal | |
CN109599094A (en) | The method of sound beauty and emotion modification | |
EP1280137B1 (en) | Method for speaker identification | |
CN106981289A (en) | A kind of identification model training method and system and intelligent terminal | |
CN104485100A (en) | Text-to-speech pronunciation person self-adaptive method and system | |
CN106887231A (en) | A kind of identification model update method and system and intelligent terminal | |
CN107910004A (en) | Voiced translation processing method and processing device | |
CN110198381A (en) | A kind of method and device of identification AI incoming call | |
Babel et al. | 19 Producing Linguistic Variation Socially Meaningful | |
CN112349266A (en) | Voice editing method and related equipment | |
CN101460994A (en) | Speech differentiation | |
CN104427125A (en) | Method and mobile terminal for answering call | |
CN113643684A (en) | Speech synthesis method, speech synthesis device, electronic equipment and storage medium | |
Hempel et al. | Sound branding and corporate voice–strategic brand management using sound | |
CN109616116B (en) | Communication system and communication method thereof | |
CN112102807A (en) | Speech synthesis method, apparatus, computer device and storage medium | |
CN112185341A (en) | Dubbing method, apparatus, device and storage medium based on speech synthesis |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |