CN105869626B - A kind of method and terminal of word speed automatic adjustment - Google Patents

A kind of method and terminal of word speed automatic adjustment Download PDF

Info

Publication number
CN105869626B
CN105869626B CN201610375868.9A CN201610375868A CN105869626B CN 105869626 B CN105869626 B CN 105869626B CN 201610375868 A CN201610375868 A CN 201610375868A CN 105869626 B CN105869626 B CN 105869626B
Authority
CN
China
Prior art keywords
voice
speed
voice messaging
information
broadcasting speed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610375868.9A
Other languages
Chinese (zh)
Other versions
CN105869626A (en
Inventor
王晓军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Original Assignee
Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yulong Computer Telecommunication Scientific Shenzhen Co Ltd filed Critical Yulong Computer Telecommunication Scientific Shenzhen Co Ltd
Priority to CN201610375868.9A priority Critical patent/CN105869626B/en
Priority to PCT/CN2016/087741 priority patent/WO2017206256A1/en
Publication of CN105869626A publication Critical patent/CN105869626A/en
Application granted granted Critical
Publication of CN105869626B publication Critical patent/CN105869626B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72433User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones

Abstract

The invention discloses a kind of methods of word speed automatic adjustment, comprising: obtains the voice messaging of input;Extract the voice characteristics information of the voice messaging;The broadcasting speed of the voice messaging corresponding with the voice characteristics information is inquired from speech database;The speed that the voice messaging plays is adjusted according to the broadcasting speed;It can be seen that this method can determine scheduled broadcasting speed corresponding with the voice characteristics information, be adjusted according to word speed of the broadcasting speed to the voice messaging of input, to adapt to the demand of various users according to the voice characteristics information of the voice messaging inputted in real time;The adjusting broadcasting speed according to the content-adaptive of voice messaging is realized, and it is adaptable to can be used for the occasions such as call and program broadcasting.The invention also discloses a kind of terminal, the adjusting broadcasting speed according to the content-adaptive of voice messaging can be realized.

Description

A kind of method and terminal of word speed automatic adjustment
Technical field
The present invention relates to field of communication technology, in particular to a kind of the method and terminal of word speed automatic adjustment.
Background technique
Due to the difference of the hearing level of people, the broadcasting content of same word speed can feel word speed very for some people Fastly so that can not hear clearly, it can think that word speed is very slow so that feeling wasting time for another part people.Therefore, terminal In broadcasting content word speed needs set according to the actual demand of people.
In the prior art, increase word speed in user mobile phone client application program and adjust control, so that user selects to adjust Word speed selectes word speed grade, and mobile phone is set according to user adjusts word speed grade broadcasting voice content.But the above method there is also Once disadvantage: it firstly, although the adjusting of word speed is divided into several grades, but needs manpower dynamic default, cannot dynamically adjust i.e. Adaptive word speed can not be adjusted.Secondly, word speed adjusts the content for being only limitted to mobile phone client software broadcasting, it cannot Word speed is adjusted in real time in call.Finally, cannot adaptive other kind of speech like sound, carry out word speed according to the languages of both call sides It adjusts.Therefore, how adaptively word speed to be adjusted, is those skilled in the art's technical issues that need to address.
Summary of the invention
The object of the present invention is to provide the methods and terminal of a kind of automatic adjustment of word speed, can be according to the voice inputted in real time The voice characteristics information of information determines scheduled broadcasting speed corresponding with the voice characteristics information, according to the broadcasting speed The word speed of the voice messaging of input is adjusted, the adjusting broadcasting speed according to the content-adaptive of voice messaging is realized.
In order to solve the above technical problems, the present invention provides a kind of method of word speed automatic adjustment, comprising:
Obtain the voice messaging of input;
Extract the voice characteristics information of the voice messaging;
The broadcasting speed of the voice messaging corresponding with the voice characteristics information is inquired from speech database;
The speed that the voice messaging plays is adjusted according to the broadcasting speed.
Wherein, the voice characteristics information for extracting the voice messaging, comprising:
Identify the languages characteristic information of the voice messaging;And/or
Extract the word speed information of the voice messaging, at least one of feature word information and audio-frequency information.
Wherein, the voice messaging is the voice messaging of this end subscriber, this method further include:
Obtain the sign information of described end subscriber;
The broadcasting speed of the voice messaging corresponding with the voice characteristics information, packet are inquired from speech database It includes:
The voice letter corresponding with the voice characteristics information and the sign information is inquired from speech database The broadcasting speed of breath.
Wherein, it will be inquired from speech database corresponding with the voice characteristics information and the sign information described After the broadcasting speed of voice messaging, further includes:
Using the voice characteristics information and the sign information, according to machine learning algorithm to being played in speech database The corresponding relationship of speed is updated.
Wherein, the speed that the voice messaging plays is adjusted according to the broadcasting speed, comprising:
By interpolation or take out and cut digital signal resampling to the voice messaging, adjust the voice messaging when Between scale reach the broadcasting speed.
The present invention also provides a kind of terminals, comprising:
Voice messaging obtains module, for obtaining the voice messaging of input;
Pronunciation extracting module, for extracting the voice characteristics information of the voice messaging;
Broadcasting speed determining module, it is corresponding with the voice characteristics information described for being inquired from speech database The broadcasting speed of voice messaging;
Broadcasting speed adjustment module, for adjusting the speed that the voice messaging plays according to the broadcasting speed.
Wherein, the pronunciation extracting module includes:
First speech feature extraction unit, for identification the languages characteristic information of the voice messaging;And/or
Second speech feature extraction unit, for extracting the word speed information of the voice messaging, feature word information and audio At least one of information.
Wherein, the voice messaging is the voice messaging of this end subscriber, the terminal further include:
Sign information obtains module, for obtaining the sign information of described end subscriber.
Wherein, the terminal further include:
Machine learning module, for utilizing the voice characteristics information and the sign information, according to machine learning algorithm The corresponding relationship of broadcasting speed in speech database is updated.
Wherein, the broadcasting speed adjustment module specifically by interpolation or takes out the number letter cut to the voice messaging Number resampling, the time scale for adjusting the voice messaging reach the module of the broadcasting speed.
The method of word speed automatic adjustment provided by the present invention, comprising: obtain the voice messaging of input;Extract the voice The voice characteristics information of information;The voice messaging corresponding with the voice characteristics information is inquired from speech database Broadcasting speed;The speed that the voice messaging plays is adjusted according to the broadcasting speed;
It can be seen that this method can be determining to believe with the phonetic feature according to the voice characteristics information of the voice messaging inputted in real time The corresponding scheduled broadcasting speed of manner of breathing, is adjusted, to adapt to according to word speed of the broadcasting speed to the voice messaging of input The demand of various users;The adjusting broadcasting speed according to the content-adaptive of voice messaging is realized, and this method can be used In the occasions such as user's communication and program broadcasting, the adaptability of this method is improved.The present invention also provides a kind of terminals, have Above-mentioned beneficial effect, details are not described herein.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis The attached drawing of offer obtains other attached drawings.
Fig. 1 is the flow chart of the method for the automatic adjustment of word speed provided by the embodiment of the present invention;
Fig. 2 is the structural block diagram of terminal provided by the embodiment of the present invention;
Fig. 3 is the structural block diagram of another terminal provided by the embodiment of the present invention;
Fig. 4 is the structural block diagram of another terminal provided by the embodiment of the present invention.
Specific embodiment
Core of the invention is to provide the method and terminal of a kind of word speed automatic adjustment, can be according to the voice inputted in real time The voice characteristics information of information determines scheduled broadcasting speed corresponding with the voice characteristics information, according to the broadcasting speed The word speed of the voice messaging of input is adjusted, the adjusting broadcasting speed according to the content-adaptive of voice messaging is realized.
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
Referring to FIG. 1, Fig. 1 is the flow chart of the method for the automatic adjustment of word speed provided by the embodiment of the present invention;This implementation Executing subject in example is terminal, which can be mobile phone;This method may include:
S100, the voice messaging for obtaining input;
Wherein, the acquisition of voice messaging here can be to talk business and can be realized voice play function business and answer It is realized with the monitoring of program;It can be voice messaging when this end subscriber makes a phone call or answers the call, be also possible to opposite end Voice messaging when user makes a phone call or answers the call is also possible to the voice that there is the application program of voice play function to play Information.
S110, the voice characteristics information for extracting the voice messaging;
Wherein, the type of the voice characteristics information extracted here and the quantity of type can be actually needed according to user Confirmed, is obtained as long as can correspond to according to the voice characteristics information having in voice messaging and be adjusted according to preset standard Voice messaging broadcasting speed.It can be adjusted according to preset standard by the voice characteristics information in voice messaging It plays word speed and realizes word speed automatic adjustment.For example, voice characteristics information here may include mood, languages, voice spy The characteristic informations such as sign, word speed, intonation.
S120, the broadcasting speed that the voice messaging corresponding with the voice characteristics information is inquired from speech database Degree;
Wherein, after the voice characteristics information that confirmation needs to extract, user can preset corresponding every kind of voice The corresponding broadcasting speed of characteristic information or several voice characteristics informations determine a corresponding broadcasting speed jointly;Here may be used Above-mentioned corresponding relationship is stored in the form of corresponding lists by speech database, also can use the form of mapping table Above-mentioned corresponding relationship is stored.User can also variation according to the actual situation corresponding closed to what is saved in speech database System such as modifies, deletes, increasing at the modification, the corresponding broadcasting speed of voice characteristics information set with guarantee to be newest, It can satisfy the actual demand of user.
Here voice inquirement database can also include that the voice characteristics information that will extract is corresponding with speech database The range intervals of such voice characteristics information compare, and judge which range the numerical value of the voice characteristics information extracted is located at, And then confirm the corresponding default broadcasting speed of the range.User can also according to actual needs to voice characteristics information range area Between modify, can also default broadcasting speed corresponding to each range modify, to adapt to the individual demand of user, Improve user experience.
S130, the speed that the voice messaging plays is adjusted according to the broadcasting speed.
Wherein, voice messaging is adjusted according to obtained broadcasting speed, to reach the broadcasting speed.Here not right The method that specific voice messaging is adjusted is defined, as long as the voice messaging that can be will acquire is adjusted to corresponding broadcasting speed It plays out.A kind of specific word speed adjustment process is provided below: being cut by interpolation or smoke to the voice messaging Digital signal resampling, the time scale for adjusting the voice messaging reach the broadcasting speed.Pass through interpolation or pumping It cuts to digital signal resampling, to elongate or shorten the time scale of voice, achievees the purpose that change word speed.
For example, call is a basic service and a critically important function during people use mobile phone. But some people's spoken utterance speed ratios are very fast, somebody's hearing is again bad, links up in this case just relatively difficult.This method During user is conversed using mobile phone, mood, languages when according to the input voice information of acquisition to double-talk, The voice characteristics informations such as phonetic feature are acquired and are compared with the information in speech database, to be judged, such as Fruit word speed is too fast or there is abnormal feedback in opposite end, confirms the corresponding broadcasting speed of the word speed, or the corresponding broadcasting of abnormal feedback Speed, and cut by interpolation or pumping to digital signal resampling, to elongate or shorten the time scale of voice, reaches and change Become the purpose of word speed.User using when mobile phone according to this end subscriber or peer user using mobile phone communication when the language kind that uses The factors such as class, emotional change automatically adjust the speed of the sound played back from earpiece.To be adapted to the need of various people It asks.
Wherein, optionally, study update is carried out to the speech database using machine learning algorithm.
Safeguard speech database in the terminal, can the voice characteristics information parameter to user store, make engineering It practises algorithm and carries out update of the study realization to speech database for voice characteristics information parameter as input.It can be according to different The long-time service habit of user group is adjusted, rather than adjusts fully according to the original start data of guidance, has more Good adaptability.
Above-mentioned example specific implementation process can be such that
This end subscriber i.e. caller end subscriber be eager to state something or it is excited when, words and phrases used in speech information content Meet to user " irritability " this kind of definition in database, then the defeated of acquisition will be reduced according to " irritability " corresponding broadcasting speed Enter the word speed of voice messaging.Achieve the purpose that releive, user is allowed more efficiently to use conversation function of mobile phone with friendly.
When for example caller end subscriber uses English again, judge that this is English according to voice characteristics information, then will be by The word speed of input voice information is adjusted according to the corresponding broadcasting speed of English.After adjusting in this way, it is called end subscriber, that is, peer user It can hear the voice messaging after slowing down, user's hard of hearing when linking up with non-native user can be solved to a certain degree and asked Topic.
Based on the above-mentioned technical proposal, the method for the word speed automatic adjustment that the embodiment of the present invention mentions, can be according to real-time input Voice messaging voice characteristics information, determine corresponding with the voice characteristics information scheduled broadcasting speed, broadcast according to this It puts speed the word speed of the voice messaging of input is adjusted, to adapt to the demand of various users;It realizes and is believed according to voice The adjusting broadcasting speed of the content-adaptive of breath, and this method can be used for the occasions such as user's communication and program broadcasting, improve This method it is adaptable.Allow different user according to self-demand adaptive voice broadcasting speed, promotes user's impression.
Based on the above embodiment, which can be according to the adaptive adjusting of the category of language of input voice information and each A corresponding voice messaging broadcasting speed of category of language;It can be according to category of language automatic adjusument broadcasting speed.Preferably, The voice characteristics information for extracting the voice messaging specifically:
Identify the languages characteristic information of the voice messaging.
Wherein, by the identification of the input voice information to acquisition, the languages characteristic information of available voice messaging should Languages characteristic information may include audio frequency parameter, feature word information, according to the corresponding preset broadcasting speed of the languages characteristic information Degree determines the speed that the voice messaging plays.Here any languages can all can be respectively set with corresponding broadcasting speed with user Degree;Or corresponding broadcasting speed is respectively set to the languages of predetermined quantity;Or languages are divided into several big classifications, only for every Corresponding broadcasting speed is arranged in kind classification, and corresponding languages characteristic information here can be classification information, or will obtain Languages are judging which classification is the languages belong to, and finally determine corresponding broadcasting speed again;This languages and broadcasting speed Corresponding relationship can be realized by corresponding lists or mapping table.
Wherein, the recognition methods of languages characteristic information can pass through user's language recognition system and language text translation system Synthesize " reference voice ", the Markov model based on segment and syllable, pitch contour, formant arrow of every kind of language of user Amount, acoustic feature, the phoneme of dialect and prosodic features and its original speech sound waves feature are identified.The classification used Method may include HMM, expert system, clustering algorithm, secondary classification and artificial neural network.
Above-described embodiment is illustrated below by several specific application scenarios:
Application program in terminal will be listened to identify the voice messaging of acquisition there are when input voice information, if When determining the languages characteristic information for English, the corresponding broadcasting speed of the English of user preset is determined, and by the language of voice messaging Speed is adjusted to corresponding broadcasting speed.Its English is only for example.
When user converses, the languages of the voice messaging of this end subscriber can be only detected, opposite end can also be only detected The languages of the voice messaging of user also can detecte the languages of the voice messaging of this end subscriber and peer user;Below with last It is illustrated for a kind of situation:
Mobile phone is in normal communication state when beginning, and calling and called have turned on.Voice messaging obtains module and obtains input Voice messaging;Pronunciation extracting module extracts the audio frequency parameter of both sides and crucial words and phrases.Broadcasting speed determines mould Block parses the audio frequency parameter extracted, and voice inquirement database simultaneously carries out languages judgement, determines user preset according to languages Broadcasting speed.Broadcasting speed adjustment module carries out temporal elongation to voice messaging or shortening is handled.Earpiece plays at The voice messaging of reason.Both sides hang up the telephone, and call is completed.
Embodiment user can determine the reception ability to every kind of language according to own actual situation, and reasonable set plays Speed, can solve user with non-native user link up when hard of hearing the problem of.
Based on above-mentioned any embodiment, when which is mainly used for carrying out speech exchange between user, it is possible that Situations such as word speed is too fast, excited can go on smoothly in order to the exchange between user in these cases, according to The voice characteristics information of family voice messaging determines the state of user, determines the broadcasting speed set under the state;It can basis User speak state self-adaption adjust broadcasting speed.Preferably, the voice characteristics information for extracting the voice messaging is specific Are as follows:
Extract the word speed information of the voice messaging, at least one of feature word information and audio-frequency information.
Wherein, these need to determine the User Status that every kind of voice characteristics information is corresponding or reacts first, carry out true Which type of broadcasting speed is scheduled under this kind of state should be arranged.Here can be determined only according to word speed information, it can also To determine etc. only according to feature word information, i.e. word speed information, feature word information and audio-frequency information can be in any combination;
When used aloned, classified according to every kind of voice characteristics information situation, and sorted each case is set Corresponding broadcasting speed, such as word speed information, user speaks word speed in the case where irritability generally can be too fast, then when word speed information The user can be thought when more than certain value for irritability, set the broadcasting speed under scheduled irritability for its voice messaging, Word speed can certainly be divided into several word speed ranges, and corresponding broadcasting speed under each word speed range is set.
It, preferably can be by word speed information, feature word information and audio-frequency information knot in order to improve the accuracy of word speed adjusting It closes and uses, i.e., broadcasting speed is determined according to the informix of three features.The word speed for example, user speaks in the case where irritability It generally can be too fast, it may appear that (user can be according to the habit being set in oneself irritable situation the characteristics of itself for some particular words Inertia word), and sound can be high, if three occur or both at least can think the user for irritability, by its voice Information is set as the broadcasting speed under scheduled irritability.
Word speed information in the embodiment, feature word information and audio-frequency information arbitrarily can carry out group with languages characteristic information It closes and uses.Corresponding broadcasting speed under each word speed range of English, corresponding broadcasting speed under each word speed range of Chinese are such as set Degree.
Based on the above embodiment, the problem of user's energy automatic adjusument call word speed.Allow different user according to itself Demand changes playout of voice, promotes user's impression.
Based on above-mentioned any embodiment, the embodiment mainly for the state of this end subscriber can be determined more accurately, And then determine the broadcasting speed of this end subscriber in this state;The state self-adaption that can be spoken according to this end subscriber, which is adjusted, plays speed Degree.The i.e. described voice messaging is the voice messaging of this end subscriber, and this method can also include:
Obtain the sign information of described end subscriber;
The broadcasting of the voice messaging corresponding with the voice characteristics information is inquired from speech database accordingly Speed, comprising:
The voice letter corresponding with the voice characteristics information and the sign information is inquired from speech database The broadcasting speed of breath.
Wherein, above-described embodiment can be according to word speed information, and feature word information and audio-frequency information determine the state of user, is It is determined more accurately whether this end subscriber is under the state, the sign information of this end subscriber, sign letter can also be obtained Breath may include the body temperature of this end subscriber, pulse etc..And the acquisition of sign information can be worn by the intelligence being adapted with terminal Wear the acquisition such as equipment such as Intelligent bracelet.
Such as this end subscriber i.e. caller end subscriber be eager to state something or it is excited when, used in speech information content Words and phrases meet to the irritable this kind of definition of user in database, and have collected the information such as user's pulse quickening from Intelligent bracelet, It can so determine that user is in irritable state, the input voice information obtained can be reduced according to irritable corresponding broadcasting speed Word speed.Achieve the purpose that releive, user is allowed more efficiently to use conversation function of mobile phone with friendly.Detailed process can be with It is as follows:
Mobile phone is in normal communication state, and calling and called have turned on.The voice messaging of user is acquired, and passes through Intelligent bracelet Acquire the information such as body temperature, the pulse during user's communication.Voice inquirement database information, in conjunction with the body during user's communication Temperature, pulse variation and crucial words and phrases, that is, feature word information use, judge whether user is in a bad mood the situation of excitement.And according to language Fast information judges whether to need to adjust.If meeting the condition adjusted, adjusted according to the preset value in speech database Section, determines new broadcasting speed.Temporal elongation or shortening processing are carried out to voice messaging data.Earpiece is played by processing Voice data.And speech database can be written into the emotional change information of this user and feature sentence, it is subsequent to optimize Calculating to emotion judgment.
Based on above-mentioned any embodiment, which mainly improves the accuracy of speech database, and therefore, this method is also wrapped It includes:
Using the voice characteristics information and the sign information, according to machine learning algorithm to being played in speech database The corresponding relationship of speed is updated.
Wherein, safeguard speech database in the terminal, can the audio-frequency information parameter to user store, in this way instruct Just has the learning functionality of word speed adjusting.It can be accustomed to being adjusted according to the long-time service of different user groups, rather than It is adjusted fully according to the original start data of guidance, there is better adaptability.With learning functionality, user can be constantly updated The key term being often used i.e. feature word information, to optimize the subsequent calculating to judging with user emotion.
Based on the above-mentioned technical proposal, the method for the word speed automatic adjustment that the embodiment of the present invention mentions, can be according to real-time input Voice messaging voice characteristics information, determine corresponding with the voice characteristics information scheduled broadcasting speed, broadcast according to this It puts speed the word speed of the voice messaging of input is adjusted, to adapt to the demand of various users;It realizes and is believed according to voice The adjusting broadcasting speed of the content-adaptive of breath, and this method can be used for the occasions such as user's communication and program broadcasting, improve This method it is adaptable.Allow different user according to self-demand adaptive voice broadcasting speed, promotes user's impression.
The embodiment of the invention provides the methods of word speed automatic adjustment, can be according to the voice of the voice messaging inputted in real time Characteristic information determines scheduled broadcasting speed corresponding with the voice characteristics information, according to the broadcasting speed to the language of input The word speed of message breath is adjusted.
Terminal provided in an embodiment of the present invention is introduced below, terminal described below and above-described word speed are certainly The dynamic method adjusted can correspond to each other reference.
Referring to FIG. 2, Fig. 2 is the structural block diagram of terminal provided by the embodiment of the present invention;The terminal may include:
Voice messaging obtains module 100, for obtaining the voice messaging of input;
Pronunciation extracting module 200, for extracting the voice characteristics information of the voice messaging;
Broadcasting speed determining module 300, it is corresponding with the voice characteristics information for being inquired from speech database The broadcasting speed of the voice messaging;
Broadcasting speed adjustment module 400, for adjusting the speed that the voice messaging plays according to the broadcasting speed.
Optionally, the pronunciation extracting module 200 includes:
First speech feature extraction unit, for identification the languages characteristic information of the voice messaging;And/or
Second speech feature extraction unit, for extracting the word speed information of the voice messaging, feature word information and audio At least one of information.
Optionally, referring to FIG. 3, the voice messaging is the voice messaging of this end subscriber, the terminal further include:
Sign information obtains module 500, for obtaining the sign information of described end subscriber.
Wherein, at this moment broadcasting speed determining module 300 is specially inquiry and phonetic feature letter from speech database The module of the broadcasting speed of breath and the corresponding voice messaging of the sign information.
Optionally, referring to FIG. 4, the terminal further include:
Machine learning module 600 is calculated for utilizing the voice characteristics information and the sign information according to machine learning Method is updated the corresponding relationship of broadcasting speed in speech database.
Optionally, broadcasting speed adjustment module 400 specifically by interpolation or takes out the number cut to the voice messaging Signal resampling, the time scale for adjusting the voice messaging reach the module of the broadcasting speed.
Wherein, it is based on above-mentioned any embodiment, which is specifically as follows mobile phone.
Each embodiment is described in a progressive manner in specification, the highlights of each of the examples are with other realities The difference of example is applied, the same or similar parts in each embodiment may refer to each other.For device disclosed in embodiment Speech, since it is corresponded to the methods disclosed in the examples, so being described relatively simple, related place is referring to method part illustration ?.
Professional further appreciates that, unit described in conjunction with the examples disclosed in the embodiments of the present disclosure And algorithm steps, can be realized with electronic hardware, computer software, or a combination of the two, in order to clearly demonstrate hardware and The interchangeability of software generally describes each exemplary composition and step according to function in the above description.These Function is implemented in hardware or software actually, the specific application and design constraint depending on technical solution.Profession Technical staff can use different methods to achieve the described function each specific application, but this realization is not answered Think beyond the scope of this invention.
The step of method described in conjunction with the examples disclosed in this document or algorithm, can directly be held with hardware, processor The combination of capable software module or the two is implemented.Software module can be placed in random access memory (RAM), memory, read-only deposit Reservoir (ROM), electrically programmable ROM, electrically erasable ROM, register, hard disk, moveable magnetic disc, CD-ROM or technology In any other form of storage medium well known in field.
The method and terminal of word speed provided by the present invention automatic adjustment are described in detail above.It is used herein A specific example illustrates the principle and implementation of the invention, and the above embodiments are only used to help understand Method and its core concept of the invention.It should be pointed out that for those skilled in the art, not departing from this , can be with several improvements and modifications are made to the present invention under the premise of inventive principle, these improvement and modification also fall into the present invention In scope of protection of the claims.

Claims (8)

1. a kind of method of word speed automatic adjustment characterized by comprising
Obtain the voice messaging of input;
Extract the voice characteristics information of the voice messaging, the quantity of the type of the voice characteristics information and type according to Family actual needs is confirmed that the voice characteristics information includes mood, languages, word speed, intonation;
The broadcasting speed of the voice messaging corresponding with the voice characteristics information is inquired from speech database;
The speed that the voice messaging plays is adjusted according to the broadcasting speed.
2. the method for word speed automatic adjustment as described in claim 1, which is characterized in that the voice messaging is this end subscriber Voice messaging, this method further include:
Obtain the sign information of described end subscriber;
The broadcasting speed of the voice messaging corresponding with the voice characteristics information is inquired from speech database, comprising:
The voice messaging corresponding with the voice characteristics information and the sign information is inquired from speech database Broadcasting speed.
3. the method for word speed as claimed in claim 2 automatic adjustment, which is characterized in that will be inquired from speech database and institute After the broadcasting speed for stating voice characteristics information and the corresponding voice messaging of the sign information, further includes:
Using the voice characteristics information and the sign information, according to machine learning algorithm to broadcasting speed in speech database Corresponding relationship be updated.
4. the method for word speed automatic adjustment as described in claim 1, which is characterized in that according to broadcasting speed adjusting The speed that voice messaging plays, comprising:
By interpolation or the digital signal resampling cut to the voice messaging is taken out, adjusts the time ruler of the voice messaging Degree reaches the broadcasting speed.
5. a kind of terminal characterized by comprising
Voice messaging obtains module, for obtaining the voice messaging of input;
Pronunciation extracting module, for extracting the voice characteristics information of the voice messaging, the kind of the voice characteristics information The quantity of class and type according to user actual needs confirms, the voice characteristics information include mood, languages, word speed, Intonation;
Broadcasting speed determining module, for inquiring the voice corresponding with the voice characteristics information from speech database The broadcasting speed of information;
Broadcasting speed adjustment module, for adjusting the speed that the voice messaging plays according to the broadcasting speed.
6. terminal as claimed in claim 5, which is characterized in that the voice messaging is the voice messaging of this end subscriber, the end End further include:
Sign information obtains module, for obtaining the sign information of described end subscriber.
7. terminal as claimed in claim 6, which is characterized in that further include:
Machine learning module, for utilizing the voice characteristics information and the sign information, according to machine learning algorithm to language The corresponding relationship of broadcasting speed is updated in sound database.
8. terminal as claimed in claim 5, which is characterized in that the broadcasting speed adjustment module specifically by interpolation or The digital signal resampling cut to the voice messaging is taken out, the time scale for adjusting the voice messaging reaches the broadcasting speed The module of degree.
CN201610375868.9A 2016-05-31 2016-05-31 A kind of method and terminal of word speed automatic adjustment Active CN105869626B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201610375868.9A CN105869626B (en) 2016-05-31 2016-05-31 A kind of method and terminal of word speed automatic adjustment
PCT/CN2016/087741 WO2017206256A1 (en) 2016-05-31 2016-06-29 Method for automatically adjusting speaking speed and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610375868.9A CN105869626B (en) 2016-05-31 2016-05-31 A kind of method and terminal of word speed automatic adjustment

Publications (2)

Publication Number Publication Date
CN105869626A CN105869626A (en) 2016-08-17
CN105869626B true CN105869626B (en) 2019-02-05

Family

ID=56643245

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610375868.9A Active CN105869626B (en) 2016-05-31 2016-05-31 A kind of method and terminal of word speed automatic adjustment

Country Status (2)

Country Link
CN (1) CN105869626B (en)
WO (1) WO2017206256A1 (en)

Families Citing this family (25)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106448653A (en) * 2016-09-27 2017-02-22 惠州市德赛工业研究院有限公司 Wearable intelligent terminal
CN106486111B (en) * 2016-10-14 2020-02-07 北京光年无限科技有限公司 Multi-TTS engine output speech speed adjusting method and system based on intelligent robot
CN106534964B (en) * 2016-11-23 2020-02-14 广东小天才科技有限公司 Method and device for adjusting speech rate
US20180350371A1 (en) * 2017-05-31 2018-12-06 Lenovo (Singapore) Pte. Ltd. Adjust output settings based on an identified user
CN107689229A (en) * 2017-09-25 2018-02-13 广东小天才科技有限公司 A kind of method of speech processing and device for wearable device
CN108630224B (en) * 2018-03-22 2020-06-09 云知声智能科技股份有限公司 Method and device for controlling speech rate
CN109119088A (en) * 2018-08-29 2019-01-01 歌尔科技有限公司 A kind of adjusting method of audio signal, device, equipment and computer storage medium
CN109147802B (en) * 2018-10-22 2020-10-20 珠海格力电器股份有限公司 Playing speed adjusting method and device
CN109348068A (en) * 2018-12-03 2019-02-15 咪咕数字传媒有限公司 A kind of information processing method, device and storage medium
CN109582275A (en) * 2018-12-03 2019-04-05 珠海格力电器股份有限公司 Voice regulation method, device, storage medium and electronic device
CN111292737A (en) * 2018-12-07 2020-06-16 阿里巴巴集团控股有限公司 Voice interaction and voice awakening detection method, device, equipment and storage medium
CN109521718A (en) * 2019-01-11 2019-03-26 深圳汉尼康科技有限公司 Electronic speech device and control method
CN109979474B (en) * 2019-03-01 2021-04-13 珠海格力电器股份有限公司 Voice equipment and user speech rate correction method and device thereof and storage medium
CN110798327B (en) * 2019-09-04 2022-09-30 腾讯科技(深圳)有限公司 Message processing method, device and storage medium
CN111031386B (en) * 2019-12-17 2021-07-30 腾讯科技(深圳)有限公司 Video dubbing method and device based on voice synthesis, computer equipment and medium
CN112185403A (en) * 2020-09-07 2021-01-05 广州多益网络股份有限公司 Voice signal processing method and device, storage medium and terminal equipment
CN112750456A (en) * 2020-09-11 2021-05-04 腾讯科技(深圳)有限公司 Voice data processing method and device in instant messaging application and electronic equipment
CN112185363B (en) * 2020-10-21 2024-02-13 北京猿力未来科技有限公司 Audio processing method and device
CN112423019B (en) * 2020-11-17 2022-11-22 北京达佳互联信息技术有限公司 Method and device for adjusting audio playing speed, electronic equipment and storage medium
CN112565880B (en) * 2020-12-28 2023-03-24 北京五街科技有限公司 Method and system for playing explanation videos
CN112565881B (en) * 2020-12-28 2023-03-24 北京五街科技有限公司 Self-adaptive video playing method and system
CN112750436B (en) * 2020-12-29 2022-12-30 上海掌门科技有限公司 Method and equipment for determining target playing speed of voice message
CN112820289A (en) * 2020-12-31 2021-05-18 广东美的厨房电器制造有限公司 Voice playing method, voice playing system, electric appliance and readable storage medium
CN113470617A (en) * 2021-06-28 2021-10-01 科大讯飞股份有限公司 Speech recognition method, electronic device and storage device
CN114979798B (en) * 2022-04-21 2024-03-22 维沃移动通信有限公司 Playing speed control method and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101427314A (en) * 2006-04-25 2009-05-06 英特尔公司 Method and apparatus for automatic adjustment of play speed of audio data
CN101860617A (en) * 2009-04-12 2010-10-13 比亚迪股份有限公司 Mobile terminal with voice processing effect and method thereof
JP2011087196A (en) * 2009-10-16 2011-04-28 Nec Saitama Ltd Telephone set, and speech speed conversion method of telephone set
JP2015184349A (en) * 2014-03-20 2015-10-22 日本放送協会 Voice signal processing device and program
CN105405439A (en) * 2015-11-04 2016-03-16 科大讯飞股份有限公司 Voice playing method and device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070177633A1 (en) * 2006-01-30 2007-08-02 Inventec Multimedia & Telecom Corporation Voice speed adjusting system of voice over Internet protocol (VoIP) phone and method therefor

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101427314A (en) * 2006-04-25 2009-05-06 英特尔公司 Method and apparatus for automatic adjustment of play speed of audio data
CN101860617A (en) * 2009-04-12 2010-10-13 比亚迪股份有限公司 Mobile terminal with voice processing effect and method thereof
JP2011087196A (en) * 2009-10-16 2011-04-28 Nec Saitama Ltd Telephone set, and speech speed conversion method of telephone set
JP2015184349A (en) * 2014-03-20 2015-10-22 日本放送協会 Voice signal processing device and program
CN105405439A (en) * 2015-11-04 2016-03-16 科大讯飞股份有限公司 Voice playing method and device

Also Published As

Publication number Publication date
CN105869626A (en) 2016-08-17
WO2017206256A1 (en) 2017-12-07

Similar Documents

Publication Publication Date Title
CN105869626B (en) A kind of method and terminal of word speed automatic adjustment
CN103903627B (en) The transmission method and device of a kind of voice data
US9571638B1 (en) Segment-based queueing for audio captioning
US10229668B2 (en) Systems and techniques for producing spoken voice prompts
CN109979457A (en) A method of thousand people, thousand face applied to Intelligent dialogue robot
CN104538043A (en) Real-time emotion reminder for call
CN104811559B (en) Noise-reduction method, communication means and mobile terminal
KR20190037363A (en) Method and apparatus for processing voice information
JP5051882B2 (en) Voice dialogue apparatus, voice dialogue method, and robot apparatus
US20220019746A1 (en) Determination of transcription accuracy
CN106887231A (en) A kind of identification model update method and system and intelligent terminal
CN111294471A (en) Intelligent telephone answering method and system
KR20150017662A (en) Method, apparatus and storing medium for text to speech conversion
CN107910004A (en) Voiced translation processing method and processing device
CN109599094A (en) The method of sound beauty and emotion modification
CN109545203A (en) Audio recognition method, device, equipment and storage medium
CN113643684A (en) Speech synthesis method, speech synthesis device, electronic equipment and storage medium
CN109616116B (en) Communication system and communication method thereof
EP4006903A1 (en) System with post-conversation representation, electronic device, and related methods
JP2021196462A (en) Telephone answering device
CN106971734A (en) It is a kind of that the method and system of identification model can be trained according to the extraction frequency of model
JP2004252085A (en) System and program for voice conversion
JPWO2007015319A1 (en) Audio output device, audio communication device, and audio output method
EP4006900A1 (en) System with speaker representation, electronic device and related methods
KR20210085777A (en) Delayed Display Method for Enhancing Hearing Efficiency

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant