CN106383648A - Intelligent terminal voice display method and apparatus - Google Patents
Intelligent terminal voice display method and apparatus Download PDFInfo
- Publication number
- CN106383648A CN106383648A CN201510448262.9A CN201510448262A CN106383648A CN 106383648 A CN106383648 A CN 106383648A CN 201510448262 A CN201510448262 A CN 201510448262A CN 106383648 A CN106383648 A CN 106383648A
- Authority
- CN
- China
- Prior art keywords
- loudness
- bubble
- tone
- animation
- sound
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Processing Or Creating Images (AREA)
Abstract
The embodiment of the invention discloses an intelligent terminal voice display method and apparatus. The method comprises steps of receiving voice data flows participating conversation, sampling and analyzing the voice data flows at fixed time intervals to acquire sampled sound loudness, pitch and speed information, determining loudness bubbles, comparing the sampled sound loudness and a preset threshold value, displaying the sampled sound via a first type circle with different diameters according to a threshold value section where the sampled sound lies, determining sound pitch bubbles, comparing frequency of the sampled sound and a preset frequency threshold value, displaying the sampled sound via a second type circle with different diameters according a threshold value section where the frequency lies, combining the acquired loudness bubbles and pitch bubbles to form an animated object, determining playing speed of the animated object according to the speed information of the sampled sound, determining an animated object motion curve according to the acquired loudness, pitch and speed information, and playing the well-set bubble animation on a display screen. By the use of the intelligent terminal voice display method, voice can be displayed in an emotional and personalized way with bubble animation effect, so user experience can be enhanced; and bubble animated scenes increases understanding depths of users to the voice.
Description
Technical field
The present invention relates to a kind of method that shows of voice of intelligent terminal, more particularly, to intelligent terminal and dress
Put.
Background technology
With the fast development of the communications industry, multiple intelligence such as smart mobile phone, intelligent watch, Intelligent bracelet
Mobile terminal is increasingly favored by people.With the variation of intelligent mobile terminal, will necessarily use
The requirement more and more higher to man-machine interaction for the family, thus the demand producing gets more and more.Such as smart mobile phone,
User, from initial function of sending short messages of substantially making a phone call, is gradually developed online till now, is taken pictures, listens
Music, see the demand of the various functions such as video, reading.Man-machine interaction mode also develops from keyboard, touch-control
To voice, video.Experience for user interface considers, during speech communication in pairs,
Need to provide and should be readily appreciated that and vivid voice Interaction Interface.
Current voice Interaction Interface mainly has:The wave that Fructus Mali pumilae siri adopts is shown, wechat platform is adopted
The aperture that block diagram is shown and worm hole voice assistant adopts is shown.
Inventor finds during realizing the present invention:The speech communication interface cartoon effect of prior art
Arrange stiff ice-cold, design lacks emotional culture and affinity.
Content of the invention
For solving above-mentioned technical problem, the method that intelligent terminal's voice provided by the present invention shows can be led to
Cross following technical method to realize:
Receive from the audio data stream participating in session, it is sampled at regular intervals point
Analysis, obtains loudness, tone and the word speed information of sound;
Audio data stream is carried out with bubble form by animation according to the loudness analyzing, tone and word speed information
Display, described bubble is made up of the first kind circle of different-diameter and the Second Type circle of different-diameter, tool
There are certain speed and curve movement.
A kind of method that intelligent terminal's voice shows, including:
Receive from the audio data stream participating in session, at regular intervals it is sampled point
Analysis, obtains loudness, tone and the word speed information of sample audio, and described tone information is come by the frequency of sound
Characterize, described word speed information to be characterized by the zero-crossing rate of sound;
Determine loudness bubble, the loudness of sample audio is contrasted with default loudness threshold values, according to sound
The residing threshold values of degree is interval to be represented sample audio with the different first kind circle of diameter;
Determine tone bubble, the frequency of sample audio is contrasted with default frequency threshold, according to frequency
Threshold values residing for rate is interval to be represented sample audio with the different Second Type circle of diameter;
The loudness bubble of acquisition and tone bubble are combined into animation object;
Determine the broadcasting speed of animation object according to the word speed information of the sample audio obtaining, according to obtain
Loudness, tone and word speed information determine animation object motion angularity;
The bubble setting animation is played out within display screen.
The device that a kind of intelligent terminal's voice shows, including:
Decimation blocks are used for the audio data stream of the participation session receiving is entered at regular intervals
Row sampling, obtains sample sound;
Speech analysis module, for being analyzed to the sample sound obtaining, the loudness of acquisition sample audio,
Tone and word speed information;
Animation object determining module, for determining the loudness bubble of sample sound and tone bubble and will own
The loudness bubble of sample sound and tone bubble are combined into animation object;
Animation object setup module, for determining described animation object according to the word speed of the sample audio obtaining
Broadcasting speed, described animation object fortune is determined according to the loudness of sample audio obtaining, intonation and word speed
Moving curve;
Animation playing module, for playing out the animation setting object within display screen.
Brief description
Fig. 1 is present invention method schematic flow sheet
Fig. 2 is embodiment of the present invention loudness bubble definition figure
Fig. 3 is embodiment of the present invention tone bubble definition figure
Fig. 4 is embodiment of the present invention word speed speed static state schematic diagram
Fig. 5 rises and falls for embodiment of the present invention sound wave and defines figure
Fig. 6 is the static schematic diagram of embodiment of the present invention sound wave fluctuating
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme total to the embodiment of the present invention is entered
What row was clear, complete describes it is clear that described embodiment is only a part of embodiment of the present invention,
Rather than whole embodiments.Based on embodiments of the invention, those skilled in the art are not making wound
The every other embodiment being obtained under the premise of the property made work, broadly falls into the scope of protection of the invention.
As Fig. 1, embodiments provide a kind of realize the method that intelligent terminal's voice shows, including:
Receive from the voice Streaming Media participating in session, it is sampled at regular intervals point
Analysis, obtains loudness, tone and the word speed information of sound;
According to the loudness of the sample audio analyzing, tone and word speed information, voice is carried out with bubble form
Animation shows, described bubble justifies group by the first kind circle of different-diameter and the Second Type of different-diameter
Become, there is certain speed and curve movement.
A kind of method that intelligent terminal's voice shows, including:
101:Receive from the audio data stream participating in session, with Fixed Time Interval, it is sampled
Analysis, obtains loudness H of each sample audio, frequency f, sound wave zero-crossing rate λt;
Wherein said Fixed Time Interval is set to 100ms;Described sound intensity refers to that the pulse of sound is compiled
Code modulation PCM quantifies loudness value, for describing people's subjective feeling to sound size;Described tone refers to
The frequency of sound, for describing people's subjective feeling to volume up-down.
102:Determine loudness bubble, the loudness of sample audio contrasted with default loudness threshold values,
Sample audio is represented by the threshold values interval according to residing for loudness with the different first kind of diameter;
Specific it is assumed that described loudness threshold values be two, described first kind circle be filled circles, will obtain
Each sample audio loudness value λtDetermine the input variable of algorithm as loudness bubble, and default
Loudness threshold values is contrasted, and algorithm is as follows:
Loudness is divided into equally spaced three intervals, right respectively according to three intervals of order from big to small
Answer large, medium and small loudness, respectively with large, medium and small three kinds of filled circles description, formula is as follows:
Wherein
Wherein htFor characterizing the loudness of time t sample sound, ItFor time t sample sound corresponding loudness bubble
Bubble selects, P1、P2、P3Represent large, medium and small three kinds of filled circles respectively, due to the volume of different sound pick-up outfits
Quantized value is different, and compromise considers HmaxValue is 100, HminValue is 0.
Described default loudness threshold values is Hmin+Δ、Hmin+2Δ.
As shown in Fig. 2 the definition of described large, medium and small three kinds of filled circles is 10 pixels for severe excess syndrome circle diameter,
A diameter of 7 pixels of middle filled circles, little filled circles pixel is 4 pixels.
103:Determine tone bubble, the frequency of sample audio contrasted with default frequency threshold,
Threshold values according to residing for frequency is interval to be represented sample audio with the different Second Type circle of diameter, described
Frequency is characterizing the parameter value of the tone of people's subjective feeling;
Specific it is assumed that described frequency threshold be two, described first kind circle be open circles, will obtain
Each sample audio frequency ftDetermine the input variable of algorithm as tone bubble, with default frequency
Rate threshold values is contrasted, and algorithm is as follows:
Frequency partition is equally spaced three intervals, right respectively according to three intervals of order from high to low
Answer high, medium and low frequency, respectively with large, medium and small three kinds of open circles description, formula is as follows:
WhereinFmaxFor frequency maxima, FminFor frequency minima, ftSign time t sample
The loudness of sound, XtSelect for time t sample audio corresponding loudness bubble, B1、B2、B3Generation respectively
The large, medium and small three kinds of filled circles of table.
Described predeterminated frequency threshold values is Fmin+δ、Fmin+2δ.
As shown in figure 3, it is 10 that the definition of described large, medium and small three kinds of filled circles is respectively severe excess syndrome circle diameter
Pixel, a diameter of 7 pixels of middle filled circles, little filled circles pixel is 4 pixels.
104:The loudness bubble of acquisition and tone bubble are combined into animation object;
, by the way of two dimension interval plane bubble is put at random, described two dimension is interval flat for described compound mode
The length in face and width are all set as that maximum loudness bubble and the diameter of maximum tone bubble close, i.e. 20 pixel.
105:Zero-crossing rate according to the sample audio obtaining arranges the broadcasting speed of animation object;
Because the path width that animation object is play is certain, therefore can be by arranging animation object
Playing duration is realizing the speed effect of broadcasting speed.As shown in figure 4, the playing duration when sample audio
More in short-term, screen a range of voice bubble is more intensive, otherwise more sparse.
Specifically, be not in not see soon very much and too impact function slowly when playing for ensureing animation, setting
The duration that animation object is play limits scope as [Lmin,Lmax], zero-crossing rate span is (0, λmax), wherein
0≤λmax<1, animation object playing duration is obtained according to equation below:
Wherein ltFor the playing duration of time t sample sound, LmaxFor maximum long recording time, LminFor the shortest recording
Duration, λtFor the short-time average zero-crossing rate of the corresponding sample sound of time t, λmaxFor in the 100ms time
The maximum of the short-time average zero-crossing rate of every frame acoustic signals.
Described short-time average zero-crossing rate refers to the zeroaxial number of times of every frame signal, relevant with frequency, permissible
The speed of reflection word speed.The word speed of sound is faster, and the speed that animation is play is faster, conversely, animation is broadcast
The speed put is slower.
106:The loudness of the sample audio according to acquisition, tone, word speed information determine animation object motion
Curve;
Set the movement locus of playing animation as sine curve, the amplitude of curve by sample audio loudness,
Tone, word speed determine jointly, specifically the loudness of sample audio, frequency, word speed information are carried out difference
Set of weights synthesize the amplitude of corresponding sample audio in sine curve, formula is as follows:
WhereinImpact coefficient can be according to time or the dynamic setting of application, and span is (0,1), TiFor loudness,
Tone, the word speed impact share to profile amplitude, can be set as fixed value, value model according to different application
EncloseWherein AmaxSpatial altitude for bubble playing animation in application.
As shown in Fig. 5 or 6, the numerical value fluctuating between each circle during the display of animation object depends on corresponding to
The amplitude of sample audio.
107:The bubble setting animation is played out.
The device that a kind of intelligent terminal's voice shows, including:
801:Decimation blocks be used for receive participation session audio data stream with regular time
Every being sampled, obtain sample sound;
802:Speech analysis module, for being analyzed to the sample sound obtaining, obtains sample audio
Loudness, tone, word speed information;
803:Animation object determining module, for determining the loudness bubble of sample sound and tone bubble simultaneously
The loudness bubble of all sample sounds and tone bubble are combined into animation object;
804:Animation object setup module, described dynamic for being determined according to the word speed of the sample audio obtaining
Draw the broadcasting speed of object, the loudness of the sample audio according to acquisition, intonation and word speed determine described animation
Object motion curve;
Specific inclusion sample audio playing duration computing unit and curve movement magnitude determinations unit.Described
Sample audio playing duration computing unit is used for calculating the time of each sample audio corresponding bubble broadcasting,
Described curve movement magnitude determinations unit is used for calculating the amplitude of the corresponding bubble of every sample audio.
805:Animation playing module, for playing out the animation setting object within display screen.
Method and apparatus that a kind of intelligent terminal's voice of the embodiment of the present invention shows is it is achieved that according to sound
Tone, loudness and word speed information according to certain rule, with different bubbles on mobile terminal screen
Form is shown, produces dynamic and interesting voice bubble identification process, allows whole interactive voice process
No longer dry as dust it is achieved that emotional culture expression is carried out to the voice messaging of user input.
The method and apparatus that a kind of intelligent terminal's the voice above embodiment of the present invention being provided shows
It is described in detail, the explanation of above example is served only for help and understands the method for the present invention and core
Thought, is not limited to the present invention;For those skilled in the art, all at this
The right that the modification made within bright spirit and principle, equivalent, improvement etc. are all contained in the present invention will
Ask in protection domain.
Claims (8)
1. a kind of method that intelligent terminal's voice shows is it is characterised in that include:
Receive from the audio data stream participating in session, at regular intervals it is sampled point
Analysis, obtains loudness, tone and the word speed information of sample audio;
Determine loudness bubble, the loudness of sample audio is contrasted with default loudness threshold values, according to sound
The residing threshold values of degree is interval to represent, loudness is more big right by sample audio with the different first kind circle of diameter
The first kind circular diameter answered is bigger;
Determine tone bubble, the frequency of sample audio is contrasted with default frequency threshold, according to frequency
Threshold values residing for rate is interval to represent, frequency is more big right by sample audio with the different Second Type circle of diameter
The Second Type circular diameter answered is bigger;
The loudness bubble of acquisition and tone bubble are combined into animation object;
Determine broadcasting speed and curve movement, the word speed information of the sample audio according to described acquisition determines to be moved
Draw the broadcasting speed of object, determine that animation object is transported according to the loudness of described acquisition, tone, word speed information
Moving curve;
The bubble setting animation is played out within display screen.
2. according to claim 1 method it is characterised in that described word speed information refers to sound
Short-time average zero-crossing rate, refer to the number of times by null value for every frame signal.
3. according to claim 1 method it is characterised in that the described first kind circle and Equations of The Second Kind
Type circle can be filled circles and open circles respectively.
4. according to claim 1 method it is characterised in that described loudness threshold values number can be
Certain numerical value default, the number of the different-diameter that described certain numerical value default is justified by the first kind is determined
Fixed;Described frequency threshold number can be certain numerical value default, and described certain numerical value default is by second
The number of the different-diameter of type circle determines.
5. according to claim 1 method it is characterised in that described by obtain loudness bubble and
Tone bubble is combined into animation object by the way of two dimension interval plane bubble is put at random, described two dimension
The length of interval plane and width are all set as that maximum loudness bubble and the diameter of maximum tone bubble close.
6. according to claim 1 method it is characterised in that described according to obtain sample audio
Word speed information determine that the broadcasting speed of animation object is determined by equation below:
Wherein ltFor the playing duration of time t sample sound, LmaxFor maximum long recording time, LminFor the shortest
Long recording time, λtFor the short-time average zero-crossing rate of the corresponding sample sound of time t, λmaxDuring for 100ms
The maximum of the short-time average zero-crossing rate of interior each frame sonic data.
7. according to claim 1 method it is characterised in that described according to obtain loudness, tone
Determine animation object motion curve with word speed information, formula is as follows:
WhereinImpact coefficient can be according to time or the dynamic setting of application, and span is (0,1), TiFor
Loudness, tone, the word speed impact share to profile amplitude, can be set as fixed value according to different application,
SpanWherein AmaxSpatial altitude for bubble playing animation in application.
8. the device that a kind of intelligent terminal's voice shows is it is characterised in that include:
Sampling module is used for the audio data stream of the participation session receiving is entered at regular intervals
Row sampling, obtains sample sound;
Speech analysis module, for being analyzed to the sample sound obtaining, the loudness of acquisition sample audio,
Tone and word speed information;
Animation object determining module, for determining the loudness bubble of sample sound and tone bubble will be described
The loudness bubble of sample sound and tone bubble are combined into animation object;
Animation object setup module, for determining described animation object according to the word speed of the sample audio obtaining
Broadcasting speed, described animation pair is determined according to the loudness of sample audio obtaining, intonation and word speed information
As curve movement;
Animation playing module, for playing out the animation setting object within display screen.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510448262.9A CN106383648A (en) | 2015-07-27 | 2015-07-27 | Intelligent terminal voice display method and apparatus |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510448262.9A CN106383648A (en) | 2015-07-27 | 2015-07-27 | Intelligent terminal voice display method and apparatus |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106383648A true CN106383648A (en) | 2017-02-08 |
Family
ID=57916090
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510448262.9A Pending CN106383648A (en) | 2015-07-27 | 2015-07-27 | Intelligent terminal voice display method and apparatus |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106383648A (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108776985A (en) * | 2018-06-05 | 2018-11-09 | 科大讯飞股份有限公司 | A kind of method of speech processing, device, equipment and readable storage medium storing program for executing |
CN109213545A (en) * | 2017-06-30 | 2019-01-15 | 北京国双科技有限公司 | Based reminding method and device in court trial process |
CN109361888A (en) * | 2018-10-25 | 2019-02-19 | 北京小鱼在家科技有限公司 | Method of adjustment and device, the video call device and storage medium of call background |
CN110311858A (en) * | 2019-07-23 | 2019-10-08 | 上海盛付通电子支付服务有限公司 | A kind of method and apparatus sending conversation message |
CN110417641A (en) * | 2019-07-23 | 2019-11-05 | 上海盛付通电子支付服务有限公司 | A kind of method and apparatus sending conversation message |
US10599391B2 (en) * | 2017-11-06 | 2020-03-24 | Google Llc | Parsing electronic conversations for presentation in an alternative interface |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101419796A (en) * | 2008-12-02 | 2009-04-29 | 无敌科技(西安)有限公司 | Device and method for automatically splitting speech signal of single character |
CN201273843Y (en) * | 2008-08-15 | 2009-07-15 | 宇龙计算机通信科技(深圳)有限公司 | Music player |
CN102760051A (en) * | 2012-03-26 | 2012-10-31 | 联想(北京)有限公司 | Method for obtaining voice signal and electronic equipment |
CN102780646A (en) * | 2012-07-19 | 2012-11-14 | 上海量明科技发展有限公司 | Method for achieving sound icon in instant messaging, client and system |
CN103927175A (en) * | 2014-04-18 | 2014-07-16 | 深圳市中兴移动通信有限公司 | Method with background interface dynamically changing along with audio and terminal equipment |
WO2015100923A1 (en) * | 2013-12-31 | 2015-07-09 | 中兴通讯股份有限公司 | User information obtaining method and mobile terminal |
-
2015
- 2015-07-27 CN CN201510448262.9A patent/CN106383648A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN201273843Y (en) * | 2008-08-15 | 2009-07-15 | 宇龙计算机通信科技(深圳)有限公司 | Music player |
CN101419796A (en) * | 2008-12-02 | 2009-04-29 | 无敌科技(西安)有限公司 | Device and method for automatically splitting speech signal of single character |
CN102760051A (en) * | 2012-03-26 | 2012-10-31 | 联想(北京)有限公司 | Method for obtaining voice signal and electronic equipment |
CN102780646A (en) * | 2012-07-19 | 2012-11-14 | 上海量明科技发展有限公司 | Method for achieving sound icon in instant messaging, client and system |
WO2015100923A1 (en) * | 2013-12-31 | 2015-07-09 | 中兴通讯股份有限公司 | User information obtaining method and mobile terminal |
CN103927175A (en) * | 2014-04-18 | 2014-07-16 | 深圳市中兴移动通信有限公司 | Method with background interface dynamically changing along with audio and terminal equipment |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109213545A (en) * | 2017-06-30 | 2019-01-15 | 北京国双科技有限公司 | Based reminding method and device in court trial process |
US10599391B2 (en) * | 2017-11-06 | 2020-03-24 | Google Llc | Parsing electronic conversations for presentation in an alternative interface |
US11036469B2 (en) | 2017-11-06 | 2021-06-15 | Google Llc | Parsing electronic conversations for presentation in an alternative interface |
CN108776985A (en) * | 2018-06-05 | 2018-11-09 | 科大讯飞股份有限公司 | A kind of method of speech processing, device, equipment and readable storage medium storing program for executing |
CN109361888A (en) * | 2018-10-25 | 2019-02-19 | 北京小鱼在家科技有限公司 | Method of adjustment and device, the video call device and storage medium of call background |
CN110311858A (en) * | 2019-07-23 | 2019-10-08 | 上海盛付通电子支付服务有限公司 | A kind of method and apparatus sending conversation message |
CN110417641A (en) * | 2019-07-23 | 2019-11-05 | 上海盛付通电子支付服务有限公司 | A kind of method and apparatus sending conversation message |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106383648A (en) | Intelligent terminal voice display method and apparatus | |
CN106504304B (en) | A kind of method and device of animation compound | |
CN104780093B (en) | Expression information processing method and processing device during instant messaging | |
JP7436664B2 (en) | Method for constructing a listening scene and related devices | |
CN101482976B (en) | Method for driving change of lip shape by voice, method and apparatus for acquiring lip cartoon | |
Goodwin | Embedded context | |
CN112099628A (en) | VR interaction method and device based on artificial intelligence, computer equipment and medium | |
CN103269405A (en) | Method and device for hinting friendlily | |
CN108206027A (en) | A kind of audio quality evaluation method and system | |
CN104505103B (en) | Voice quality assessment equipment, method and system | |
Schmidt | Ambience and ubiquity | |
CN116009748A (en) | Picture information interaction method and device in children interaction story | |
CN104123857B (en) | A kind of Apparatus and method for realizing personalized some reading | |
CN104853257A (en) | Subtitle display method and device | |
CN108322868A (en) | Improve the method that loud speaker plays piano voice sound quality | |
CN104978966A (en) | Method and apparatus realizing compensation of frame loss in audio stream | |
CN103701982B (en) | The method of adjustment of user terminal displays content, device and system | |
CN106653003A (en) | Voice recognition method and device | |
Dudley | Fundamentals of speech synthesis | |
CN114446316B (en) | Audio separation method, training method, device and equipment of audio separation model | |
CN115619897A (en) | Image processing method, image processing device, electronic equipment and storage medium | |
CN108854062A (en) | A kind of voice-enabled chat module of moving game | |
Risca et al. | INOVASI MEDIA PEMBELAJARAN TAHFIDZ UNTUK ANAK BERKEBUTUHAN MENGGUNAKAN POP UP BOOK MAURO | |
Volkmann et al. | Age-appropriate Participatory Design of a Storytelling Voice Input in the Context of Historytelling. | |
Thacker | Experiencing the moment in song: An analysis of the Irish traditional singing session |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170208 |
|
RJ01 | Rejection of invention patent application after publication |