CN101273405B - Optional encoding system and method for operating the system - Google Patents

Optional encoding system and method for operating the system Download PDF

Info

Publication number
CN101273405B
CN101273405B CN2006800359075A CN200680035907A CN101273405B CN 101273405 B CN101273405 B CN 101273405B CN 2006800359075 A CN2006800359075 A CN 2006800359075A CN 200680035907 A CN200680035907 A CN 200680035907A CN 101273405 B CN101273405 B CN 101273405B
Authority
CN
China
Prior art keywords
mentioned
voice data
user
data
terminating machine
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN2006800359075A
Other languages
Chinese (zh)
Other versions
CN101273405A (en
Inventor
全允豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Intel Corp
Original Assignee
WIDERTHAN Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by WIDERTHAN Co Ltd filed Critical WIDERTHAN Co Ltd
Publication of CN101273405A publication Critical patent/CN101273405A/en
Application granted granted Critical
Publication of CN101273405B publication Critical patent/CN101273405B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/78Detection of presence or absence of voice signals
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/04Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis using predictive techniques
    • G10L19/16Vocoder architecture
    • G10L19/18Vocoders using multiple modes
    • G10L19/20Vocoders using multiple modes using sound class specific coding, hybrid encoders or object based coding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/90Pitch determination of speech signals

Abstract

A variable encoding system and method for operating the system, the method including: receiving audio data from a predetermined server, encoding the audio data via a predetermined encoder and providing a user terminal with the audio data. The variable encoding system and method for operating the system can increase usage efficiency of a memory device of a mobile terminal recording audio data and reduce a load of a wireless communication network by encoding audio data in a variable encoding system based on characteristics of data, and transmitting the audio data to a second user terminal via the wireless communication network.

Description

The method of coded system and operating system optionally
Technical field
The invention provides and a kind ofly from predetermined server, receive voice data, above-mentioned voice data is encoded and the method and system of above-mentioned voice data is provided to user's terminating machine by predetermined scrambler.In this case, above-mentioned scrambler can at random be set based on the characteristic of above-mentioned voice data and when above-mentioned voice data comprises speech data above predetermined rate, and above-mentioned scrambler comprises is excited linear predictive coding (QCELP:Qualcomm Code Excited Linear Prediction), enhanced variable rate coding (EVRC:Enhanced Voice Rated Codec) and adaptive multi-rate (AMR:Adaptive Multi-Rate) voice coding etc.
Background technology
Nowadays the Internet develops by leaps and bounds, stored audio content and the mobile phone quilt application more and more widely of playing again as requested.For example, downloading audio content to mobile phone and using in the situation of for example podcast service of above-mentioned audio content (podcasting service), above-mentioned audio content is required to download in the terminal at first.Above-mentionedly be downloaded to audio content in the aforementioned calculation machine terminal based on audio compression techniques MP3 method for example, Advanced Audio Coding (ACC) technology etc. is sent to for example MP3 player of above-mentioned mobile phone by coded format, in the mobile phone etc.Therefore, above-mentioned mobile phone can be play compressed audio content again by the above-mentioned compressed audio content of decoding.And aforementioned calculation machine terminal can be downloaded audio content by a predetermined cycle from the server that above-mentioned voice data is provided, for example news broadcast etc., the above-mentioned audio content and provide above-mentioned audio content to above-mentioned mobile phone of encoding.
In this case, above-mentioned mobile phone further comprises the memory device that writes down above-mentioned audio content, and its above-mentioned audio content can be recorded in the memory device of above-mentioned mobile phone.
Yet the current above-mentioned mobile phone that is widely used has the memory capacity of tens of or hundreds of MB (Mega Byte) usually.Above-mentioned memory size may be not enough for the audio content of record high bit rate coding.Therefore, in order to satisfy practical application, the technology of compression or the above-mentioned audio content of encoded recording in above-mentioned memory device is the ten minutes needs substantially.
Specifically, it is coded that the above-mentioned voice data that receives from predetermined server just has been employed a kind of special method when receiving above-mentioned voice data.Characteristic based on above-mentioned voice data, after repeating to compile by above-mentioned voice data with the specific process coding, by sending above-mentioned mobile phone to, it is needed improving the internal memory service efficiency of above-mentioned mobile phone and reduce the technology that transmits the channel load.
For example, voice data is such as music etc., usually be compressed to bit rate greater than 128Kbps, and each the voice centre point that is not the very sound quality place of care usually requires the bit rate of 32Kbps at least, but as the vocoder that is fit to human speech, for example, Enhanced Variable Rate Coder (EVRC) can be compressed to the low bit rate of 8Kbps.
Yet according to current prior art, no matter speech data in rich site data (RSS:Rich SiteSummary) or the podcast and music data are how, sound source is usually by for example MP3 method or ACC method provide with coding method in groups.Therefore, above-mentioned mobile phone comprises the situation that is compressed to the speech data that surpasses required high bit rate and will take place.So the problem that the above-mentioned internal memory of above-mentioned mobile phone is not fully used will exist.
Summary of the invention
Technical purpose
One aspect of the present invention provides a kind of method and system of service efficiency of the memory device that improves mobile phone.
Another aspect of the present invention also provides based on data characteristic and has reduced the load of wireless communication networks by coding audio data in coded system optionally, and above-mentioned voice data is sent to the method and system of second user's terminating machine by above-mentioned wireless communication networks.
Technical scheme
According to an aspect of the present invention, provide a kind of method of optionally coding audio data, having comprised: from a server of being scheduled to, received voice data; Determine whether speech data is included in the above-mentioned voice data by the data layout of analyzing above-mentioned voice data; When above-mentioned speech data is comprised in the above-mentioned voice data, utilize vocoder by to only with above-mentioned voice data in the corresponding part of speech data encode and produce second voice data, above-mentioned second voice data comprises the transitional information of above-mentioned vocoder and above-mentioned coding; Send second user's terminating machine to second voice data with above-mentioned generation, wherein above-mentioned second user's terminating machine is decoded to above-mentioned second voice data based on above-mentioned transitional information.
According to another aspect of the present invention, provide a kind of system of optionally coding audio data, having comprised: the receiving element that from predetermined server, receives above-mentioned voice data; Determine whether that speech data is included in above-mentioned voice data and when above-mentioned speech data is comprised in the above-mentioned voice data by the data layout of analyzing above-mentioned voice data, utilize vocoder by to only with above-mentioned voice data in the corresponding part of speech data encode and produce the converting unit of second voice data, above-mentioned second voice data comprises the transitional information of vocoder and above-mentioned coding; With the delivery unit that second voice data of above-mentioned generation is sent to second user's terminating machine, wherein above-mentioned second user's terminating machine is decoded to above-mentioned second voice data based on above-mentioned transitional information.
Description of drawings
Fig. 1 illustrates according to the optionally coded system that comprises of the present invention, the diagrammatic sketch of the network of server and second user's terminating machine;
Fig. 2 is the process flow diagram that illustrates according to optionally coding audio data method of the present invention;
Fig. 3 and Fig. 4 illustrate according to the optionally coded system that comprises of the present invention, the exemplary plot of server and second user's terminal network;
Fig. 5 illustrates the diagrammatic sketch of audio data format and second voice data according to an exemplary embodiment of the present invention;
Fig. 6 illustrates according to an exemplary embodiment of the present invention the optionally block diagram of the internal configurations of coded system;
Fig. 7 illustrates according to internal frame diagram that can adopted general calculation machine in using method optionally of the present invention.
Embodiment
Certain exemplary embodiments of the present invention will be described in detail in conjunction with the accompanying drawings.
Fig. 1 illustrates according to the optionally coded system that comprises of the present invention, the diagrammatic sketch of the network of server and second user's terminating machine.
With reference to figure 1, optionally coded system 100 according to the present invention receives predetermined voice data from the webserver 110.The above-mentioned webserver 100 of exemplary embodiment according to the present invention provides podcast service or rich site data (RSS) service.Therefore, above-mentioned optionally coded system 100 can receive above-mentioned voice data by a predetermined cycle from the above-mentioned webserver 110.And above-mentioned voice data can comprise music data, speech data or broadcast data.
Whether the above-mentioned optionally coded system 100 that receives above-mentioned speech data analytically predicate sound data and decision is included in speech data in the above-mentioned voice data.Determine whether speech data is included in the above-mentioned voice data by the data layout of analyzing above-mentioned voice data and can use traditional technology.Whether for example, formed by people's sound in order to discern above-mentioned speech data, whether decision sound can be employed greater than the cut method of predetermined rate.And, whether whether above-mentioned speech data is comprised in the above-mentioned voice data can be detected by checking pitch predetermined from above-mentioned speech data, or in predetermined frequency band frequency crowded the deciding whether of the above-mentioned speech data of frequency check by discerning above-mentioned speech data.And current mobile communication terminal machine sends band by control in real time such as voice activation detecting device (VAD:VoiceActivity Detector), discontinuous transmission (DTX:Discontinuous Transmission) or variable rate coder (VRC:Variable Rate Codec).Whether to be included in above-mentioned speech data different with the above-mentioned speech data of above-mentioned mobile communication terminal machine Real time identification, can be on details relatively more widely judge according to the how obtainable time whether above-mentioned speech data is comprised in the above-mentioned voice data in predicate sound data analytically according to above-mentioned optionally coded system 100 of the present invention.
Whether the above-mentioned speech data of above-mentioned optionally coded system 100 decisions that receives above-mentioned voice data from above-mentioned server 110 is included in the above-mentioned voice data, and comes coded voice data by predetermined vocoder when above-mentioned speech data is included in the above-mentioned voice data.Can utilize according to the above-mentioned optionally coded system 100 of exemplary embodiment of the present invention and for example to be excited vocoders such as linear predictive coding (QCELP), enhanced variable rate coding (EVRC) and adaptive multi-rate (AMR) voice coding.
Second voice data produces from above-mentioned voice data by vocoder behind coding.When above-mentioned EVRC is used as the voice data that comprises above-mentioned speech data, above-mentioned second voice data be approximately the corresponding bit rate of 8Kbps place and be encoded.And, when above-mentioned speech data is not comprised in the above-mentioned voice data, but when above-mentioned music data or above-mentioned song data were comprised in above-mentioned voice data, above-mentioned optionally coded system 100 was no longer encoded to above-mentioned voice data.
Second user's terminating machine 120 receives above-mentioned second voice data from above-mentioned optionally coded system 100.
According to the above-mentioned optionally coded system 100 of exemplary embodiment of the present invention is from the computer terminal of above-mentioned voice data is provided in the service of podcast service or the audio content that similarly provides the method.Therefore, above-mentioned optionally coded system 100 receives above-mentioned voice data via the wire/wireless the Internet communication network from server.And, above-mentioned optionally coded system 100 above-mentioned second voice data of optionally encoding, or above-mentioned voice data sent in above-mentioned second user's terminating machine 120.In this case, above-mentioned second user's terminating machine 120 is for example mobile communication terminal machine, MP3 player, portable game station (PSP:Play StationPortable), portable media player (PMP:Portable Multimedia Player), personal digital assistant (PDA:Personal Digital Assistant) or electronic memos etc. of a mobile phone, and aforementioned calculation machine terminating machine transmits above-mentioned second voice data by being connected with above-mentioned second user's terminating machine 120.
Above-mentioned optionally coded system 100 according to exemplary embodiment of the present invention is separate servers of being scheduled to.Therefore, above-mentioned optionally coded system 100 receives above-mentioned voice data via the wire/wireless communication network from above-mentioned server 110, from above-mentioned voice data, optionally produce above-mentioned second voice data, or send above-mentioned original voice data to above-mentioned second user's terminating machine 120.In this case, above-mentioned second user's terminating machine 120 is above-mentioned mobile communication terminal machines, and above-mentioned optionally coded system 100 wirelessly sends above-mentioned second voice data to above-mentioned mobile communication terminal machine via data channel.
Therefore, above-mentioned optionally coded system 100 according to the present invention can have the memory efficient that improves above-mentioned second user's terminating machine 120, the influence of reduction transmission channel load etc.Specifically, above-mentioned optionally coded system 100 according to the present invention can be by reducing the total volume of above-mentioned speech data at the less bit rate above-mentioned speech data of only encoding again when above-mentioned speech data partially or even wholly is included in the above-mentioned voice data.
Fig. 2 illustrates according to of the present invention based on the operational flowchart of coding audio data method optionally.
In operation 201, exemplary embodiment server according to the present invention sends predetermined voice data to above-mentioned optionally coded system.Above-mentioned server provides the system of podcast service or RSS service.Therefore, above-mentioned optionally coded system is discerned the voice data tabulation of renewal by a predetermined cycle by discerning above-mentioned server, and asks above-mentioned voice data to be carried out transmission when the voice data of above-mentioned renewal exists.
In operation 202, above-mentioned optionally coded system receives voice data and analyzes data layout from above-mentioned server.Above-mentioned voice data comprises data such as for example broadcasting, music, song, voice.Therefore, whether above-mentioned voice data has a special attribute and above-mentioned specific properties to be sheared etc. and to decide a characteristic by analyzing frequency band, pitch detection, tut based on data layout.The characteristic of above-mentioned voice data decides by using above-mentioned traditional handicraft.
In operation 203, whether above-mentioned optionally coded system decides above-mentioned speech data to be included in the above-mentioned voice data based on the analysis of above-mentioned data layout.Whether whether above-mentioned optionally coded system be sheared etc. and decide above-mentioned speech data to be included in the above-mentioned voice data by analyzing above-mentioned frequency band, pitch detection, tut.Here, every part comprises an exponential sum whether above-mentioned index comprises above-mentioned voice data and all is recorded in the predetermined memory device.
And, in operation 203, when above-mentioned speech data was not comprised in above-mentioned voice data as the result of above-mentioned data layout analysis, by branch operation 206, above-mentioned optionally coded system sent above-mentioned voice data to second user's terminating machine.
When above-mentioned speech data partly or wholly is comprised in the above-mentioned voice data, above-mentioned optionally coded system in operation 204 by predetermined vocoder only encode with above-mentioned voice data in the corresponding part of above-mentioned speech data and in operation 205, produce above-mentioned second voice data.
According to the above-mentioned optionally coded system of exemplary embodiment of the present invention by the corresponding predetermined part of the above-mentioned speech data in above-mentioned vocoder coding and the above-mentioned voice data.For example, for the above-mentioned voice data of encoding with the corresponding center section of above-mentioned speech data, above-mentioned optionally coded system is by the above-mentioned vocoder above-mentioned center section of only encoding, and by sign that identifying information is for example predetermined or index information etc. be inserted into the reference position of above-mentioned center section or reconfigure transitional information for example sound sign indicating number information etc. produce above-mentioned second voice data.Specifically, when above-mentioned speech data partly is included in the above-mentioned voice data and above-mentioned music data when partly being included in the above-mentioned voice data, above-mentioned second voice data has a different bit rate by each partial section classification.For example, above-mentioned voice data can be encoded at 8Kbps bit rate and the corresponding part of above-mentioned speech data, also can be encoded at 128Kbps bit rate and the corresponding part of above-mentioned music data.
According to the above-mentioned optionally coded system of exemplary embodiment of the present invention when above-mentioned speech data with surpass the predetermined corresponding ratio of ratio and be comprised in above-mentioned voice data in, can with the corresponding bit rate of the above-mentioned speech data place above-mentioned total audio data of encoding.In this case, above-mentioned predetermined ratio can be set by the developer or the operator of above-mentioned optionally coded system.
In operation 206, above-mentioned optionally coded system sends second voice data of above-mentioned generation to above-mentioned second user's terminating machine.
Can on user's computer terminal, implement and above-mentioned second user's terminating machine can be for example mobile phone, PDA, electronic memo, PMP, PSP, a MP3 player etc. of mobile phone according to the above-mentioned optionally coded system of exemplary embodiment of the present invention.Exemplary embodiment of the present invention will be done in conjunction with Fig. 3 and describe in detail.
Fig. 3 illustrates according to the optionally coded system that comprises of the present invention, the exemplary plot of server and second user's terminal network.
With reference to figure 3, above-mentioned optionally coded system 300 can be implemented on terminal 310.Specifically, above-mentioned optionally coded system 300 is predetermined Application Software Program or the hardware that is arranged in aforementioned calculation machine terminal 310.Server 301 sends above-mentioned voice data to aforementioned calculation machine terminating machine 310 by a predetermined cycle via network 302 based on above-mentioned podcast service or above-mentioned RSS service.Above-mentioned network 302 can be considered to offer the wire/radio network of aforementioned calculation machine terminating machine 310 network communication abilities.Whether the above-mentioned speech data of aforementioned calculation machine terminating machine 310 decisions that receives above-mentioned voice data via network 302 is comprised in the above-mentioned voice data in the coded system 300 optionally.When above-mentioned speech data was comprised in the above-mentioned voice data, above-mentioned optionally coded system 300 produced above-mentioned second voice data by above-mentioned vocoder behind the above-mentioned voice data of coding.When above-mentioned second user's terminating machine was connected with aforementioned calculation machine terminating machine 310, aforementioned calculation machine terminating machine 310 sent above-mentioned second voice data that above-mentioned optionally coded system 300 produces to above-mentioned second user's terminating machine.Above-mentioned second user's terminating machine is the mobile phone with predetermined memory device, for example MP3 player 304, mobile communication terminal machine 305, recreation war 306 etc.
Above-mentioned second user's terminating machine by the short distance communication assembly for example USB assembly, RS-232C assembly, bluetooth module etc. be connected with above-mentioned optionally coded system 300, and above-mentioned optionally coded system 300 sends above-mentioned second voice data to above-mentioned second user's terminating machine by the connection of discerning above-mentioned second user's terminating machine.
The above-mentioned pen container of optionally encoding according to exemplary embodiment of the present invention is that separate server and the above-mentioned second user's terminating machine of being scheduled to is above-mentioned mobile communication terminal machine.Exemplary embodiment of the present invention will describe in detail in conjunction with Fig. 4.
Fig. 4 illustrates according to the optionally exemplary plot of coded system, server and second user's terminal network that comprises of the present invention.
With reference to figure 4, above-mentioned optionally coded system 400 receives predetermined voice data via network 402 from server 401.In this case, above-mentioned network 402 can be understood that to comprise on the broad sense all wire/wireless communication networks.
Similar with the exemplary embodiment of Fig. 3, whether the above-mentioned speech data of above-mentioned optionally coded system 400 decisions that receives above-mentioned voice data is included in the above-mentioned voice data, and produces above-mentioned second voice data by above-mentioned predetermined vocoder behind the above-mentioned voice data of coding when above-mentioned speech data is comprised in the above-mentioned voice data.And second voice data of above-mentioned generation is sent on above-mentioned second user's terminating machine by above-mentioned network 403.Above-mentioned second user's terminating machine is that mobile communication terminal machine 404 and above-mentioned network 403 comprise the wireless communication networks that contains system of predetermined communication provider.
Specifically, above-mentioned optionally coded system 400 requires system of above-mentioned communication provider to set up a channel that is connected with mobile communication terminal machine 404.Therefore, system of above-mentioned communication provider sets up wireless channel between above-mentioned optionally coded system 400 and above-mentioned mobile communication terminal machine 404, and above-mentioned optionally coded system 400 wirelessly sends above-mentioned second voice data to above-mentioned mobile communication terminal machine 404 by above-mentioned wireless channel.And, above-mentioned according to an exemplary embodiment of the present invention wireless communication terminal machine 404 is ask second voice data that Seeking Truth does not have the alternative coded system of above-mentioned transmission by a predetermined cycle, and the above-mentioned optionally coded system 400 of requirement transmits above-mentioned second voice data when having above-mentioned second voice data.At last, above-mentioned optionally coded system 400 according to the present invention is used by the internal memory that the capacity that reduces above-mentioned voice data effectively can reduce above-mentioned mobile communication terminal machine 404, also can lower the load that above-mentioned mobile communication network transmits channel.
Again with reference to figure 2, in operation 207, above-mentioned second user's terminating machine is based on decode above-mentioned second voice data and provide above-mentioned second voice data to above-mentioned user by predetermined speaker unit of above-mentioned transitional information.
Contain user's database according to the above-mentioned optionally coded system of exemplary embodiment of the present invention relevant at least one user's record.Above-mentioned user's information comprises the identifying information with the corresponding above-mentioned second user's terminating machine of above-mentioned user, and telephone number information can be used as an example of above-mentioned identifying information.Specifically, above-mentioned optionally coded system sends above-mentioned second user's terminating machine to by reading with reference to above-mentioned user's database with the corresponding user's information of above-mentioned second user's terminating machine and with second voice data of above-mentioned generation, and based on sending above-mentioned second user's terminating machine with the corresponding identifying information of above-mentioned user's information to above-mentioned second voice data is wireless.In this case, above-mentioned second user's terminating machine is the mobile communication terminal machine of above-mentioned mobile phone for example.
Fig. 5 illustrates the diagrammatic sketch of audio data format and second voice data according to an exemplary embodiment of the present invention.
With reference among the figure 5 with numeral 501 parts of representing, be ' A.MP3 ' according to the voice data of exemplary embodiment of the present invention.Above-mentioned ' A.MP3 ' comprises that a plurality of playlists and above-mentioned optionally coded system discern above-mentioned speech data and whether be included in the above-mentioned voice data by analyzing each playlist.For example, ' A.MP3 ' is an audio broadcasting and narration data and the music data that comprises the announcer.As the result who analyzes above-mentioned playlist, above-mentioned optionally coded system decision ' A1 ' and ' A3 ' is above-mentioned music data, and ' A2 ' and ' A4 ' is above-mentioned announcer's narration data.And the predetermined vocoder of above-mentioned optionally coded system utilization is that ' A1 ' and ' A3 ' of music data encodes and ' A2 ' and ' A4 ' also encoded to being judged to be.Specifically, each voice data that the analysis of above-mentioned optionally coded system is classified by each playlist is carried out different codings as the result who analyzes at each playlist.In this case, above-mentioned second user's terminating machine requires to have the function of playing each tabulation based on above-mentioned playlist again.Similar with the part of numeral 501 expressions, the voice data that comprises above-mentioned speech data can prevent that above-mentioned voice data is judged as the problem of above-mentioned music data or above-mentioned song data.
With reference among the figure 5 with the numeral 502 parts of representing, above-mentioned optionally coded system has been deleted above-mentioned playlist with numeral 501 parts of representing from above-mentioned, in each playlist, inserted and the relevant transitional information of coding, and the above-mentioned playlist of recombination and form a voice data.In situation, can the predetermined software of decoding via the above-mentioned voice data of a plurality of encoder encodes be needed with numeral 502 parts of representing.Because above-mentioned software is open and shared technology, describes in detail and will be omitted.
Fig. 6 illustrates according to an exemplary embodiment of the present invention the optionally block diagram of the internal configurations of coded system.
With reference to figure 6, comprise receiving element 601, converting unit 602 and delivery unit 603 according to the above-mentioned optionally coded system 600 of exemplary embodiment of the present invention.
Above-mentioned receiving element 601 receives voice data from book server.Above-mentioned server just provides for example above-mentioned voice data of voice, music, song, broadcasting etc. as the common server that voice data is provided.And above-mentioned voice data comprises all data that are encoded or not processed data.
Above-mentioned converting unit 602 judges by the data layout of analyzing the above-mentioned voice data that receives from above-mentioned receiving element 601 whether speech data is included in the above-mentioned speech data, and produces second voice data via predetermined vocoder by the above-mentioned voice data of encoding when above-mentioned speech data is included in the above-mentioned voice data.Judge according to the above-mentioned converting unit 602 of exemplary embodiment of the present invention whether a plurality of data that the above-mentioned voice data that receives are divided into based on predetermined tabulation are each speech datas.Therefore, differentiated coding is applied in above-mentioned a plurality of data and above-mentioned a plurality of data are generated above-mentioned second audio frequency individually.In this case, above-mentioned second voice data comprises the transitional information about above-mentioned vocoder and above-mentioned coding.
Instruction by the user is generated as above-mentioned second voice data via special scrambler with above-mentioned voice data according to the above-mentioned converting unit 602 of exemplary embodiment of the present invention.Above-mentioned user can set the error coded that above-mentioned voice data maybe will encode by special scrambler based on above-mentioned user's custom becomes above-mentioned second voice data.For example, above-mentioned user can set according to the song data that the content capacity of above-mentioned second user's terminating machine maybe will be encoded music data and be encoded to above-mentioned sound sign indicating number.
Above-mentioned delivery unit 603 sends second voice data of above-mentioned generation to above-mentioned second user's terminating machine.
Above-mentioned optionally coded system 600 according to exemplary embodiment of the present invention is comprised in the predetermined computation machine terminating machine of Application Software Program or type of hardware.Specifically, above-mentioned receiving element 601 usefulness wire/wireless forms receive above-mentioned voice data via the Internet communication network from predetermined server, and above-mentioned converting unit 602 is judged whether above-mentioned speech data is included in the above-mentioned voice data and when above-mentioned speech data is comprised in the above-mentioned voice data and produced above-mentioned second voice data via above-mentioned vocoder by the above-mentioned voice data of encoding.Therefore, when above-mentioned second user's terminating machine via the short distance communication assembly, when for example USB assembly, RS-232C assembly, ultra broadband (UWB) assembly, bluetooth module, WLAN (wireless local area network) (LAN) etc. were connected, above-mentioned delivery unit 603 sent above-mentioned second voice data to above-mentioned second user's terminating machine.
Above-mentioned optionally coded system 600 according to exemplary embodiment of the present invention is predetermined independently servers.Therefore, above-mentioned receiving element 601 receives above-mentioned voice data via the wire/wireless communication network from above-mentioned server, and whether above-mentioned converting unit 602 is included in according to above-mentioned speech data and produces above-mentioned second voice data in the above-mentioned voice data.And then above-mentioned delivery unit 603 wirelessly sends above-mentioned second voice data to above-mentioned second user's terminating machine.Above-mentioned second user's terminating machine comprises mobile communication terminal machine, public translation telephone network (PSTN) terminating machine, the networking telephone (VoIP), SIP (SIP), media gateway controlling (Megaco), personal digital assistant (PDA:Personal Digital Assistant), mobile phone, person-to-person communication service (PCS:Personal Commuincation Service) phone, handheld personal computers (Hand-Held PC), CDMA (CDMA)-2000 (1X, 3X) phone, wideband CDMA (WCDMA:Wideband CDMA) phone, biobelt/dual model (Dual Band/Dual Mode) phone, global system for mobile communications (GSM:Global Standard for Mobile) phone, mobile broadband system (MBS:Mobile Broadband System) phone, satellite/earth DMB (DMB:Digital Multimedia Broadcasting) phones etc. are as a predetermined communication terminal machine.
Above-mentioned optionally coded system 600 according to exemplary embodiment of the present invention further comprises user's database 604 and database management unit 605.
Above-mentioned user's database 604 contains the user's information relevant at least one user.Above-mentioned user's information comprises the identifying information with the corresponding above-mentioned second user's terminating machine of above-mentioned user.And, above-mentioned database management unit 605 is by reading and the corresponding above-mentioned user's information of above-mentioned second user's terminating machine with reference to above-mentioned user's database 604, control above-mentioned delivery unit 603, and based on wirelessly sending above-mentioned second voice data to above-mentioned second user's terminating machine with the corresponding above-mentioned identifying information of above-mentioned user's information.
For example, above-mentioned delivery unit 603 carries out the above-mentioned user's database 604 of the analysis of sentence and then wirelessly sends above-mentioned second voice data to above-mentioned second user's terminating machine from grammer, reads predetermined above-mentioned user's information.Above-mentioned user's information comprises above-mentioned identifying information, the telephone number information of for example above-mentioned second user's terminating machine etc., and above-mentioned delivery unit 603 based on above-mentioned identifying information for example above-mentioned telephone number information etc. send above-mentioned second voice data to above-mentioned second user's terminating machine.
Fig. 7 illustrates according to internal frame diagram that can adopted general calculation machine in using method optionally of the present invention.
Computer installation 700 comprise at least one with contain RAM (Random Access Memory: random access memory) 720 with ROM (Read Only Memory: the ROM (read-only memory)) processor 710 that is connected of 730 main memory device.Above-mentioned processor 710 is also referred to as CPU (central processing unit) (CPU).Technical field just as is known, above-mentioned ROM 730 uniaxiallies are passed to above-mentioned CPU with data and instruction, and above-mentioned RAM 720 is often used as and transmits data and instruction two-wayly.Above-mentioned RAM 720 and above-mentioned ROM 730 can comprise certain suitable form of computer readable recording medium storing program for performing.Large capacity equipment 740 is connected with above-mentioned processor 710 two-wayly and is used to provide extra data storage capacity and can is a kind of in a plurality of computer-readable recording mediums.Above-mentioned large capacity equipment 740 is used as stored programme, data etc., and is a standby memory device, for example moves hard disk comparatively slowly than above-mentioned main memory device usually.Special large capacity equipment for example CD ROM 760 also can be used.For example video display, trackball, mouse, keyboard, microphone, touch screen type display, card reader, tape or paper tape reader, voice or handwriting recognizer, joystick or other known computing machine I/O units are connected above-mentioned processor 710 with at least one input/output interface 750.Above-mentioned processor 710 can be connected with the wire/wireless communication network by network interface 770.The step of the method for foregoing description can connect by above-mentioned network and realizes.Said apparatus and instrument are common general knowledge for the technician of computer hardware and software technology field.
Above-mentioned hardware unit is for the one or more component softwares of operation since carrying out aforesaid operations of the present invention and can disposing accordingly.
Although shown and described the present invention with reference to its certain exemplary embodiments, but it should be appreciated by those skilled in the art, or else break away under the situation by the spirit and scope of the present invention of claim definition, can carry out various changes on form and the details it.
Commercial Application
One aspect of the present invention provides a kind of method and system of service efficiency of memory device of the mobile phone that improves recording audio data.
Another aspect of the present invention also provides based on above-mentioned voice data characteristic by optionally reducing the load of wireless communication networks by coding audio data in the coded system, and above-mentioned voice data is sent to the method and system of second user's terminating machine by above-mentioned wireless communication networks.

Claims (8)

1. method of coding audio data optionally in coded system optionally, described method comprises:
From a server of being scheduled to, receive voice data;
Determine whether speech data is included in the described voice data by the data layout of analyzing described voice data;
When described speech data is comprised in the described voice data, utilize vocoder by to only with described voice data in the corresponding part of speech data encode and produce second voice data, described second voice data comprises the transitional information to described vocoder and described coding; With
Send second voice data of described generation to second user's terminating machine,
Wherein said second user's terminating machine is decoded to described second voice data based on described transitional information.
2. the method for claim 1, wherein said optionally coded system comprises terminal, and when described second user's terminating machine was connected with described terminal, described terminal sent to described second user's terminating machine with described second voice data.
3. the method for claim 1 further comprises:
Keep the user database of record about user's information of at least one user, described user's information comprises the identifying information with the corresponding described second user's terminating machine of described user,
Wherein said transmission comprises:
By reading and the corresponding user's information of described second user's terminating machine with reference to described user's database; With
Based on sending described second user's terminating machine to described second voice data is wireless with the corresponding identifying information of described user's information.
4. the method for claim 1, wherein said vocoder comprise at least is excited linear predictive coding (QCELP), a kind of in enhanced variable rate coding (EVRC) and adaptive multi-rate (AMR) voice coding.
5. the method for claim 1, wherein said voice data is received from described server with rich site data (RSS) method.
6. the method for claim 1, wherein said second voice data via different vocoders by described voice data being decomposed and coding becomes a plurality of voice datas and produces.
7. system of coding audio data optionally, described system comprises:
From predetermined server, receive the receiving element of described voice data;
Determine whether that speech data is included in described voice data and when described speech data is comprised in the described voice data by the data layout of analyzing described voice data, utilize vocoder by to only with described voice data in the corresponding part of speech data encode and produce the converting unit of second voice data, described second voice data comprises the transitional information to described vocoder and described coding; With
Send second voice data of described generation the delivery unit of second user's terminating machine to,
Wherein said second user's terminating machine is decoded to described second voice data based on described transitional information.
8. system as claimed in claim 7 further comprises:
Record is about user's database of user's information of at least one user, and described user's information comprises the identifying information with the corresponding described second user's terminating machine of described user; With
By reading and the corresponding user's information of described second user's terminating machine with reference to described user's database, with based on the corresponding identifying information of described user's information, control described delivery unit with the wireless database management unit that sends described second user's terminating machine to of described second voice data.
CN2006800359075A 2005-09-30 2006-09-28 Optional encoding system and method for operating the system Expired - Fee Related CN101273405B (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
KR10-2005-0091846 2005-09-30
KR1020050091846A KR100757858B1 (en) 2005-09-30 2005-09-30 Optional encoding system and method for operating the system
PCT/KR2006/003903 WO2007037641A1 (en) 2005-09-30 2006-09-28 Optional encoding system and method for operating the system

Publications (2)

Publication Number Publication Date
CN101273405A CN101273405A (en) 2008-09-24
CN101273405B true CN101273405B (en) 2011-12-21

Family

ID=37900009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2006800359075A Expired - Fee Related CN101273405B (en) 2005-09-30 2006-09-28 Optional encoding system and method for operating the system

Country Status (3)

Country Link
KR (1) KR100757858B1 (en)
CN (1) CN101273405B (en)
WO (1) WO2007037641A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112967735A (en) * 2021-02-23 2021-06-15 北京达佳互联信息技术有限公司 Training method of voice quality detection model and voice quality detection method

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5742734A (en) * 1994-08-10 1998-04-21 Qualcomm Incorporated Encoding rate selection in a variable rate vocoder
US5940796A (en) * 1991-11-12 1999-08-17 Fujitsu Limited Speech synthesis client/server system employing client determined destination control
US6490553B2 (en) * 2000-05-22 2002-12-03 Compaq Information Technologies Group, L.P. Apparatus and method for controlling rate of playback of audio data

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100754439B1 (en) * 2003-01-09 2007-08-31 와이더댄 주식회사 Preprocessing of Digital Audio data for Improving Perceptual Sound Quality on a Mobile Phone
KR20030057494A (en) * 2003-01-16 2003-07-04 (주)유토포스 The advanced digital audio contents service system and its implementation method for mobile wireless device on wireless and wired internet communication network
KR100597964B1 (en) * 2003-01-16 2006-08-21 (주)유토포스 The advanced digital audio contents service system and its implementation method for mobile wireless device on wireless and wired internet communication network
KR20060027246A (en) * 2004-09-22 2006-03-27 (주)믹스크리에이티브 The digital audio streaming service system and its implementation method for non-mpeg4 mobile handsets on wireless communication network

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5940796A (en) * 1991-11-12 1999-08-17 Fujitsu Limited Speech synthesis client/server system employing client determined destination control
US5742734A (en) * 1994-08-10 1998-04-21 Qualcomm Incorporated Encoding rate selection in a variable rate vocoder
US6490553B2 (en) * 2000-05-22 2002-12-03 Compaq Information Technologies Group, L.P. Apparatus and method for controlling rate of playback of audio data

Also Published As

Publication number Publication date
WO2007037641A1 (en) 2007-04-05
KR100757858B1 (en) 2007-09-11
KR20070036870A (en) 2007-04-04
CN101273405A (en) 2008-09-24

Similar Documents

Publication Publication Date Title
US9111531B2 (en) Multiple coding mode signal classification
Zaykovskiy Survey of the speech recognition techniques for mobile devices
EP3416166B1 (en) Processing speech signal using substitute speech data
CN1266673C (en) Efficient improvement in scalable audio coding
KR101699138B1 (en) Devices for redundant frame coding and decoding
US20070112571A1 (en) Speech recognition at a mobile terminal
CA2792898C (en) Adaptive audio transcoding
CN101496098A (en) Systems and methods for modifying a window with a frame associated with an audio signal
CN103299365B (en) Devices for adaptively encoding and decoding a watermarked signal
CN1922659A (en) Coding model selection
CN101232542A (en) Method for mobile terminal to implement voice memorandum function and mobile terminal using the same
KR101279857B1 (en) Adaptive multi rate codec mode decoding method and apparatus thereof
CN104036788B (en) The acoustic fidelity identification method of audio file and device
US20100142730A1 (en) Crossfading of audio signals
CN103299364A (en) Devices for encoding and decoding a watermarked signal
CN101273405B (en) Optional encoding system and method for operating the system
CN100550130C (en) Be used to distribute and be used to reset the content distributing server and the terminal of content frame of music
US9437211B1 (en) Adaptive delay for enhanced speech processing
CN110211610A (en) Assess the method, apparatus and storage medium of audio signal loss
CN1748244A (en) Pitch quantization for distributed speech recognition
CN1335733A (en) Handwriting inputting, sending and receiving method and system for mobile terminal
Grigoras et al. Forensic Analysis of AAC Encoding on Apple iPhone Voice Memos Recordings
US20030065512A1 (en) Communication device and a method for transmitting and receiving of natural speech
KR100657818B1 (en) Method for playing multimedia data including codec
Beigi et al. Standard audio format encapsulation (SAFE)

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
ASS Succession or assignment of patent right

Owner name: REAL DIGITAL CO., LTD.

Free format text: FORMER OWNER: WIDERTHAN CO., LTD.

Effective date: 20120229

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20120229

Address after: Washington State

Patentee after: Realnetworks Asia Pacific Co., Ltd.

Address before: Seoul, South Kerean

Patentee before: Widerthan Co., Ltd.

ASS Succession or assignment of patent right

Owner name: INTEL CORP .

Free format text: FORMER OWNER: REAL DIGITAL CO., LTD.

Effective date: 20150528

C41 Transfer of patent application or patent right or utility model
TR01 Transfer of patent right

Effective date of registration: 20150528

Address after: American California

Patentee after: Intel Corporation

Address before: Washington State

Patentee before: Realnetworks Asia Pacific Co., Ltd.

CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20111221

Termination date: 20170928

CF01 Termination of patent right due to non-payment of annual fee