WO2013167023A2 - 一种内置搜索语音短信功能的移动终端及其搜索方法 - Google Patents

一种内置搜索语音短信功能的移动终端及其搜索方法 Download PDF

Info

Publication number
WO2013167023A2
WO2013167023A2 PCT/CN2013/079091 CN2013079091W WO2013167023A2 WO 2013167023 A2 WO2013167023 A2 WO 2013167023A2 CN 2013079091 W CN2013079091 W CN 2013079091W WO 2013167023 A2 WO2013167023 A2 WO 2013167023A2
Authority
WO
WIPO (PCT)
Prior art keywords
signal
module
voice
mobile terminal
similarity
Prior art date
Application number
PCT/CN2013/079091
Other languages
English (en)
French (fr)
Other versions
WO2013167023A3 (zh
Inventor
党正
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to EP13788386.4A priority Critical patent/EP2919429A4/en
Priority to US14/649,658 priority patent/US9992321B2/en
Publication of WO2013167023A2 publication Critical patent/WO2013167023A2/zh
Publication of WO2013167023A3 publication Critical patent/WO2013167023A3/zh

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/10Speech classification or search using distance or distortion measures between unknown speech and reference templates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/683Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L51/00User-to-user messaging in packet-switching networks, transmitted according to store-and-forward or real-time protocols, e.g. e-mail
    • H04L51/58Message adaptation for wireless communication
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • H04M1/72436User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for text messaging, e.g. short messaging services [SMS] or e-mails
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/5307Centralised arrangements for recording incoming messages, i.e. mailbox systems for recording messages comprising any combination of audio and non-audio components
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M3/00Automatic or semi-automatic exchanges
    • H04M3/42Systems providing special services or facilities to subscribers
    • H04M3/50Centralised arrangements for answering calls; Centralised arrangements for recording messages for absent or busy subscribers ; Centralised arrangements for recording messages
    • H04M3/53Centralised arrangements for recording incoming messages, i.e. mailbox systems
    • H04M3/533Voice mail systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/04Error control
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/12Messaging; Mailboxes; Announcements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W4/00Services specially adapted for wireless communication networks; Facilities therefor
    • H04W4/12Messaging; Mailboxes; Announcements
    • H04W4/14Short messaging services, e.g. short message services [SMS] or unstructured supplementary service data [USSD]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W88/00Devices specially adapted for wireless communication networks, e.g. terminals, base stations or access point devices
    • H04W88/02Terminal devices

Definitions

  • the present invention relates to the field of information search in mobile terminal technologies, and in particular, to a mobile terminal with a built-in search voice message function and a search method thereof. Background technique
  • the voice short message service means that the user records the utterance to be spoken by a mobile terminal such as a mobile phone, and sends it to one or more buddy users for listening. At the same time, the user can also receive the voice short message according to the prompt sound of the mobile terminal such as a mobile phone. Forward, query, reply, and voice SMS on demand. Voice messages make up for the difficulty of transmitting traditional text messages and inconvenient information input, and solve the problem of sending text messages to people who are not familiar with pinyin for a long time.
  • the voice message received by the user is an audio file sent by the sender
  • the user cannot intuitively view the voice message.
  • a mobile terminal such as a mobile phone stores too many voice messages locally, the user wants to view a specific voice message and becomes extremely difficult to find. It is necessary to open the voice message one by one to listen, and thus, the search of the voice message becomes very inconvenient, Reduce the user experience.
  • the main purpose of the embodiments of the present invention is to provide a mobile terminal with a built-in search voice message function and a search method thereof, which can quickly search for a voice message stored in the mobile terminal.
  • the embodiment of the invention provides a mobile terminal with a built-in search voice short message function, and the mobile terminal
  • the terminal includes: a voice input module, a pre-processing module, a matching module, and a result output module; wherein the voice input module is configured to input a voice search signal of the user, and send the voice search signal to the pre-processing module for pre-processing;
  • the pre-processing module is configured to pre-process the voice search signal, and send the pre-processed pre-processed signal to the matching module for signal matching;
  • the matching module is configured to perform feature parameter extraction on the pre-processed signal, calculate a similarity between the extracted feature parameter and the stored voice message feature parameter, and send a voice message with a similarity greater than or equal to the threshold value to the result output module;
  • the result output module is configured to display the voice message with the similarity greater than or equal to the threshold value in a list form on the screen of the mobile terminal.
  • the result output module is further configured to prompt the user whether to search again on the screen of the mobile terminal if the voice message having the similarity greater than or equal to the threshold is greater than one.
  • the pre-processing module includes: a signal normalization module, a signal reduction module, an anti-aliasing filter module, a signal amplification module, an endpoint detection module, and a noise filtering module;
  • the signal normalization module is configured to normalize the amplitude, frequency, and phase of the voice search signal into a unified amplitude, frequency, and phase, and send the normalized signal to the signal drop module;
  • the signal drop sampling module is configured to perform low frequency sampling on the normalized signal, and send the sampled signal to the anti-aliasing filtering module;
  • the anti-aliasing filtering module is configured to filter the aliasing frequency component in the downsampled signal, and send the signal after filtering the aliased frequency component to the signal amplifying module;
  • the signal amplifying module is configured to perform amplification processing on the signal after filtering the aliasing frequency component, and send the amplified signal to the endpoint detecting module;
  • the endpoint detection module is configured to determine a starting point and a ending point of the effective voice in the amplified signal, and send the valid voice signal to the noise filtering module;
  • the noise filtering module is configured to filter out a noise signal in the effective voice signal, and send the signal after filtering the noise to the matching module for signal matching.
  • the unified amplitude, frequency, and phase are respectively set amplitude, frequency, and phase of the human ear hearing range
  • the low frequency in the low frequency sample is greater than twice the highest frequency of the sampled signal.
  • the matching module includes: a feature extraction module, a similarity measurement module, and a voice short message library module;
  • the feature extraction module is configured to extract feature parameters of the preprocessed signal, and send the extracted feature parameters to the similarity measurement module;
  • the similarity measurement module is configured to calculate a similarity between the extracted feature parameter and the voice message feature parameter sent by the voice message library module, and send the voice message with the similarity equal to or greater than the threshold to the result output module;
  • the voice short message library module is configured to store feature parameters of the voice message, and send the feature parameters of each voice message to the similarity measurement module for similarity calculation.
  • the characteristic parameters include: a linear prediction coefficient, a linear prediction cepstrum coefficient, and a Meir frequency cepstrum coefficient;
  • the similarity calculation method includes: an Euclidean distance similarity method, a cosine similarity method, a Manhattan distance method, and a gray correlation method.
  • the embodiment of the invention provides a mobile terminal search method with a built-in search voice short message function, and the method includes the following steps:
  • the user's voice search signal is input, and the voice search signal is preprocessed; the feature parameters are extracted from the preprocessed signal, and the similarity between the extracted feature parameter and the feature parameter of the stored voice message is calculated;
  • the voice messages whose similarity is greater than or equal to the threshold are displayed on the screen of the mobile terminal in a list form.
  • the preprocessing the voice search signal includes the following steps:
  • the amplitude, frequency and phase of the speech signal are respectively unified into a uniform amplitude, frequency and phase; the low-frequency sampling is performed on the normalized signal; the aliasing frequency component in the low-frequency sampling signal is filtered out and amplified; Amplifying the start and end points of the effective speech in the signal; filtering out the noise signal in the active speech signal.
  • the method further includes: the mobile terminal prompting the user whether to perform the search again.
  • the mobile terminal with the built-in search voice short message function and the search method thereof are provided by the voice input module, and the voice search signal is preprocessed by the preprocessing module, and is calculated by the matching module.
  • the similarity between the voice search signal and the stored voice message, and the result output module displays the voice message with the similarity greater than or equal to the threshold value in the form of a list on the mobile terminal screen; thus, the voice search signal can be input to the internal of the mobile terminal.
  • the voice message is searched, and the user no longer needs to open the voice message one by one to listen, so that the search of the voice message is very convenient, and the user experience is improved.
  • the result output module prompts the user whether to perform the search again. In this way, the voice message can be searched again by re-entering the voice search signal.
  • FIG. 1 is a schematic structural diagram of a mobile terminal with a built-in search voice message function according to an embodiment of the present invention
  • FIG. 2 is a schematic structural diagram of a structure of a voice short message library module according to an embodiment of the present invention
  • FIG. 3 is a schematic flowchart diagram of a mobile terminal that searches for a voice short message function according to an embodiment of the present invention. detailed description
  • the mobile terminal includes: a voice input module 11, a preprocessing module 12, a matching module 13, and a result output module 14; among them,
  • the voice input module 11 is configured to input a voice search signal of the user, and send the voice search signal to the pre-processing module 12 for pre-processing;
  • the pre-processing module 12 is configured to receive the voice search signal sent by the voice input module 11, pre-process the voice search signal, and send the pre-processed pre-processed signal to the matching module 13 for signal matching;
  • the matching module 13 is configured to receive the pre-processing signal sent by the pre-processing module 12, perform feature parameter extraction on the pre-processed signal, and calculate a similarity between the extracted feature parameter and the feature parameter of the stored voice message, and the similarity is greater than a voice message equal to the threshold is sent to the result output module 14;
  • the result output module 14 is configured to receive a voice message sent by the matching module 13 with a similarity greater than or equal to a threshold value, and display the voice message in a list form on the screen of the mobile terminal;
  • the list includes at least one voice short message entry, and the voice short message entry is vertically arranged on the mobile terminal screen;
  • the voice short message entry includes: a voice short message connection identifier, and may further include: a voice message creation time, a voice One or more of the SMS duration and the size of the voice message, and the voice message connection identifier, the voice message creation time, the voice message duration, and the voice message size are horizontally arranged on the mobile terminal screen.
  • the threshold value is a set similarity threshold.
  • the voice message includes a voice search signal.
  • the voice is short.
  • the letter does not contain a voice search signal.
  • the voice search signal may be a keyword and a key sentence of the voice message; the voice message is at least one piece of voice information stored by the mobile terminal.
  • the result output module 14 is further configured to: when the voice message with the similarity greater than or equal to the threshold is greater than one, prompt the user to perform a search again on the screen of the mobile terminal; correspondingly, when performing the search again,
  • the voice input module 11 is configured to record the voice search signal as a second keyword or a second key sentence; the matching module 13 calculates that the similarity between the second keyword or the second key sentence and the last search is greater than or equal to the threshold The similarity of the voice message; wherein the second keyword or the second key sentence is different from the keyword or key sentence at the time of the last search.
  • the pre-processing module 12 includes: a signal normalization module 121, a signal-down module 122, an anti-aliasing filter module 123, a signal amplification module 124, an endpoint detection module 125, and a noise filtering module 126;
  • the signal normalization module 121 is configured to receive the voice search signal sent by the voice input module 11, and normalize the amplitude, frequency, and phase of the voice search signal into a unified amplitude, frequency, and phase; The signal is sent to the signal drop sample module 122;
  • the signal dampening module 122 is configured to receive the signal sent by the signal normalization module 121, and perform low frequency sampling on the signal; send the sampled signal to the anti-aliasing filtering module 123;
  • the anti-aliasing filtering module 123 is configured to receive the signal sent by the signal-down module 122, filter out the aliasing frequency component in the signal; send the signal filtered by the aliasing frequency component to the signal amplifying module 124;
  • the signal amplifying module 124 is configured to receive the signal sent by the anti-aliasing filtering module 123, and perform amplification processing on the signal; send the amplified signal to the endpoint detecting module 125; the endpoint detecting module 125 is configured to receive The signal sent by the signal amplifying module 124 determines the starting point and the ending point of the effective voice in the signal; the effective voice signal is sent to the noise filtering module 126; The noise filtering module 126 is configured to receive the valid voice signal sent by the endpoint detecting module 125, filter out the noise signal in the valid voice signal, and send the filtered noise signal to the matching module 13 for signal matching.
  • the unified amplitude, frequency, and phase are respectively a set amplitude, frequency, and phase of the human ear hearing range.
  • the low frequency in the low frequency sample is greater than twice the highest frequency of the sampled signal to ensure that it is sufficiently high.
  • the matching module 13 includes a feature extraction module 131, a similarity measurement module 132, and a voice message library module 133;
  • the feature extraction module 131 is configured to receive the pre-processing signal sent by the pre-processing module 12, extract the feature parameters of the pre-processed signal, and send the extracted feature parameters to the similarity measurement module 132;
  • the similarity measurement module 132 is configured to receive the feature parameter sent by the feature extraction module 131, calculate the similarity between the feature parameter and the voice message feature parameter sent by the voice message library module 133, and set the voice message with the similarity greater than or equal to the threshold value. Send to result output module 14;
  • the voice message library module 133 is configured to store feature parameters of the voice message, and sequentially send the feature parameters of each voice message to the similarity measurement module 132 for similarity calculation.
  • the characteristic parameters include: a linear prediction coefficient, a linear prediction cepstral coefficient, a Meir frequency cepstral coefficient, and the like.
  • the extracting the feature parameter is specifically:
  • the pre-processed signal is framed, windowed, and then subjected to discrete Fourier transform to obtain spectral distribution information; then the square of the spectral amplitude is obtained to obtain the energy language; the energy spectrum is passed through a set of triangular filter banks of the Meier scale.
  • the mer frequency to the spectral coefficient are obtained by discrete cosine transform; the mer frequency to the spectral coefficient are vector quantized.
  • vector quantization of characteristic parameters such as the frequency of the mer to the spectral coefficient can be performed by Method implementation: Principal Component Analysis (PCA) method, Support Vector Machine (SVM) method, or Wavelet Transform (WT) method.
  • PCA Principal Component Analysis
  • SVM Support Vector Machine
  • WT Wavelet Transform
  • the method for calculating the similarity may be: an Euclidean distance similarity method, a cosine similarity method, a Manhattan distance method, or a gray correlation method.
  • Xi is the characteristic parameter vector of the signal
  • 1 ⁇ 4 is the characteristic parameter vector of a speech information
  • d 2 (X, Y) is the Euclidean distance similarity
  • i takes 1, 2, 3, ... K
  • the Euclidean distance similarity characterizes the degree of similarity between the signal and the speech information, and the greater the Euclidean distance similarity value, the more similarity is Small, the smaller the Euclidean distance measurement value, the greater the similarity.
  • the module includes: a voice short message unit 133a, a pre-processing unit 133b, and a feature extraction unit 133c;
  • the voice short message unit 133a is configured to store the recorded voice message, and send the voice short message to the pre-processing unit 133b for pre-processing;
  • the pre-processing unit 133b is configured to receive the voice message sent by the voice short message unit 133a, pre-process the voice message, and send the pre-processed pre-processed signal to the feature extraction unit 133c;
  • the feature extraction unit 133c is configured to receive the pre-processing signal sent by the pre-processing unit 133b, and perform feature parameter extraction on the pre-processed signal.
  • the preprocessing of the voice message is as follows: the amplitude, the frequency, and the phase of the voice signal are respectively classified into a unified amplitude, frequency, and phase; and the normalized signal is low.
  • the frequency component is filtered; the aliasing frequency component in the low frequency sample signal is filtered out; then, the signal is amplified; the starting point and the ending point of the effective speech in the amplified signal are determined; finally, the noise signal in the effective speech signal is filtered out.
  • the characteristic parameters include: a linear prediction coefficient, a linear prediction cepstral coefficient, a Meir frequency cepstral coefficient, and the like.
  • the extracting the feature parameter is specifically:
  • the signal is framed, windowed, and then subjected to discrete Fourier transform to obtain spectrum distribution information; then the square of the spectrum amplitude is obtained to obtain an energy spectrum; the energy spectrum is passed through a set of Meier-scale triangular filter banks; The cosine transform obtains the mer frequency to the spectral coefficient; vector quantization is performed on the mer frequency to the spectral coefficient.
  • vector quantization of characteristic parameters such as the frequency of the mer to the spectral coefficient can be achieved by the following methods: PCA method, SVM method, or WT method.
  • the pre-processing of the voice message and the extraction of the feature parameters of the pre-processed signal may be performed in the background of the mobile terminal.
  • FIG. 3 is a schematic flowchart of a search method for implementing a mobile terminal with a built-in voice message function according to an embodiment of the present invention. As shown in FIG. 3, the method includes the following steps:
  • Step 301 The mobile terminal enters a voice search signal of the user.
  • the voice search signal may be a keyword or a key sentence of a voice message.
  • Step 302 The mobile terminal performs preprocessing on the voice search signal.
  • the step specifically includes: normalizing the amplitude, frequency, and phase of the speech signal into a uniform amplitude, frequency, and phase; performing low frequency sampling on the normalized signal; filtering out aliasing frequency components in the low frequency sampling signal After that, the signal is amplified; the starting point and the ending point of the effective speech in the amplified signal are determined; finally, the noise signal in the effective speech signal is filtered out.
  • the low frequency in the low frequency sample is greater than twice the highest frequency of the signal.
  • Step 303 The mobile terminal performs feature parameter extraction on the preprocessed signal, and calculates the extracted The similarity between the feature parameters and the feature parameters of the stored voice message.
  • calculating a similarity between the extracted feature parameter and the speech feature parameter at the starting point sequentially pushing back a syllable of a word, such as a "good" syllable, and calculating the extracted feature
  • the similarity between the parameter and the voice feature parameter is stopped until the end point of the valid voice in the voice message; the calculated maximum similarity is used as the similarity of the voice message.
  • the characteristic parameters include: a linear prediction coefficient, a linear prediction cepstral coefficient, a Meyer frequency cepstral coefficient, and the like.
  • the extracting the feature parameter is specifically:
  • the signal is framed, windowed, and then subjected to discrete Fourier transform to obtain spectrum distribution information; then the square of the spectrum amplitude is obtained to obtain an energy spectrum; the energy spectrum is passed through a set of Meier-scale triangular filter banks; The cosine transform obtains the mer frequency to the spectral coefficient; vector quantization is performed on the mer frequency to the spectral coefficient.
  • vector quantization of characteristic parameters such as the frequency of the mer to the spectral coefficient can be achieved by the following methods: PCA method, SVM method, or WT method.
  • the method for calculating the similarity may be: an Euclidean distance similarity method, a cosine similarity method, a Manhattan distance method, or a gray correlation method.
  • Step 304 The mobile terminal displays the voice message with the similarity greater than or equal to the threshold value in the form of a list on the screen of the mobile terminal.
  • the list includes at least one voice short message entry, and the voice short message entry is vertically arranged on the mobile terminal screen;
  • the voice short message entry includes: a voice short message connection identifier, and may further include: a voice message creation time, a voice One or more of the SMS duration and the size of the voice message, and the voice message connection identifier, the voice message creation time, the voice message duration, and the voice message size are horizontally arranged on the mobile terminal screen.
  • the threshold value is a set similarity threshold.
  • the voice message includes a voice search signal.
  • the voice message does not include a voice search. signal.
  • the step further includes: the mobile terminal prompting the user whether to perform the search again.
  • step 301 to step 304 are repeated; where the voice search signal entered again is the second keyword or the second key sentence; the recalculated similarity is the second keyword or the second key
  • the similarity between the sentence and the last searched voice message is greater than or equal to the value of the voice message; wherein the second keyword or the second key sentence is different from the keyword or key sentence at the time of the last search.

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Business, Economics & Management (AREA)
  • Business, Economics & Management (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephone Function (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

本发明公开了一种内置搜索语音短信功能的移动终端,该移动终端包括语音录入模块,配置为录入用户的语音搜索信号,并将该语音搜索信号发送至预处理模块进行预处理;预处理模块,配置为对语音搜索信号进行预处理,将预处理后的预处理信号发送至匹配模块进行信号匹配;匹配模块,配置为对预处理信号进行特征参数提取,计算所提取特征参数与所存语音短信特征参数的相似度,将相似度大于等于阈值的语音短信发送至结果输出模块;结果输出模块,配置为将相似度大于等于阈值的语音短信以列表形式显示于移动终端屏幕上;本发明还同时公开了一种内置搜索语音短信功能的移动终端搜发方法,采用本发明,能快捷地对移动终端内部存储的语音短信进行搜索。

Description

一种内置搜索语音短信功能的移动终端及其搜索方法 技术领域
本发明涉及移动终端技术中的信息搜索领域, 尤其涉及一种内置搜索 语音短信功能的移动终端及其搜索方法。 背景技术
语音短信业务是指用户把想说的话语通过手机等移动终端进行录音, 发给一个或多个好友用户进行收听, 同时, 用户还可以根据手机等移动终 端的提示音, 进行语音短信的接收、 转发、 查询、 回复和语音短信点播等 操作。 语音短信弥补了传统的文字短信难以传递声音和信息输入不便的缺 憾, 解决了那些因为不熟悉拼音使用, 长时间徘徊在短信之外的人们发送 短信的难题。
但是, 由于用户收到的语音短信是发件人发送的音频文件, 用户无法 对语音短信进行直观的查看。 当手机等移动终端本地存储语音短信过多时, 用户希望查看特定的语音短信就变的极难查找, 需要一个一个的打开语音 短信进行收听, 如此, 语音短信的搜索就变的非常不便, 极大的降低了用 户的体验度。 发明内容
有鉴于此, 本发明实施例的主要目的在于提供一种内置搜索语音短信 功能的移动终端及其搜索方法, 能快捷地对移动终端内部存储的语音短信 进行搜索。
为达到上述目的, 本发明实施例的技术方案是这样实现的:
本发明实施例提供了一种内置搜索语音短信功能的移动终端, 该移动 终端包括: 语音录入模块、 预处理模块、 匹配模块、 结果输出模块; 其中, 所述语音录入模块, 配置为录入用户的语音搜索信号, 并将该语音搜 索信号发送至预处理模块进行预处理;
所述预处理模块, 配置为对语音搜索信号进行预处理, 将预处理后的 预处理信号发送至匹配模块进行信号匹配;
所述匹配模块, 配置为对预处理信号进行特征参数提取, 计算所提取 特征参数与所存语音短信特征参数的相似度, 将相似度大于等于阈值的语 音短信发送至结果输出模块;
所述结果输出模块, 配置为将相似度大于等于阔值的语音短信以列表 形式显示于移动终端屏幕上。
上述方案中, 所述结果输出模块, 还配置为在相似度大于等于阔值的 语音短信大于一条的情况下, 在移动终端屏幕上提示用户是否进行再次搜 索。
上述方案中, 所述预处理模块包括: 信号归一化模块、 信号降釆样模 块、 反混叠滤波模块、 信号放大模块、 端点检测模块、 噪声滤波模块; 其 中,
所述信号归一化模块, 配置为将语音搜索信号的振幅、 频率、 相位分 别归一为统一的振幅、 频率、 相位, 将归一后的信号发送至信号降釆样模 块;
所述信号降釆样模块, 配置为对归一后的信号进行低频釆样, 将釆样 后的信号发送至反混叠滤波模块;
所述反混叠滤波模块, 配置为滤除降釆样信号中的混叠频率分量, 将 滤除混叠频率分量后的信号发送至信号放大模块;
所述信号放大模块, 配置为对滤除混叠频率分量后的信号进行放大处 理, 将放大处理后的信号发送至端点检测模块; 所述端点检测模块, 配置为确定放大信号中有效语音的起始点和终止 点, 将有效语音信号发送至噪声滤波模块;
所述噪声滤波模块, 配置为滤除有效语音信号中的噪声信号, 将滤除 噪声后的信号发送至匹配模块进行信号匹配。
上述方案中, 所述统一的振幅、 频率、 相位分别为, 人耳听觉范围的 设定的振幅、 频率、 相位;
所述低频釆样中的低频大于被釆样信号最高频率的两倍。
上述方案中, 所述匹配模块包括: 特征提取模块、 相似性测量模块、 语音短信库模块; 其中,
所述特征提取模块, 配置为提取预处理信号的特征参数, 并将提取出 来的特征参数发送至相似性测量模块;
所述相似性测量模块, 配置为计算所提取特征参数与语音短信库模块 发送的语音短信特征参数的相似度, 将相似度大于等于阈值的语音短信发 送至结果输出模块;
所述语音短信库模块, 配置为存储语音短信的特征参数, 并将每条语 音短信的特征参数依次发送至相似性测量模块进行相似度计算。
上述方案中, 所述特征参数包括: 线性预测系数、 线性预测倒谱系数、 美尔频率倒谱系数;
所述相似度计算方法包括: 欧式距离相似度方法、 余弦相似度方法、 曼哈顿距离方法、 灰关联度方法。
本发明实施例提供了一种内置搜索语音短信功能的移动终端搜索方 法, 该方法包括步骤:
录入用户的语音搜索信号, 并对该语音搜索信号进行预处理; 对预处理信号进行特征参数提取, 计算提取出来的特征参数与所存语 音短信的特征参数的相似度; 将相似度大于等于阔值的语音短信以列表形式显示于移动终端屏幕 上。
上述方案中, 所述对该语音搜索信号进行预处理包括步骤:
将该语音信号的振幅、 频率、 相位分别归一为统一的振幅、 频率、 相 位; 对归一后的信号进行低频釆样; 滤除低频釆样信号中的混叠频率分量 并进行放大; 确定放大信号中有效语音的起始点和终止点; 滤除有效语音 信号中的噪声信号。
上述方案中, 在相似度大于等于阔值的语音短信大于一条的情况下, 该方法还包括: 移动终端提示用户是否进行再次搜索。
釆用本发明实施例提供的一种内置搜索语音短信功能的移动终端及其 搜索方法, 由语音录入模块录入语音搜索信号, 由预处理模块对该语音搜 索信号进行预处理, 并通过匹配模块计算所述语音搜索信号与所存语音短 信的相似度, 结果输出模块将相似度大于等于阔值的语音短信以列表形式 显示于移动终端屏幕上; 如此, 可以通过录入语音搜索信号, 对移动终端 内部的语音短信进行搜索, 不再需要用户一个一个的打开语音短信进行收 听, 使语音短信的搜索非常方便, 提高了用户的体验度。
优选地, 当相似度大于等于阔值的语音短信大于一条时, 结果输出模 块提示用户是否进行再次搜索, 如此, 可以通过再次录入语音搜索信号, 对上次搜索出来的语音短信进行再次搜索。 附图说明
图 1 为本发明实施例内置搜索语音短信功能的移动终端的组成结构示 意图;
图 2为本发明实施例语音短信库模块的组成结构示意图;
图 3 为本发明实施例内置搜索语音短信功能的移动终端实现其搜索方 法的流程示意图。 具体实施方式
为了能够更加详尽地了解本发明实施例的特点与技术内容, 下面结合 附图对本发明实施例的实现进行详细阐述, 所附附图仅供参考说明之用, 并非用来限定本发明实施例。
图 1 为本发明实施例内置搜索语音短信功能的移动终端的组成结构示 意图, 如图 1所示, 该移动终端包括: 语音录入模块 11、 预处理模块 12、 匹配模块 13、 结果输出模块 14; 其中,
所述语音录入模块 11 , 配置为录入用户的语音搜索信号, 并将该语音 搜索信号发送至预处理模块 12进行预处理;
所述预处理模块 12,配置为接收语音录入模块 11发送的语音搜索信号, 对该语音搜索信号进行预处理, 将预处理后的预处理信号发送至匹配模块 13进行信号匹配;
所述匹配模块 13 , 配置为接收预处理模块 12发送的预处理信号, 对该 预处理信号进行特征参数提取, 计算提取出来的特征参数与所存语音短信 的特征参数的相似度, 将相似度大于等于阔值的语音短信发送至结果输出 模块 14;
所述结果输出模块 14,配置为接收匹配模块 13发送的相似度大于等于 阔值的语音短信, 将该语音短信以列表形式显示于移动终端屏幕上;
这里, 所述列表包括至少一条语音短信表项, 且语音短信表项纵向排 列于移动终端屏幕上; 所述语音短信表项包括: 语音短信连接标识, 还可 以进一步包括: 语音短信创建时间、 语音短信时长、 语音短信大小中的一 项或多项, 且语音短信连接标识、 语音短信创建时间、 语音短信时长、 语 音短信大小横向排列于移动终端屏幕上。
这里, 所述阔值为设定的相似度门限值, 当相似度大于等于阔值时, 表示该语音短信含有语音搜索信号; 当相似度小于阔值时, 表示该语音短 信不含有语音搜索信号。
上述方案中, 所述语音搜索信号可以是语音短信的关键词、 关键句; 所述语音短信是移动终端所存储的至少一条语音信息。
优选地, 所述结果输出模块 14, 还配置为在相似度大于等于阔值的语 音短信大于一条的情况下, 在移动终端屏幕上提示用户是否进行再次搜索; 相应的, 进行再次搜索时, 所述语音录入模块 11 , 录入的语音搜索信号为 第二关键词或第二关键句; 所述匹配模块 13 , 计算第二关键词或第二关键 句与上次搜索出的相似度大于等于阔值的语音短信的相似度; 其中, 第二 关键词或第二关键句是不同于上次搜索时的关键词或关键句。
优选地, 所述预处理模块 12包括: 信号归一化模块 121、 信号降釆样 模块 122、 反混叠滤波模块 123、 信号放大模块 124、 端点检测模块 125、 噪声滤波模块 126; 其中,
所述信号归一化模块 121 , 配置为接收语音录入模块 11发送的语音搜 索信号, 将该语音搜索信号的振幅、 频率、 相位分别归一为统一的振幅、 频率、 相位; 将归一后的信号发送至信号降釆样模块 122;
所述信号降釆样模块 122,配置为接收信号归一化模块 121发送的信号, 对该信号进行低频釆样; 将釆样后的信号发送至反混叠滤波模块 123;
所述反混叠滤波模块 123 ,配置为接收信号降釆样模块 122发送的信号, 滤除该信号中的混叠频率分量; 将滤除混叠频率分量后的信号发送至信号 放大模块 124;
所述信号放大模块 124, 配置为接收反混叠滤波模块 123发送的信号, 对该信号进行放大处理; 将放大处理后的信号发送至端点检测模块 125; 所述端点检测模块 125 , 配置为接收信号放大模块 124发送的信号, 确 定该信号中有效语音的起始点和终止点; 将有效语音信号发送至噪声滤波 模块 126; 所述噪声滤波模块 126,配置为接收端点检测模块 125发送的有效语音 信号, 滤除有效语音信号中的噪声信号; 将滤除噪声后的信号发送至匹配 模块 13进行信号匹配。
上述方案中, 所述统一的振幅、 频率、 相位分别为, 人耳听觉范围的 某一设定的振幅、 频率、 相位。
所述低频釆样中的低频大于被釆样信号最高频率的两倍, 以保证其足 够高。
优选地,所述匹配模块 13包括特征提取模块 131、相似性测量模块 132、 语音短信库模块 133 ; 其中,
所述特征提取模块 131 ,配置为接收预处理模块 12发送的预处理信号, 提取该预处理信号的特征参数, 并将提取出来的特征参数发送至相似性测 量模块 132;
所述相似性测量模块 132 ,配置为接收特征提取模块 131发送的特征参 数, 计算该特征参数与语音短信库模块 133发送的语音短信特征参数的相 似度, 将相似度大于等于阔值的语音短信发送至结果输出模块 14;
所述语音短信库模块 133 , 配置为存储语音短信的特征参数, 并将每条 语音短信的特征参数依次发送至相似性测量模块 132进行相似度计算。
上述方案中, 所述特征参数包括: 线性预测系数、 线性预测倒谱系数、 美尔频率倒谱系数等。
以美尔频率倒谱系数为例, 所述提取该特征参数具体为:
对预处理信号进行分帧、 加窗, 然后作离散傅里叶变换, 获得频谱分 布信息; 再求频谱幅度的平方, 得到能量语; 将能量谱通过一组美尔尺度 的三角形滤波器组, 经离散余弦变换得到美尔频率到谱系数; 对美尔频率 到谱系数进行矢量量化。
这里, 对美尔频率到谱系数等特征参数进行矢量量化可以通过以下方 法实现: 主成分分析( Principal Component Analysis, PCA )方法、 支持向 量机 ( Support Vector Machine , SVM )方法、或小波变换 ( Wavelet Transform, WT )方法。
上述方案中, 所述相似度计算的方法可以是: 欧式距离相似度方法、 余弦相似度方法、 曼哈顿距离方法、 或灰关联度方法等。
以欧式距离相似度方法为例, 所述相似度计算具体为: ά2 (Χ, Υ) = ∑(Χί - γί )
Figure imgf000010_0001
其中, Xi为信号的特征参数矢量, ¼为一条语音信息的特征参数矢量, d2(X,Y)为欧式距离相似度;
Figure imgf000010_0002
为求和符号, i取 1、 2、 3...... K; 所述欧式 距离相似度表征所述信号与所述语音信息的相似程度, 欧式距离相似度值 越大表示相似度越小, 欧式距离测量度值越小表示相似度越大。
图 2为本发明实施例语音短信库模块的组成结构示意图, 如图 2所示, 该模块包括: 语音短信单元 133a、 预处理单元 133b、 特征提取单元 133c; 其中,
所述语音短信单元 133a, 配置为存储录入的语音短信, 并将该语音短 信发送至预处理单元 133b进行预处理;
所述预处理单元 133b,配置为接收语音短信单元 133a发送的语音短信, 对该语音短信进行预处理, 将预处理后的预处理信号发送至特征提取单元 133c;
所述特征提取单元 133c,配置为接收预处理单元 133b发送的预处理信 号, 对该预处理信号进行特征参数提取。
上述方案中, 对语音短信进行预处理具体为: 将该语音信号的振幅、 频率、 相位分别归一为统一的振幅、 频率、 相位; 对归一后的信号进行低 频釆样; 滤除低频釆样信号中的混叠频率分量; 之后, 将信号进行放大; 确定放大信号中有效语音的起始点和终止点; 最后, 滤除有效语音信号中 的噪声信号。
上述方案中, 所述特征参数包括: 线性预测系数、 线性预测倒谱系数、 美尔频率倒谱系数等。
以美尔频率倒谱系数为例, 所述提取该特征参数具体为:
对信号进行分帧、 加窗, 然后作离散傅里叶变换, 获得频谱分布信息; 再求频谱幅度的平方, 得到能量谱; 将能量谱通过一组美尔尺度的三角形 滤波器组; 经离散余弦变换得到美尔频率到谱系数; 对美尔频率到谱系数 进行矢量量化。
这里, 对美尔频率到谱系数等特征参数进行矢量量化可以通过以下方 法实现: PCA方法、 SVM方法、 或 WT方法。
上述方案中, 所述对语音短信进行预处理、 对预处理信号进行特征参 数提取, 可以在移动终端后台进行操作。
图 3 为本发明实施例内置搜索语音短信功能的移动终端实现其搜索方 法的流程示意图, 如图 3所示, 该方法包括步骤:
步骤 301 : 移动终端录入用户的语音搜索信号。
所述语音搜索信号可以是语音短信的关键词或关键句。
步骤 302: 移动终端对该语音搜索信号进行预处理。
本步骤具体包括: 将该语音信号的振幅、 频率、 相位分别归一为统一 的振幅、 频率、 相位; 对归一后的信号进行低频釆样; 滤除低频釆样信号 中的混叠频率分量; 之后, 将信号进行放大; 确定放大信号中有效语音的 起始点和终止点; 最后, 滤除有效语音信号中的噪声信号。
这里, 所述低频釆样中的低频大于该信号最高频率的两倍。
步骤 303: 移动终端对预处理信号进行特征参数提取, 计算提取出来的 特征参数与所存语音短信的特征参数的相似度。
优选地, 从语音短信中有效语音的起始点开始, 计算所提取特征参数 与该起始点处语音特征参数的相似度; 逐次推后一个字的音节, 如 "好" 字音节, 计算所提取特征参数与此处语音特征参数的相似度, 直到语音短 信中有效语音的终止点为止停止计算; 将所计算出的最大相似度作为本条 语音短信的相似度。
所述特征参数包括: 线性预测系数、 线性预测倒谱系数、 美尔频率倒 谱系数等。
以美尔频率倒谱系数为例, 所述提取该特征参数具体为:
对信号进行分帧、 加窗, 然后作离散傅里叶变换, 获得频谱分布信息; 再求频谱幅度的平方, 得到能量谱; 将能量谱通过一组美尔尺度的三角形 滤波器组; 经离散余弦变换得到美尔频率到谱系数; 对美尔频率到谱系数 进行矢量量化。
这里, 对美尔频率到谱系数等特征参数进行矢量量化可以通过以下方 法实现: PCA方法、 SVM方法、 或 WT方法。
所述相似度计算的方法可以是: 欧式距离相似度方法、 余弦相似度方 法、 曼哈顿距离方法、 或灰关联度方法等。
以欧式距离相似度方法为例, 所述相似度计算具体为: ά2 (Χ, Υ) = ∑(Χί - γί )
Figure imgf000012_0001
其中, Xi为信号特征参数矢量, Yi为一条语音信息特征参数矢量, d2(X,Y) 为欧式距离相似度;
Figure imgf000012_0002
为求和符号, i取 1、 2、 3...... K; 所述欧式距离测 量度表征所述信号与所述语音信息的相似程度, 欧式距离测量度值越大表 示相似度越小, 欧式距离测量度值越小表示相似度越大。 步骤 304:移动终端将相似度大于等于阔值的语音短信以列表的形式显 示于移动终端屏幕上。
这里, 所述列表包括至少一条语音短信表项, 且语音短信表项纵向排 列于移动终端屏幕上; 所述语音短信表项包括: 语音短信连接标识, 还可 以进一步包括: 语音短信创建时间、 语音短信时长、 语音短信大小中的一 项或多项, 且语音短信连接标识、 语音短信创建时间、 语音短信时长、 语 音短信大小横向排列于移动终端屏幕上。
这里, 所述阔值为设定的相似度门限值, 当相似度大于等于阔值时, 表示该语音短信含有语音搜索信号; 当相似度小于阔值时, 表示该语音短 信不含有语音搜索信号。
在相似度大于等于阔值的语音短信大于一条的情况下, 本步骤还包括, 移动终端提示用户是否进行再次搜索。
相应的, 在用户确定再次搜索之后, 重复步骤 301〜步骤 304; 这里, 再次录入的语音搜索信号为第二关键词或第二关键句; 再次计算的相似度 为第二关键词或第二关键句与上次搜索出的相似度大于等于阔值的语音短 信的相似度; 其中, 第二关键词或第二关键句是不同于上次搜索时的关键 词或关键句。
以上所述, 仅为本发明的较佳实施例而已, 并非用于限定本发明的保 护范围。

Claims

权利要求书
1、 一种内置搜索语音短信功能的移动终端, 所述移动终端包括: 语音 录入模块、 预处理模块、 匹配模块、 结果输出模块; 其中,
所述语音录入模块, 配置为录入用户的语音搜索信号, 并将该语音搜 索信号发送至预处理模块进行预处理;
所述预处理模块, 配置为对语音搜索信号进行预处理, 将预处理后的 预处理信号发送至匹配模块进行信号匹配;
所述匹配模块, 配置为对预处理信号进行特征参数提取, 计算所提取 特征参数与所存语音短信特征参数的相似度, 将相似度大于等于阈值的语 音短信发送至结果输出模块;
所述结果输出模块, 配置为将相似度大于等于阔值的语音短信以列表 形式显示于移动终端屏幕上。
2、 根据权利要求 1所述的移动终端, 其中, 所述结果输出模块, 还配 置为在相似度大于等于阔值的语音短信大于一条的情况下, 在移动终端屏 幕上提示用户是否进行再次搜索。
3、 根据权利要求 1所述的移动终端, 其中, 所述预处理模块包括: 信 号归一化模块、 信号降釆样模块、 反混叠滤波模块、 信号放大模块、 端点 检测模块、 噪声滤波模块; 其中,
所述信号归一化模块, 配置为将语音搜索信号的振幅、 频率、 相位分 别归一为统一的振幅、 频率、 相位, 将归一后的信号发送至信号降釆样模 块;
所述信号降釆样模块, 配置为对归一后的信号进行低频釆样, 将釆样 后的信号发送至反混叠滤波模块;
所述反混叠滤波模块, 配置为滤除降釆样信号中的混叠频率分量, 将 滤除混叠频率分量后的信号发送至信号放大模块; 所述信号放大模块, 配置为对滤除混叠频率分量后的信号进行放大处 理, 将放大处理后的信号发送至端点检测模块;
所述端点检测模块, 配置为确定放大信号中有效语音的起始点和终止 点, 将有效语音信号发送至噪声滤波模块;
所述噪声滤波模块, 配置为滤除有效语音信号中的噪声信号, 将滤除 噪声后的信号发送至匹配模块进行信号匹配。
4、 根据权利要求 3所述的移动终端, 其中,
所述统一的振幅、 频率、 相位分别为, 人耳听觉范围的设定的振幅、 频率、 相位;
所述低频釆样中的低频大于被釆样信号最高频率的两倍。
5、 根据权利要求 1至 4任一项所述的移动终端, 其中, 所述匹配模块 包括: 特征提取模块、 相似性测量模块、 语音短信库模块; 其中,
所述特征提取模块, 配置为提取预处理信号的特征参数, 并将提取出 来的特征参数发送至相似性测量模块;
所述相似性测量模块, 配置为计算所提取特征参数与语音短信库模块 发送的语音短信特征参数的相似度, 将相似度大于等于阈值的语音短信发 送至结果输出模块;
所述语音短信库模块, 配置为存储语音短信的特征参数, 并将每条语 音短信的特征参数依次发送至相似性测量模块进行相似度计算。
6、 根据权利要求 5所述的移动终端, 其中,
所述特征参数包括: 线性预测系数、 线性预测倒谱系数、 美尔频率倒 谱系数;
所述相似度计算方法包括: 欧式距离相似度方法、 余弦相似度方法、 曼哈顿距离方法、 灰关联度方法。
7、 一种内置搜索语音短信功能的移动终端搜索方法, 所述方法包括步 骤:
录入用户的语音搜索信号, 并对该语音搜索信号进行预处理; 对预处理信号进行特征参数提取, 计算提取出来的特征参数与所存语 音短信的特征参数的相似度;
将相似度大于等于阔值的语音短信以列表形式显示于移动终端屏幕 上。
8、 根据权利要求 7所述的方法, 其中, 所述对该语音搜索信号进行预 处理包括步骤:
将该语音信号的振幅、 频率、 相位分别归一为统一的振幅、 频率、 相 位; 对归一后的信号进行低频釆样; 滤除低频釆样信号中的混叠频率分量 并进行放大; 确定放大信号中有效语音的起始点和终止点; 滤除有效语音 信号中的噪声信号。
9、 根据权利要求 7或 8所述的方法, 其中, 在相似度大于等于阔值的 语音短信大于一条的情况下, 该方法还包括: 移动终端提示用户是否进行 再次搜索。
PCT/CN2013/079091 2012-12-04 2013-07-09 一种内置搜索语音短信功能的移动终端及其搜索方法 WO2013167023A2 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP13788386.4A EP2919429A4 (en) 2012-12-04 2013-07-09 MOBILE TERMINAL INCORPORATING SHORT MESSAGE SEARCHING FUNCTION AND ASSOCIATED SEARCH METHOD
US14/649,658 US9992321B2 (en) 2012-12-04 2013-07-09 Mobile terminal with a built-in voice message searching function and corresponding searching method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201210512740.4A CN103856600B (zh) 2012-12-04 2012-12-04 一种内置搜索语音短信功能的移动终端及其搜索方法
CN201210512740.4 2012-12-04

Publications (2)

Publication Number Publication Date
WO2013167023A2 true WO2013167023A2 (zh) 2013-11-14
WO2013167023A3 WO2013167023A3 (zh) 2013-12-27

Family

ID=49551349

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2013/079091 WO2013167023A2 (zh) 2012-12-04 2013-07-09 一种内置搜索语音短信功能的移动终端及其搜索方法

Country Status (4)

Country Link
US (1) US9992321B2 (zh)
EP (1) EP2919429A4 (zh)
CN (1) CN103856600B (zh)
WO (1) WO2013167023A2 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105513588B (zh) * 2014-09-22 2019-06-25 联想(北京)有限公司 一种信息处理方法及电子设备
CN109220773B (zh) * 2018-09-06 2021-11-02 东北农业大学 一种耐抽薹甘蓝品种的培育方法

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB8911461D0 (en) 1989-05-18 1989-07-05 Smiths Industries Plc Temperature adaptors
EP1315098A1 (en) 2001-11-27 2003-05-28 Telefonaktiebolaget L M Ericsson (Publ) Searching for voice messages
CN1180597C (zh) * 2002-03-14 2004-12-15 四川长城软件科技有限公司 通信中的语音短信息系统
JP2004286834A (ja) * 2003-03-19 2004-10-14 Mamiya Op Co Ltd 語学学習機
US8150683B2 (en) 2003-11-04 2012-04-03 Stmicroelectronics Asia Pacific Pte., Ltd. Apparatus, method, and computer program for comparing audio signals
KR100800873B1 (ko) * 2005-10-28 2008-02-04 삼성전자주식회사 음성 신호 검출 시스템 및 방법
CN101414412A (zh) * 2007-10-19 2009-04-22 陈修志 互动式声控儿童教育学习装置
JP2009282690A (ja) * 2008-05-21 2009-12-03 Toshiba Corp 情報検索方法および情報処理装置
US8359205B2 (en) * 2008-10-24 2013-01-22 The Nielsen Company (Us), Llc Methods and apparatus to perform audio watermarking and watermark detection and extraction
KR20100067174A (ko) 2008-12-11 2010-06-21 한국전자통신연구원 음성 인식을 이용한 메타데이터 검색기, 검색 방법, iptv 수신 장치
US20100305948A1 (en) 2009-06-01 2010-12-02 Adam Simone Phoneme Model for Speech Recognition
CN102376303B (zh) * 2010-08-13 2014-03-12 国基电子(上海)有限公司 录音设备及利用该录音设备进行声音处理与录入的方法
CN102523349A (zh) 2011-12-22 2012-06-27 苏州巴米特信息科技有限公司 一种特色的手机语音搜索的方法
US9286904B2 (en) * 2012-03-06 2016-03-15 Ati Technologies Ulc Adjusting a data rate of a digital audio stream based on dynamically determined audio playback system capabilities
US8681950B2 (en) * 2012-03-28 2014-03-25 Interactive Intelligence, Inc. System and method for fingerprinting datasets
WO2013184520A1 (en) * 2012-06-04 2013-12-12 Stone Troy Christopher Methods and systems for identifying content types
US9251406B2 (en) * 2012-06-20 2016-02-02 Yahoo! Inc. Method and system for detecting users' emotions when experiencing a media program
US9263059B2 (en) * 2012-09-28 2016-02-16 International Business Machines Corporation Deep tagging background noises

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
None
See also references of EP2919429A4

Also Published As

Publication number Publication date
CN103856600B (zh) 2016-09-28
EP2919429A2 (en) 2015-09-16
WO2013167023A3 (zh) 2013-12-27
US20150319286A1 (en) 2015-11-05
US9992321B2 (en) 2018-06-05
EP2919429A4 (en) 2015-12-09
CN103856600A (zh) 2014-06-11

Similar Documents

Publication Publication Date Title
US11227611B2 (en) Determining hotword suitability
JP6649474B2 (ja) 声紋識別方法、装置及びバックグラウンドサーバ
CN103095911B (zh) 一种通过语音唤醒寻找手机的方法及系统
US20190005961A1 (en) Method and device for processing voice message, terminal and storage medium
WO2019148586A1 (zh) 多人发言中发言人识别方法以及装置
US20150279351A1 (en) Keyword detection based on acoustic alignment
EP2994911A1 (en) Adaptive audio frame processing for keyword detection
JP2014505270A (ja) オンライン音声認識を処理する音声認識クライアントシステム、音声認識サーバシステム及び音声認識方法
CN104142831B (zh) 应用程序搜索方法及装置
TW202018696A (zh) 語音識別方法、裝置及計算設備
CN104282303B (zh) 利用声纹识别进行语音辨识的方法及其电子装置
WO2013167023A2 (zh) 一种内置搜索语音短信功能的移动终端及其搜索方法
US10930283B2 (en) Sound recognition device and sound recognition method applied therein
CN111128198B (zh) 一种声纹识别方法、装置、存储介质、服务器及系统
CN108989551B (zh) 位置提示方法、装置、存储介质及电子设备
KR20110079161A (ko) 이동 단말기에서 화자 인증 방법 및 장치
CN117153185B (zh) 通话处理方法、装置、计算机设备和存储介质
TWI574255B (zh) 語音辨識方法、電子裝置及語音辨識系統
CN115132211A (zh) 一种基于深度学习的声纹识别身份认证系统
CN111583939A (zh) 语音识别用于特定目标唤醒的方法及装置
TW200941455A (en) Speech recognition apparatus and method thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 13788386

Country of ref document: EP

Kind code of ref document: A2

WWE Wipo information: entry into national phase

Ref document number: 14649658

Country of ref document: US

REEP Request for entry into the european phase

Ref document number: 2013788386

Country of ref document: EP

WWE Wipo information: entry into national phase

Ref document number: 2013788386

Country of ref document: EP