WO2008138257A1 - Dispositif de reconnaissance vocale et procédé de communication vocale - Google Patents

Dispositif de reconnaissance vocale et procédé de communication vocale Download PDF

Info

Publication number
WO2008138257A1
WO2008138257A1 PCT/CN2008/070906 CN2008070906W WO2008138257A1 WO 2008138257 A1 WO2008138257 A1 WO 2008138257A1 CN 2008070906 W CN2008070906 W CN 2008070906W WO 2008138257 A1 WO2008138257 A1 WO 2008138257A1
Authority
WO
WIPO (PCT)
Prior art keywords
user
information
voice
access
parsing
Prior art date
Application number
PCT/CN2008/070906
Other languages
English (en)
Chinese (zh)
Inventor
Liang Liang
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Publication of WO2008138257A1 publication Critical patent/WO2008138257A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3231Biological data, e.g. fingerprint, voice or retina
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/26Devices for calling a subscriber
    • H04M1/27Devices whereby a plurality of signals may be stored simultaneously
    • H04M1/274Devices whereby a plurality of signals may be stored simultaneously with provision for storing more than one subscriber number at a time, e.g. using toothed disc
    • H04M1/2745Devices whereby a plurality of signals may be stored simultaneously with provision for storing more than one subscriber number at a time, e.g. using toothed disc using static electronic memories, e.g. chips
    • H04M1/2753Devices whereby a plurality of signals may be stored simultaneously with provision for storing more than one subscriber number at a time, e.g. using toothed disc using static electronic memories, e.g. chips providing data content
    • H04M1/2757Devices whereby a plurality of signals may be stored simultaneously with provision for storing more than one subscriber number at a time, e.g. using toothed disc using static electronic memories, e.g. chips providing data content by data transmission, e.g. downloading
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/66Substation equipment, e.g. for use by subscribers with means for preventing unauthorised or fraudulent calling
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M2250/00Details of telephonic subscriber devices
    • H04M2250/74Details of telephonic subscriber devices with voice recognition means

Definitions

  • the present invention relates to a speech recognition technology, and more particularly to a speech recognition apparatus and a voice communication method for recognizing and processing speech in a network. Background technique
  • the speech recognition technology is gradually matured, and relevant identification modules are integrated on many user terminals, and the user who needs to call is recognized, and the called party is automatically dialed.
  • the user terminal needs to integrate a complex identification module, and the algorithm of the identification module is updated 4 times, which is not easy to update, and also increases the complexity and cost of the user terminal.
  • the embodiment of the present invention provides a voice recognition device, where the voice recognition device is coupled to at least one user terminal, and the voice recognition device includes:
  • a voice recognition analysis unit configured to identify voice information of the user
  • a user information storage unit configured to store user information, including user contact information, user personal information, and user voice information;
  • a user identification unit configured to identify a user, to determine whether the user is a legitimate user, and whether to have access to related data information, where the related data information includes one or more of the following combinations: The information of the user, the information that the legitimate user can access, and the information that other users are allowed to access.
  • the embodiment of the invention further provides a voice communication method, the method comprising:
  • a related operation is performed based on the content in the voice information, the related operation including a call connection to the user.
  • the calling number of the incoming call is identified. If it is a new number, the name of the calling party during the call is resolved, and the name of the calling party is associated with the number or the address information, and the subsequent user directly calls the name to connect.
  • the above method further includes a user identification unit, and the user identification unit can establish an association between the number or address information of the user terminal and the user information to identify whether the user terminal has the right to access related data.
  • the above identification method may further be that the user identification unit automatically recognizes the identity of the user according to the personal attribute of the user's voice, and determines whether the user has the right to use the related data information.
  • the network device can establish an independent user information area for different users, and ensure the confidentiality of the user information through the access authority of the user information, and at the same time, because each person's daily contact personnel are relatively limited, the recognition rate is It will be greatly improved; at the same time, according to the establishment of relevant user data by individuals, it can carry out various businesses with great flexibility.
  • FIG. 1 is a schematic block diagram of an embodiment of the present invention
  • FIG. 2 is a block diagram of a broadband access network applied to a digital subscriber line according to an embodiment of the present invention
  • 3 is a block diagram of an embodiment of the present invention applied to an Ethernet system
  • FIG. 4 is a block diagram of a network system applied to a switch according to an embodiment of the present invention.
  • FIG. 5 is a block diagram of an embodiment of the present invention applied to a distributed network. detailed description
  • a network virtual entity corresponding to an individual is established in the network, where the personal identity information of the user and the associated information of the user are saved.
  • the network recognizes the identity of the user; simultaneously analyzes the user's voice, identifies the user's instruction information and non-instruction information from the user's voice; and performs related network operations according to the instruction information or non-instruction information. If the call connection is connected to the user, the user's recording instruction, or other related operations are performed.
  • FIG. 1 is a schematic block diagram of an embodiment of the present invention.
  • the terminal 1 accesses the network, it sends a voice indication of "connecting terminal 2" to the network, and the voice recognition analysis unit in the network analyzes the feature information of the voice, and analyzes the meaning of "connecting terminal 2".
  • the number information of "terminal 2" is queried, and the terminal 2 and the terminal 1 are connected.
  • the voice recognition analysis unit can parse the user's instruction information and execute, such as "recording", the voice recognition analysis unit, recording the user's voice, and further parsing, forming text information, Text information is stored on the network for the user to call.
  • the speech recognition analysis unit can also identify the pronunciation attribute of the user. If the user is a Cantonese, the speech recognition analysis unit parses the user's voice using the Cantonese parsing analysis template, and the speech recognition analysis unit can also use other languages or dialects.
  • the parsing analysis template parses the user's voice.
  • the speech recognition analysis unit includes an adaptive algorithm that automatically matches the relevant speech parsing parameter or the language parsing analysis template according to the user's pronunciation attribute, such as the user's voice or dialect.
  • the user identification unit can be used to identify whether the user has the right to access related data information.
  • the subscriber identity unit may store number information or address information of the user's terminal, such as VOIP.
  • the phone, or the attribute of the user's terminal, such as the device identification of the terminal the user identification unit can use the above information to identify whether the user has the right to access the relevant user information.
  • the user identification unit may also store the user's voice personal attributes, such as voice quality features or voiceprint feature information, and the user identification unit may utilize the user's voice personal attributes to identify whether the user has access to relevant data information, or utilize the user's voice personally. Attribute index, matching user information that should be accessed.
  • the related data information may be information of the legitimate user, information that the legitimate user can access, and one or more combinations of information that other users are allowed to access.
  • the user's voice quality characteristics or voiceprint feature information can be entered into the system when the user registers.
  • Table 1 describes the associated user voiceprint feature information and user information.
  • the personal information of the user in Table 1 may be the name and telephone number of the user; it may be address information, such as a VOIP phone; it may be an attribute of the terminal, such as a device identifier of the terminal; or may be a network access link information, such as a virtual local area network VLAN , port number, and more.
  • the user information is associated with the user voiceprint feature information, so that when the user uses the network, the user's voice is authenticated in the network, whether the user can access the network or use the service, or access the user's personal data, ie, the user. information.
  • the speech recognition analysis unit can also establish a fast index matching according to the relationship of Table 2, index the voiceprint feature information of the user history, such as waveform data, and compare with the voiceprint feature information input by the user this time.
  • Table 2 when the user calls Wang Yikui, the input waveform data is first compared with the stored voiceprint feature information of "Li Jian”, “Zhang Feng” and “Wang Yikui", because the amount of data retrieved is small.
  • the corresponding waveform data can be quickly matched, and the identification of other data can also be established by this method. It can be understood that the corresponding form and related waveform data of Table 2 are only examples, and are not identified.
  • a plurality of methods may be used, such as importing the address book in the original user terminal at one time, or entering the address book on the user login service website, or by automatically collecting.
  • the calling number of the incoming call is identified. If it is a new number, the name of the calling party during the call is resolved, and the name of the calling party is associated with the number or address information, and the subsequent called user directly calls the name to connect.
  • FIG. 2 is a block diagram of an embodiment of a broadband access network applied to a digital subscriber line XDSL according to the present invention.
  • User information is included in the control center, and the user information includes information such as the user's personal address book.
  • the subscriber identity unit is coupled to a Digital Subscriber Line Access Multiplexer (DSLAM) device or a Broadband Access Server (BAS) (not shown).
  • DSLAM Digital Subscriber Line Access Multiplexer
  • BAS Broadband Access Server
  • the voice recognition analysis unit is coupled to the DSLAM device or the BAS (not shown), and the voice recognition analysis unit can convert the voice signal of the user and extract the feature data for local identification, and the voice recognition analysis unit can also convert the voice of the user. Signal and extract feature data, transfer the feature data to the control
  • the heart is identified. At least one unit of the user identification unit and the voice recognition analysis unit may be disposed at
  • the control center completes the relevant processing and controls the softswitch center to make related connections.
  • the control center includes user information, which includes contact information, such as the user's personal address book.
  • the gateway device is coupled to the user identification unit, and the user identification unit authenticates and authenticates the user's authority.
  • the gateway device is coupled to the voice recognition analysis unit, and the voice recognition analysis unit can convert the voice signal of the user and extract the feature data for local identification, and the voice recognition analysis unit can also convert the voice signal of the user and extract the feature data, and transmit the feature data to the control.
  • the center identifies it.
  • At least one unit of the user identification unit and the voice recognition analysis unit may be disposed on the gateway device to enhance the function of the device.
  • FIG. 4 is a block diagram of a network system applied to a switch according to an embodiment of the present invention, where a voice recognition analysis unit and a subscriber identity unit are coupled to a switch. At least one unit of the subscriber identity unit and the voice recognition analysis unit may be disposed on the switch to enhance the function of the device.
  • the voice recognition analysis unit and the user identification unit are separated from the control center and respectively coupled to the network device close to the user terminal, and the voice recognition analysis unit and the user identification unit may also be integrated with the control center or centralized to the server center.
  • the subscriber identity unit can also be placed on a work platform connected to the Internet or on a work platform located on the core network.
  • User information such as the user's contact information, the user's personal information, and the user's voice information may be distributed or stored in the user information storage unit, such as a controller.
  • the voice recognition module is integrated, and the user needs to contact the phone in the user's personal address book according to the identified user instruction, and needs to dial for the user connection Through the telephone, or executing the user's instructions, the key establishes the association between the user's personal characteristic information and the personal address book, which greatly improves the recognition rate and the efficiency of execution.
  • a centralized identification module array or a recognition module group can be established in the server center for centralized speech recognition processing.
  • FIG. 5 is a block diagram of a speech recognition of a distributed network according to an embodiment of the present invention.
  • the original speech or speech data is sampled and extracted, the sampling function can be integrated into the terminal, the feature extraction function can be integrated in the speech recognition analysis unit, the endpoint detection module detects the specific endpoint user identification, and the recognition module can adopt various algorithms.
  • the identification module can be divided into multiple levels. For the single-segment and the execution function with low security requirements, the identification function can be performed by the execution unit, which is complex and has high security requirements.
  • the command or function can identify and authenticate the authentication at multiple levels, and can also be authenticated based on the combined personal characteristics data, and then perform related operations.

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Multimedia (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)

Abstract

L'invention concerne un dispositif de reconnaissance vocale se couplant à au moins un terminal d'utilisateur, qui comprend une unité d'analyse vocale pour identifier des données d'utilisateur; une unité de mémoire de données d'utilisateur pour stocker des données d'utilisateur incluant des données de contact, des données personnelles et des données vocales d'utilisateur; une unité de reconnaissance vocale pour identifier l'utilisateur afin d'estimer si celui-ci est un utilisateur valide, et s'il est autorisé à accéder à des données associées. Les données associées comprennent un ou plusieurs des éléments du groupe suivant: données personnelles de l'utilisateur valide, données auxquelles l'utilisateur valide peut accéder et données auxquelles un autre utilisateur peut accéder. Un procédé de communication vocale est également prévu.
PCT/CN2008/070906 2007-05-14 2008-05-08 Dispositif de reconnaissance vocale et procédé de communication vocale WO2008138257A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN200710074460.9 2007-05-14
CN2007100744609A CN101308654B (zh) 2007-05-14 2007-05-14 一种语音分析识别方法、系统与装置

Publications (1)

Publication Number Publication Date
WO2008138257A1 true WO2008138257A1 (fr) 2008-11-20

Family

ID=40001699

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2008/070906 WO2008138257A1 (fr) 2007-05-14 2008-05-08 Dispositif de reconnaissance vocale et procédé de communication vocale

Country Status (2)

Country Link
CN (1) CN101308654B (fr)
WO (1) WO2008138257A1 (fr)

Families Citing this family (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101951432A (zh) * 2010-08-30 2011-01-19 宇龙计算机通信科技(深圳)有限公司 一种在通信录中增加联系人信息的方法、装置及移动终端
CN102779509B (zh) * 2011-05-11 2014-12-03 联想(北京)有限公司 语音处理设备和语音处理方法
CN102223367B (zh) * 2011-06-10 2014-04-23 安徽科大讯飞信息科技股份有限公司 移动用户访问网站的方法、设备及系统
CN102520789A (zh) * 2011-11-18 2012-06-27 上海聚力传媒技术有限公司 一种用于实现语音控制受控设备的方法与设备
CN103139351B (zh) * 2011-11-24 2016-10-05 联想(北京)有限公司 音量控制方法、装置及通信终端
CN102917105B (zh) * 2012-10-17 2015-06-24 中国联合网络通信集团有限公司 呼叫信息处理方法和终端设备
CN103888861B (zh) * 2012-12-19 2017-09-22 联想(北京)有限公司 麦克风阵列指向性调节方法、装置及电子设备
CN103903621A (zh) * 2012-12-26 2014-07-02 联想(北京)有限公司 一种语音识别的方法及电子设备
CN103530549B (zh) * 2013-09-23 2016-08-24 北京奇虎科技有限公司 移动通讯终端上的文件/应用程序处理方法及装置
CN104575499B (zh) * 2013-10-09 2019-12-20 上海携程商务有限公司 移动终端的声控方法及移动终端
GB201320334D0 (en) * 2013-11-18 2014-01-01 Microsoft Corp Identifying a contact
CN103745720A (zh) * 2013-12-25 2014-04-23 安徽科大讯飞信息科技股份有限公司 一种带有语音识别的蓝牙系统
CN104159153A (zh) * 2014-07-22 2014-11-19 乐视网信息技术(北京)股份有限公司 用户角色的切换方法及系统
CN105282294B (zh) * 2015-10-30 2018-06-15 东莞酷派软件技术有限公司 语音拨号方法及装置
CN105472152A (zh) * 2015-12-03 2016-04-06 广东小天才科技有限公司 一种智能终端自动接听电话的方法系统
CN105448294A (zh) * 2015-12-09 2016-03-30 江苏天安智联科技股份有限公司 一种应用于车载设备的智能语音识别系统
CN107294815B (zh) * 2016-04-01 2020-03-03 北京京东尚科信息技术有限公司 进入工作网络的方法、装置和计算机可读存储介质
CN106328140A (zh) * 2016-09-20 2017-01-11 深圳市同行者科技有限公司 基于语音链接的语音控制方法及其装置
CN107707745A (zh) * 2017-09-25 2018-02-16 百度在线网络技术(北京)有限公司 用于提取信息的方法和装置
CN108520750A (zh) * 2018-03-13 2018-09-11 努比亚技术有限公司 一种语音输入控制方法、设备及计算机可读存储介质
CN110390938A (zh) * 2018-04-20 2019-10-29 比亚迪股份有限公司 基于声纹的语音处理方法、装置和终端设备
JP2019211966A (ja) * 2018-06-04 2019-12-12 シャープ株式会社 制御装置、対話装置、制御方法、およびプログラム

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07162453A (ja) * 1993-12-13 1995-06-23 Nec Corp 電子メールシステム
JPH0936980A (ja) * 1995-07-19 1997-02-07 Murata Mach Ltd 通信端末装置
CN1244984A (zh) * 1996-11-22 2000-02-16 T-内提克斯公司 用于信息系统访问和交易处理的语音识别
CA2256781A1 (fr) * 1998-09-14 2000-03-14 Northern Telecom Limited Methode et appareil pour composer automatiquement un numero de telephone voulu, au moyen de commandes vocales
CN1611056A (zh) * 2001-09-04 2005-04-27 李文燮 使用通过语音识别构建的个人电话本数据库的自动语音呼叫连接服务方法

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE10050360B4 (de) * 2000-10-11 2004-12-09 Siemens Ag Verfahren Aktivierung und/oder Deaktivierung von Diensten in einem Vermittlungssystem

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH07162453A (ja) * 1993-12-13 1995-06-23 Nec Corp 電子メールシステム
JPH0936980A (ja) * 1995-07-19 1997-02-07 Murata Mach Ltd 通信端末装置
CN1244984A (zh) * 1996-11-22 2000-02-16 T-内提克斯公司 用于信息系统访问和交易处理的语音识别
CA2256781A1 (fr) * 1998-09-14 2000-03-14 Northern Telecom Limited Methode et appareil pour composer automatiquement un numero de telephone voulu, au moyen de commandes vocales
CN1611056A (zh) * 2001-09-04 2005-04-27 李文燮 使用通过语音识别构建的个人电话本数据库的自动语音呼叫连接服务方法

Also Published As

Publication number Publication date
CN101308654A (zh) 2008-11-19
CN101308654B (zh) 2012-11-07

Similar Documents

Publication Publication Date Title
WO2008138257A1 (fr) Dispositif de reconnaissance vocale et procédé de communication vocale
US10122712B2 (en) Voice over IP based biometric authentication
EP3327720B1 (fr) Procédé et appareil de construction de modèle d'empreinte vocale d'utilisateur
KR101126775B1 (ko) 중앙집중형 생체 인증
US9396730B2 (en) Customer identification through voice biometrics
US10650824B1 (en) Computer systems and methods for securing access to content provided by virtual assistants
US9361891B1 (en) Method for converting speech to text, performing natural language processing on the text output, extracting data values and matching to an electronic ticket form
US9380041B2 (en) Identification, verification, and authentication scoring
WO2014140970A2 (fr) Étiquetage d'empreinte vocale de sessions de réponse vocale interactive
AU2011349110B2 (en) Voice authentication system and methods
Tanwar et al. An approach to ensure security using voice authentication system
KR101703942B1 (ko) 화자 인증을 이용한 금융 보안 시스템 및 그 방법
JP2004013273A (ja) ユーザ認証システム、ユーザ認証方法、ユーザ認証プログラムおよび記録媒体
JP5271634B2 (ja) 内部統制制御システム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08734261

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08734261

Country of ref document: EP

Kind code of ref document: A1