CN105391873A - Method for realizing local voice recognition in mobile device - Google Patents
Method for realizing local voice recognition in mobile device Download PDFInfo
- Publication number
- CN105391873A CN105391873A CN201510834406.4A CN201510834406A CN105391873A CN 105391873 A CN105391873 A CN 105391873A CN 201510834406 A CN201510834406 A CN 201510834406A CN 105391873 A CN105391873 A CN 105391873A
- Authority
- CN
- China
- Prior art keywords
- mobile device
- nonvolatile memory
- local voice
- distinguishing
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 238000013528 artificial neural network Methods 0.000 claims abstract description 12
- 230000004044 response Effects 0.000 claims abstract description 12
- XUIMIQQOPSSXEZ-UHFFFAOYSA-N Silicon Chemical compound [Si] XUIMIQQOPSSXEZ-UHFFFAOYSA-N 0.000 claims description 12
- 229910052710 silicon Inorganic materials 0.000 claims description 12
- 239000010703 silicon Substances 0.000 claims description 12
- 239000000758 substrate Substances 0.000 claims description 6
- 239000012212 insulator Substances 0.000 claims description 3
- 238000005516 engineering process Methods 0.000 abstract description 11
- 230000005540 biological transmission Effects 0.000 abstract 1
- 238000004806 packaging method and process Methods 0.000 abstract 1
- 238000013461 design Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000007630 basic procedure Methods 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000284 resting effect Effects 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/724—User interfaces specially adapted for cordless or mobile telephones
- H04M1/72403—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
- H04M1/7243—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
- H04M1/72433—User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages for voice messaging, e.g. dictaphones
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
- G10L15/065—Adaptation
- G10L15/07—Adaptation to the speaker
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04M—TELEPHONIC COMMUNICATION
- H04M1/00—Substation equipment, e.g. for use by subscribers
- H04M1/72—Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
- H04M1/725—Cordless telephones
- H04M1/73—Battery saving arrangements
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D30/00—Reducing energy consumption in communication networks
- Y02D30/70—Reducing energy consumption in communication networks in wireless communication networks
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Multimedia (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Business, Economics & Management (AREA)
- General Business, Economics & Management (AREA)
- Artificial Intelligence (AREA)
- Telephone Function (AREA)
Abstract
The invention discloses a method for realizing local voice intelligent recognition in a mobile handheld device, and utilizes a 3D nonvolatile memory to locally store voice database information of each device user and an artificial neural network learning database. The characteristic of a 3D nonvolatile memory technology is not realized through chip stacking or 3D packaging, but by adoption of a 3D technology by a memory cell, and thus high storage density can be achieved. According to the method provided by the invention, the voice database information of each device user and the artificial neural network learning database are locally stored in the 3D nonvolatile memory, the step of data transmission of the mobile handheld device through a network and a cloud data center is avoided, thereby greatly improving the response speed of voice recognition, and guaranteeing security of use. To further reduce power consumption, a baseband processor responds to a user voice command, and an application process and a memory which are serious in electric leakage are in a dormant state, and thus power consumption is further reduced.
Description
Technical field
The present invention relates to field of speech recognition, particularly relate to one and realize local voice knowledge method for distinguishing in a mobile device.
Background technology
Along with deepening continuously of studying artificial neural network (ANN), modern science and technology technology has made great progress in artificial intelligence field.Such as, pattern recognition, intelligent robot, the automatically field such as control and speech recognition technology, all show good intelligent characteristic.Wherein, speech recognition technology will substitute key-press input, become the next developing direction of mobile hand-held device.
Because speech recognition technology needs the speech database that can store vast capacity data, and artificial neural network study also needs the support of mass storage, and ability of data processing requires also very high, therefore the realization of intelligent sound recognition technology generally realizes in data center beyond the clouds.Current mobile hand-held device (such as mobile phone, panel computer) storage capacity and data-handling capacity are all quite limited, are therefore difficult to realize intelligent sound identification.
The speech database of high in the clouds data center is for general population widely, can not formulate individual speech database specific to the accent of someone, intonation, term custom and word speed etc., therefore the accuracy of the speech database at cloud device center is not identical concerning different individual.
Mobile hand-held device to the basic procedure of speech processes, as shown in Figure 1.Mobile hand-held device receives the speech data of user, by network, the speech data received can be sent to high in the clouds data center, by high in the clouds data center to speech data after treatment, command operation after resolving is sent it back mobile hand-held device by network again, and mobile hand-held device makes response according to this command operation.This shows, real-time network transfer speeds affects the latency that can mobile hand-held device respond fast.Common mobile hand-held device is normally performed by mobile hand-held device internal applications processor (Applicationprocessor) speech recognition and process and processes in internal memory.Therefore application processor and internal memory must remain that opening can voice responsive order in time, obvious power consumption can increase greatly, in order to ensure longer flying power, mobile hand-held device needs the battery of high power capacity as support, and this will increase the cost of mobile hand-held device undoubtedly.
Therefore, those skilled in the art is devoted to develop a kind of method realizing local voice Intelligent Recognition in mobile hand-held device, improves the accuracy of speech recognition, accelerates voice response speed, reduce power consumption.
Summary of the invention
Because the above-mentioned defect of prior art, technical problem to be solved by this invention how to realize speech recognition in this locality of mobile device, improves the accuracy and quickening response speed that identify.
For achieving the above object, the invention provides one and realize local voice knowledge method for distinguishing in a mobile device, based on 3D nonvolatile memory, in described mobile device this locality stores, set up speech database and artificial neural network learning database.
Further, described speech database carries out learning for the accent of each equipment user, intonation, term custom and word speed thus analyze and store.
Further, described mobile device is mobile phone or panel computer.
Further, base band processor module is configured to carry out user speech Intelligent Recognition, comprises response user voice command.
Further, base band processor module and 3D nonvolatile memory integrate, and described base band processor module is fabricated on 3D nonvolatile memory silicon substrate.
Further, described mobile device is configured to light or do not light screen and can carries out local voice Intelligent Recognition.
Further, described 3D nonvolatile memory refers to that memory cell array adopts 3D technique.
Further, the silicon substrate of described 3D nonvolatile memory is body silicon or silicon-on-insulator.
The present invention proposes a kind of method realizing local voice Intelligent Recognition in mobile hand-held device, utilizes 3D nonvolatile memory to store for the speech data library information of each equipment user and artificial neural network learning database in this locality.Described mobile hand-held device can be mobile phone, panel computer etc.The feature of 3D non-volatile memory technologies of the present invention is not realized by the stacking of chip or 3D encapsulation, but memory cell employing is 3D technique, thus can reach the storage density of superelevation.
As shown in Figure 2 be the structural representation of 3D nonvolatile memory of the present invention.Wherein, 1 is the storage array of 3D nonvolatile memory, in order to store speech data library information for each equipment user and artificial neural network learning database in this locality; 2 is silicon substrate, can make body silicon or silicon-on-insulator, in order to realize the peripheral logical circuit (such as, decoding circuit, read/write circuit, control circuit, output input circuit etc.) of 3D nonvolatile memory.In addition, the 3D nonvolatile memory (NVM) of this superelevation of the present invention storage density can also substitute the storage chip (being generally nand flash memory chip) in traditional mobile hand-held device.The present invention stores speech data library information for each equipment user and artificial neural network learning database by local in 3D nonvolatile memory, avoid mobile hand-held device transmits data step by network and high in the clouds data center, thus substantially increase the response speed of speech recognition, more ensure that the fail safe of use.Because these data are for specific user, can carry out learning for the accent of each different user, intonation, term custom and word speed etc. thus analyze and store, therefore can carry out speech recognition to individual subscriber more accurately.In order to reduce power consumption further, baseband processor can be allowed to respond user voice command, and allow the severe application processor of electric leakage and internal memory be in resting state, thus more reducing power consumption.In order to reduce power consumption further and improve response speed, 3D nonvolatile memory and baseband processor can also integrate by the present invention, as shown in Figure 3.Wherein, the 3 dimensional drawing that (1) realizes for 3D nonvolatile memory of the present invention and baseband processor, (2) are sectional view.Wherein, on silicon chip be the storage array of 3D nonvolatile memory; In substrate silicon except realize 3D nonvolatile memory peripheral logical circuit (such as, decoding circuit, read/write circuit, control circuit, output input circuit etc.) outside, also will realize baseband processor logical circuit.The present invention is integrated with 3D nonvolatile memory and baseband processor on a chips simultaneously, substantially increases silicon chip utilance, and reduces manufacturing cost; Meanwhile, the response speed of speech recognition can be further increased, also can save power consumption further.
Therefore, this method realizing local voice Intelligent Recognition in mobile hand-held device of the present invention, be stored in local jumbo 3D nonvolatile memory by for the speech data library information of each equipment user and artificial neural network learning database, improve the accuracy of speech recognition, accelerate voice response speed, reduce power consumption.Further, a chips is integrated with 3D nonvolatile memory and baseband processor simultaneously, silicon chip utilance can be substantially increased, and reduce manufacturing cost.
Be described further below with reference to the technique effect of accompanying drawing to design of the present invention, concrete structure and generation, to understand object of the present invention, characteristic sum effect fully.
Accompanying drawing explanation
Fig. 1 is that in prior art, mobile device relies on high in the clouds to realize the functional schematic of speech recognition;
Fig. 2 is the structural representation of the 3D nonvolatile memory of a preferred embodiment of the present invention;
Fig. 3 is the 3 dimensional drawing that realizes of the 3D nonvolatile memory of a preferred embodiment of the present invention and baseband processor and sectional view;
Fig. 4 is the voice operating schematic diagram of the mobile device response user of a preferred embodiment of the present invention.
Embodiment
Be further elaborated under lifting an instantiation below:
Mobile phone traditional at present only supports button operation, if user drives, user wants suddenly to check a mail, he just must pick up mobile phone and light screen by button, then mailbox position is found, then opened the mail wanting to check by button, and in startup procedure, do such thing be danger close.If adopt this method that can realize local voice Intelligent Recognition in mobile hand-held device of the present invention, just can be simply many.User can pick up mobile phone, only needs to carry out voice operating to mobile phone just passable.As shown in Figure 4, the voice operating of mobile phone response user, to be analyzed and coupling by the speech database of baseband processor in the 3D nonvolatile memory inside of this locality and artificial neural network learning data library lookup, mobile phone screen can light, then respond corresponding voice operating fast, user is wanted the e-mail messages searched feeds back to user by speech form.Visible, this method realizing speech recognition in this locality of the present invention, without the need to button operation, soon, more safer, more economize power consumption.This intelligent sound operation of the present invention is except for checking mail, and can also be used to make a phone call, check or answer short message, speech cipher inputs, and plays music, reading articles etc., is applicable to widely among people's life.
More than describe preferred embodiment of the present invention in detail.Should be appreciated that the ordinary skill of this area just design according to the present invention can make many modifications and variations without the need to creative work.Therefore, all technical staff in the art, all should by the determined protection range of claims under this invention's idea on the basis of existing technology by the available technical scheme of logical analysis, reasoning, or a limited experiment.
Claims (8)
1. realize local voice in a mobile device and know a method for distinguishing, it is characterized in that, based on 3D nonvolatile memory, in described mobile device this locality stores, set up speech database and artificial neural network learning database.
2. realize as claimed in claim 1 local voice in a mobile device and know method for distinguishing, it is characterized in that, described speech database carries out learning for the accent of each equipment user, intonation, term custom and word speed thus analyze and store.
3. realize local voice as claimed in claim 1 in a mobile device and know method for distinguishing, it is characterized in that, described mobile device is mobile phone or panel computer.
4. realize local voice as claimed in claim 3 in a mobile device and know method for distinguishing, it is characterized in that, base band processor module is configured to carry out user speech Intelligent Recognition, comprises response user voice command.
5. realize local voice as claimed in claim 3 in a mobile device and know method for distinguishing, it is characterized in that, base band processor module and 3D nonvolatile memory integrate, and described base band processor module is fabricated on 3D nonvolatile memory silicon substrate.
6., as the local voice that realizes in a mobile device in claim 3 ~ 5 as described in any knows method for distinguishing, it is characterized in that, described mobile device is configured to light or do not light screen can carry out local voice Intelligent Recognition.
7., as the local voice that realizes in a mobile device in Claims 1 to 5 as described in any knows method for distinguishing, it is characterized in that, described 3D nonvolatile memory refers to that memory cell array adopts 3D technique.
8., as the local voice that realizes in a mobile device in Claims 1 to 5 as described in any knows method for distinguishing, it is characterized in that, the silicon substrate of described 3D nonvolatile memory is body silicon or silicon-on-insulator.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510834406.4A CN105391873A (en) | 2015-11-25 | 2015-11-25 | Method for realizing local voice recognition in mobile device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510834406.4A CN105391873A (en) | 2015-11-25 | 2015-11-25 | Method for realizing local voice recognition in mobile device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105391873A true CN105391873A (en) | 2016-03-09 |
Family
ID=55423696
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510834406.4A Pending CN105391873A (en) | 2015-11-25 | 2015-11-25 | Method for realizing local voice recognition in mobile device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105391873A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107919120A (en) * | 2017-11-16 | 2018-04-17 | 百度在线网络技术(北京)有限公司 | Voice interactive method and device, terminal, server and readable storage medium storing program for executing |
CN108172218A (en) * | 2016-12-05 | 2018-06-15 | 中国移动通信有限公司研究院 | A kind of pronunciation modeling method and device |
CN110874343A (en) * | 2018-08-10 | 2020-03-10 | 北京百度网讯科技有限公司 | Method for processing voice based on deep learning chip and deep learning chip |
CN112417129A (en) * | 2021-01-22 | 2021-02-26 | 北京新广视通科技有限公司 | Intelligent quick response method and equipment with AI learning function |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6035270A (en) * | 1995-07-27 | 2000-03-07 | British Telecommunications Public Limited Company | Trained artificial neural networks using an imperfect vocal tract model for assessment of speech signal quality |
CN1272198A (en) * | 1997-07-04 | 2000-11-01 | 三星电子株式会社 | Digital cellular phone with voice recognition function and method for controlling same |
CN1543640A (en) * | 2001-06-14 | 2004-11-03 | �����ɷ� | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
CN102800695A (en) * | 2011-05-24 | 2012-11-28 | 爱思开海力士有限公司 | 3-dimensional non-volatile memory device and method of manufacturing the same |
CN103187053A (en) * | 2011-12-31 | 2013-07-03 | 联想(北京)有限公司 | Input method and electronic equipment |
CN103514879A (en) * | 2013-09-18 | 2014-01-15 | 广东欧珀移动通信有限公司 | Local voice recognition method based on BP neural network |
-
2015
- 2015-11-25 CN CN201510834406.4A patent/CN105391873A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6035270A (en) * | 1995-07-27 | 2000-03-07 | British Telecommunications Public Limited Company | Trained artificial neural networks using an imperfect vocal tract model for assessment of speech signal quality |
CN1272198A (en) * | 1997-07-04 | 2000-11-01 | 三星电子株式会社 | Digital cellular phone with voice recognition function and method for controlling same |
CN1543640A (en) * | 2001-06-14 | 2004-11-03 | �����ɷ� | Method and apparatus for transmitting speech activity in distributed voice recognition systems |
CN102800695A (en) * | 2011-05-24 | 2012-11-28 | 爱思开海力士有限公司 | 3-dimensional non-volatile memory device and method of manufacturing the same |
CN103187053A (en) * | 2011-12-31 | 2013-07-03 | 联想(北京)有限公司 | Input method and electronic equipment |
CN103514879A (en) * | 2013-09-18 | 2014-01-15 | 广东欧珀移动通信有限公司 | Local voice recognition method based on BP neural network |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108172218A (en) * | 2016-12-05 | 2018-06-15 | 中国移动通信有限公司研究院 | A kind of pronunciation modeling method and device |
CN107919120A (en) * | 2017-11-16 | 2018-04-17 | 百度在线网络技术(北京)有限公司 | Voice interactive method and device, terminal, server and readable storage medium storing program for executing |
US10811010B2 (en) | 2017-11-16 | 2020-10-20 | Baidu Online Network Technology (Beijing) Co., Ltd. | Voice interaction method and apparatus, terminal, server and readable storage medium |
CN110874343A (en) * | 2018-08-10 | 2020-03-10 | 北京百度网讯科技有限公司 | Method for processing voice based on deep learning chip and deep learning chip |
CN110874343B (en) * | 2018-08-10 | 2023-04-21 | 北京百度网讯科技有限公司 | Method for processing voice based on deep learning chip and deep learning chip |
CN112417129A (en) * | 2021-01-22 | 2021-02-26 | 北京新广视通科技有限公司 | Intelligent quick response method and equipment with AI learning function |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
TWI692751B (en) | Voice wake-up method, device and electronic equipment | |
CN105391873A (en) | Method for realizing local voice recognition in mobile device | |
CN103632165B (en) | A kind of method of image procossing, device and terminal device | |
EP3087801B1 (en) | Interchangable charm messaging wearable electronic device for wireless communication | |
CN107924288A (en) | Electronic equipment and its method for carrying out perform function using speech recognition | |
CN106020763B (en) | For providing the method and electronic equipment of content | |
CN108242235A (en) | Electronic equipment and its audio recognition method | |
CN104102410B (en) | Method and apparatus for showing the screen of mobile terminal device | |
CN105810194B (en) | Speech-controlled information acquisition methods and intelligent terminal under standby mode | |
CN106030440A (en) | Smart circular audio buffer | |
CN103870547A (en) | Grouping processing method and device of contact persons | |
CN104765446A (en) | Electronic device and method of controlling electronic device | |
US11381527B2 (en) | Information prompt method and apparatus | |
CN103155428A (en) | Apparatus and method for adaptive gesture recognition in portable terminal | |
CN107809542A (en) | application control method, device, storage medium and electronic equipment | |
CN108494947A (en) | A kind of images share method and mobile terminal | |
CN104700842A (en) | Sound signal time delay estimation method and device | |
CN106201427A (en) | A kind of application program launching method and terminal unit | |
CN107483751A (en) | Terminal device and its power energy allocation method, computer-readable recording medium | |
CN103189853A (en) | Method and apparatus for providing efficient context classification | |
CN110334334A (en) | A kind of abstraction generating method, device and computer equipment | |
CN107256334A (en) | recipe matching method and related product | |
CN104992715A (en) | Interface switching method and system of intelligent device | |
CN106534528A (en) | Processing method and device of text information and mobile terminal | |
CN103871050A (en) | Image partition method, device and terminal |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160309 |