CN110097876A - Voice wakes up processing method and is waken up equipment - Google Patents

Voice wakes up processing method and is waken up equipment Download PDF

Info

Publication number
CN110097876A
CN110097876A CN201810088343.6A CN201810088343A CN110097876A CN 110097876 A CN110097876 A CN 110097876A CN 201810088343 A CN201810088343 A CN 201810088343A CN 110097876 A CN110097876 A CN 110097876A
Authority
CN
China
Prior art keywords
wake
word
waken
server
equipment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810088343.6A
Other languages
Chinese (zh)
Inventor
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN201810088343.6A priority Critical patent/CN110097876A/en
Publication of CN110097876A publication Critical patent/CN110097876A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/14Speech classification or search using statistical models, e.g. Hidden Markov Models [HMMs]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • G10L2015/0631Creating reference templates; Clustering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Artificial Intelligence (AREA)
  • Probability & Statistics with Applications (AREA)
  • Telephonic Communication Services (AREA)

Abstract

Processing method is waken up this application provides a kind of voice and is waken up equipment, wherein it includes: acquisition voice data that the voice, which wakes up processing method,;Identify in the voice data whether there is wake-up word by waking up model;In the case where having identified wake-up word, by the sound bite between the predetermined number of frames after the end position of predetermined number of frames and wake-up word before the starting position for waking up word in the voice data, server is uploaded to as word sound bite is waken up, wherein, the server is updated the wake-up model by the wake-up word sound bite.Using technical solution provided by the embodiments of the present application, a possibility that false wake-up can be reduced, the accuracy rate for waking up word identification is improved.

Description

Voice wakes up processing method and is waken up equipment
Technical field
The application belongs to Internet technical field more particularly to a kind of voice wakes up processing method and is waken up equipment.
Background technique
With the continuous development of intelligent identification technology, artificial intelligence wake up using more and more extensive.For example, intelligent sound Case, smart television, intelligent automobile etc. all can gradually be waken up by artificial intelligence.Wake-up primarily now Mode still by way of waking up word, wakes up word " Bei Bei " for example, being arranged for intelligent automobile, then automobile can be monitored in real time Extraneous sound, if recognizing the external world has " Bei Bei " this voice data input, with regard to wakeup of automotive, that is, pass through wake-up Word realizes the wake-up to equipment.
However, be easy to causeing false wake-up by way of waking up word come wake-up device, that is, user say it is other, not It cries " Bei Bei ", but identifying system identifies mistake, has been identified as " Bei Bei ", then will lead to false wake-up, has seriously affected User experience.
In view of the above-mentioned problems, currently no effective solution has been proposed.
Summary of the invention
The application is designed to provide a kind of voice and wakes up processing method and be waken up equipment, to reduce false wake-up generation Probability improves user experience.
The application, which provides a kind of voice and wakes up processing method and be waken up equipment, to be achieved in that
A kind of voice wakes up processing method, applied to being waken up in equipment, which comprises
Obtain voice data;
Identify in the voice data whether there is wake-up word by waking up model;
In the case where having identified wake-up word, by the predetermined number before the starting position for waking up word in the voice data The sound bite between predetermined number of frames after the end position of amount frame and wake-up word, is uploaded to as word sound bite is waken up Server, wherein the server is updated the wake-up model by the wake-up word sound bite.
A kind of data processing method is applied in server, which comprises
It obtains from the wake-up word voice data for being waken up equipment;
The wake-up model for being waken up equipment is updated according to the wake-up word voice data;
Updated wake-up model is pushed to and described is waken up equipment.
One kind being waken up equipment, including processor and for the memory of storage processor executable instruction, the place Manage the step of realizing the above method when device executes described instruction.
A kind of server, including processor and for the memory of storage processor executable instruction, the processor The step of realizing the above method when executing described instruction.
A kind of cloud server, including processor and for the memory of storage processor executable instruction, the place Manage the step of realizing the above method when device executes described instruction.
A kind of computer readable storage medium is stored thereon with computer instruction, and it is above-mentioned that described instruction is performed realization The step of method.
A kind of mobile unit, including processor and for the memory of storage processor executable instruction, the processing Device realizes the above method when executing described instruction.
A kind of mobile device, including processor and for the memory of storage processor executable instruction, the processing Device realizes the above method when executing described instruction.
A kind of conference facility, including processor and for the memory of storage processor executable instruction, the processing Device realizes the above method when executing described instruction.
Voice provided by the present application wakes up processing method and data processing method, and being waken up equipment end by control can be The identification for locally carrying out waking up word can will wake up word segment and upload to cloud, cloud can be with base after recognizing wake-up word Be updated optimization to model is waken up in these data, a possibility that reduce false wake-up, improve wake up word identify it is accurate Rate.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application, for those of ordinary skill in the art, in the premise of not making the creative labor property Under, it is also possible to obtain other drawings based on these drawings.
Fig. 1 is a kind of configuration diagram of data processing system provided by the present application;
Fig. 2 is the extraction schematic diagram provided by the present application for waking up word sound bite;
Fig. 3 is the data storage provided by the present application for having cloud in the case of multiple equipment;
Fig. 4 is wake-up word identification provided by the present application and wakes up model modification flow diagram;
Fig. 5 is the method flow diagram that voice provided by the present application wakes up processing method;
Fig. 6 is the configuration diagram of terminal device provided by the present application;
Fig. 7 is the structural block diagram that voice provided by the present application wakes up processing unit;
Fig. 8 is the structural block diagram of data processing equipment provided by the present application.
Specific embodiment
In order to make those skilled in the art better understand the technical solutions in the application, below in conjunction with the application reality The attached drawing in example is applied, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described implementation Example is merely a part but not all of the embodiments of the present application.Based on the embodiment in the application, this field is common The application protection all should belong in technical staff's every other embodiment obtained without creative efforts Range.
It, can be by improving identification model in view of the false wake-up problem in the presence of existing wake-up word identification technology Accuracy rate is realized.For this purpose, in this example, being waken up equipment end by control can know in the identification that locally carry out waking up word It is clipped to after waking up word, can will wake up word segment and upload to cloud, cloud can based on these data answer equipment end It is existing, so that it is determined that the false wake-up probability of equipment end out, and optimization can be updated to wake-up model based on these data, so as to A possibility that reducing false wake-up, improves the accuracy rate for waking up word identification.
Based on this, a kind of data processing system is provided in this example, it can be as shown in Figure 1, comprising: be waken up equipment 101, server end 102.
It in one embodiment, can also include: to wake up word detection module 103 to determine language for obtaining voice data Whether wake-up word is had in sound data.Wherein, it wakes up word detection module and can be and be arranged in being waken up equipment 101, be also possible to It is independent with server end 102 and to be waken up equipment 101 self-existent, positioned at server 102 and be waken up between equipment.
Voice data can be obtained by the MIC being waken up in equipment 101 by waking up word detection module 1001.For being transferred to The voice data for waking up word detection module 1001 can be through the voice data after denoising, because have passed through at denoising Reason can effectively improve the accuracy for waking up word identification.
For waking up word detection module 1001, signal processing algorithm can have been run thereon and has waken up word identification model (being referred to as waking up engine), can be handled the voice data of acquisition by the signal processing algorithm of operation, for example, Denoising is carried out, text, or identification voice therein etc. are converted into, it is then, defeated as input data Enter to waking up in word identification model, identifies whether there is wake-up word in voice data by waking up word identification model.
If waking up word detection module 103 to detect in voice data without waking up word, result return can be will test To equipment 101 is waken up, equipment 101 is waken up with control and continues audio monitoring.It is called out if detected in voice data Awake word, then wake up the sound bite data of word in available voice data, wherein the sound bite data are for being sent to clothes Business device end 102, for reappearing the voice scene being waken up in equipment, to detect whether that there are false wake-ups, and can be based on detection As a result wake-up word identification model is further optimized.
In one embodiment, it is contemplated that if only interception wakes up that section of voice data of word, it is incomplete to will lead to information Face, recognition efficiency reduce.In order to solve this problem, as shown in Fig. 2, the beginning and end of wake-up word can be will test forward Extend default frame number (such as 20 frames), extend default frame number (such as 10 frames) backward, such voice data more fully, Ke Yishi Existing more accurate identification.Above-mentioned default frame number can also be by the way of preset time, for example, 5 seconds etc..That is, will inspection The beginning and end of the wake-up measured extends forward 5 seconds, extends 5 seconds backward.
It is to be noted, however, that the quantity and time span of above-mentioned cited frame are all only a kind of exemplary descriptions, When actually realizing, other number of frames can be used, such as: 1 frame, 3 frames, 9 frames, 12 frames, 35 frames, 40 frames etc., this Application is not especially limited, and can be selected according to actual needs.
By taking vehicle intelligent equipment as an example, waking up word is " piggy is hurried up ", when vehicle intelligent equipment recognizes the sound of surrounding In have " piggy is hurried up " this wake up word after, available " piggy is hurried up " corresponding voice data, and " piggy is hurried up " 10 frames and the later data of 10 frames before, to form the sound bite data for carrying and waking up word.It in turn, can be by the voice Fragment data is uploaded to server side.
It is worth noting that, the above-mentioned cited mode and time span that extend forward or backward, are only a kind of Schematic description can in other manners be not construed as limiting this with length, the application when actually realizing.Even It if not considering transimission and storage cost, can not intercept, but all data are all transmitted to server end 102.Pass through The mode of interception can effectively save flow, improving performance, be waken up equipment end using less flow reproduction to reach The purpose of voice data.
In one embodiment, being waken up equipment 101 can all upload all sound bites for recognizing wake-up word To server end 102, can also upload at intervals.For example, it is every occur five times or every appearance three times, upload it is primary, in this way The burden of server end 102 can be reduced, it is also possible to concentrate the sound bite for uploading the word of the wake-up in a period of time, that is, can be with It all uploads, can also upload according to a certain percentage, and when uploading, compressed data can be uploaded, to save stream Amount.
In one embodiment, server end 102 can be handled multiple equipment, that is, be carried out to multiple equipment The processing of false wake-up.For this purpose, as shown in figure 3, use can be increased in the voice data after the interception for being transferred to server end 102 In the mark of identification equipment, so that server end 102 can identify that voice data is which is waken up equipment from.Example Such as, it can be carried in the voice data for sending server end 102 to be each waken up one ID of equipment setting or distribution The ID of equipment.
Server end 102 simultaneously to it is multiple be waken up equipment and handle in the case where, can be according to the language received Sound data are stored respectively according to device id, for example, the wake-up word voice data from equipment 1, stores corresponding to equipment 1 In storage unit, the wake-up word voice data from equipment 2 is stored into the corresponding storage unit of equipment 2, that is, realization is set up separately Standby data storage.
In one embodiment, it for being waken up equipment 101, after identifying wake-up word, can will wake up Word text is also passed to server end 102, in this way, server end is confirmed whether that there are in the case where false wake-up subsequent, so that it may To be directly compared with the wake-up word text, to determine whether there is false wake-up.
In one embodiment, server end 102 can be to multiple wake-up word tablet segment datas of acquisition at Reason to identify one by one to wake-up word segment, and determines whether recognition result is consistent with word is waken up, if unanimously, then it is assumed that It is positive sample, if recognition result and wake-up word are inconsistent, then it is assumed that be negative sample.
When realizing, positive negative sample can be and identify, is also possible to manually to mark, in this regard, this Application is not construed as limiting, and can be selected according to actual needs.
Based on above-mentioned recognition result, server end 102 can count the recognition accuracy of corresponding equipment, and can be with Further the wake-up word identification model for the equipment is trained.For example, if some equipment has 100 wake-up word sounds Segment, wherein have 90 recognition results with wake up word it is consistent, then it is considered that the recognition accuracy of the equipment be 90%.
Based on positive sample obtained above and negative sample, server 102 can to wake up word identification model carry out into The training of one step.
In one embodiment, server end 102 can be cloud server, and cloud platform can also be taken with one group of processing The relatively high processing equipment etc. of the server cluster or a processing capacity that business device forms can be realized, as long as energy Realize the centralized processing to data, the equipment for having higher processing capacity all can serve as the server end.Realization when It waits, cloud is selected to implement in contrast as server end, processing capacity is more stronger, is also easy to set with multiple be waken up It is standby to establish connection.
In one embodiment, server end 102 can be according to acquisition and storage the sound bite pair for waking up word The identification model for being waken up equipment 101 is updated.Model modification is carried out specifically, can trigger according to one of the following conditions:
1) data volume accumulated reaches certain amount;
2) there is the ratio of negative sample beyond certain threshold value in a certain equipment;
3) it is high to wake up error rate for user's active feedback.
In one embodiment, server end 102 can by above-mentioned positive sample and negative sample to identification model into Row training.Specifically, can be trained, be also possible to identification model for each equipment is waken up when realizing Iteration is uniformly updated by an identification model.
It is above-mentioned be waken up equipment 101 and can be intelligent sound box, intelligent automobile, smart television etc. need what is be waken up to set It is standby.
Be specifically described below with reference to a concrete scene, it should be noted, however, that the specific embodiment be only for The application is better described, does not constitute an undue limitation on the present application.
By taking vehicle system as an example, if the wake-up word identification model recognition effect of vehicle system it is bad there is false wake-up will be compared with It is waken up the wake-up fragment data of equipment by uploading in this example for the serious experience for influencing user, utilizes these wake-ups Fragment data can reappear being waken up in equipment as a result, can carry out false wake-up analysis by these results completely offline.Together When, after grabbing wake-up data (including: false wake-up), it can actively update the wake-up word identification mould being waken up in equipment Type, to improve user experience.It can be as shown in figure 4, including the following steps:
S1: it is waken up equipment and uploads the fragment data waken up;
S2: cloud determines that the data uploaded are positive sample or negative sample;
S3: wake up by positive negative sample the update of model;
S4: after regression test passes through, by new wake-up model modification to being waken up in equipment.
Specifically, may include steps of:
S1: the vehicle device MIC voice data recorded is input in vehicle device equipment end, at the signal run in vehicle device equipment end Adjustment method and wake-up engine handle these input datas.
It can be set in vehicle device equipment end and waken up word detection module, which detects whether voice data is wake-up Word just will test result and be back in vehicle device equipment end, then, continue to monitor if it is not to wake up word that testing result, which is, The data of MIC acquisition;If it is word is waken up, then the sound bite data for waking up word are obtained, and wake up the text and equipment of word Id information, and these data informations are uploaded to cloud.
In view of vehicle device equipment end is when carrying out waking up word detection, can there are a starting point and an end point, But for cloud, if directly usually can not all reappear vehicle device out using the piece segment information between the two time points The result of equipment end.For this purpose, waking up word detection module can be extended forward with final choice by the starting point that word detects is waken up 20 frames, tail point extend 10 frames backward, so that on the one hand being uploaded to the data in cloud can be used for reappearing in vehicle device equipment end As a result, to detect false wake-up, be on the other hand used directly for the update training of model;
S2: the wake-up fragment data that cloud uploads collected vehicle device equipment end carries out classification storage according to device id;
S3: the sound bite data of accumulation can be passed through identification engine after data are accumulated to certain amount by cloud It is identified, if recognition result is consistent with word is waken up, is used as positive sample, if recognition result and wake-up word are inconsistent, As negative sample.
S4: by the positive sample and negative sample identified, the accuracy rate waken up in corresponding vehicle device equipment can be calculated;
S5: it can be trained by the positive sample and negative sample identified to model is waken up.
Wherein, identification model can update in triggering one of in the following manner: the data volume of cloud accumulation reaches a fixed number There is the ratio of negative sample beyond certain threshold value in amount, a certain vehicle device equipment, and it is high that user's active feedback wakes up error rate.
S6: carrying out regression test to updated wake-up model, if test passes through, can be pushed directly to corresponding In vehicle device equipment, if exception occurs in regression test, pushed again after can ascertaining the reason.
After wake-up model in the updated is pushed in vehicle device equipment end, restarting equipment can just make user next time With new model.
It should be noted, however, that above-mentioned is the schematic description carried out using vehicle system as being waken up, in reality When border is realized, being waken up equipment can also be vehicle intelligent equipment, mobile device, conference facility, intelligent sound box etc., only If there is the equipment for waking up word demand can realize that the application is to the class for being waken up equipment by method provided by the present application Type and existence form etc. are not especially limited.
Fig. 5 is the method flow diagram that a kind of herein described voice wakes up processing method one embodiment.Although the application It provides as the following examples or method operating procedure shown in the drawings or apparatus structure, but based on conventional or without creativeness Labour may include more or less operating procedure or modular unit in the method or device.In logicality not In the step of there are necessary causalities or structure, the execution sequence of these steps or the modular structure of device are not limited to the application Embodiment description and execution shown in the drawings sequence or modular structure.The device in practice of the method or modular structure Or end product in application, can according to embodiment or method shown in the drawings or modular structure connection the execution of carry out sequence or Person executes (such as environment or even distributed processing environment of parallel processor or multiple threads) parallel.As shown in figure 5, The voice wakes up processing method and may include steps of applied to being waken up in equipment:
Step 501: obtaining voice data;
Step 502: identifying in the voice data whether there is wake-up word by waking up model;
Step 503:, will be in the voice data before the starting position of wake-up word in the case where having identified wake-up word Predetermined number of frames and wake up word end position after predetermined number of frames between sound bite, as wake up word tablet Section is uploaded to server, wherein the server is updated the wake-up model by the wake-up word sound bite.
The above-mentioned wake-up word sound bite by the voice data is uploaded to server and may include:
S1: it is identified from the voice data and wakes up word sound bite;
S2: it obtains and wakes up word and the device identification for being waken up equipment;
S3: the wake-up word sound bite, the wake-up word, the wake-up device are identified, the server is uploaded to.
The condition that triggering carries out model modification can be set, for example, can be the case where meeting at least one the following conditions Under, triggering carries out model modification:
1) quantity of the wake-up word sound bite of the server aggregates reaches preset threshold;
2) server detects that being waken up the ratio that negative sample occurs in equipment exceeds default sample threshold, wherein negative It is inconsistent with the wake-up word recognition result that is waken up equipment end that sample is that server end wakes up word recognition result;
3) server receives the model modification instruction information of user.
It is important to note, however, that the condition that above-mentioned cited trigger model updates is only a kind of exemplary description, When practical realization, trigger model it can also update in other manners.
When carrying out model modification, server be can be within a preset time by positive negative sample to the wake-up mould Type is updated.
In order to realize the effective of data or be ordered into acquisition, equipment can will be waken up according to preset ratio or period The wake-up word sound bite at end is uploaded to server.
Above-mentioned server can be cloud server, can also other type of server.
Embodiment of the method provided by the embodiment of the present application can be in classes such as mobile terminal, terminal or servers As execute in arithmetic unit.For running on computer terminals, Fig. 6 is at a kind of voice wake-up of the embodiment of the present invention The hardware block diagram of the terminal of reason method.As shown in fig. 6, terminal 10 may include one or more (figures In only show one) (processor 102 can include but is not limited to Micro-processor MCV or programmable logic device to processor 102 The processing unit of FPGA etc.), memory 104 for storing data and the transmission module 106 for communication function.Ability Domain those of ordinary skill is appreciated that structure shown in fig. 6 is only to illustrate, and does not cause to limit to the structure of above-mentioned electronic device It is fixed.For example, terminal 10 may also include than shown in Fig. 6 more perhaps less component or have with shown in Fig. 6 not Same configuration.
Memory 104 can be used for storing the software program and module of application software, such as the voice in the embodiment of the present invention Wake up the corresponding program instruction/module of processing method, the software program that processor 102 is stored in memory 104 by operation And module realizes the voice wake-up processing of above-mentioned application program thereby executing various function application and data processing Method.Memory 104 may include high speed random access memory, may also include nonvolatile memory, such as one or more magnetism Storage device, flash memory or other non-volatile solid state memories.In some instances, memory 104 can further comprise phase The memory remotely located for processor 102, these remote memories can pass through network connection to terminal 10.On The example for stating network includes but is not limited to internet, intranet, local area network, mobile radio communication and combinations thereof.
Transmission module 106 is used to that data to be received or sent via a network.Above-mentioned network specific example may include The wireless network that the communication providers of terminal 10 provide.In an example, transmission module 106 includes that a network is suitable Orchestration (Network Interface Controller, NIC), can be connected by base station with other network equipments so as to Internet is communicated.In an example, transmission module 106 can be radio frequency (Radio Frequency, RF) module, For wirelessly being communicated with internet.
Referring to FIG. 7, the voice wakes up processing unit and is applied in the terminal of client-side in Software Implementation, It may include acquiring unit, recognition unit and uploading unit.Wherein:
Acquiring unit, for obtaining voice data;
Recognition unit, for identifying in the voice data whether there is wake-up word by waking up model;
Uploading unit, for the start bit of word will to be waken up in the voice data in the case where having identified wake-up word The sound bite between predetermined number of frames after the end position of predetermined number of frames before setting and wake-up word, as wake-up word Sound bite is uploaded to server, wherein the server carries out the wake-up model by the wake-up word sound bite It updates.
In one embodiment, uploading unit can specifically be identified from the voice data wakes up word tablet Section;It obtains and wakes up word and the device identification for being waken up equipment;By the wake-up word sound bite, the wake-up word, described Wake-up device mark, is uploaded to the server.
In one embodiment, uploading unit specifically can be waken up equipment for described according to preset ratio or period The wake-up word sound bite at end is uploaded to the server.
Referring to FIG. 8, in Software Implementation, the data processing equipment of server side may include: acquiring unit, more New unit and push unit.Wherein:
Acquiring unit, for obtaining from the wake-up word voice data for being waken up equipment;
Updating unit, for being carried out more according to the wake-up word voice data to the wake-up model for being waken up equipment Newly;
Updated wake-up model is pushed to and described is waken up equipment by push unit.
In one embodiment, above-mentioned data processing equipment can also comprise determining that unit, for according to the wake-up The false wake-up ratio of equipment is waken up described in word voice data is determining.
In one embodiment, updating unit specifically can carry out in accordance with the following steps model modification:
S1: it will be waken up in the wake-up word voice data identification model that data are delivered in the server one by one Word identification;
S2: recognition result is compared with the recognition result for being waken up equipment end;
S3: if consistent, it is used as positive sample, if it is inconsistent, as negative sample;
S4: the wake-up model is updated by the positive sample and negative sample.
In one embodiment, updating unit can specifically determine the case where one of meeting but be not limited to the following conditions Under, triggering is updated the wake-up model for being waken up equipment according to the wake-up word voice data:
1) the wake-up word voice data accumulated reaches preset data amount threshold value;
2) it is waken up the ratio that negative sample occurs in equipment and exceeds default sample threshold;
3) it receives and wakes up the high instruction information of error rate.
Although this application provides the method operating procedure as described in embodiment or flow chart, based on conventional or noninvasive The labour for the property made may include more or less operating procedure.The step of enumerating in embodiment sequence is only numerous steps One of execution sequence mode, does not represent and unique executes sequence.It, can when device or client production in practice executes To execute or parallel execute (such as at parallel processor or multithreading according to embodiment or method shown in the drawings sequence The environment of reason).
The device or module that above-described embodiment illustrates can specifically realize by computer chip or entity, or by having The product of certain function is realized.For convenience of description, it is divided into various modules when description apparatus above with function to describe respectively. The function of each module can be realized in the same or multiple software and or hardware when implementing the application.It is of course also possible to Realization the module for realizing certain function is combined by multiple submodule or subelement.
Method, apparatus or module described herein can realize that controller is pressed in a manner of computer readable program code Any mode appropriate is realized, for example, controller can take such as microprocessor or processor and storage can be by (micro-) The computer-readable medium of computer readable program code (such as software or firmware) that processor executes, logic gate, switch, specially With integrated circuit (Application Specific Integrated Circuit, ASIC), programmable logic controller (PLC) and embedding Enter the form of microcontroller, the example of controller includes but is not limited to following microcontroller: ARC 625D, Atmel AT91SAM, Microchip PIC18F26K20 and Silicone Labs C8051F320, Memory Controller are also implemented as depositing A part of the control logic of reservoir.It is also known in the art that in addition to real in a manner of pure computer readable program code Other than existing controller, completely can by by method and step carry out programming in logic come so that controller with logic gate, switch, dedicated The form of integrated circuit, programmable logic controller (PLC) and insertion microcontroller etc. realizes identical function.Therefore this controller It is considered a kind of hardware component, and hardware can also be considered as to the device for realizing various functions that its inside includes Structure in component.Or even, it can will be considered as the software either implementation method for realizing the device of various functions Module can be the structure in hardware component again.
Part of module in herein described device can be in the general of computer executable instructions Upper and lower described in the text, such as program module.Generally, program module includes executing particular task or realization specific abstract data class The routine of type, programs, objects, component, data structure, class etc..The application can also be practiced in a distributed computing environment, In these distributed computing environment, by executing task by the connected remote processing devices of communication network.In distribution It calculates in environment, program module can be located in the local and remote computer storage media including storage equipment.
As seen through the above description of the embodiments, those skilled in the art can be understood that the application can It is realized by the mode of software plus required hardware.Based on this understanding, the technical solution of the application is substantially in other words The part that contributes to existing technology can be embodied in the form of software products, and can also pass through the implementation of Data Migration It embodies in the process.The computer software product can store in storage medium, such as ROM/RAM, magnetic disk, CD, packet Some instructions are included to use so that a computer equipment (can be personal computer, mobile terminal, server or network are set It is standby etc.) execute method described in certain parts of each embodiment of the application or embodiment.
Each embodiment in this specification is described in a progressive manner, the same or similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.The whole of the application or Person part can be used in numerous general or special purpose computing system environments or configuration.Such as: personal computer, server calculate Machine, handheld device or portable device, mobile communication terminal, multicomputer system, based on microprocessor are at laptop device System, programmable electronic equipment, network PC, minicomputer, mainframe computer, the distribution including any of the above system or equipment Formula calculates environment etc..
Although depicting the application by embodiment, it will be appreciated by the skilled addressee that the application there are many deformation and Variation is without departing from spirit herein, it is desirable to which the attached claims include these deformations and change without departing from the application's Spirit.

Claims (10)

1. a kind of voice wakes up processing method, which is characterized in that applied to being waken up in equipment, which comprises
Obtain voice data;
Identify in the voice data whether there is wake-up word by waking up model;
In the case where having identified wake-up word, by the predetermined number of frames before the starting position for waking up word in the voice data Sound bite between the predetermined number of frames after the end position of wake-up word is uploaded to service as word sound bite is waken up Device, wherein the server is updated the wake-up model by the wake-up word sound bite.
2. the method according to claim 1, wherein the wake-up word sound bite in the voice data is uploaded Include: to server
It is identified from the voice data and wakes up word sound bite;
It obtains and wakes up word and the device identification for being waken up equipment;
By the wake-up word sound bite, the wake-up word, wake-up device mark, it is uploaded to the server.
3. the method according to claim 1, wherein the server is by the wake-up word sound bite to institute Wake-up model is stated to be updated, comprising:
In the case where meeting at least one the following conditions, triggers the server and pass through positive negative sample within a preset time to institute Wake-up model is stated to be updated:
The quantity of the wake-up word sound bite of the server aggregates reaches preset threshold;
The server detects that being waken up the ratio that negative sample occurs in equipment exceeds default sample threshold, wherein negative sample is Server end wake-up word recognition result and the wake-up word recognition result for being waken up equipment end are inconsistent;
The server receives the model modification instruction information of user.
4. the method according to claim 1, wherein the wake-up word sound bite in the voice data is uploaded To server, comprising:
According to preset ratio or period, the wake-up word sound bite for being waken up equipment end is uploaded to the server.
5. the method according to claim 1, wherein the server includes: cloud server.
6. one kind is waken up equipment, including processor and for the memory of storage processor executable instruction, the processing Device realizes method described in any one of claims 1 to 5 when executing described instruction.
7. a kind of computer readable storage medium is stored thereon with computer instruction, described instruction, which is performed, realizes that right is wanted The step of seeking any one of 1 to 5 the method.
8. a kind of mobile unit, including processor and for the memory of storage processor executable instruction, the processor Method described in any one of claims 1 to 5 is realized when executing described instruction.
9. a kind of mobile device, including processor and for the memory of storage processor executable instruction, the processor Method described in any one of claims 1 to 5 is realized when executing described instruction.
10. a kind of conference facility, including processor and for the memory of storage processor executable instruction, the processor Method described in any one of claims 1 to 5 is realized when executing described instruction.
CN201810088343.6A 2018-01-30 2018-01-30 Voice wakes up processing method and is waken up equipment Pending CN110097876A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810088343.6A CN110097876A (en) 2018-01-30 2018-01-30 Voice wakes up processing method and is waken up equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810088343.6A CN110097876A (en) 2018-01-30 2018-01-30 Voice wakes up processing method and is waken up equipment

Publications (1)

Publication Number Publication Date
CN110097876A true CN110097876A (en) 2019-08-06

Family

ID=67441872

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810088343.6A Pending CN110097876A (en) 2018-01-30 2018-01-30 Voice wakes up processing method and is waken up equipment

Country Status (1)

Country Link
CN (1) CN110097876A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110364147A (en) * 2019-08-29 2019-10-22 厦门市思芯微科技有限公司 A kind of wake-up training word acquisition system and method
CN110415724A (en) * 2019-08-08 2019-11-05 中南大学湘雅二医院 Transmission method, device, system and the computer readable storage medium of alert data
CN110428810A (en) * 2019-08-30 2019-11-08 北京声智科技有限公司 A kind of recognition methods, device and electronic equipment that voice wakes up
CN110570840A (en) * 2019-09-12 2019-12-13 腾讯科技(深圳)有限公司 Intelligent device awakening method and device based on artificial intelligence
CN110634468A (en) * 2019-09-11 2019-12-31 中国联合网络通信集团有限公司 Voice wake-up method, device, equipment and computer readable storage medium
CN110727821A (en) * 2019-10-12 2020-01-24 深圳海翼智新科技有限公司 Method, apparatus, system and computer storage medium for preventing device from being awoken by mistake
CN110970016A (en) * 2019-10-28 2020-04-07 苏宁云计算有限公司 Awakening model generation method, intelligent terminal awakening method and device
CN110992953A (en) * 2019-12-16 2020-04-10 苏州思必驰信息科技有限公司 Voice data processing method, device, system and storage medium
CN111091813A (en) * 2019-12-31 2020-05-01 北京猎户星空科技有限公司 Voice wakeup model updating method, device, equipment and medium
CN111429917A (en) * 2020-03-18 2020-07-17 北京声智科技有限公司 Equipment awakening method and terminal equipment
CN111464644A (en) * 2020-04-01 2020-07-28 北京声智科技有限公司 Data transmission method and electronic equipment
CN111554298A (en) * 2020-05-18 2020-08-18 北京百度网讯科技有限公司 Voice interaction method, voice interaction equipment and electronic equipment
CN111640426A (en) * 2020-06-10 2020-09-08 北京百度网讯科技有限公司 Method and apparatus for outputting information
CN111767083A (en) * 2020-02-03 2020-10-13 北京沃东天骏信息技术有限公司 Method for collecting false wake-up audio data, playing device, electronic device and medium
CN111833902A (en) * 2020-07-07 2020-10-27 Oppo广东移动通信有限公司 Awakening model training method, awakening word recognition device and electronic equipment
CN112071323A (en) * 2020-09-18 2020-12-11 北京百度网讯科技有限公司 Method and device for acquiring false wake-up sample data and electronic equipment
CN112114886A (en) * 2020-09-17 2020-12-22 北京百度网讯科技有限公司 Method and device for acquiring false wake-up audio
CN112201239A (en) * 2020-09-25 2021-01-08 海尔优家智能科技(北京)有限公司 Target device determination method and apparatus, storage medium, and electronic apparatus
CN112435670A (en) * 2020-11-11 2021-03-02 青岛歌尔智能传感器有限公司 Speech recognition method, speech recognition apparatus, and computer-readable storage medium
CN112820296A (en) * 2021-01-06 2021-05-18 北京声智科技有限公司 Data transmission method and electronic equipment
CN113571069A (en) * 2021-08-03 2021-10-29 北京房江湖科技有限公司 Information processing method, device and storage medium
CN114299933A (en) * 2021-12-28 2022-04-08 北京声智科技有限公司 Speech recognition model training method, device, equipment, storage medium and product
CN114845066A (en) * 2022-05-05 2022-08-02 成都赛力斯科技有限公司 Driving recording method, device, equipment and storage medium
WO2023207149A1 (en) * 2022-04-29 2023-11-02 荣耀终端有限公司 Speech recognition method and electronic device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130325484A1 (en) * 2012-05-29 2013-12-05 Samsung Electronics Co., Ltd. Method and apparatus for executing voice command in electronic device
CN106446561A (en) * 2016-09-28 2017-02-22 湖南老码信息科技有限责任公司 Incremental neural network model based urticaria prediction method and system
CN106782554A (en) * 2016-12-19 2017-05-31 百度在线网络技术(北京)有限公司 Voice awakening method and device based on artificial intelligence
US9697828B1 (en) * 2014-06-20 2017-07-04 Amazon Technologies, Inc. Keyword detection modeling using contextual and environmental information
CN107066451A (en) * 2016-12-16 2017-08-18 中国科学院自动化研究所 The update method of man-machine interaction translation model and more new system
CN107610702A (en) * 2017-09-22 2018-01-19 百度在线网络技术(北京)有限公司 Terminal device standby wakeup method, apparatus and computer equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130325484A1 (en) * 2012-05-29 2013-12-05 Samsung Electronics Co., Ltd. Method and apparatus for executing voice command in electronic device
US9697828B1 (en) * 2014-06-20 2017-07-04 Amazon Technologies, Inc. Keyword detection modeling using contextual and environmental information
CN106446561A (en) * 2016-09-28 2017-02-22 湖南老码信息科技有限责任公司 Incremental neural network model based urticaria prediction method and system
CN107066451A (en) * 2016-12-16 2017-08-18 中国科学院自动化研究所 The update method of man-machine interaction translation model and more new system
CN106782554A (en) * 2016-12-19 2017-05-31 百度在线网络技术(北京)有限公司 Voice awakening method and device based on artificial intelligence
CN107610702A (en) * 2017-09-22 2018-01-19 百度在线网络技术(北京)有限公司 Terminal device standby wakeup method, apparatus and computer equipment

Cited By (40)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110415724A (en) * 2019-08-08 2019-11-05 中南大学湘雅二医院 Transmission method, device, system and the computer readable storage medium of alert data
CN110364147A (en) * 2019-08-29 2019-10-22 厦门市思芯微科技有限公司 A kind of wake-up training word acquisition system and method
CN110364147B (en) * 2019-08-29 2021-08-20 厦门市思芯微科技有限公司 Awakening training word acquisition system and method
CN110428810A (en) * 2019-08-30 2019-11-08 北京声智科技有限公司 A kind of recognition methods, device and electronic equipment that voice wakes up
CN110634468A (en) * 2019-09-11 2019-12-31 中国联合网络通信集团有限公司 Voice wake-up method, device, equipment and computer readable storage medium
CN110634468B (en) * 2019-09-11 2022-04-15 中国联合网络通信集团有限公司 Voice wake-up method, device, equipment and computer readable storage medium
CN110570840A (en) * 2019-09-12 2019-12-13 腾讯科技(深圳)有限公司 Intelligent device awakening method and device based on artificial intelligence
CN110570840B (en) * 2019-09-12 2022-07-05 腾讯科技(深圳)有限公司 Intelligent device awakening method and device based on artificial intelligence
CN110727821A (en) * 2019-10-12 2020-01-24 深圳海翼智新科技有限公司 Method, apparatus, system and computer storage medium for preventing device from being awoken by mistake
WO2021082572A1 (en) * 2019-10-28 2021-05-06 苏宁云计算有限公司 Wake-up model generation method, smart terminal wake-up method, and devices
CN110970016A (en) * 2019-10-28 2020-04-07 苏宁云计算有限公司 Awakening model generation method, intelligent terminal awakening method and device
CN110992953A (en) * 2019-12-16 2020-04-10 苏州思必驰信息科技有限公司 Voice data processing method, device, system and storage medium
CN111091813B (en) * 2019-12-31 2022-07-22 北京猎户星空科技有限公司 Voice wakeup model updating and wakeup method, system, device, equipment and medium
CN111091813A (en) * 2019-12-31 2020-05-01 北京猎户星空科技有限公司 Voice wakeup model updating method, device, equipment and medium
CN111767083B (en) * 2020-02-03 2024-07-16 北京沃东天骏信息技术有限公司 Collecting method, playing device, electronic device and medium for awakening audio data by mistake
CN111767083A (en) * 2020-02-03 2020-10-13 北京沃东天骏信息技术有限公司 Method for collecting false wake-up audio data, playing device, electronic device and medium
CN111429917A (en) * 2020-03-18 2020-07-17 北京声智科技有限公司 Equipment awakening method and terminal equipment
CN111429917B (en) * 2020-03-18 2023-09-22 北京声智科技有限公司 Equipment awakening method and terminal equipment
CN111464644A (en) * 2020-04-01 2020-07-28 北京声智科技有限公司 Data transmission method and electronic equipment
CN111554298A (en) * 2020-05-18 2020-08-18 北京百度网讯科技有限公司 Voice interaction method, voice interaction equipment and electronic equipment
CN111554298B (en) * 2020-05-18 2023-03-28 阿波罗智联(北京)科技有限公司 Voice interaction method, voice interaction equipment and electronic equipment
JP2021196599A (en) * 2020-06-10 2021-12-27 ペキン バイドゥ ネットコム サイエンス アンド テクノロジー カンパニー リミテッドBeijing Baidu Netcom Science And Technology Co., Ltd. Method and apparatus for outputting information
CN111640426A (en) * 2020-06-10 2020-09-08 北京百度网讯科技有限公司 Method and apparatus for outputting information
US11587550B2 (en) 2020-06-10 2023-02-21 Apollo Intelligent Connectivity (Beijing) Technology Co., Ltd. Method and apparatus for outputting information
CN111833902B (en) * 2020-07-07 2024-07-19 Oppo广东移动通信有限公司 Awakening model training method, awakening word recognition device and electronic equipment
CN111833902A (en) * 2020-07-07 2020-10-27 Oppo广东移动通信有限公司 Awakening model training method, awakening word recognition device and electronic equipment
CN112114886A (en) * 2020-09-17 2020-12-22 北京百度网讯科技有限公司 Method and device for acquiring false wake-up audio
CN112114886B (en) * 2020-09-17 2024-03-29 北京百度网讯科技有限公司 Acquisition method and device for false wake-up audio
CN112071323B (en) * 2020-09-18 2023-03-21 阿波罗智联(北京)科技有限公司 Method and device for acquiring false wake-up sample data and electronic equipment
CN112071323A (en) * 2020-09-18 2020-12-11 北京百度网讯科技有限公司 Method and device for acquiring false wake-up sample data and electronic equipment
CN112201239B (en) * 2020-09-25 2024-05-24 海尔优家智能科技(北京)有限公司 Determination method and device of target equipment, storage medium and electronic device
CN112201239A (en) * 2020-09-25 2021-01-08 海尔优家智能科技(北京)有限公司 Target device determination method and apparatus, storage medium, and electronic apparatus
CN112435670A (en) * 2020-11-11 2021-03-02 青岛歌尔智能传感器有限公司 Speech recognition method, speech recognition apparatus, and computer-readable storage medium
CN112820296A (en) * 2021-01-06 2021-05-18 北京声智科技有限公司 Data transmission method and electronic equipment
CN113571069A (en) * 2021-08-03 2021-10-29 北京房江湖科技有限公司 Information processing method, device and storage medium
CN114299933A (en) * 2021-12-28 2022-04-08 北京声智科技有限公司 Speech recognition model training method, device, equipment, storage medium and product
CN114299933B (en) * 2021-12-28 2024-08-20 北京声智科技有限公司 Speech recognition model training method, device, equipment, storage medium and product
WO2023207149A1 (en) * 2022-04-29 2023-11-02 荣耀终端有限公司 Speech recognition method and electronic device
CN117012189A (en) * 2022-04-29 2023-11-07 荣耀终端有限公司 Voice recognition method and electronic equipment
CN114845066A (en) * 2022-05-05 2022-08-02 成都赛力斯科技有限公司 Driving recording method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110097876A (en) Voice wakes up processing method and is waken up equipment
EP3599754B1 (en) Intelligent sensing device and sensing system
CN107567083B (en) Method and device for performing power-saving optimization processing
CN109473092B (en) Voice endpoint detection method and device
CN105376335B (en) Collected data uploading method and device
CN109951363B (en) Data processing method, device and system
CN103944977A (en) Cloud health information management system and method based on wearable device
CN111091813B (en) Voice wakeup model updating and wakeup method, system, device, equipment and medium
CN109065046A (en) Method, apparatus, electronic equipment and the computer readable storage medium that voice wakes up
CN111080968A (en) Linkage control early warning method and system for accidental occurrence of solitary old people
CN112148493A (en) Streaming media task management method and device and data server
CN104750483A (en) Method and device of controlling terminal alarm clock
CN116980958A (en) Radio equipment electric fault monitoring method and system based on data identification
CN112037820A (en) Security alarm method, device, system and equipment
CN112908321A (en) Device control method, device, storage medium, and electronic apparatus
WO2023005789A1 (en) Temperature treatment method and apparatus
Xiao Machine learning in smart home energy monitoring system
CN109199325B (en) Sleep monitoring method and device
CN112948763B (en) Piece quantity prediction method and device, electronic equipment and storage medium
CN115670397B (en) PPG artifact identification method and device, storage medium and electronic equipment
CN111161747B (en) Prediction method and device based on Tensorflow awakening model and computer equipment
CN108924002B (en) Method, device and equipment for analyzing performance data file and storage medium
WO2017166645A1 (en) Health prompting method and apparatus
US11714721B2 (en) Machine learning systems for ETL data streams
CN115550615A (en) Video recording method and device for intelligent equipment, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40012138

Country of ref document: HK

RJ01 Rejection of invention patent application after publication

Application publication date: 20190806

RJ01 Rejection of invention patent application after publication