CN112233681A - Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium - Google Patents

Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium Download PDF

Info

Publication number
CN112233681A
CN112233681A CN202011076956.1A CN202011076956A CN112233681A CN 112233681 A CN112233681 A CN 112233681A CN 202011076956 A CN202011076956 A CN 202011076956A CN 112233681 A CN112233681 A CN 112233681A
Authority
CN
China
Prior art keywords
awakening
engine
corpus
audio data
interference
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011076956.1A
Other languages
Chinese (zh)
Inventor
彭经伟
左声勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apollo Zhilian Beijing Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202011076956.1A priority Critical patent/CN112233681A/en
Publication of CN112233681A publication Critical patent/CN112233681A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/20Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Business, Economics & Management (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Game Theory and Decision Science (AREA)
  • Traffic Control Systems (AREA)

Abstract

The application discloses a method and a device for determining a mistaken awakening corpus, electronic equipment and a storage medium, and relates to the field of voice recognition. The specific implementation scheme is as follows: collecting interference audio data through at least one audio collector; the system comprises an audio acquisition area, an audio acquisition unit and a control unit, wherein the audio acquisition area in which the audio acquisition unit is located comprises a wake-up engine associated with the audio acquisition unit; inputting interference audio data collected by an audio collector into a related awakening engine; and under the condition that the input interference audio data of the awakening engine is successfully awakened, taking the interference audio data as the mistaken awakening corpus of the awakening engine. In the embodiment of the application, the collected interference voice is directly transmitted to the corresponding awakening engine, and the interference audio data is used as the error awakening corpus as long as the awakening engine is awakened, so that the automatic confirmation collection of the error awakening material is realized, the error awakening corpus is avoided being selected in the manual local recording, and the efficiency of confirming and collecting the error awakening corpus is improved.

Description

Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium
Technical Field
The present application relates to the field of artificial intelligence technologies, and in particular, to a method and an apparatus for determining a corpus of incorrectly awakened words, an electronic device, and a storage medium.
Background
The voice assistant is an application for helping a user to solve problems through intelligent interaction of intelligent conversation and instant question and answer, and along with the development of artificial intelligence technology, the voice assistant is widely used in automobiles.
The voice assistant in the automobile is often awoken, so the corpus of the awoken voice assistant needs to be collected. At present, when corpora of a car-mounted voice assistant awakened by mistake are collected, external tools (artificial mouths and horns) are usually adopted to randomly play audio, a car-mounted machine acquires audio data played by the external tools through a microphone, or the car-mounted machine starts the microphone in different environments (such as a driving environment, a quiet environment and a multi-person speaking scene) to acquire the audio data and stores the audio data to the local, and then the audio data (namely the corpora) awakened by mistake are determined and triggered from the locally recorded audio in a manual selection mode.
Disclosure of Invention
The embodiment of the application provides a method, a device, equipment and a storage medium for determining a mistaken awakening corpus.
According to a first aspect, a method for determining a false wake corpus is provided, including:
collecting interference audio data through at least one audio collector; the system comprises an audio acquisition area, an audio acquisition unit and a control unit, wherein the audio acquisition area in which the audio acquisition unit is located comprises a wake-up engine associated with the audio acquisition unit;
inputting interference audio data collected by an audio collector into a related awakening engine;
and under the condition that the input interference audio data of the awakening engine is successfully awakened, taking the interference audio data as the mistaken awakening corpus of the awakening engine.
According to a second aspect, there is provided a false wake-up corpus determining apparatus, comprising:
the voice acquisition module is used for acquiring interference audio data through at least one audio acquisition device; the system comprises an audio acquisition area, an audio acquisition unit and a control unit, wherein the audio acquisition area in which the audio acquisition unit is located comprises a wake-up engine associated with the audio acquisition unit;
the data input module is used for inputting the interference audio data collected by the audio collector into the associated wake-up engine;
and the mistaken awakening corpus determining module is used for taking the interference audio data as the mistaken awakening corpus of the awakening engine under the condition that the interference audio data input by the awakening engine is successfully awakened.
According to a third aspect, there is provided an electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method for determining a false wake corpus according to any of the embodiments of the present application.
According to a fourth aspect, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing a computer to perform the method of false wake corpus determination of any embodiment of the present application.
According to the technology of the application, the purpose of automatically collecting the mistaken awakening corpora is achieved, and the efficiency of determining the mistaken awakening corpora is improved.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not intended to limit the present application. Wherein:
FIG. 1 is a flowchart illustrating a method for determining a mis-wakeup corpus according to an embodiment of the present application;
FIG. 2 is a flowchart illustrating a method for determining a mis-wakeup corpus according to an embodiment of the present application;
FIG. 3 is a flowchart illustrating a method for determining a false wake corpus according to an embodiment of the present application;
FIG. 4 is a flowchart illustrating a method for determining a false wake corpus according to an embodiment of the present application;
FIG. 5 is a flowchart of a method for determining a false wake corpus according to an embodiment of the present application;
FIG. 6 is a schematic structural diagram of a device for determining a mis-wakeup corpus according to an embodiment of the present application;
fig. 7 is a block diagram of an electronic device for implementing the method for determining a false wake corpus according to the embodiment of the present application.
Detailed Description
The following description of the exemplary embodiments of the present application, taken in conjunction with the accompanying drawings, wherein various details of the embodiments of the application are included to assist in understanding, are to be considered exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
Fig. 1 is a schematic flow chart of a method for determining a mis-awakening corpus according to an embodiment of the present application, which is applicable to a situation where a car machine determines a corpus of a mis-awakening voice assistant from audio data collected by an audio collector. The method may be performed by a false wake corpus determining apparatus, which is implemented in software and/or hardware, preferably configured in an electronic device.
Referring to fig. 1, the method for determining the false wake-up corpus is as follows:
s101, interference audio data are collected through at least one audio collector.
The audio acquisition device can be a microphone, a plurality of audio acquisition areas (namely sound areas) are preset on the automobile, and each sound area is provided with a microphone for acquiring audio data of the sound area where the microphone is located. The disturbance audio data is data which is not voice input by a user, and is optionally audio data played by an external tool (an artificial mouth, a loudspeaker).
S102, inputting the interference audio data collected by the audio collector into a related awakening engine.
In the embodiment of the application, the audio acquisition area where the audio acquisition device is located includes a wake-up engine associated with the audio acquisition device, and for example, a microphone MIC1 is arranged in the sound zone 1, and a wake-up engine a is associated with MIC 1; a microphone MIC2 is arranged in zone 2, MIC2 being associated with the wake-up engine B. Therefore, after the interfering audio data is collected by a certain audio collector, the collected interfering audio data is directly input into the wake-up engine associated with the audio collector, for example, after the MIC1 collects the interfering audio data, the interfering audio data is directly transmitted into the wake-up engine a and is not transmitted into other wake-up engines. It should be noted that, in the embodiment of the present application, the collected audio data is directly input into the search engine, and it is not necessary to store the collected audio data locally first, and then manually select the mis-awakening corpus from the local, so that the efficiency of determining the subsequent mis-awakening corpus can be ensured.
S103, under the condition that the input interference audio data of the awakening engine is awakened successfully, the interference audio data is used as the mistaken awakening corpus of the awakening engine.
In the embodiment of the application, a wakeup monitoring mechanism is provided for monitoring whether each wakeup engine is successfully awakened in real time. Because the audio input into the wake-up engine is the interference audio data and the user does not send any voice command, the interference audio data is used as the mistaken wake-up corpus of the wake-up engine only under the condition that the input interference audio data of the wake-up engine is monitored to be successfully woken up.
In the embodiment of the application, the collected interference voice is directly transmitted to the corresponding awakening engine, and if the awakening engine is awakened, the interference audio data is used as the mistaken awakening corpus, so that the automatic identification and collection of the mistaken awakening material are realized, the condition that all collected audio is stored locally is avoided, the mistaken awakening corpus is selected from the local recording through manual work, and the efficiency of the mistaken awakening corpus identification and collection is improved.
Fig. 2 is a schematic flow chart of a method for determining a false wake-up corpus according to an embodiment of the present application, which is optimized based on the foregoing embodiment, and referring to fig. 2, the method for determining the false wake-up corpus is specifically as follows:
s201, interference audio data are collected through at least one audio collector.
The audio acquisition area where the audio acquisition device is located comprises a wake-up engine associated with the audio acquisition device.
S202, inputting the interference audio data collected by the audio collector into a related awakening engine.
S203, under the condition that the input interference audio data of the awakening engine is successfully awakened, taking the interference audio data with preset time length before the interference audio data is successfully awakened as the mistaken awakening corpus of the awakening engine.
In the prior art, when the false wake corpus is manually selected, a complete audio file recorded by a microphone is usually used as the false wake corpus, so that irrelevant data in a false wake forecast is excessive, for example, in one false wake corpus (for example, 8 hours of recorded data), only the last 10 seconds of audio data successfully awaken a wake engine, and therefore the false wake corpus is irrelevant audio except for the last 10 seconds of audio. And when the complete audio file is stored as the mistaken awakening corpus, a larger storage space is occupied.
Based on this, the inventor proposes that, under the condition that the interference audio data input by the wake-up engine is successfully awakened, the interference audio data with the preset time length before the interference audio data is successfully awakened is used as the false wake-up corpus of the wake-up engine. Optionally, the time for waking up the engine is determined by a monitoring mechanism, and then the audio data input within a preset time before the time is used as the false wake-up corpus, where the preset time may be set according to actual needs, for example, 10 seconds.
In the embodiment of the application, the interference audio data with the preset duration before the successful awakening in the interference audio data is used as the mistaken awakening corpus of the awakening engine, so that irrelevant data in the mistaken awakening corpus can be reduced, and the storage space can be saved when the mistaken awakening corpus is stored.
Fig. 3 is a schematic flow chart of a method for determining a false wake-up corpus according to an embodiment of the present application, which is optimized based on the above embodiment, and referring to fig. 3, the method for determining a false wake-up corpus specifically includes:
s301, interference audio data are collected through at least one audio collector.
The audio acquisition area where the audio acquisition device is located comprises a wake-up engine associated with the audio acquisition device.
S302, inputting the interference audio data collected by the audio collector into a related awakening engine.
And S303, inputting the interference audio data into a cache unit associated with the wake-up engine, wherein the cache unit is configured to store the interference audio data with the last preset time length in the input interference audio data.
In this embodiment of the application, each wake-up engine is associated with a buffer unit, where the buffer unit is configured to store interference audio data of a last preset duration in the input interference audio data, that is, the buffer unit employs a first-in first-out buffer mechanism, and in a process of inputting the interference audio data into the buffer unit associated with the wake-up engine, the buffer unit always stores the interference audio data of the last input preset duration, where the preset duration may be set according to actual needs, for example, 10 seconds, that is, the interference audio data of the preset duration is stored in the buffer unit.
It should be noted that S302 and S303 are executed synchronously, that is, while the interference audio data collected by the audio collector is input into the associated wake-up engine, the collected interference audio data is input into the buffer unit associated with the wake-up engine synchronously. The synchronous storage in the buffer unit is to ensure that the interference audio data which causes the wake-up engine to be successfully awakened is currently in the buffer unit when the wake-up engine is successfully awakened.
In an alternative embodiment, the buffer unit is illustratively an lru (least recent used) buffer queue, and lru the buffer queue capacity is calculated according to a preset time length, a sampling rate and a bit depth, and when the three parameters are determined, the buffer capacity of lru buffer queue is limited, so that when the collected audio data is stored in the lru buffer queue in real time, if the buffer is full, the part of the audio data that is stored in advance is deleted based on a first-in first-out buffer mechanism, so as to leave a buffer position for the subsequent newly input audio data.
S304, responding to any awakening engine being awakened successfully, taking the interference audio data cached in the cache unit associated with the awakening engine as the mistaken awakening corpus of the awakening engine.
In this embodiment, when the wake-up monitoring mechanism monitors that a certain wake-up engine is successfully woken up, since the interference audio data transmitted to the wake-up engine is synchronously stored in the lru buffer queue associated with the wake-up engine, the interference audio data buffered in the buffer unit associated with the wake-up engine at that time can be directly used as the mistaken wake-up corpus of the wake-up engine. Illustratively, the cache unit is configured to store latest 10 seconds of interference audio data, and if the wake-up engine is successfully woken up at a certain time, the 10 seconds of interference audio data stored in the cache unit at the certain time is used as the false wake-up corpus. It should be noted that, in the initial stage, if the interference audio data stored in the cache unit is less than 10 seconds, for example, if the wake-up engine is successfully woken up when the interference audio data of 6 seconds is stored, the stored interference audio data of 6 seconds is used as the false wake-up corpus.
According to the embodiment of the application, under the condition that any awakening engine is successfully awakened, the interference audio data is directly acquired from the cache unit associated with the awakening engine to serve as the mistaken awakening corpus, and the efficiency of acquiring the mistaken awakening corpus can be improved.
Fig. 4 is a schematic flow chart of a method for determining a false wake-up corpus according to an embodiment of the present application, where the embodiment is optimized based on the foregoing embodiment, and referring to fig. 4, the method for determining the false wake-up corpus specifically includes:
s401, interference audio data are collected through at least one audio collector.
The audio acquisition area where the audio acquisition device is located comprises a wake-up engine associated with the audio acquisition device.
S402, inputting the interference audio data collected by the audio collector into a related awakening engine.
And S403, under the condition that the input interference audio data of the wake-up engine is successfully awakened, taking the interference audio data as the mistaken wake-up corpus of the wake-up engine.
S404, identifying the awakening words included in the mistaken awakening corpus, and naming the mistaken awakening corpus according to the awakening words.
After the mistaken awakening corpus is obtained, the mistaken awakening corpus can be identified, awakening words included in the mistaken awakening corpus are determined, and the mistaken awakening corpus is named according to the awakening words, so that a subsequent user can directly and visually determine which awakening word causes the awakening engine to be awakened mistakenly according to the name of the mistaken awakening corpus. And after the mistaken awakening corpus is named according to the awakening words, the mistaken awakening corpus can be stored in the local of the vehicle-mounted device. In an alternative embodiment, the naming and saving of the false wake corpus may be accomplished through a file writing tool.
S405, determining the false awakening rate of the awakening words according to the named false awakening linguistic data.
Further, after the false wake-up pressure is measured, that is, after the external tool stops playing the interfering audio data, the false wake-up rate of the wake-up word in unit time is determined according to the names of all the locally stored false wake-up corpora, so that the user knows which wake-up word causes the highest false wake-up rate, and the subsequent user can avoid the situation when inputting a voice instruction.
S406, taking the mistaken awakening corpus of the awakening engine as a negative sample of the awakening model in the awakening engine, and training the awakening model.
The vehicle machine can train the awakening model by taking the locally stored mistaken awakening corpus as a negative sample of the awakening model in the awakening engine so as to achieve the purpose of optimizing the awakening engine. It should be noted that the mistaken awakening corpus locally stored in the car machine can also be sent to the cloud, and the modification training of the awakening model can be completed at the cloud.
Fig. 5 is a flowchart of a method for determining a false wake corpus according to an embodiment of the present application, and referring to fig. 5, an exemplary automobile in the embodiment of the present application is provided with four voice zones, for example, MIC1-MIC4 in fig. 5 is four audio collectors, such as microphones, arranged in different voice zones on a car machine and used for collecting audio data, each microphone is associated with a wake engine, and at the same time, 4 lru buffer queues are arranged, and the wake engine in each voice zone is associated with a lru buffer queue. Meanwhile, a wakeup monitoring mechanism is also arranged to monitor whether the wakeup engine is successfully awakened in real time.
During specific work, the external tool plays interference audio data in the automobile, and the microphones MIC1, MIC2, MIC3 and MIC4 respectively collect the interference audio data of the sound zone where the microphones are located. Furthermore, each microphone injects the collected interference audio data into the wake-up engine, and synchronously stores the interference audio data into Lru cache queues corresponding to the wake-up engine according to a preset time length. After a wake engine is successfully awakened, the callback confirms which wake engine of the sound zone is awakened, so as to determine an lru cache queue associated with the wake engine, and the interference audio data currently cached in the lru cache queue is used as a false wake corpus. And in order to store the mistaken awakening corpora, popping lru the determined mistaken awakening corpora to a cache queue, transmitting the mistaken awakening corpora to a file writing tool, and storing the mistaken awakening corpora in the local vehicle-mounted device through the file writing tool, so that the aim of automatically collecting the mistaken awakening corpora is fulfilled.
Fig. 6 is a schematic structural diagram of a device for determining a mis-wake-up corpus according to an embodiment of the present application, which is applicable to a situation of querying an electric quantity of an earphone through voice. As shown in fig. 6, the apparatus 600 specifically includes:
the voice acquisition module 601 is used for acquiring interference audio data through at least one audio acquisition device; the system comprises an audio acquisition area, an audio acquisition unit and a control unit, wherein the audio acquisition area in which the audio acquisition unit is located comprises a wake-up engine associated with the audio acquisition unit;
a data input module 602, configured to input the interference audio data collected by the audio collector into an associated wake-up engine;
the false wake corpus determining module 603 is configured to, under a condition that the interfering audio data input by the wake engine is successfully woken up, use the interfering audio data as a false wake corpus of the wake engine.
On the basis of the foregoing embodiment, optionally, the mis-awakening corpus determining module includes:
and the mistaken awakening corpus determining unit is used for taking the interference audio data with preset time length before the interference audio data is successfully awakened in the interference audio data as the mistaken awakening corpus of the awakening engine under the condition that the interference audio data input by the awakening engine is successfully awakened.
On the basis of the above embodiment, optionally, the apparatus further includes:
the buffer module is used for inputting the interference audio data into a buffer unit associated with the wake-up engine, wherein the buffer unit is configured to store the interference audio data with the last preset time length in the input interference audio data;
the mis-awakening corpus determining unit is specifically configured to:
and in response to any wake-up engine being successfully awakened, taking the interference audio data cached in the cache unit associated with the wake-up engine as the false wake-up corpus of the wake-up engine.
On the basis of the above embodiment, optionally, the apparatus further includes:
and the identification and naming module is used for identifying the awakening words in the mistaken awakening corpus after the interference audio data is used as the mistaken awakening corpus of the awakening engine, and naming the mistaken awakening corpus according to the awakening words.
On the basis of the above embodiment, optionally, the apparatus further includes:
and the computing module is used for naming the mistaken awakening linguistic data according to the awakening words and then determining the mistaken awakening rate of the awakening words according to the named mistaken awakening linguistic data.
On the basis of the above embodiment, optionally, the apparatus further includes:
and the training module is used for taking the interference audio data as a false awakening corpus of the awakening engine, taking the false awakening corpus of the awakening engine as a negative sample of an awakening model in the awakening engine, and training the awakening model.
The device 600 for determining a corpus awaked from a mistake provided in the embodiment of the present application can execute the method for determining a corpus awaked from a mistake provided in any embodiment of the present application, and has functional modules and beneficial effects corresponding to the execution method. Reference may be made to the description of any method embodiment of the present application for details not explicitly described in this embodiment.
According to an embodiment of the present application, an electronic device and a readable storage medium are also provided.
Fig. 7 is a block diagram of an electronic device according to an embodiment of the disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the present application that are described and/or claimed herein.
As shown in fig. 7, the electronic apparatus includes: one or more processors 701, a memory 702, and interfaces for connecting the various components, including a high-speed interface and a low-speed interface. The various components are interconnected using different buses and may be mounted on a common motherboard or in other manners as desired. The processor may process instructions for execution within the electronic device, including instructions stored in or on the memory to display graphical information of a GUI on an external input/output apparatus (such as a display device coupled to the interface). In other embodiments, multiple processors and/or multiple buses may be used, along with multiple memories and multiple memories, as desired. Also, multiple electronic devices may be connected, with each device providing portions of the necessary operations (e.g., as a server array, a group of blade servers, or a multi-processor system). In fig. 7, one processor 701 is taken as an example.
The memory 702 is a non-transitory computer readable storage medium as provided herein. The memory stores instructions executable by the at least one processor, so that the at least one processor executes the method for determining the false wake-up corpus. The non-transitory computer readable storage medium of the present application stores computer instructions for causing a computer to perform the false wake corpus determination method provided herein.
The memory 702 is a non-transitory computer readable storage medium, and can be used to store non-transitory software programs, non-transitory computer executable programs, and modules, such as program instructions/modules corresponding to the false wake corpus determination method in the embodiment of the present application (for example, the voice acquisition module 601, the data input module 602, and the false wake corpus determination module 603 shown in fig. 6). The processor 701 executes various functional applications and data processing of the server by running non-transitory software programs, instructions and modules stored in the memory 702, that is, implements the false wake corpus determination method in the above method embodiment.
The memory 702 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device that implements the false wake corpus determination method according to the embodiment of the present application, and the like. Further, the memory 702 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 702 may optionally include a memory remotely located from the processor 701, and the remote memory may be connected to an electronic device implementing the false wake corpus determination method of the present embodiments via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The electronic device implementing the method for determining the mis-awakening corpus according to the embodiment of the present application may further include: an input device 703 and an output device 704. The processor 701, the memory 702, the input device 703 and the output device 704 may be connected by a bus or other means, and fig. 7 illustrates an example of a connection by a bus.
The input device 703 may receive input numeric or character information and generate key signal inputs related to user settings and function control of an electronic device implementing the mis-wake corpus determination method of the embodiment of the present application, such as an input device like a touch screen, a keypad, a mouse, a track pad, a touch pad, a pointing stick, one or more mouse buttons, a track ball, a joystick, etc. The output devices 704 may include a display device, auxiliary lighting devices (e.g., LEDs), and tactile feedback devices (e.g., vibrating motors), among others. The display device may include, but is not limited to, a Liquid Crystal Display (LCD), a Light Emitting Diode (LED) display, and a plasma display. In some implementations, the display device can be a touch screen.
Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, application specific ASICs (application specific integrated circuits), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
These computer programs (also known as programs, software applications, or code) include machine instructions for a programmable processor, and may be implemented using high-level procedural and/or object-oriented programming languages, and/or assembly/machine languages. As used herein, the terms "machine-readable medium" and "computer-readable medium" refer to any computer program product, apparatus, and/or device (e.g., magnetic discs, optical disks, memory, Programmable Logic Devices (PLDs)) used to provide machine instructions and/or data to a programmable processor, including a machine-readable medium that receives machine instructions as a machine-readable signal. The term "machine-readable signal" refers to any signal used to provide machine instructions and/or data to a programmable processor.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), the internet, and blockchain networks.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service are overcome.
According to the technical scheme of the embodiment of the application, the purpose of automatically collecting the mistaken awakening corpora is achieved, and the efficiency of determining the mistaken awakening corpora is improved.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present application may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present application can be achieved, and the present invention is not limited herein.
The above-described embodiments should not be construed as limiting the scope of the present application. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present application shall be included in the protection scope of the present application.

Claims (14)

1. A method for determining and determining false awakening corpora comprises the following steps:
collecting interference audio data through at least one audio collector; the audio acquisition area where the audio acquisition device is located comprises a wake-up engine associated with the audio acquisition device;
inputting the interference audio data collected by the audio collector into a related awakening engine;
and under the condition that the input interference audio data of the awakening engine is successfully awakened, taking the interference audio data as the mistaken awakening corpus of the awakening engine.
2. The method according to claim 1, wherein the taking the interference audio data as the false wake corpus of the wake engine in case that the input interference audio data of the wake engine successfully wakes up comprises:
and under the condition that the input interference audio data of the awakening engine is successfully awakened, taking the interference audio data with preset time length before the interference audio data is successfully awakened as the mistaken awakening corpus of the awakening engine.
3. The method of claim 2, wherein:
the method further comprises the following steps:
inputting the interference audio data into a cache unit associated with the wake-up engine, wherein the cache unit is configured to store the interference audio data of the last preset duration in the input interference audio data;
the said awakening engine is said to be successful in the input interference audio data awakening the situation, the interference audio data in the preset time before the successful awakening, as the said awakening engine mistake awakening corpus, including:
and in response to any awakening engine being awakened successfully, taking the interference audio data cached in the cache unit associated with the awakening engine as the mistaken awakening corpus of the awakening engine.
4. The method of claim 1, after the interfering audio data is used as the false wake corpus of the wake engine, further comprising:
and identifying awakening words included in the mistaken awakening linguistic data, and naming the mistaken awakening linguistic data according to the awakening words.
5. The method according to claim 4, further comprising, after naming the mis-wakeup corpus according to the wakeup word:
and determining the false awakening rate of the awakening words according to the named false awakening linguistic data.
6. The method of claim 1, after the interfering audio data is used as the false wake corpus of the wake engine, further comprising:
and taking the mistaken awakening corpus of the awakening engine as a negative sample of an awakening model in the awakening engine, and training the awakening model.
7. A false wake corpus determination device, comprising:
the voice acquisition module is used for acquiring interference audio data through at least one audio acquisition device; the audio acquisition area where the audio acquisition device is located comprises a wake-up engine associated with the audio acquisition device;
the data input module is used for inputting the interference audio data collected by the audio collector into the associated wake-up engine;
and the false awakening corpus determining module is used for taking the interference audio data as the false awakening corpus of the awakening engine under the condition that the interference audio data input by the awakening engine is successfully awakened.
8. The apparatus of claim 7, wherein the mis-wakeup corpus determination module comprises:
and the mistaken awakening corpus determining unit is used for taking the interference audio data with preset time length before the interference audio data is successfully awakened as the mistaken awakening corpus of the awakening engine under the condition that the input interference audio data of the awakening engine is successfully awakened.
9. The apparatus of claim 8, further comprising:
the buffer module is used for inputting the interference audio data into a buffer unit associated with the wake-up engine, wherein the buffer unit is configured to store the interference audio data of the last preset duration in the input interference audio data;
the mis-awakening corpus determining unit is specifically configured to:
and in response to any awakening engine being awakened successfully, taking the interference audio data cached in the cache unit associated with the awakening engine as the mistaken awakening corpus of the awakening engine.
10. The apparatus of claim 7, further comprising:
and the identification and naming module is used for identifying the awakening words in the mistaken awakening corpus after the interference audio data is used as the mistaken awakening corpus of the awakening engine, and naming the mistaken awakening corpus according to the awakening words.
11. The apparatus of claim 10, further comprising:
and the computing module is used for naming the mistaken awakening linguistic data according to the awakening words and then determining the mistaken awakening rate of the awakening words according to the named mistaken awakening linguistic data.
12. The apparatus of claim 7, further comprising:
and the training module is used for taking the interference audio data as the false awakening corpus of the awakening engine, taking the false awakening corpus of the awakening engine as a negative sample of an awakening model in the awakening engine, and training the awakening model.
13. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of miscouvering corpus determination of any of claims 1-6.
14. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method for determining a false wake corpus according to any one of claims 1-6.
CN202011076956.1A 2020-10-10 2020-10-10 Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium Pending CN112233681A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011076956.1A CN112233681A (en) 2020-10-10 2020-10-10 Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011076956.1A CN112233681A (en) 2020-10-10 2020-10-10 Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN112233681A true CN112233681A (en) 2021-01-15

Family

ID=74113061

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011076956.1A Pending CN112233681A (en) 2020-10-10 2020-10-10 Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN112233681A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115588435A (en) * 2022-11-08 2023-01-10 荣耀终端有限公司 Voice wake-up method and electronic equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140163978A1 (en) * 2012-12-11 2014-06-12 Amazon Technologies, Inc. Speech recognition power management
CN107808670A (en) * 2017-10-25 2018-03-16 百度在线网络技术(北京)有限公司 Voice data processing method, device, equipment and storage medium
CN109448708A (en) * 2018-10-15 2019-03-08 四川长虹电器股份有限公司 Far field voice wakes up system
CN110070857A (en) * 2019-04-25 2019-07-30 北京梧桐车联科技有限责任公司 The model parameter method of adjustment and device, speech ciphering equipment of voice wake-up model
CN111640426A (en) * 2020-06-10 2020-09-08 北京百度网讯科技有限公司 Method and apparatus for outputting information
CN112071323A (en) * 2020-09-18 2020-12-11 北京百度网讯科技有限公司 Method and device for acquiring false wake-up sample data and electronic equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140163978A1 (en) * 2012-12-11 2014-06-12 Amazon Technologies, Inc. Speech recognition power management
CN107808670A (en) * 2017-10-25 2018-03-16 百度在线网络技术(北京)有限公司 Voice data processing method, device, equipment and storage medium
CN109448708A (en) * 2018-10-15 2019-03-08 四川长虹电器股份有限公司 Far field voice wakes up system
CN110070857A (en) * 2019-04-25 2019-07-30 北京梧桐车联科技有限责任公司 The model parameter method of adjustment and device, speech ciphering equipment of voice wake-up model
CN111640426A (en) * 2020-06-10 2020-09-08 北京百度网讯科技有限公司 Method and apparatus for outputting information
CN112071323A (en) * 2020-09-18 2020-12-11 北京百度网讯科技有限公司 Method and device for acquiring false wake-up sample data and electronic equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115588435A (en) * 2022-11-08 2023-01-10 荣耀终端有限公司 Voice wake-up method and electronic equipment

Similar Documents

Publication Publication Date Title
JP6751433B2 (en) Processing method, device and storage medium for waking up application program
CN111192591B (en) Awakening method and device of intelligent equipment, intelligent sound box and storage medium
AU2019246868B2 (en) Method and system for voice activation
CN104133652A (en) Audio playing control method and terminal
CN111755002B (en) Speech recognition device, electronic apparatus, and speech recognition method
CN112634890B (en) Method, device, equipment and storage medium for waking up playing equipment
CN112382285B (en) Voice control method, voice control device, electronic equipment and storage medium
CN111177453B (en) Method, apparatus, device and computer readable storage medium for controlling audio playing
CN107180631A (en) A kind of voice interactive method and device
CN109192208A (en) A kind of control method of electrical equipment, system, device, equipment and medium
CN111968642A (en) Voice data processing method and device and intelligent vehicle
CN106935253A (en) The method of cutting out of audio file, device and terminal device
CN111640426A (en) Method and apparatus for outputting information
CN112382279B (en) Voice recognition method and device, electronic equipment and storage medium
CN112466296A (en) Voice interaction processing method and device, electronic equipment and storage medium
CN112530419A (en) Voice recognition control method and device, electronic equipment and readable storage medium
CN111883127A (en) Method and apparatus for processing speech
CN112071323B (en) Method and device for acquiring false wake-up sample data and electronic equipment
CN112652304B (en) Voice interaction method and device of intelligent equipment and electronic equipment
CN112233681A (en) Method and device for determining mistakenly awakened corpus, electronic equipment and storage medium
CN111292716A (en) Voice chip and electronic equipment
CN111261143B (en) Voice wakeup method and device and computer readable storage medium
US20210201894A1 (en) N/a
CN113961619A (en) Data synchronization method and device, computer equipment and storage medium
CN112382292A (en) Voice-based control method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20211022

Address after: 100176 101, floor 1, building 1, yard 7, Ruihe West 2nd Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Applicant after: Apollo Zhilian (Beijing) Technology Co.,Ltd.

Address before: 2 / F, baidu building, 10 Shangdi 10th Street, Haidian District, Beijing 100085

Applicant before: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right