CN111429911A - Method and device for reducing power consumption of speech recognition engine in noise scene - Google Patents

Method and device for reducing power consumption of speech recognition engine in noise scene Download PDF

Info

Publication number
CN111429911A
CN111429911A CN202010163866.XA CN202010163866A CN111429911A CN 111429911 A CN111429911 A CN 111429911A CN 202010163866 A CN202010163866 A CN 202010163866A CN 111429911 A CN111429911 A CN 111429911A
Authority
CN
China
Prior art keywords
voice recognition
voice
recognition result
recognition engine
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010163866.XA
Other languages
Chinese (zh)
Inventor
闫子魁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Original Assignee
Unisound Intelligent Technology Co Ltd
Xiamen Yunzhixin Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Unisound Intelligent Technology Co Ltd, Xiamen Yunzhixin Intelligent Technology Co Ltd filed Critical Unisound Intelligent Technology Co Ltd
Priority to CN202010163866.XA priority Critical patent/CN111429911A/en
Publication of CN111429911A publication Critical patent/CN111429911A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses a method and a device for reducing power consumption of a speech recognition engine in a noise scene. The method comprises the following steps: acquiring a first voice recognition engine and a second voice recognition engine, wherein the power consumption of the first voice recognition engine is greater than that of the second voice recognition engine; acquiring a preset number of pieces of voice information in a noise scene; preliminarily recognizing the preset number of pieces of voice information through the first voice recognition engine to obtain a first voice recognition result; and identifying the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result. According to the technical scheme of the invention, as the first speech recognition engine with low power consumption is used for recognizing and filtering useless noise, and the second speech recognition engine with high power consumption is not used for frequently recognizing the useless noise, the working frequency is reduced, the power consumption is greatly reduced, and the accuracy of the obtained second speech recognition result is high.

Description

Method and device for reducing power consumption of speech recognition engine in noise scene
Technical Field
The invention relates to the technical field of mutual voice recognition, in particular to a method and a device for reducing power consumption of a voice recognition engine in a noise scene.
Background
Speech recognition is a cross discipline. In the last two decades, speech recognition technology has advanced significantly, starting to move from the laboratory to the market. It is expected that voice recognition technology will enter various fields such as industry, home appliances, communications, automotive electronics, medical care, home services, consumer electronics, and the like, in the next 10 years, where noise is a major factor blocking voice recognition from being put to practical use.
At present, a voice recognition engine with a VAD (voice activity detection) technology has a poor voice recognition effect in a noise scene, and a voice recognition engine without the VAD has relatively high power consumption when performing voice recognition in the noise scene, so how to ensure the recognition effect of the voice recognition engine in the noise scene and the power consumption are small is a problem that needs to be solved urgently.
Disclosure of Invention
The invention provides a method and a device for reducing power consumption of a speech recognition engine in a noise scene, wherein the technical scheme is as follows:
according to a first aspect of the embodiments of the present invention, there is provided a method for reducing power consumption of a speech recognition engine in a noise scene, including:
acquiring a first voice recognition engine and a second voice recognition engine, wherein the power consumption of the first voice recognition engine is greater than that of the second voice recognition engine;
acquiring a preset number of pieces of voice information in a noise scene;
preliminarily recognizing the preset number of pieces of voice information through the first voice recognition engine to obtain a first voice recognition result;
and identifying the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
In one embodiment, the method for reducing power consumption of a speech recognition engine in a noisy scene further comprises:
judging whether the first voice recognition result meets a preset condition or not;
when the first voice recognition result does not meet the preset condition, recognizing the first voice recognition result through the first voice recognition engine to obtain a third voice recognition result;
and recognizing the third voice recognition result through the second voice recognition engine to obtain a fourth voice recognition result.
And when the first voice recognition result meets the preset condition, recognizing the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
In one embodiment, the preliminary recognition of the preset number of pieces of speech information by the first speech recognition engine to obtain a first speech recognition result includes:
acquiring various noise information in the noise scene;
extracting various noise characteristics in the various noise information;
extracting the voice characteristics corresponding to the preset number of pieces of voice information;
and the first voice recognition engine filters the preset number of pieces of voice information according to the voice characteristics and the various noise characteristics to obtain the first voice recognition result.
In one embodiment, the recognizing, by the second speech recognition engine, the first speech recognition result to obtain a second speech recognition result includes:
judging whether the first voice recognition result has voice information of a user;
when the first voice recognition result has the voice information of the user, calculating the first recognition result to obtain a living body detection score;
judging whether the living body detection score is larger than a preset threshold value or not;
and when the living body detection score is larger than a preset threshold value, identifying the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
In one embodiment, the filtering, by the first speech recognition engine, the preset number of pieces of speech information according to the speech features and the various noise features to obtain the first speech recognition result includes:
respectively judging whether the voice characteristics corresponding to the preset number of pieces of voice information are matched with the various noise characteristics;
and the first voice recognition engine filters a plurality of pieces of voice information of which the voice characteristics are matched with the various noise characteristics in the preset number of pieces of voice information to obtain a first voice recognition result.
According to a second aspect of the embodiments of the present invention, there is provided an apparatus for reducing power consumption of a speech recognition engine in a noise scene, including:
the device comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring a first voice recognition engine and a second voice recognition engine, and the power consumption of the first voice recognition engine is greater than that of the second voice recognition engine;
the second acquisition module is used for acquiring a preset number of pieces of voice information in a noise scene;
the first recognition module is used for carrying out preliminary recognition on the preset number of pieces of voice information through the first voice recognition engine so as to obtain a first voice recognition result;
and the second recognition module is used for recognizing the first voice recognition result through the second voice recognition engine so as to obtain a second voice recognition result.
In one embodiment, the apparatus for reducing power consumption of a speech recognition engine in a noisy scene further comprises:
the judging module is used for judging whether the first voice recognition result meets a preset condition or not;
the third recognition module is used for recognizing the first voice recognition result through the first voice recognition engine when the first voice recognition result does not meet the preset condition so as to obtain a third voice recognition result;
and the fourth recognition module is used for recognizing the third voice recognition result through the second voice recognition engine so as to obtain a fourth voice recognition result.
And the fifth recognition module is used for recognizing the first voice recognition result through the second voice recognition engine when the first voice recognition result meets the preset condition so as to obtain a second voice recognition result.
In one embodiment, the first identification module includes:
the obtaining submodule is used for obtaining various noise information in the noise scene;
the first extraction submodule is used for extracting various noise characteristics in the various noise information;
the second extraction submodule is used for extracting the voice features corresponding to the preset number of pieces of voice information;
and the filtering submodule is used for filtering the preset number of pieces of voice information by the first voice recognition engine according to the voice characteristics and the various noise characteristics so as to obtain the first voice recognition result.
In one embodiment, the second identification module includes:
the first judgment submodule is used for judging whether the first voice recognition result has voice information of a user;
the calculation submodule is used for calculating the first recognition result to obtain a living body detection score when the first voice recognition result has the voice information of the user;
the second judgment submodule is used for judging whether the in-vivo detection score is larger than a preset threshold value or not;
and the recognition submodule is used for recognizing the first voice recognition result through the second voice recognition engine when the living body detection score is larger than a preset threshold value so as to obtain a second voice recognition result.
In one embodiment, the filtering submodule includes:
the judging unit is used for respectively judging whether the voice characteristics corresponding to the preset number of pieces of voice information are matched with the various noise characteristics;
and the filtering unit is used for filtering a plurality of pieces of voice information of which the voice characteristics are matched with the various noise characteristics in the preset number of pieces of voice information by the first voice recognition engine so as to obtain a first voice recognition result.
The technical scheme provided by the embodiment of the invention has the following beneficial effects:
acquiring a first voice recognition engine and a second voice recognition engine, then acquiring a preset number of pieces of voice information in a noise scene, firstly, preliminarily recognizing the preset number of pieces of voice information through the first voice recognition engine to obtain a first voice recognition result, and then, recognizing the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result; according to the technical scheme, the preset number of pieces of voice information are preliminarily recognized through the first voice recognition engine with low power consumption, useless noise is filtered, then the filtered first voice recognition result is subjected to voice recognition through the second voice recognition engine with high power consumption and strong functions, the second voice recognition result can be obtained, and the useless noise is recognized and filtered through the first voice recognition engine with low power consumption, so that the useless noise is not frequently recognized by the second voice recognition engine with high power consumption, the working frequency is reduced, the power consumption is greatly reduced, and the accuracy of the obtained second voice recognition result is high.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a method for reducing power consumption of a speech recognition engine in a noisy scene according to an embodiment of the present invention;
FIG. 2 is a flow chart of another method for reducing power consumption of a speech recognition engine in a noisy environment according to an embodiment of the present invention;
FIG. 3 is a block diagram of an apparatus for reducing power consumption of a speech recognition engine in a noisy scene according to an embodiment of the present invention;
FIG. 4 is a block diagram of an apparatus for reducing power consumption of a speech recognition engine in a noisy scene according to an embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
FIG. 1 is a flowchart illustrating a method for reducing power consumption of a speech recognition engine in a noisy scene according to an embodiment of the present invention, as shown in FIG. 1, the method can be implemented as the following steps S11-S14:
in step S11, a first speech recognition engine and a second speech recognition engine are obtained, wherein the power consumption of the first speech recognition engine is greater than that of the second speech recognition engine; the first speech recognition engine has small parameters, so that the first speech recognition engine has very low power consumption but not powerful function, and its main function is to filter the useless noise in the speech information, while the second speech recognition engine has high power consumption but good recognition effect.
In step S12, a preset number of pieces of voice information in a noise scene are acquired;
in step S13, performing preliminary recognition on a preset number of pieces of speech information by a first speech recognition engine to obtain a first speech recognition result; the first voice recognition engine is used for carrying out preliminary recognition on the preset number of pieces of voice information, namely, useful voice in the preset number of pieces of voice information is recognized, useless noise is filtered, and the remaining voice information is the first voice recognition result.
In step S14, the first speech recognition result is recognized by the second speech recognition engine to obtain a second speech recognition result.
Acquiring a first voice recognition engine and a second voice recognition engine, then acquiring a preset number of pieces of voice information in a noise scene, firstly, preliminarily recognizing the preset number of pieces of voice information through the first voice recognition engine to obtain a first voice recognition result, and then, recognizing the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result; according to the technical scheme, the preset number of pieces of voice information are preliminarily recognized through the first voice recognition engine with low power consumption, useless noise is filtered, then the filtered first voice recognition result is subjected to voice recognition through the second voice recognition engine with high power consumption and strong functions, the second voice recognition result can be obtained, and the useless noise is recognized and filtered through the first voice recognition engine with low power consumption, so that the useless noise is not frequently recognized by the second voice recognition engine with high power consumption, the working frequency is reduced, the power consumption is greatly reduced, and the accuracy of the obtained second voice recognition result is high.
In one embodiment, the method for reducing power consumption of a speech recognition engine in a noisy scene further comprises:
judging whether the first voice recognition result meets a preset condition or not; the preset condition may be, but is not limited to, that no noise information exists in the first speech recognition result.
When the first voice recognition result does not meet the preset condition, recognizing the first voice recognition result through the first voice recognition engine to obtain a third voice recognition result; when the first voice recognition result does not meet the preset condition, namely the first voice recognition result does not meet the condition that no noise information exists in the first voice recognition result, the first voice recognition engine is used for recognizing the first recognition result again so as to filter the noise information in the first recognition result.
And recognizing the third voice recognition result through the second voice recognition engine to obtain a fourth voice recognition result.
And when the first voice recognition result meets the preset condition, recognizing the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
After whether the first voice recognition result meets the preset condition or not is judged, subsequent recognition operation is carried out according to the judged result, the phenomenon that a second voice recognition engine carries out useless voice information recognition is fully avoided, the use of the second voice recognition engine with high power consumption is reduced, and further the power consumption is reduced.
As shown in fig. 2, in one embodiment, the above step S13 can be implemented as the following steps S131-S134, including:
in step S131, various noise information in a noise scene is acquired; the noise information refers to useless sounds, such as the cry of an animal, the booming of a machine, and the like.
In step S132, various noise features in various noise information are extracted;
in step S133, extracting respective corresponding voice features of a preset number of pieces of voice information;
in step S134, the first speech recognition engine filters a preset number of pieces of speech information according to the speech characteristics and various noise characteristics to obtain a first speech recognition result.
Noise features in the obtained noise information are extracted, and then voice features in the voice information are extracted, the first voice recognition engine filters the noise information through the voice features and the noise features, the remaining voice information is a first voice recognition result, filtering is carried out according to the voice features and the noise features, and useful voice information can be prevented from being filtered.
In one embodiment, the recognizing, by the second speech recognition engine, the first speech recognition result to obtain a second speech recognition result includes:
judging whether the first voice recognition result has voice information of a user;
when the first voice recognition result has the voice information of the user, calculating the first recognition result to obtain a living body detection score;
judging whether the living body detection score is larger than a preset threshold value or not;
and when the living body detection score is larger than a preset threshold value, identifying the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
Through calculating the first voice recognition result with the user voice information, the live body detection score can be obtained, the live body detection score is compared with a preset threshold value, when the live body detection score is larger than the preset threshold value, the live body of the user is recognized according to the first voice recognition result, unnecessary recognition work of a second voice recognition engine is prevented through the live body detection, and resource waste is avoided.
In one embodiment, the filtering, by the first speech recognition engine, the preset number of pieces of speech information according to the speech features and the various noise features to obtain the first speech recognition result includes:
respectively judging whether the voice characteristics corresponding to the preset number of pieces of voice information are matched with the various noise characteristics;
and the first voice recognition engine filters a plurality of pieces of voice information of which the voice characteristics are matched with the various noise characteristics in the preset number of pieces of voice information to obtain a first voice recognition result.
Whether the voice features corresponding to the preset number of pieces of voice information are matched with various noise features or not is judged, when the voice features are matched, the matched voice information is determined to be noise, then the noise is filtered through the first voice recognition engine, the left voice information is a first voice recognition result, and the filtering of useful voice information can be prevented through the matching mechanism.
For the method for reducing the power consumption of the speech recognition engine in the noise scene provided by the embodiment of the present invention, the embodiment of the present invention also provides a device for reducing the power consumption of the speech recognition engine in the noise scene, as shown in fig. 3, the device includes:
a first obtaining module 31, configured to obtain a first speech recognition engine and a second speech recognition engine, where power consumption of the first speech recognition engine is greater than that of the second speech recognition engine;
a second obtaining module 32, configured to obtain a preset number of pieces of voice information in a noise scene;
the first recognition module 33 is configured to perform preliminary recognition on the preset number of pieces of speech information by using the first speech recognition engine to obtain a first speech recognition result;
and a second recognition module 34, configured to recognize the first speech recognition result through the second speech recognition engine to obtain a second speech recognition result.
In one embodiment, the apparatus for reducing power consumption of a speech recognition engine in a noisy scene further comprises:
the judging module is used for judging whether the first voice recognition result meets a preset condition or not;
the third recognition module is used for recognizing the first voice recognition result through the first voice recognition engine when the first voice recognition result does not meet the preset condition so as to obtain a third voice recognition result;
and the fourth recognition module is used for recognizing the third voice recognition result through the second voice recognition engine so as to obtain a fourth voice recognition result.
And the fifth recognition module is used for recognizing the first voice recognition result through the second voice recognition engine when the first voice recognition result meets the preset condition so as to obtain a second voice recognition result.
As shown in fig. 4, in one embodiment, the first identification module 33 includes:
the obtaining sub-module 331 is configured to obtain various noise information in the noise scene;
a first extraction submodule 332, configured to extract various noise features in the various noise information;
the second extraction submodule 333 is configured to extract respective voice features corresponding to the preset number of pieces of voice information;
and a filtering submodule 334, configured to filter, by the first speech recognition engine, the preset number of pieces of speech information according to the speech features and the various noise features, so as to obtain the first speech recognition result.
In one embodiment, the second identification module includes:
the first judgment submodule is used for judging whether the first voice recognition result has voice information of a user;
the calculation submodule is used for calculating the first recognition result to obtain a living body detection score when the first voice recognition result has the voice information of the user;
the second judgment submodule is used for judging whether the in-vivo detection score is larger than a preset threshold value or not;
and the recognition submodule is used for recognizing the first voice recognition result through the second voice recognition engine when the living body detection score is larger than a preset threshold value so as to obtain a second voice recognition result.
In one embodiment, the filtering submodule includes:
the judging unit is used for respectively judging whether the voice characteristics corresponding to the preset number of pieces of voice information are matched with the various noise characteristics;
and the filtering unit is used for filtering a plurality of pieces of voice information of which the voice characteristics are matched with the various noise characteristics in the preset number of pieces of voice information by the first voice recognition engine so as to obtain a first voice recognition result.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for reducing power consumption of a speech recognition engine in a noisy scene, comprising:
acquiring a first voice recognition engine and a second voice recognition engine, wherein the power consumption of the first voice recognition engine is greater than that of the second voice recognition engine;
acquiring a preset number of pieces of voice information in a noise scene;
preliminarily recognizing the preset number of pieces of voice information through the first voice recognition engine to obtain a first voice recognition result;
and identifying the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
2. The method of claim 1, further comprising:
judging whether the first voice recognition result meets a preset condition or not;
when the first voice recognition result does not meet the preset condition, recognizing the first voice recognition result through the first voice recognition engine to obtain a third voice recognition result;
recognizing the third voice recognition result through the second voice recognition engine to obtain a fourth voice recognition result;
and when the first voice recognition result meets the preset condition, recognizing the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
3. The method of claim 1, wherein the preliminary recognizing, by the first speech recognition engine, the preset number of pieces of speech information to obtain a first speech recognition result comprises:
acquiring various noise information in the noise scene;
extracting various noise characteristics in the various noise information;
extracting the voice characteristics corresponding to the preset number of pieces of voice information;
and the first voice recognition engine filters the preset number of pieces of voice information according to the voice characteristics and the various noise characteristics to obtain the first voice recognition result.
4. The method of claim 1, wherein said recognizing, by the second speech recognition engine, the first speech recognition result to obtain a second speech recognition result comprises:
judging whether the first voice recognition result has voice information of a user;
when the first voice recognition result has the voice information of the user, calculating the first recognition result to obtain a living body detection score;
judging whether the living body detection score is larger than a preset threshold value or not;
and when the living body detection score is larger than a preset threshold value, identifying the first voice recognition result through the second voice recognition engine to obtain a second voice recognition result.
5. The method of claim 3, wherein the first speech recognition engine filtering the preset number of pieces of speech information according to the speech characteristics and the various noise characteristics to obtain the first speech recognition result, comprising:
respectively judging whether the voice characteristics corresponding to the preset number of pieces of voice information are matched with the various noise characteristics;
and the first voice recognition engine filters a plurality of pieces of voice information of which the voice characteristics are matched with the various noise characteristics in the preset number of pieces of voice information to obtain a first voice recognition result.
6. An apparatus for reducing power consumption of a speech recognition engine in a noisy scene, comprising:
the device comprises a first acquisition module, a second acquisition module and a processing module, wherein the first acquisition module is used for acquiring a first voice recognition engine and a second voice recognition engine, and the power consumption of the first voice recognition engine is greater than that of the second voice recognition engine;
the second acquisition module is used for acquiring a preset number of pieces of voice information in a noise scene;
the first recognition module is used for carrying out preliminary recognition on the preset number of pieces of voice information through the first voice recognition engine so as to obtain a first voice recognition result;
and the second recognition module is used for recognizing the first voice recognition result through the second voice recognition engine so as to obtain a second voice recognition result.
7. The apparatus of claim 6, further comprising:
the judging module is used for judging whether the first voice recognition result meets a preset condition or not;
the third recognition module is used for recognizing the first voice recognition result through the first voice recognition engine when the first voice recognition result does not meet the preset condition so as to obtain a third voice recognition result;
the fourth recognition module is used for recognizing the third voice recognition result through the second voice recognition engine to obtain a fourth voice recognition result;
and the fifth recognition module is used for recognizing the first voice recognition result through the second voice recognition engine when the first voice recognition result meets the preset condition so as to obtain a second voice recognition result.
8. The apparatus of claim 6, wherein the first identification module comprises:
the obtaining submodule is used for obtaining various noise information in the noise scene;
the first extraction submodule is used for extracting various noise characteristics in the various noise information;
the second extraction submodule is used for extracting the voice features corresponding to the preset number of pieces of voice information;
and the filtering submodule is used for filtering the preset number of pieces of voice information by the first voice recognition engine according to the voice characteristics and the various noise characteristics so as to obtain the first voice recognition result.
9. The apparatus of claim 6, wherein the second identification module comprises:
the first judgment submodule is used for judging whether the first voice recognition result has voice information of a user;
the calculation submodule is used for calculating the first recognition result to obtain a living body detection score when the first voice recognition result has the voice information of the user;
the second judgment submodule is used for judging whether the in-vivo detection score is larger than a preset threshold value or not;
and the recognition submodule is used for recognizing the first voice recognition result through the second voice recognition engine when the living body detection score is larger than a preset threshold value so as to obtain a second voice recognition result.
10. The apparatus of claim 8, wherein the filtering submodule comprises:
the judging unit is used for respectively judging whether the voice characteristics corresponding to the preset number of pieces of voice information are matched with the various noise characteristics;
and the filtering unit is used for filtering a plurality of pieces of voice information of which the voice characteristics are matched with the various noise characteristics in the preset number of pieces of voice information by the first voice recognition engine so as to obtain a first voice recognition result.
CN202010163866.XA 2020-03-11 2020-03-11 Method and device for reducing power consumption of speech recognition engine in noise scene Pending CN111429911A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010163866.XA CN111429911A (en) 2020-03-11 2020-03-11 Method and device for reducing power consumption of speech recognition engine in noise scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010163866.XA CN111429911A (en) 2020-03-11 2020-03-11 Method and device for reducing power consumption of speech recognition engine in noise scene

Publications (1)

Publication Number Publication Date
CN111429911A true CN111429911A (en) 2020-07-17

Family

ID=71547678

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010163866.XA Pending CN111429911A (en) 2020-03-11 2020-03-11 Method and device for reducing power consumption of speech recognition engine in noise scene

Country Status (1)

Country Link
CN (1) CN111429911A (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002055691A (en) * 2000-08-08 2002-02-20 Sanyo Electric Co Ltd Voice-recognition method
CN1723487A (en) * 2002-12-13 2006-01-18 摩托罗拉公司 Method and apparatus for selective speech recognition
JP2006208486A (en) * 2005-01-25 2006-08-10 Matsushita Electric Ind Co Ltd Voice inputting device
CN103594088A (en) * 2013-11-11 2014-02-19 联想(北京)有限公司 Information processing method and electronic equipment
US20140136215A1 (en) * 2012-11-13 2014-05-15 Lenovo (Beijing) Co., Ltd. Information Processing Method And Electronic Apparatus
US20140348345A1 (en) * 2013-05-23 2014-11-27 Knowles Electronics, Llc Vad detection microphone and method of operating the same
CN104423925A (en) * 2013-08-26 2015-03-18 联想(北京)有限公司 Information processing method and electronic device
US20170178628A1 (en) * 2015-12-22 2017-06-22 Nxp B.V. Voice activation system
CN108538305A (en) * 2018-04-20 2018-09-14 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and computer readable storage medium
CN109036428A (en) * 2018-10-31 2018-12-18 广东小天才科技有限公司 A kind of voice wake-up device, method and computer readable storage medium
CN109215647A (en) * 2018-08-30 2019-01-15 出门问问信息科技有限公司 Voice awakening method, electronic equipment and non-transient computer readable storage medium
CN110021307A (en) * 2019-04-04 2019-07-16 Oppo广东移动通信有限公司 Audio method of calibration, device, storage medium and electronic equipment
CN110312235A (en) * 2019-05-16 2019-10-08 深圳市豪恩声学股份有限公司 Audio frequency apparatus, operation method, device and the storage medium that real-time voice wakes up
CN110427097A (en) * 2019-06-18 2019-11-08 华为技术有限公司 Voice data processing method, apparatus and system
CN110853644A (en) * 2019-11-20 2020-02-28 Oppo(重庆)智能科技有限公司 Voice wake-up method, device, equipment and storage medium

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2002055691A (en) * 2000-08-08 2002-02-20 Sanyo Electric Co Ltd Voice-recognition method
CN1723487A (en) * 2002-12-13 2006-01-18 摩托罗拉公司 Method and apparatus for selective speech recognition
JP2006208486A (en) * 2005-01-25 2006-08-10 Matsushita Electric Ind Co Ltd Voice inputting device
US20140136215A1 (en) * 2012-11-13 2014-05-15 Lenovo (Beijing) Co., Ltd. Information Processing Method And Electronic Apparatus
US20140348345A1 (en) * 2013-05-23 2014-11-27 Knowles Electronics, Llc Vad detection microphone and method of operating the same
CN104423925A (en) * 2013-08-26 2015-03-18 联想(北京)有限公司 Information processing method and electronic device
CN103594088A (en) * 2013-11-11 2014-02-19 联想(北京)有限公司 Information processing method and electronic equipment
US20170178628A1 (en) * 2015-12-22 2017-06-22 Nxp B.V. Voice activation system
CN108538305A (en) * 2018-04-20 2018-09-14 百度在线网络技术(北京)有限公司 Audio recognition method, device, equipment and computer readable storage medium
CN109215647A (en) * 2018-08-30 2019-01-15 出门问问信息科技有限公司 Voice awakening method, electronic equipment and non-transient computer readable storage medium
CN109036428A (en) * 2018-10-31 2018-12-18 广东小天才科技有限公司 A kind of voice wake-up device, method and computer readable storage medium
CN110021307A (en) * 2019-04-04 2019-07-16 Oppo广东移动通信有限公司 Audio method of calibration, device, storage medium and electronic equipment
CN110312235A (en) * 2019-05-16 2019-10-08 深圳市豪恩声学股份有限公司 Audio frequency apparatus, operation method, device and the storage medium that real-time voice wakes up
CN110427097A (en) * 2019-06-18 2019-11-08 华为技术有限公司 Voice data processing method, apparatus and system
CN110853644A (en) * 2019-11-20 2020-02-28 Oppo(重庆)智能科技有限公司 Voice wake-up method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN109241266B (en) Method and device for creating extended question based on standard question in man-machine interaction
CN107680589B (en) Voice information interaction method, device and equipment
CN108985451B (en) Data processing method and device based on AI chip
CN106469555B (en) Voice recognition method and terminal
CN111046969A (en) Data screening method and device, storage medium and electronic equipment
CN111710332B (en) Voice processing method, device, electronic equipment and storage medium
CN111488813B (en) Video emotion marking method and device, electronic equipment and storage medium
CN113593565B (en) Intelligent home device management and control method and system
CN111027316A (en) Text processing method and device, electronic equipment and computer readable storage medium
CN110414294B (en) Pedestrian re-identification method and device
CN106887228A (en) The sound control method of robot, device and robot
CN111429911A (en) Method and device for reducing power consumption of speech recognition engine in noise scene
CN111325078A (en) Face recognition method, face recognition device and storage medium
CN113421546A (en) Cross-tested multi-mode based speech synthesis method and related equipment
CN112687274A (en) Voice information processing method, device, equipment and medium
CN111640450A (en) Multi-person audio processing method, device, equipment and readable storage medium
CN113925517B (en) Cognitive disorder recognition method, device and medium based on electroencephalogram signals
CN110415689A (en) Speech recognition equipment and method
CN111128194A (en) System and method for improving online voice recognition effect
CN114420136A (en) Method and device for training voiceprint recognition model and storage medium
CN110837494B (en) Method and device for identifying unspecified diagnosis coding errors of medical record home page
CN113780580A (en) Data analysis method, device and equipment based on machine learning and storage medium
CN107885722B (en) Keyword extraction method and device
CN110543894A (en) Medical image processing method
CN109065066A (en) A kind of call control method, device and equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200717