CN112765335A - Voice calling landing system - Google Patents

Voice calling landing system Download PDF

Info

Publication number
CN112765335A
CN112765335A CN202110107551.8A CN202110107551A CN112765335A CN 112765335 A CN112765335 A CN 112765335A CN 202110107551 A CN202110107551 A CN 202110107551A CN 112765335 A CN112765335 A CN 112765335A
Authority
CN
China
Prior art keywords
voice
keyword
calling
call
recognition device
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110107551.8A
Other languages
Chinese (zh)
Other versions
CN112765335B (en
Inventor
金鑫嘉
何晓光
翁彬
刘德辉
姚飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mitsubishi Elevator Co Ltd
Original Assignee
Shanghai Mitsubishi Elevator Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Mitsubishi Elevator Co Ltd filed Critical Shanghai Mitsubishi Elevator Co Ltd
Priority to CN202110107551.8A priority Critical patent/CN112765335B/en
Publication of CN112765335A publication Critical patent/CN112765335A/en
Application granted granted Critical
Publication of CN112765335B publication Critical patent/CN112765335B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Mathematical Physics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Health & Medical Sciences (AREA)
  • Acoustics & Sound (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Telephonic Communication Services (AREA)
  • Indicating And Signalling Devices For Elevators (AREA)

Abstract

The invention discloses a voice calling landing system.A voice acquisition device is used for acquiring an audio signal and sending the audio signal to a voice recognition device; the voice recognition device judges that effective calling instruction keywords exist in the audio signal when the calling instruction keywords exist in the audio signal and the maximum amplitude value of the calling instruction keywords in the audio signal is larger than a trigger threshold value; if the effective calling instruction keywords exist in the audio signals and the audio signal amplitudes in first set time before and after the calling instruction keywords are all smaller than the voice threshold, sending corresponding calling instructions to the microprocessor; otherwise, not sending the calling instruction to the microprocessor; the speech threshold is less than or equal to the trigger threshold. The invention discloses another two voice calling systems. The voice call system provided by the invention can avoid the false triggering of the call instruction.

Description

Voice calling landing system
Technical Field
The invention relates to an elevator, in particular to a voice calling system.
Background
To reduce the false triggering probability of speech, a valid command is considered only when the speech volume reaches a certain threshold. In order to avoid false triggering, a voice calling device of an elevator is triggered only when the voice volume of a keyword reaches a certain threshold value. However, a fixed threshold is difficult to adapt to different situations due to different noise levels of the field environment. The threshold value is set under the quiet condition, and the condition of false triggering still exists under the relatively noisy condition; a threshold is set in a noisy environment, and a correct command may not be triggered in a quiet environment. As shown in fig. 1.
Although normal triggering can be performed when no voice input is made before or after the keyword as shown in fig. 2, false triggering may be performed when voice input is made before or after the keyword as shown in fig. 3. When people are chatting, if the sentences contain keyword information, the keywords can be captured by a voice device and cause false triggering. For example, in a voice call device of an elevator, keywords as control commands are generally "go up/down" or "i go up/down". Because the keywords are short, only two keywords of 'up' and 'down' are different, and the two keywords are similar in pronunciation, when people chat in the call site, if the sentences contain keyword information, the keywords can be captured by the voice device and cause false triggering, so that the voice call device is frequently subjected to false triggering when used in the call site (say that the user goes upstairs and goes downstairs to identify the user as going upstairs). In an elevator application scenario, "false triggering" is a worse case scenario than "no response".
Disclosure of Invention
The invention aims to provide a voice calling system which can avoid the mistaken triggering of calling instructions.
In order to solve the technical problem, the invention provides a voice calling system, which comprises a voice acquisition device, a voice recognition device and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
the voice recognition device judges that effective calling instruction keywords exist in the audio signal when the calling instruction keywords exist in the audio signal and the maximum amplitude value of the calling instruction keywords in the audio signal is larger than a trigger threshold value;
if the effective calling instruction keywords exist in the audio signals and the audio signal amplitudes in first set time before and after the calling instruction keywords are all smaller than the voice threshold, sending corresponding calling instructions to the microprocessor; otherwise, not sending the calling instruction to the microprocessor;
the speech threshold is less than or equal to the trigger threshold.
Preferably, the voice recognition device receives the audio signal collected by the voice collection device, and takes the amplitude average value of the audio signal received in the previous second set time as the environment volume average value, and then increases the fixed offset volume on the basis of the environment volume average value as the current trigger threshold, and the second set time is longer than the first set time;
and when the voice recognition device recognizes that the call instruction keywords exist in the audio signals and the amplitude of the audio signals of the call instruction keywords is larger than the current trigger threshold, judging that the effective call instruction keywords exist in the audio signals.
Preferably, the second set time is between 5 seconds and 30 minutes.
Preferably, the speech threshold is less than or equal to 1/4 of the trigger threshold.
In order to solve the technical problem, the other voice calling system provided by the invention comprises a voice acquisition device, a voice recognition device and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
the voice recognition device is used for obtaining n keyword feature scores after the audio signal is recognized and calculated, wherein n is the number of keywords which can be recognized by the voice recognition device, and n is an integer greater than 1; the keyword feature score value represents the confidence rate of the corresponding keyword, and the higher the score value is, the higher the probability that the audio signal is identified as the keyword is;
when the highest value of the n keyword feature scores is greater than a first set feature threshold value and the difference value of the feature score values of the two keywords with the highest scores is greater than or equal to a second set feature threshold value, taking one keyword with the highest keyword feature score value as an effective call instruction keyword obtained through identification and calculation, wherein the second set feature threshold value is smaller than the first set feature threshold value;
when the effective calling call instruction keywords are identified from the audio signals, the voice identification device sends corresponding calling call instructions to the microprocessor; otherwise no call command is sent to the microprocessor.
Preferably, the voice recognition device is provided with a calling instruction keyword sample set;
the call instruction keyword sample set comprises n samples of call instruction keywords;
and the voice recognition device recognizes and calculates n call instruction keyword feature scores according to the matching degree of the audio signal and n call instruction keyword samples in the call instruction keyword sample set.
In order to solve the technical problem, the invention provides a voice calling system, which comprises a voice acquisition device, a voice recognition device and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
the voice recognition device is used for obtaining k keyword feature scores after the audio signal is recognized and calculated, wherein k is the number of keywords which can be recognized by the voice recognition device, and is an integer larger than 1; the keyword feature score value represents the confidence rate of the corresponding keyword, and the higher the score value is, the higher the probability that the audio signal is identified as the keyword is;
when the keyword with the highest keyword feature score value is not a junk keyword, taking the keyword with the highest keyword feature score value as an effective call instruction keyword obtained by identification and calculation, and sending a corresponding call instruction to the microprocessor;
and when the keyword with the highest keyword feature score value is a junk keyword, not sending a call instruction to the microprocessor.
Preferably, the voice recognition device is provided with a calling instruction keyword sample set;
the call instruction keyword sample set comprises i samples of standard call instruction keywords and j samples of junk keywords, wherein i and j are positive integers, and k is i + j;
and the voice recognition device recognizes and calculates k keyword feature scores according to the matching degree of the audio signal and k call instruction keyword samples in the call instruction keyword sample set.
Preferably, when the keyword with the highest feature score value of the keyword is not a spam keyword, if the feature score value of the keyword is greater than a first set feature threshold value, the keyword with the highest feature score value of the keyword is taken as an effective call instruction keyword obtained by identification and calculation, and a corresponding call instruction is sent to the microprocessor; otherwise no call command is sent to the microprocessor.
Preferably, after receiving the calling instruction, the microprocessor sends the information related to the calling instruction to the elevator controller;
and the elevator controller controls the elevator to run according to the call instruction related information and the elevator running state.
Preferably, the voice calling system is a landing voice calling system;
the voice acquisition device is used for acquiring a landing audio signal and sending the landing audio signal to the voice recognition device;
the call instruction keywords include "up" and "down".
Preferably, the voice calling system is an in-car voice calling system;
the voice acquisition device is used for acquiring audio signals in the car and sending the audio signals to the voice recognition device;
the call instruction keywords comprise at least one of a star building, a door opening and a door closing, and the star is a floor number.
The voice call system can effectively avoid the false triggering of the call instruction.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings needed to be used in the present invention are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a schematic diagram of voice call system triggering with fixed trigger thresholds;
FIG. 2 is a schematic diagram of normal triggering of a voice call system without voice input before and after a keyword;
FIG. 3 is a schematic diagram of a voice call system false triggering with voice input before and after a keyword;
FIG. 4 is a schematic diagram of a voice call system architecture;
FIG. 5 is a schematic illustration of the operation of the voice recognition device of a first embodiment of the voice call system of the present invention;
FIG. 6 is a schematic diagram of normal triggering without voice input before and after a period of a call instruction keyword;
FIG. 7 is a schematic diagram of a call instruction keyword being not triggered by a voice input before and after a period of time;
FIG. 8 is a dynamic schematic diagram of the change of the trigger threshold of the voice call system of the present invention;
FIG. 9 is a schematic illustration of the operation of the voice recognition device of the second embodiment of the voice call system of the present invention;
FIG. 10 is a schematic illustration of the operation of the voice recognition device of a third embodiment of the voice call system of the present invention;
fig. 11 is a schematic view of the operating principle of the voice recognition device of the fourth embodiment of the voice call system of the present invention.
Detailed Description
The technical solutions in the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
As shown in fig. 4, the voice call system includes a voice acquisition device (MIC)), a voice recognition device, and a Microprocessor (MCU);
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
as shown in fig. 5, when a call instruction keyword is identified in the audio signal and the maximum amplitude of the call instruction keyword in the audio signal is greater than the trigger threshold, the voice recognition device determines that an effective call instruction keyword exists in the audio signal;
if the effective calling instruction keywords exist in the audio signals and the audio signal amplitudes in first set time before and after the calling instruction keywords are all smaller than the voice threshold, sending corresponding calling instructions to the microprocessor; otherwise, not sending the calling instruction to the microprocessor;
the speech threshold is less than or equal to the trigger threshold.
In the voice call system in the first embodiment, when no voice input is given before and after the call instruction keyword (the amplitude of the audio signal is smaller than the voice threshold), as shown in fig. 6, it is determined that an effective call instruction keyword exists in the audio signal, and a corresponding call instruction is sent to the microprocessor; when voice input exists before and after the call instruction keyword (the amplitude of the audio signal reaches or exceeds the voice threshold), as shown in fig. 7, it is judged that no effective call instruction keyword exists in the audio signal, and no corresponding call instruction is sent to the microprocessor. The voice call system of the first embodiment adds the condition of no voice input before and after the call instruction keyword in the call instruction keyword triggering condition, that is, after the call instruction keyword is captured, firstly, whether voice input exists in a period of time before and after the call instruction keyword is judged, and then, a related call calling function is triggered after no voice input is confirmed, so that the call instruction voice triggering condition is more strict, and the condition that the call instruction is triggered by mistake due to the fact that the keyword is captured in a sentence can be effectively avoided.
Example two
Based on the voice call system of the first embodiment, the voice recognition device receives the audio signal acquired by the voice acquisition device, and takes the average value of the amplitude of the audio signal received within the second setting time before as the average value of the environmental volume, and then increases the fixed offset volume on the basis of the average value of the environmental volume as the current trigger threshold, as shown in fig. 8, the second setting time is longer than the first setting time;
when the voice recognition device recognizes that the call instruction keywords exist in the audio signal and the amplitude of the audio signal of the call instruction keywords is larger than the current trigger threshold, it is determined that the valid call instruction keywords exist in the audio signal, as shown in fig. 9.
Preferably, the second set time is between 5 seconds and 30 minutes, and the second set time is determined according to an installation place of the elevator.
Preferably, the speech threshold is less than or equal to 1/4 of the trigger threshold.
The voice calling system in the second embodiment changes the triggering threshold value of the calling instruction keyword in the audio signal into dynamic one, dynamically adjusts according to different environment volume mean values, regards voice as invalid voice and continues to collect when the voice does not reach the triggering threshold value, extracts continuous audio signal stream containing voice and carries out next processing when the voice collected by the voice collecting device reaches the triggering threshold value, and therefore the purpose of reducing false triggering can be met, and adaptation to different application scenes can be guaranteed.
EXAMPLE III
As shown in fig. 4, the voice calling system includes a voice collecting device, a voice recognition device, and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
as shown in fig. 10, the speech recognition device performs recognition calculation on the audio signal to obtain n keyword feature scores, where n is the number of keywords that can be recognized by the speech recognition device, and n is an integer greater than 1; the keyword feature score value represents the confidence rate of the corresponding keyword, and the higher the score value is, the higher the probability that the audio signal is identified as the keyword is;
when the highest value of the n keyword feature scores is greater than a first set feature threshold value and the difference value of the feature score values of the two keywords with the highest scores is greater than or equal to a second set feature threshold value, taking one keyword with the highest keyword feature score value as an effective call instruction keyword obtained through identification and calculation, wherein the second set feature threshold value is smaller than the first set feature threshold value;
when the effective calling call instruction keywords are identified from the audio signals, the voice identification device sends corresponding calling call instructions to the microprocessor; otherwise no call command is sent to the microprocessor.
Preferably, the voice recognition device is provided with a calling instruction keyword sample set;
the call instruction keyword sample set comprises n samples of call instruction keywords;
and the voice recognition device recognizes and calculates n call instruction keyword feature scores according to the matching degree of the audio signal and n call instruction keyword samples in the call instruction keyword sample set.
In the voice call system of the third embodiment, for the case that some keywords have similar pronunciations and are frequently triggered by mistake in field use (say that going upstairs is recognized as going downstairs and say that going downstairs is recognized as going upstairs), a scoring strategy is adopted, and as shown in fig. 10, an audio signal is recognized and calculated by a voice recognition device to obtain n keyword feature scores; when the highest value of the n keyword feature scores is larger than a first set feature threshold value, the probability that the audio signal is identified as the corresponding keyword is larger, then further processing is carried out, otherwise, no processing is carried out; then selecting two keyword feature score values with the highest scores and calculating a difference value; when the difference value does not reach the second set characteristic threshold value, the obtained audio signal is regarded as invalid and the processing is finished; and when the difference value reaches a second set characteristic threshold value, taking a keyword with the highest keyword characteristic score value as an effective call instruction keyword obtained by identification and calculation, and carrying out the next processing. The voice call system of the third embodiment adopts a scoring strategy, so that the situation of false recognition caused by small overall difference of the feature scoring values of the keywords is avoided, the situation of false recognition caused by similar pronunciation can be effectively reduced, and the accuracy rate of voice call instruction recognition is improved.
Example four
As shown in fig. 4, the voice calling system includes a voice collecting device, a voice recognition device, and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
as shown in fig. 11, after the audio signal is identified and calculated, the speech recognition device obtains k keyword feature scores, where k is the number of keywords that can be identified by the speech recognition device, and k is an integer greater than 1; the keyword feature score value represents the confidence rate of the corresponding keyword, and the higher the score value is, the higher the probability that the audio signal is identified as the keyword is;
when the keyword with the highest keyword feature score value is not a junk keyword, taking the keyword with the highest keyword feature score value as an effective call instruction keyword obtained by identification and calculation, and sending a corresponding call instruction to the microprocessor;
and when the keyword with the highest keyword feature score value is a junk keyword, not sending a call instruction to the microprocessor.
Preferably, the voice recognition device is provided with a calling instruction keyword sample set;
the call instruction keyword sample set comprises i samples of standard call instruction keywords and j samples of junk keywords, wherein i and j are positive integers, and k is i + j;
and the voice recognition device recognizes and calculates k keyword feature scores according to the matching degree of the audio signal and k call instruction keyword samples in the call instruction keyword sample set.
Preferably, when the keyword with the highest feature score value of the keyword is not a spam keyword, if the feature score value of the keyword is greater than a first set feature threshold value, the keyword with the highest feature score value of the keyword is taken as an effective call instruction keyword obtained by identification and calculation, and a corresponding call instruction is sent to the microprocessor; otherwise no call command is sent to the microprocessor.
Canonical call instruction keywords are standard words commonly used for passenger calls. The junk keywords are words which are similar to the standard call instruction keywords but do not meet the standard requirements.
In the voice call system according to the fourth embodiment, as shown in fig. 11, when the keyword with the highest keyword feature score value is a spam keyword, the processing is ended, and there is no call instruction; and if the keyword with the highest keyword feature score value is not the junk keyword, generating a corresponding call instruction according to the keyword with the highest keyword feature score value.
EXAMPLE five
Based on the voice call system of the first, second, third and fourth embodiments, as shown in fig. 4, after receiving the call instruction, the microprocessor sends the information related to the call instruction to the elevator controller;
and the elevator controller controls the elevator to run according to the call instruction related information and the elevator running state.
EXAMPLE six
Based on the voice calling systems of the first embodiment, the second embodiment, the third embodiment, the fourth embodiment and the fifth embodiment, the voice calling system is a landing voice calling system;
the voice acquisition device is used for acquiring a landing audio signal and sending the landing audio signal to the voice recognition device;
the call instruction keywords include "up" and "down".
In a landing voice call system, a standard call instruction keyword is generally 'going upstairs/downstairs' or 'i going upstairs/i going downstairs', and the like. However, when a passenger actually uses a voice call device, the passenger may speak "up/down" and the like unsatisfactory words, and the words are classified as spam keywords.
EXAMPLE seven
Based on the voice calling systems of the first embodiment, the second embodiment, the third embodiment, the fourth embodiment and the fifth embodiment, the voice calling system is an in-car voice calling system;
the voice acquisition device is used for acquiring audio signals in the car and sending the audio signals to the voice recognition device;
the call instruction keywords comprise at least one of a star building, a door opening and a door closing, and the star is a floor number.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (18)

1. A voice calling landing system is characterized by comprising a voice acquisition device, a voice recognition device and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
the voice recognition device judges that effective calling instruction keywords exist in the audio signal when the calling instruction keywords exist in the audio signal and the maximum amplitude value of the calling instruction keywords in the audio signal is larger than a trigger threshold value;
if the effective calling instruction keywords exist in the audio signals and the audio signal amplitudes in first set time before and after the calling instruction keywords are all smaller than the voice threshold, sending corresponding calling instructions to the microprocessor; otherwise, not sending the calling instruction to the microprocessor;
the speech threshold is less than or equal to the trigger threshold.
2. Voice call system according to claim 1,
the voice recognition device receives the audio signals collected by the voice collection device, takes the amplitude average value of the audio signals received in the second set time as the environment volume average value, then increases the fixed offset volume on the basis of the environment volume average value as the current trigger threshold value, and the second set time is longer than the first set time;
and when the voice recognition device recognizes that the call instruction keywords exist in the audio signals and the amplitude of the audio signals of the call instruction keywords is larger than the current trigger threshold, judging that the effective call instruction keywords exist in the audio signals.
3. Voice call system according to claim 2,
the second set time is between 5 seconds and 30 minutes.
4. Voice call system according to claim 1,
the speech threshold is less than or equal to 1/4 of the trigger threshold.
5. Voice call system according to claim 1,
after receiving the calling instruction, the microprocessor sends the information related to the calling instruction to an elevator controller;
and the elevator controller controls the elevator to run according to the call instruction related information and the elevator running state.
6. Voice call system according to claim 1,
the voice calling system is a landing voice calling system;
the voice acquisition device is used for acquiring a landing audio signal and sending the landing audio signal to the voice recognition device;
the call instruction keywords include "up" and "down".
7. Voice call system according to claim 1,
the voice calling system is a voice calling system in the car;
the voice acquisition device is used for acquiring audio signals in the car and sending the audio signals to the voice recognition device;
the call instruction keywords comprise at least one of a star building, a door opening and a door closing, and the star is a floor number.
8. A voice calling landing system is characterized by comprising a voice acquisition device, a voice recognition device and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
the voice recognition device is used for obtaining n keyword feature scores after the audio signal is recognized and calculated, wherein n is the number of keywords which can be recognized by the voice recognition device, and n is an integer greater than 1; the keyword feature score value represents the confidence rate of the corresponding keyword, and the higher the score value is, the higher the probability that the audio signal is identified as the keyword is;
when the highest value of the n keyword feature scores is greater than a first set feature threshold value and the difference value of the feature score values of the two keywords with the highest scores is greater than or equal to a second set feature threshold value, taking one keyword with the highest keyword feature score value as an effective call instruction keyword obtained through identification and calculation, wherein the second set feature threshold value is smaller than the first set feature threshold value;
when the effective calling call instruction keywords are identified from the audio signals, the voice identification device sends corresponding calling call instructions to the microprocessor; otherwise no call command is sent to the microprocessor.
9. Voice call system according to claim 8,
the voice recognition device is provided with a calling instruction keyword sample set;
the call instruction keyword sample set comprises n samples of call instruction keywords;
and the voice recognition device recognizes and calculates n call instruction keyword feature scores according to the matching degree of the audio signal and n call instruction keyword samples in the call instruction keyword sample set.
10. Voice call system according to claim 8,
after receiving the calling instruction, the microprocessor sends the information related to the calling instruction to an elevator controller;
and the elevator controller controls the elevator to run according to the call instruction related information and the elevator running state.
11. Voice call system according to claim 8,
the voice calling system is a landing voice calling system;
the voice acquisition device is used for acquiring a landing audio signal and sending the landing audio signal to the voice recognition device;
the call instruction keywords include "up" and "down".
12. Voice call system according to claim 8,
the voice calling system is a voice calling system in the car;
the voice acquisition device is used for acquiring audio signals in the car and sending the audio signals to the voice recognition device;
the call instruction keywords comprise at least one of a star building, a door opening and a door closing, and the star is a floor number.
13. A voice calling landing system is characterized by comprising a voice acquisition device, a voice recognition device and a microprocessor;
the voice acquisition device is used for acquiring audio signals and sending the audio signals to the voice recognition device;
the voice recognition device is used for obtaining k keyword feature scores after the audio signal is recognized and calculated, wherein k is the number of keywords which can be recognized by the voice recognition device, and is an integer larger than 1; the keyword feature score value represents the confidence rate of the corresponding keyword, and the higher the score value is, the higher the probability that the audio signal is identified as the keyword is;
when the keyword with the highest keyword feature score value is not a junk keyword, taking the keyword with the highest keyword feature score value as an effective call instruction keyword obtained by identification and calculation, and sending a corresponding call instruction to the microprocessor;
and when the keyword with the highest keyword feature score value is a junk keyword, not sending a call instruction to the microprocessor.
14. Voice call system according to claim 13,
the voice recognition device is provided with a calling instruction keyword sample set;
the call instruction keyword sample set comprises i samples of standard call instruction keywords and j samples of junk keywords, wherein i and j are positive integers, and k is i + j;
and the voice recognition device recognizes and calculates k keyword feature scores according to the matching degree of the audio signal and k call instruction keyword samples in the call instruction keyword sample set.
15. Voice call system according to claim 13,
when the keyword with the highest feature score value of the keyword is not a spam keyword, if the feature score value of the keyword is greater than a first set feature threshold value, taking the keyword with the highest feature score value of the keyword as an effective call instruction keyword obtained by identification and calculation, and sending a corresponding call instruction to the microprocessor; otherwise no call command is sent to the microprocessor.
16. Voice call system according to claim 13,
after receiving the calling instruction, the microprocessor sends the information related to the calling instruction to an elevator controller;
and the elevator controller controls the elevator to run according to the call instruction related information and the elevator running state.
17. Voice call system according to claim 13,
the voice calling system is a landing voice calling system;
the voice acquisition device is used for acquiring a landing audio signal and sending the landing audio signal to the voice recognition device;
the call instruction keywords include "up" and "down".
18. Voice call system according to claim 13,
the voice calling system is a voice calling system in the car;
the voice acquisition device is used for acquiring audio signals in the car and sending the audio signals to the voice recognition device;
the call instruction keywords comprise at least one of a star building, a door opening and a door closing, and the star is a floor number.
CN202110107551.8A 2021-01-27 2021-01-27 Voice call system Active CN112765335B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110107551.8A CN112765335B (en) 2021-01-27 2021-01-27 Voice call system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110107551.8A CN112765335B (en) 2021-01-27 2021-01-27 Voice call system

Publications (2)

Publication Number Publication Date
CN112765335A true CN112765335A (en) 2021-05-07
CN112765335B CN112765335B (en) 2024-03-08

Family

ID=75705942

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110107551.8A Active CN112765335B (en) 2021-01-27 2021-01-27 Voice call system

Country Status (1)

Country Link
CN (1) CN112765335B (en)

Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105473483A (en) * 2013-08-21 2016-04-06 三菱电机株式会社 Elevator control device
CN106653024A (en) * 2016-12-30 2017-05-10 首都师范大学 Speech control method and device, balance car control method and device and balance car
CN106686191A (en) * 2015-11-06 2017-05-17 北京奇虎科技有限公司 Processing method for adaptively identifying harassing call and processing system thereof
CN106920558A (en) * 2015-12-25 2017-07-04 展讯通信(上海)有限公司 Keyword recognition method and device
CN108182937A (en) * 2018-01-17 2018-06-19 出门问问信息科技有限公司 Keyword recognition method, device, equipment and storage medium
CN109147780A (en) * 2018-08-15 2019-01-04 重庆柚瓣家科技有限公司 Audio recognition method and system under free chat scenario
CN110155830A (en) * 2019-06-17 2019-08-23 上海三菱电梯有限公司 Elevator control system
CN110211592A (en) * 2019-05-17 2019-09-06 北京华控创为南京信息技术有限公司 Intelligent sound data processing equipment and method
CN110232917A (en) * 2019-05-21 2019-09-13 平安科技(深圳)有限公司 Voice login method, device, equipment and storage medium based on artificial intelligence
CN110232916A (en) * 2019-05-10 2019-09-13 平安科技(深圳)有限公司 Method of speech processing, device, computer equipment and storage medium
CN110300001A (en) * 2019-05-21 2019-10-01 深圳壹账通智能科技有限公司 Conference audio control method, system, equipment and computer readable storage medium
CN111128183A (en) * 2019-12-19 2020-05-08 北京搜狗科技发展有限公司 Speech recognition method, apparatus and medium
CN111477219A (en) * 2020-05-08 2020-07-31 合肥讯飞数码科技有限公司 Keyword distinguishing method and device, electronic equipment and readable storage medium
CN111883121A (en) * 2020-07-20 2020-11-03 北京声智科技有限公司 Awakening method and device and electronic equipment
CN112201246A (en) * 2020-11-19 2021-01-08 深圳市欧瑞博科技股份有限公司 Intelligent control method and device based on voice, electronic equipment and storage medium

Patent Citations (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105473483A (en) * 2013-08-21 2016-04-06 三菱电机株式会社 Elevator control device
CN106686191A (en) * 2015-11-06 2017-05-17 北京奇虎科技有限公司 Processing method for adaptively identifying harassing call and processing system thereof
CN106920558A (en) * 2015-12-25 2017-07-04 展讯通信(上海)有限公司 Keyword recognition method and device
CN106653024A (en) * 2016-12-30 2017-05-10 首都师范大学 Speech control method and device, balance car control method and device and balance car
CN108182937A (en) * 2018-01-17 2018-06-19 出门问问信息科技有限公司 Keyword recognition method, device, equipment and storage medium
CN109147780A (en) * 2018-08-15 2019-01-04 重庆柚瓣家科技有限公司 Audio recognition method and system under free chat scenario
CN110232916A (en) * 2019-05-10 2019-09-13 平安科技(深圳)有限公司 Method of speech processing, device, computer equipment and storage medium
CN110211592A (en) * 2019-05-17 2019-09-06 北京华控创为南京信息技术有限公司 Intelligent sound data processing equipment and method
CN110232917A (en) * 2019-05-21 2019-09-13 平安科技(深圳)有限公司 Voice login method, device, equipment and storage medium based on artificial intelligence
CN110300001A (en) * 2019-05-21 2019-10-01 深圳壹账通智能科技有限公司 Conference audio control method, system, equipment and computer readable storage medium
CN110155830A (en) * 2019-06-17 2019-08-23 上海三菱电梯有限公司 Elevator control system
CN111128183A (en) * 2019-12-19 2020-05-08 北京搜狗科技发展有限公司 Speech recognition method, apparatus and medium
CN111477219A (en) * 2020-05-08 2020-07-31 合肥讯飞数码科技有限公司 Keyword distinguishing method and device, electronic equipment and readable storage medium
CN111883121A (en) * 2020-07-20 2020-11-03 北京声智科技有限公司 Awakening method and device and electronic equipment
CN112201246A (en) * 2020-11-19 2021-01-08 深圳市欧瑞博科技股份有限公司 Intelligent control method and device based on voice, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN112765335B (en) 2024-03-08

Similar Documents

Publication Publication Date Title
US8170875B2 (en) Speech end-pointer
EP2898510B1 (en) Method, system and computer program for adaptive control of gain applied to an audio signal
EP0655732A2 (en) Soft decision speech recognition
JP5156043B2 (en) Voice discrimination device
CN109256134B (en) Voice awakening method, storage medium and terminal
US10115399B2 (en) Audio classifier that includes analog signal voice activity detection and digital signal voice activity detection
US9031841B2 (en) Speech recognition apparatus, speech recognition method, and speech recognition program
JP3255584B2 (en) Sound detection device and method
CN102194452A (en) Voice activity detection method in complex background noise
EP2504745B1 (en) Communication interface apparatus and method for multi-user
US20180158462A1 (en) Speaker identification
GB2408133A (en) Adaptable speech recognition system
CN107274895B (en) Voice recognition device and method
Selfridge et al. Continuously predicting and processing barge-in during a live spoken dialogue task
CN112765335A (en) Voice calling landing system
CN113330513A (en) Voice information processing method and device
CN115567336B (en) Wake-free voice control system and method based on smart home
WO2019169272A1 (en) Enhanced barge-in detector
US20220114447A1 (en) Adaptive tuning parameters for a classification neural network
EP2551227B1 (en) Elevator device
CN112435691B (en) Online voice endpoint detection post-processing method, device, equipment and storage medium
US20100106495A1 (en) Voice recognition system, method, and program
KR20080061901A (en) System and method of effcient speech recognition by input/output device of robot
JP3484559B2 (en) Voice recognition device and voice recognition method
JP3114757B2 (en) Voice recognition device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant