CN115206323A - Voice recognition method of fan voice control system - Google Patents

Voice recognition method of fan voice control system Download PDF

Info

Publication number
CN115206323A
CN115206323A CN202211125810.0A CN202211125810A CN115206323A CN 115206323 A CN115206323 A CN 115206323A CN 202211125810 A CN202211125810 A CN 202211125810A CN 115206323 A CN115206323 A CN 115206323A
Authority
CN
China
Prior art keywords
control
voice
word
fan
control voice
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211125810.0A
Other languages
Chinese (zh)
Other versions
CN115206323B (en
Inventor
杨伟鸿
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangmen Hongyuda Machinery & Electrical Manufacturing Co ltd
Original Assignee
Jiangmen Hongyuda Machinery & Electrical Manufacturing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangmen Hongyuda Machinery & Electrical Manufacturing Co ltd filed Critical Jiangmen Hongyuda Machinery & Electrical Manufacturing Co ltd
Priority to CN202211125810.0A priority Critical patent/CN115206323B/en
Publication of CN115206323A publication Critical patent/CN115206323A/en
Application granted granted Critical
Publication of CN115206323B publication Critical patent/CN115206323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • FMECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
    • F04POSITIVE - DISPLACEMENT MACHINES FOR LIQUIDS; PUMPS FOR LIQUIDS OR ELASTIC FLUIDS
    • F04DNON-POSITIVE-DISPLACEMENT PUMPS
    • F04D27/00Control, e.g. regulation, of pumps, pumping installations or pumping systems specially adapted for elastic fluids
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/20Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Quality & Reliability (AREA)
  • Signal Processing (AREA)
  • Mechanical Engineering (AREA)
  • General Engineering & Computer Science (AREA)
  • Control Of Positive-Displacement Air Blowers (AREA)

Abstract

The invention provides a voice recognition method of a fan voice control system, which comprises the following steps: s1: acquiring audio signals within a preset range around the fan; s2: removing a background sound signal in the audio signal to obtain a control voice signal; s3: identifying a fan control instruction in the control voice signal based on the voice characteristics of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction; the method is used for performing voice semantic recognition after background sound elimination is performed on the audio signals acquired around the fan, so that an accurate fan control instruction is acquired, and instruction control of the fan based on voice is realized.

Description

Voice recognition method of fan voice control system
Technical Field
The invention relates to the technical field of voice recognition, in particular to a voice recognition method of a fan voice control system.
Background
Currently, speech Recognition technology, also known as Automatic Speech Recognition (ASR), aims at converting the lexical content of human Speech into computer-readable input, such as keystrokes, binary codes or character sequences. Applications of speech recognition technology include voice dialing, voice navigation, indoor device control, voice document retrieval, simple dictation data entry, and the like. Speech recognition technology, combined with other natural language processing techniques such as machine translation and speech synthesis, can build more complex applications such as fan speech control systems.
The speech recognition technology needs to be able to exclude the influence of various environmental factors. At present, the most significant influence on the voice recognition effect is environmental noise or voice, and in public places, it is almost impossible to expect a computer to understand your voice, which obviously greatly limits the application range of the voice technology. In public places, it is a difficult task to intelligently abandon the environmental voice and obtain the sound of the user for sending control instructions.
Therefore, the invention provides a voice recognition method of the fan voice control system.
Disclosure of Invention
The invention provides a voice recognition method of a fan voice control system, which is used for carrying out voice semantic recognition after background sound elimination is carried out on an audio signal acquired around a fan to acquire an accurate fan control instruction, intelligently abandoning environmental voice and acquiring the sound of a user for sending the control instruction from the environmental voice, and realizing instruction control on the fan based on voice.
The invention provides a voice recognition method of a fan voice control system, which comprises the following steps:
s1: acquiring audio signals within a preset range around the fan;
s2: removing a background sound signal in the audio signal to obtain a control voice signal;
s3: and identifying a fan control instruction in the control voice signal based on the voice characteristics of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction.
Preferably, in the voice recognition method of the voice control system for the fan, S2: removing the background sound signal in the audio signal to obtain a control voice signal, comprising:
s201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal;
s202: and removing the background sound signal in the audio signal to obtain the control voice signal.
Preferably, in the voice recognition method of the voice control system for the fan, S201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal, including:
acquiring an original audio frequency spectrum curve in the preset period of the audio signal, judging whether abrupt change points exist in the original audio frequency spectrum curve, if so, determining first interval time between adjacent abrupt change points in the original audio frequency spectrum curve, and fitting a corresponding first interval time change curve based on the first interval time;
judging whether the abrupt change points are regularly distributed or not based on the first interval time change curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, taking the abrupt change points as a starting point to divide an analysis section curve corresponding to the abrupt change points in the original audio frequency spectrum curve;
determining the sudden change amplitude of the sudden change point, determining a reasonable fluctuation amplitude based on the sudden change amplitude, judging whether a fluctuation point with a difference value not exceeding the corresponding reasonable fluctuation amplitude exists in the analysis section curve, connecting continuous fluctuation points to obtain a reasonable fluctuation curve, and determining a second interval time between the reasonable fluctuation curve and the corresponding sudden change point;
determining a reasonable interval time threshold value based on the time span of the reasonable fluctuation curve, when the second interval time is not more than the reasonable interval time threshold value, taking the reasonable fluctuation curve as an analysis curve segment corresponding to the sudden change point, otherwise, judging that the analysis curve segment does not exist in the corresponding sudden change point;
marking all analysis curve segments in the original audio frequency spectrum curve to obtain a first marking result, determining a third interval time between adjacent analysis curve segments based on the first marking result, and fitting a second interval time change curve based on the third interval time;
judging whether the analysis curve segments are regularly distributed or not based on the second interval time variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, fitting a time span variation curve based on the time span of the analysis curve segments, judging whether the analysis curve segments have a rule or not based on the time span variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, deleting all the analysis curve segments in the original audio frequency spectrum curve to obtain a first audio frequency spectrum curve;
when the original audio frequency spectrum curve does not have the abrupt change point, the original audio frequency spectrum curve is taken as a first audio frequency spectrum curve;
and converting the first audio frequency spectrum curve into a sound signal to obtain a background sound signal in the audio signal.
Preferably, in the voice recognition method of the voice control system for a fan, S3: based on the voice characteristics of the control voice signal, a fan control instruction is identified in the control voice signal, and the fan is controlled to execute corresponding operations based on the fan control instruction, including:
s301: extracting the voice characteristics of the control voice signal;
s302: judging whether only one source user exists in the control voice signal or not based on the voice characteristics, if so, performing semantic recognition on the control voice signal to obtain a fan control instruction, otherwise, performing primary and secondary recognition on the control voice signal based on the voice characteristics to obtain a primary and secondary recognition result, and obtaining the fan control instruction based on the primary and secondary recognition result;
s303: and controlling the fan to execute corresponding operation based on the fan control instruction.
Preferably, in the voice recognition method of the voice control system for the fan, S301: extracting the voice features of the control voice signal, including:
extracting fundamental frequency characteristics of the control voice signal, and calculating a short-time average zero-crossing rate of the control voice signal;
and taking the fundamental frequency characteristic and the short-time average zero crossing rate as the voice characteristic of the control voice signal.
Preferably, the voice recognition method for a voice control system of a fan performs primary and secondary recognition on the control voice signal to obtain primary and secondary recognition results, and obtains a fan control instruction based on the primary and secondary recognition results, includes:
when more than one source user exists in the control voice signal, dividing the control voice signal based on the voice characteristics to obtain a sub-control voice signal set, and marking each sub-control voice signal in the sub-control voice signal set on the control voice signal to obtain a second marking result;
calculating a first weight of each sub-control voice signal based on the position of each sub-control voice signal in the second marking result in the control voice signal;
calculating the average decibel of each sub-control voice signal, and calculating the final weight of the corresponding sub-control voice signal based on the first weight and the corresponding average decibel;
taking the sub-control voice signal corresponding to the maximum final weight as a main control voice signal, taking the remaining sub-control voice signals except the main control voice signal in the sub-control voice signal set as secondary control voice signals, and taking the main control voice signal and the secondary control voice signals as corresponding primary and secondary recognition results;
and carrying out semantic recognition on the main control voice signals in the primary and secondary recognition results to obtain a fan control instruction.
Preferably, the voice recognition method of the fan voice control system performs semantic recognition on the control voice signal to obtain a fan control instruction, and includes:
performing semantic recognition on the control voice signal to obtain a semantic recognition result, and aligning the semantic recognition result with the control voice signal to obtain control voice distribution information;
and integrating the control voice distribution information to obtain a fan control instruction.
Preferably, the voice recognition method for a fan voice control system integrates the control voice distribution information to obtain a fan control instruction, and includes:
determining fourth interval time between adjacent control voice information in the control voice distribution information, and calculating average interval time based on the fourth interval time;
setting a dividing boundary between adjacent control voice information of which the interval time is greater than the average interval time in the control voice distribution information, and dividing the control voice distribution information based on all the dividing boundaries to obtain a first dividing result;
sequentially matching original control voice information in the control voice distribution information with word segments in a preset instruction library to obtain a matching result;
judging whether the original control voice information contains unmatched residual word segments or not based on the matching result, if so, calculating the association degree between the residual word segments and the adjacent matched word segments based on the first occurrence probability of the residual word segments in a preset instruction information base, the second occurrence probability of each adjacent matched word segment in the preset instruction information base and the first simultaneous occurrence probability of the residual word segments and the adjacent matched word segments in the preset instruction information base:
Figure 581051DEST_PATH_IMAGE001
in the formula (I), the compound is shown in the specification,
Figure 29350DEST_PATH_IMAGE002
for the degree of association between the remaining word segment and the adjacent matched word segment,
Figure 215612DEST_PATH_IMAGE003
is a logarithmic function with a base 2,
Figure 876400DEST_PATH_IMAGE004
in order to be said first probability of occurrence,
Figure 601692DEST_PATH_IMAGE005
in order to be said second probability of occurrence,
Figure 955313DEST_PATH_IMAGE006
is the first co-occurrence probability;
dividing the residual word segments and adjacent matched word segments corresponding to the larger association degree into the same word segments, and dividing the original control voice information by combining the matching results to obtain second division results;
otherwise, dividing the original control voice information based on the matching result to obtain a second division result;
determining control instruction related words contained in the original control voice information based on a control instruction related word list, and performing minimum unit word segment division on the remaining voice information except the control instruction related words in the original control voice information to obtain a minimum division result;
determining the part of speech of each minimum unit word segment in the minimum division result, taking any control instruction related word in the original control voice information as a starting point, simultaneously searching from two ends, and determining the minimum unit word consistent with the part of speech of the corresponding control instruction related word as a secondary related word at each end;
summarizing all the minimum unit word segments between the control instruction related words and the corresponding secondary related words and the control instruction related words to obtain at least one word segment set of each control instruction related word;
integrating the word segment sets based on the part of speech and an instruction grammar frame list of each word segment in the word segment sets to obtain integrated word segments corresponding to each word segment set, determining part of speech connection weight between adjacent word segments in the integrated word segments, third occurrence probability of each word segment in the preset instruction information base and second simultaneous occurrence probability of adjacent word segments in the integrated word segments in the preset instruction information base, and calculating semantic score values of the integrated word segments based on the part of speech connection weight, the third occurrence probability and the second simultaneous occurrence probability:
Figure 894450DEST_PATH_IMAGE007
in the formula (I), the compound is shown in the specification,
Figure 624508DEST_PATH_IMAGE008
a semantic score value for the integrated speech segment,
Figure 699912DEST_PATH_IMAGE009
the total number of word segments contained in the integrated speech segment,
Figure 224434DEST_PATH_IMAGE010
for the ith word segment in the integrated speech segment,
Figure 385288DEST_PATH_IMAGE011
a part-of-speech connection weight between the ith word segment and the (i + 1) th word segment in the integrated word segments,
Figure 856721DEST_PATH_IMAGE012
a third probability of occurrence for the ith word segment in the integrated speech segment,
Figure 911264DEST_PATH_IMAGE013
a third probability of occurrence for the (i + 1) th word segment in the integrated speech segment,
Figure 246168DEST_PATH_IMAGE014
a second simultaneous occurrence probability for the ith word segment and the (i + 1) th word segment in the integrated speech segment;
taking the word segment set corresponding to the maximum semantic score value as a final segmentation word segment set of the control instruction related words, taking secondary related words in the opposite direction of the retrieval direction corresponding to the final segmentation word segment set as new starting points to perform word segment segmentation, and obtaining a third segmentation result based on all the final segmentation word segment sets until all the original control voice information is completely segmented;
summarizing the first division result, the second division result and the third division result to obtain a division result set, and obtaining a control speech segment sequence of each division result in the division result set;
performing semantic scoring on the control word segment sequences to obtain word segment semantic scoring values, and sequencing all the control word segment sequences based on the word segment semantic scoring values to obtain sequence sequencing results;
determining capacity parameters of an iteration matrix determinant based on all control language segment sequences, generating an iteration characterization matrix corresponding to each control language segment sequence based on the capacity parameters and the number of fields contained in a division result corresponding to the control language segment sequences, performing cumulative-weighing operation on all the iteration characterization matrices based on a sequence sequencing result to obtain a final division characterization matrix, and dividing and performing semantic recognition on the original control voice information based on the final division characterization matrix to obtain a fan control instruction.
Preferably, in the voice recognition method of the voice control system for the fan, S3: the method for recognizing the fan control instruction in the control voice signal comprises the following steps of, after the fan control instruction is used for controlling the fan to execute corresponding operation, the method comprises the following steps:
judging whether a source user of the control voice signal is a historical user in a historical user library or not based on the voice characteristics to obtain a judgment result;
storing the fan control instructions in a historical instruction library of the source user.
Preferably, the voice recognition method for the voice control system of the fan, based on the voice feature, determines whether a source user of the control voice signal is a historical user in a historical user library, and obtains a determination result, including:
judging whether a user voice characteristic consistent with the voice characteristic exists in a historical user library, if so, taking the historical user of which the source user of the control voice signal is in the historical user library as the judgment result;
otherwise, the source user of the control voice signal is not the historical user in the historical user library as the judgment result.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a voice recognition method of a voice control system of a fan according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a voice recognition method of a voice control system of a fan according to another embodiment of the present invention;
FIG. 3 is a flowchart illustrating a voice recognition method of a voice control system of a fan according to another embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Example 1:
the invention provides a voice recognition method of a fan voice control system, which comprises the following steps with reference to fig. 1:
s1: acquiring audio signals within a preset range around the fan;
s2: background sound signals in the audio signals are removed, and control voice signals are obtained;
s3: and identifying a fan control instruction in the control voice signal based on the voice characteristics of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction.
In this embodiment, the preset range is a preset range for acquiring the control command.
In this embodiment, the audio signal is an information carrier with frequency and amplitude variation of the sound wave within a preset range around the fan.
In this embodiment, the background sound signal is an audio signal of the audio signal except for the audio signal of the user giving the control instruction.
In this embodiment, the control speech signal is an audio signal obtained after removing the background speech signal in the audio signal.
In this embodiment, the speech features are the fundamental frequency features of the control speech signal and the short-time average zero-crossing rate.
In this embodiment, the fan control command is a voice command for controlling the fan, which is recognized in the control voice signal.
The beneficial effects of the above technology are: after background sound of an audio signal acquired around the fan is removed, voice semantic recognition is carried out to acquire an accurate fan control instruction, environmental voice is intelligently abandoned, and the sound of a user for sending the control instruction is acquired, so that instruction control of the fan based on voice is realized.
Example 2:
on the basis of the embodiment 1, the voice recognition method of the fan voice control system includes the following steps: removing the background sound signal in the audio signal to obtain a control sound signal, with reference to fig. 2, including:
s201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal;
s202: and removing the background sound signal in the audio signal to obtain the control voice signal.
In this embodiment, the original audio frequency spectrum curve is an audio frequency spectrum curve corresponding to the audio signal.
The beneficial effects of the above technology are: the background sound signal in the audio signal is determined based on the original audio frequency spectrum curve, so that the recognition accuracy of the background sound signal is improved, the background sound signal in the audio signal is removed, background noise is filtered, and accurate extraction of the control voice signal is realized.
Example 3:
on the basis of the embodiment 2, the voice recognition method of the fan voice control system includes step S201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal, including:
acquiring an original audio frequency spectrum curve in the preset period of the audio signal, judging whether abrupt change points exist in the original audio frequency spectrum curve, if so, determining first interval time between adjacent abrupt change points in the original audio frequency spectrum curve, and fitting a corresponding first interval time change curve based on the first interval time;
judging whether the abrupt change points are regularly distributed or not based on the first interval time change curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, taking the abrupt change points as a starting point to divide an analysis section curve corresponding to the abrupt change points in the original audio frequency spectrum curve;
determining the sudden change amplitude of the sudden change point, determining a reasonable fluctuation amplitude based on the sudden change amplitude, judging whether a fluctuation point with a difference value not exceeding the corresponding reasonable fluctuation amplitude exists in the analysis section curve, connecting continuous fluctuation points to obtain a reasonable fluctuation curve, and determining a second interval time between the reasonable fluctuation curve and the corresponding sudden change point;
determining a reasonable interval time threshold value based on the time span of the reasonable fluctuation curve, when the second interval time is not more than the reasonable interval time threshold value, taking the reasonable fluctuation curve as an analysis curve section of the corresponding abrupt change point, otherwise, judging that the analysis curve section does not exist in the corresponding abrupt change point;
marking all analysis curve segments in the original audio frequency spectrum curve to obtain a first marking result, determining a third interval time between adjacent analysis curve segments based on the first marking result, and fitting a second interval time change curve based on the third interval time;
judging whether the analysis curve segments are regularly distributed or not based on the second interval time variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, fitting a time span variation curve based on the time span of the analysis curve segments, judging whether the analysis curve segments have a rule or not based on the time span variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, deleting all the analysis curve segments in the original audio frequency spectrum curve to obtain a first audio frequency spectrum curve;
when the sudden change point does not exist in the original audio frequency spectrum curve, the original audio frequency spectrum curve is taken as a first audio frequency spectrum curve;
and converting the first audio frequency spectrum curve into a sound signal to obtain a background sound signal in the audio signal.
In this embodiment, the original audio spectrum curve is an audio spectrum curve corresponding to a part of audio lines in a preset period of the audio signal.
In this embodiment, the abrupt change point is a point where the amplitude of the audio frequency spectrum curve suddenly increases, that is, a point where the difference between the amplitude and the amplitude of the previous point is greater than the average value of the amplitude differences of all the previous adjacent points.
In this embodiment, the first interval time is the interval time between adjacent abrupt change points in the original audio frequency spectrum curve.
In this embodiment, the first interval time variation curve is a curve representing the variation of the first time interval between adjacent abrupt change points in the original audio frequency spectrum curve from left to right.
In this embodiment, whether the abrupt change points are regularly distributed is determined based on the first interval time variation curve, which is: and judging whether the function corresponding to the first interval time change curve is a linear function, if so, judging that the abrupt change points are regularly distributed, and otherwise, judging that the abrupt change points are not regularly distributed.
In this embodiment, the first audio spectrum curve is an audio spectrum curve corresponding to the background sound signal.
In this embodiment, the analysis section curve is a partial curve segment that is divided from left to right in the original audio frequency spectrum curve with the abrupt change point as the starting point.
In this embodiment, the abrupt change amplitude is an amplitude difference between an amplitude corresponding to an abrupt change point of the original audio frequency spectrum curve and an amplitude corresponding to a point before the abrupt change point in the original audio frequency spectrum curve.
In this embodiment, the reasonable fluctuation range is [ -0.1m,0.1m ], where m is the sudden change range.
In this embodiment, the fluctuation point is a point in the analysis section curve whose difference from the abrupt change point does not exceed a corresponding reasonable fluctuation range.
In this embodiment, the rational fluctuation curve is a curve obtained by connecting continuous fluctuation points.
In this embodiment, the second interval is the interval between the reasonable undulation curve and the corresponding abrupt change point.
In this embodiment, the reasonable interval time threshold is determined based on the time span of the reasonable fluctuation curve, which is:
the product of the time span of the rational fluctuation curve and 0.1 is the rational interval time threshold.
In this embodiment, the time span of the reasonable fluctuation curve is the span of the reasonable fluctuation curve on the time coordinate axis.
In this embodiment, the analysis curve segment is a reasonable fluctuation curve when the second interval time does not exceed the reasonable interval time threshold.
In this embodiment, the first marking result is a result obtained after marking all the analysis curve segments in the original audio frequency spectrum curve.
In this embodiment, the third interval is the interval between adjacent analysis curve segments.
In this embodiment, the second interval time variation curve is a curve representing a third interval time between adjacent analysis curve segments in the original audio spectrum curve from left to right.
In this embodiment, it is determined whether the analysis curve segment is regularly distributed based on the second interval time variation curve, that is: and judging whether the function corresponding to the second interval time change curve is a linear function, if so, judging that the analysis curve section is regularly distributed, and otherwise, judging that the analysis curve section is not regularly distributed.
In this embodiment, the time span variation curve is a curve representing the time span of the analysis curve segment in the original audio frequency spectrum curve from left to right.
In this embodiment, whether the analysis curve segment has a rule or not is determined based on the time span change curve, which is:
and judging whether the function corresponding to the time span change curve is a linear function, if so, judging that the analysis curve segment has a rule, and otherwise, judging that the analysis curve segment does not have a rule.
The beneficial effects of the above technology are: the method comprises the steps of identifying and analyzing abrupt change points contained in an original audio frequency spectrum curve in a preset period of an audio signal, analyzing curve segment division on the abrupt change points, judging whether the abrupt change points and the analysis curve segments have rules or not based on time intervals between adjacent abrupt change points, time span of the analysis curve segments and interval time between adjacent analysis curve segments, if so, reserving the abrupt change points and the analysis curve segments, otherwise, deleting the corresponding abrupt change points or the analysis curve segments to obtain accurate background signals, and analyzing and identifying abnormal points in the audio frequency spectrum curve through amplitude distribution analysis in the audio frequency spectrum curve to realize accurate extraction of background sounds in the audio signal.
Example 4:
on the basis of the embodiment 1, the voice recognition method of the fan voice control system includes the following steps: based on the voice feature of the control voice signal, a fan control instruction is recognized in the control voice signal, and the fan is controlled to perform corresponding operations based on the fan control instruction, which includes, with reference to fig. 3:
s301: extracting the voice characteristics of the control voice signal;
s302: judging whether only one source user exists in the control voice signal or not based on the voice characteristics, if so, performing semantic recognition on the control voice signal to obtain a fan control instruction, otherwise, performing primary and secondary recognition on the control voice signal based on the voice characteristics to obtain a primary and secondary recognition result, and obtaining the fan control instruction based on the primary and secondary recognition result;
s303: and controlling the fan to execute corresponding operation based on the fan control instruction.
In this embodiment, the source user is the user who sends the instruction in the control voice signal.
In this embodiment, the primary and secondary recognition results are obtained by performing primary and secondary recognition on the control speech signal based on the speech features.
The beneficial effects of the above technology are: the number of users sending control instructions in the control voice signals is judged, and when more than one user sends the control instructions, primary and secondary recognition is carried out on the control voice signals of each user in the control voice signals, so that accurate control over the fan is achieved, interference of the voice recognition control instructions is further reduced, and the accuracy of intelligent fan control is achieved to a greater extent.
Example 5:
on the basis of the embodiment 4, in the voice recognition method of the fan voice control system, S301: extracting the voice features of the control voice signal, including:
extracting fundamental frequency characteristics of the control voice signal, and calculating a short-time average zero crossing rate of the control voice signal;
and taking the fundamental frequency characteristic and the short-time average zero crossing rate as the voice characteristic of the control voice signal.
In this embodiment, the fundamental frequency feature is the pitch frequency of the control speech signal.
In this embodiment, the short-time average zero crossing rate is the number of times that the signal passes through a zero value in each frame of the control speech signal.
The beneficial effects of the above technology are: the fundamental frequency characteristic and the short-time average zero crossing rate of the control voice signal are used as voice characteristics, the difference of speaking characteristics among individuals can be distinguished based on the voice characteristics, and an important distinguishing basis is provided for the subsequent identification of the total number of source users in the control voice signal.
Example 6:
on the basis of embodiment 4, the voice recognition method for a voice control system of a fan, which performs primary and secondary recognition on the control voice signal to obtain a primary and secondary recognition result, and obtains a fan control instruction based on the primary and secondary recognition result, includes:
when more than one source user exists in the control voice signal, dividing the control voice signal based on the voice characteristics to obtain a sub-control voice signal set, and marking each sub-control voice signal in the sub-control voice signal set on the control voice signal to obtain a second marking result;
calculating a first weight of each sub-control voice signal based on the position of each sub-control voice signal in the second marking result in the control voice signal;
calculating the average decibel of each sub-control voice signal, and calculating the final weight of the corresponding sub-control voice signal based on the first weight and the corresponding average decibel;
taking the sub-control voice signal corresponding to the maximum final weight as a main control voice signal, taking the remaining sub-control voice signals except the main control voice signal in the sub-control voice signal set as secondary control voice signals, and taking the main control voice signal and the secondary control voice signals as corresponding primary and secondary recognition results;
and carrying out semantic recognition on the main control voice signals in the primary and secondary recognition results to obtain a fan control instruction.
In this embodiment, the sub-control speech signal set is a set formed by a plurality of self-control speech signals obtained by dividing the control speech signal based on speech features when more than one source user exists in the control speech signal.
In this embodiment, the sub-control speech signal is a control speech signal included in the sub-control speech signal set.
In this embodiment, the second labeling result labels each sub-control speech signal in the set of sub-control speech signals with a result obtained after the control speech signal.
In this embodiment, based on the position of each sub-control speech signal in the second labeling result in the control speech signal, a first weight of each sub-control speech signal is calculated, that is:
and the ratio of the time span of the sub-control voice signal to the time span between the end point of the sub-control voice signal and the end point of the control voice signal is the first weight of the corresponding sub-control voice signal.
In this embodiment, the average decibel is an average value of decibel amplitudes of the sub-control speech signal.
In this embodiment, based on the first weight and the corresponding average decibel, the final weight of the corresponding sub-control speech signal is calculated, that is:
and taking the product of the average decibel and the first weight as the final weight of the corresponding sub-control voice signal.
In this embodiment, the main control speech signal is the sub-control speech signal corresponding to the maximum final weight.
In this embodiment, the secondary control speech signal is the remaining sub-control speech signals in the set of sub-control speech signals except the primary control speech signal.
The beneficial effects of the above technology are: and determining the final weight of the sub-control voice signals based on the sequencing position of the sub-control voice signals in the control voice signals and the average decibel of the sub-control voice signals, and performing primary and secondary recognition on the sub-control voice signals based on the final weight, so that the fans are respectively controlled according to the primary and secondary status of the control voice signals, and the abnormal control condition caused by the existence of conflicting control instructions in the control voice signals is avoided.
Example 7:
on the basis of embodiment 4, the voice recognition method for a fan voice control system performs semantic recognition on the control voice signal to obtain a fan control instruction, and includes:
carrying out semantic recognition on the control voice signal to obtain a semantic recognition result, and aligning the semantic recognition result with the control voice signal to obtain control voice distribution information;
and integrating the control voice distribution information to obtain a fan control instruction.
In this embodiment, the semantic recognition result is a result obtained by performing semantic recognition on the control speech signal.
In this embodiment, the control speech distribution information is a result obtained by aligning the semantic recognition result with the control speech signal.
The beneficial effects of the above technology are: the control voice distribution information is integrated based on the time distribution information of the voice information after the semantic recognition is carried out on the control voice signal, so that the control instruction of the fan can be better understood, the accuracy of the voice instruction recognition is improved, and the control of the fan is more accurate and intelligent.
Example 8:
on the basis of embodiment 7, the voice recognition method for a fan voice control system integrates the control voice distribution information to obtain a fan control instruction, and includes:
determining fourth interval time between adjacent control voice information in the control voice distribution information, and calculating average interval time based on the fourth interval time;
setting a dividing boundary between adjacent control voice information of which the interval time is greater than the average interval time in the control voice distribution information, and dividing the control voice distribution information based on all the dividing boundaries to obtain a first dividing result;
sequentially matching original control voice information in the control voice distribution information with word segments in a preset instruction library to obtain a matching result;
judging whether unmatched residual word segments exist in the original control voice information or not based on the matching result, if so, calculating the association degree between the residual word segments and the adjacent matched word segments based on the first occurrence probability of the residual word segments in a preset instruction information base, the second occurrence probability of each adjacent matched word segment in the preset instruction information base and the first simultaneous occurrence probability of the residual word segments and the adjacent matched word segments in the preset instruction information base:
Figure 284532DEST_PATH_IMAGE001
in the formula (I), the compound is shown in the specification,
Figure 966180DEST_PATH_IMAGE002
for the degree of association between the remaining word segment and the adjacent matched word segment,
Figure 140809DEST_PATH_IMAGE003
is a logarithmic function with a base 2,
Figure 882500DEST_PATH_IMAGE004
in order to be said first probability of occurrence,
Figure 345843DEST_PATH_IMAGE005
in order to be said second probability of occurrence,
Figure 955816DEST_PATH_IMAGE006
is the first co-occurrence probability;
dividing the residual word segments and adjacent matched word segments corresponding to the larger association degree into the same word segments, and dividing the original control voice information by combining the matching results to obtain second division results;
otherwise, dividing the original control voice information based on the matching result to obtain a second division result;
determining control instruction related words contained in the original control voice information based on a control instruction related word list, and performing minimum unit word segment division on the remaining voice information except the control instruction related words in the original control voice information to obtain a minimum division result;
determining the part of speech of each minimum unit word segment in the minimum division result, taking any control instruction related word in the original control voice information as a starting point, simultaneously searching from two ends, and determining the minimum unit word consistent with the part of speech of the corresponding control instruction related word as a secondary related word at each end;
summarizing all the minimum unit word segments between the control instruction related words and the corresponding secondary related words and the control instruction related words to obtain at least one word segment set of each control instruction related word;
integrating the word segment sets based on the part of speech and an instruction grammar frame list of each word segment in the word segment sets to obtain integrated word segments corresponding to each word segment set, determining part of speech connection weight between adjacent word segments in the integrated word segments, third occurrence probability of each word segment in the preset instruction information base and second simultaneous occurrence probability of adjacent word segments in the integrated word segments in the preset instruction information base, and calculating semantic score values of the integrated word segments based on the part of speech connection weight, the third occurrence probability and the second simultaneous occurrence probability:
Figure 860318DEST_PATH_IMAGE007
in the formula (I), the compound is shown in the specification,
Figure 428702DEST_PATH_IMAGE008
a semantic score value for the integrated speech segment,
Figure 284401DEST_PATH_IMAGE009
the total number of word segments contained in the integrated word segments,
Figure 698064DEST_PATH_IMAGE010
for the ith word segment in the integrated speech segment,
Figure 457073DEST_PATH_IMAGE011
for the part-of-speech connection weight between the ith word segment and the (i + 1) th word segment in the integrated word segment,
Figure 134042DEST_PATH_IMAGE012
a third probability of occurrence for the ith word segment in the integrated speech segment,
Figure 103135DEST_PATH_IMAGE013
a third probability of occurrence for the (i + 1) th word segment in the integrated speech segment,
Figure 195856DEST_PATH_IMAGE014
a second simultaneous occurrence probability for the ith word segment and the (i + 1) th word segment in the integrated utterance segment;
taking a word segment set corresponding to the maximum semantic score value as a final division word segment set of the control instruction related words, taking a secondary related word in the opposite direction of the retrieval direction corresponding to the final division word segment set as a new starting point to divide word segments, and obtaining a third division result based on all the final division word segment sets until the original control voice information is completely divided;
summarizing the first division result, the second division result and the third division result to obtain a division result set, and obtaining a control speech segment sequence of each division result in the division result set;
performing semantic scoring on the control word segment sequences to obtain word segment semantic scoring values, and sequencing all the control word segment sequences based on the word segment semantic scoring values to obtain sequence sequencing results;
determining capacity parameters of an iteration matrix determinant based on all control language segment sequences, generating an iteration characterization matrix corresponding to each control language segment sequence based on the capacity parameters and the number of fields contained in a division result corresponding to the control language segment sequences, performing cumulative-weighing operation on all the iteration characterization matrices based on a sequence sequencing result to obtain a final division characterization matrix, and dividing and performing semantic recognition on the original control voice information based on the final division characterization matrix to obtain a fan control instruction.
In this embodiment, the fourth interval is an interval between adjacent control speech information in the control speech distribution information.
In this embodiment, the average interval time is the average of all four interval times.
In this embodiment, the division boundary is used for subsequently dividing the control voice distribution information and obtaining a division position of the first division result.
In this embodiment, the first division result is a result obtained by dividing the control speech distribution information based on all the division boundaries.
In this embodiment, the preset command library is a command library prepared in advance and storing all fan control commands.
In this embodiment, the matching result is a result obtained by sequentially matching the original control speech information in the control speech distribution information with the word segments in the preset instruction library.
In this embodiment, the remaining word segments are the word segments that are not matched in the original control speech information.
In this embodiment, the first occurrence probability is the occurrence probability of the remaining word segments in the preset instruction information base.
In this embodiment, the second occurrence probability is the occurrence probability of the adjacent matched word segment in the preset instruction information base.
In this embodiment, the first simultaneous occurrence probability is a probability that the remaining word segment and the adjacent matched word segment occur simultaneously in the preset instruction information base.
In this embodiment, the adjacent matched word segments are word segments adjacent to the remaining word segments and already matched with word segments in the preset instruction library.
In this embodiment, the association degree is a numerical value representing the association degree between the remaining word segment and the adjacent matched word segment.
In this embodiment, the second division result is a division result obtained by dividing the remaining word segments and the adjacent matched word segments corresponding to the larger association degree into the same word segment and dividing the original control speech information by combining the matching result.
In this embodiment, the list of words related to the control command is a list of words included in the fan control command.
In this embodiment, the control instruction related word is a word in a control instruction related word list included in the original control voice information.
In this embodiment, the minimum division result is a result obtained by performing minimum unit word segment division on the remaining voice information except the control instruction related word in the original control voice information.
In this embodiment, the secondary related word is the smallest unit word determined at each end of the control instruction related word and having the same part of speech as the corresponding control instruction related word and the closest distance to the corresponding control instruction related word.
In this embodiment, the minimum unit word is a word in the minimum division result.
In this embodiment, the word segment set is a set formed by all the minimum unit word segments between the control instruction related word and the corresponding secondary related word and the word segment of each control instruction related word obtained after summarizing the control instruction related word.
In this embodiment, the integrated speech segment is a speech segment obtained by integrating the word segment set based on the part of speech of each word segment in the word segment set and the instruction grammar frame list.
In this embodiment, the instruction syntax frame list is a list including instruction syntax frames, for example: verb plus noun.
In this embodiment, the part-of-speech connection weight is a value representing a moderate part-of-speech connection level between adjacent word segments in the integrated word segment, and is determined according to a preset part-of-speech connection weight list, for example: the connection weight of nouns and verbs is 1, and the connection weight of nouns and nouns is 0.5.
In this embodiment, the third occurrence probability is the occurrence probability of each word segment in the integrated speech segment in the preset instruction information base.
In this embodiment, the second simultaneous occurrence probability is a probability that adjacent word segments in the integrated speech segment occur simultaneously in the preset instruction information base.
In this embodiment, the semantic score value is a numerical value obtained by scoring the whole speech segment.
In this embodiment, the final divided word segment set is the word segment set corresponding to the maximum semantic score value.
In this embodiment, the third segmentation result is a result obtained by summarizing all the final segmentation result.
In this embodiment, the partitioning result set is a set obtained by summarizing the first partitioning result, the second partitioning result, and the third partitioning result.
In this embodiment, the control speech segment sequence is a speech segment sequence obtained by dividing the original control speech information based on the corresponding division result in the division result set.
In this embodiment, the term semantic score is a score obtained by semantically scoring the control term sequence (the same principle as that of calculating the semantic score of the integrated term).
In this embodiment, the sequence ordering result is a result obtained by ordering all the control word segment sequences from large to small based on the word segment semantic score value.
In this embodiment, the iteration matrix determinant is a matrix determinant corresponding to each control speech segment sequence and used for subsequent cumulative-called iteration.
In this embodiment, the capacity parameter of the determinant of the iterative matrix is determined based on all the control speech segment sequences, that is, the capacity parameter is:
and determining the total number of the language segments contained in the control language segment sequence, and taking the maximum total number of the language segments in all the control language segment sequences as the row number and the column number of the determinant of the iterative matrix.
In this embodiment, the capacity parameter is the number of rows and columns of the determinant of the iterative matrix.
In this embodiment, based on the capacity parameter and the number of fields included in the division result corresponding to the control word sequence, an iterative characterization matrix corresponding to each control word sequence is generated, that is, the iterative characterization matrix is:
determining the total number of fields contained in each speech segment in the control speech segment sequence, and determining the total number sequence of the fields corresponding to the control speech segment sequence based on the total number of the fields;
and determining the number of rows of the corresponding field total number sequence in the corresponding iteration characterization matrix based on the ordinal number of each control language segment sequence in the sequence sequencing result, setting the corresponding rows in the corresponding iteration characterization matrix as the corresponding field total number sequence based on the determined number of rows, and setting other values except the field total number sequence in the corresponding iteration characterization matrix as 0 to obtain the corresponding iteration characterization matrix.
In this embodiment, based on the sequence ordering result, performing a cumulative weighting operation on all iterative characterization matrices to obtain a final partition characterization matrix, which is:
and on the basis of the sequence sequencing result, sequentially carrying out cumulative weighing on the control language segment sequences corresponding to each control language segment sequence to obtain a final division characterization matrix.
In this embodiment, the final partition characterization matrix is a matrix obtained by performing a cumulative-scaling operation on all iterative characterization matrices based on the sequence ordering result.
In this embodiment, the original control speech information is divided and semantically identified based on the final division representation matrix, and a fan control instruction is obtained, that is:
determining the total number u of the iterative characterization matrix, taking the numerical value in the corresponding row of the non-0 numerical value in the matrix after the final division of the u-time root number of the characterization matrix as a field total number sequence, and dividing the original control voice information from left to right according to the field total number sequence to obtain a voice information division result;
and performing semantic recognition on the voice information division result to obtain a fan control instruction.
The beneficial effects of the above technology are: based on interval time division in control voice distribution information, result division after residual association based on a result matched with a word segment in a preset instruction library, and integrated division based on part-of-speech association, an iteration representation matrix is generated based on the total number of fields in the word segment in the result obtained after the control voice distribution information is divided by the three division modes, a corresponding division result is represented, the iteration representation matrix is subjected to accumulated weighing iteration based on the iteration representation matrix and semantic score values of control word segment sequences of each division result, and then root division is carried out, the same division word segment in the three division modes can be determined, so that the final division result comprises the same division result in the three division modes, the original control voice information is divided and subjected to semantic recognition by the three comprehensive division modes, and the recognition accuracy of the fan control instruction is improved.
Example 9:
on the basis of the embodiment 1, in the voice recognition method of the fan voice control system, S3: the method for recognizing the fan control instruction in the control voice signal comprises the following steps of, after the fan control instruction is used for controlling the fan to execute corresponding operation, the method comprises the following steps:
judging whether a source user of the control voice signal is a historical user in a historical user library or not based on the voice characteristics to obtain a judgment result;
storing the fan control instructions in a historical instruction library of the source user.
In this embodiment, the determination result is a determination result of determining whether the source user of the control voice information is a historical user in the historical user library.
In this embodiment, the historical command library is a command library for storing all the fan control commands issued by the corresponding source users.
The beneficial effects of the above technology are: the fan control instructions are stored respectively according to users, and an information basis is provided for optimization of voice instruction recognition.
Example 10:
on the basis of embodiment 9, the voice recognition method for a fan voice control system, which determines whether a source user of the control voice signal is a historical user in a historical user library based on the voice feature, and obtains a determination result, includes:
judging whether a user voice characteristic consistent with the voice characteristic exists in a historical user library, if so, taking the historical user of which the source user of the control voice signal is in the historical user library as the judgment result;
otherwise, the source user of the control voice signal is not the historical user in the historical user library as the judgment result.
In this embodiment, the user speech features are the user speech features that are consistent with the speech features and stored in the historical user library.
In this embodiment, the historical user library is used to store each user who has performed voice control on the fan and the corresponding user voice characteristics.
The beneficial effects of the above technology are: the voice characteristics of the control voice signal are matched with the voice characteristics of the users in the historical user library, whether the source user of the control voice signal is the historical user in the historical user library or not can be judged, and a basis is provided for sub-user storage of the fan control instruction in the follow-up process.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (8)

1. A voice recognition method of a fan voice control system is characterized by comprising the following steps:
s1: acquiring audio signals within a preset range around the fan;
s2: removing a background sound signal in the audio signal to obtain a control voice signal;
s3: identifying a fan control instruction in the control voice signal based on the voice feature of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction;
wherein, S2: removing the background sound signal in the audio signal to obtain a control voice signal, comprising:
s201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal;
s202: removing a background sound signal in the audio signal to obtain the control voice signal;
wherein, S201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal, including:
acquiring an original audio frequency spectrum curve in the preset period of the audio signal, judging whether abrupt change points exist in the original audio frequency spectrum curve, if so, determining first interval time between adjacent abrupt change points in the original audio frequency spectrum curve, and fitting a corresponding first interval time change curve based on the first interval time;
judging whether the abrupt change points are regularly distributed or not based on the first interval time change curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, taking the abrupt change points as a starting point to divide an analysis section curve corresponding to the abrupt change points in the original audio frequency spectrum curve;
determining the sudden change amplitude of the sudden change point, determining a reasonable fluctuation amplitude based on the sudden change amplitude, judging whether a fluctuation point with a difference value not exceeding the corresponding reasonable fluctuation amplitude exists in the analysis section curve, connecting continuous fluctuation points to obtain a reasonable fluctuation curve, and determining a second interval time between the reasonable fluctuation curve and the corresponding sudden change point;
determining a reasonable interval time threshold value based on the time span of the reasonable fluctuation curve, when the second interval time is not more than the reasonable interval time threshold value, taking the reasonable fluctuation curve as an analysis curve segment corresponding to the sudden change point, otherwise, judging that the analysis curve segment does not exist in the corresponding sudden change point;
marking all analysis curve segments in the original audio frequency spectrum curve to obtain a first marking result, determining a third interval time between adjacent analysis curve segments based on the first marking result, and fitting a second interval time change curve based on the third interval time;
judging whether the analysis curve segments are regularly distributed or not based on the second interval time variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, fitting a time span variation curve based on the time span of the analysis curve segments, judging whether the analysis curve segments have a rule or not based on the time span variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, deleting all the analysis curve segments in the original audio frequency spectrum curve to obtain a first audio frequency spectrum curve;
when the sudden change point does not exist in the original audio frequency spectrum curve, the original audio frequency spectrum curve is taken as a first audio frequency spectrum curve;
and converting the first audio frequency spectrum curve into a sound signal to obtain a background sound signal in the audio signal.
2. The voice recognition method of the voice control system of the fan according to claim 1, wherein S3: based on the voice characteristics of the control voice signal, a fan control instruction is identified in the control voice signal, and the fan is controlled to execute corresponding operations based on the fan control instruction, including:
s301: extracting the voice characteristics of the control voice signal;
s302: judging whether only one source user exists in the control voice signal or not based on the voice characteristics, if so, performing semantic recognition on the control voice signal to obtain a fan control instruction, otherwise, performing primary and secondary recognition on the control voice signal based on the voice characteristics to obtain a primary and secondary recognition result, and obtaining the fan control instruction based on the primary and secondary recognition result;
s303: and controlling the fan to execute corresponding operation based on the fan control instruction.
3. The voice recognition method of the voice control system for the fan as claimed in claim 2, wherein S301: extracting the voice features of the control voice signal, including:
extracting fundamental frequency characteristics of the control voice signal, and calculating a short-time average zero-crossing rate of the control voice signal;
and taking the fundamental frequency characteristic and the short-time average zero crossing rate as the voice characteristic of the control voice signal.
4. The voice recognition method of claim 2, wherein performing primary and secondary recognition on the control voice signal to obtain a primary and secondary recognition result, and obtaining a fan control command based on the primary and secondary recognition result comprises:
when more than one source user exists in the control voice signal, dividing the control voice signal based on the voice characteristics to obtain a sub-control voice signal set, and marking each sub-control voice signal in the sub-control voice signal set on the control voice signal to obtain a second marking result;
calculating a first weight of each sub-control voice signal based on the position of each sub-control voice signal in the second marking result in the control voice signal;
calculating the average decibel of each sub-control voice signal, and calculating the final weight of the corresponding sub-control voice signal based on the first weight and the corresponding average decibel;
taking the sub-control voice signal corresponding to the maximum final weight as a main control voice signal, taking the remaining sub-control voice signals except the main control voice signal in the sub-control voice signal set as secondary control voice signals, and taking the main control voice signal and the secondary control voice signals as corresponding primary and secondary recognition results;
and carrying out semantic recognition on the main control voice signals in the primary and secondary recognition results to obtain a fan control instruction.
5. The voice recognition method of claim 2, wherein performing semantic recognition on the control voice signal to obtain a fan control command comprises:
performing semantic recognition on the control voice signal to obtain a semantic recognition result, and aligning the semantic recognition result with the control voice signal to obtain control voice distribution information;
and integrating the control voice distribution information to obtain a fan control instruction.
6. The method as claimed in claim 5, wherein the step of integrating the control speech distribution information to obtain the fan control command comprises:
determining fourth interval time between adjacent control voice information in the control voice distribution information, and calculating average interval time based on the fourth interval time;
setting a dividing boundary between adjacent control voice information with the interval time larger than the average interval time in the control voice distribution information, and dividing the control voice distribution information based on all the dividing boundaries to obtain a first dividing result;
sequentially matching original control voice information in the control voice distribution information with word segments in a preset instruction library to obtain a matching result;
judging whether unmatched residual word segments exist in the original control voice information or not based on the matching result, if so, calculating the association degree between the residual word segments and the adjacent matched word segments based on the first occurrence probability of the residual word segments in a preset instruction information base, the second occurrence probability of each adjacent matched word segment in the preset instruction information base and the first simultaneous occurrence probability of the residual word segments and the adjacent matched word segments in the preset instruction information base:
Figure 28624DEST_PATH_IMAGE001
in the formula (I), the compound is shown in the specification,
Figure 61782DEST_PATH_IMAGE002
for the degree of association between the remaining word segment and the adjacent matched word segment,
Figure 518171DEST_PATH_IMAGE003
is a logarithmic function with a base 2,
Figure 476900DEST_PATH_IMAGE004
in order to be said first probability of occurrence,
Figure 210501DEST_PATH_IMAGE005
in order to be said second probability of occurrence,
Figure 229272DEST_PATH_IMAGE006
is the first co-occurrence probability;
dividing the rest word segments and adjacent matched word segments corresponding to larger association degrees into the same word segment, and dividing the original control voice information by combining the matching result to obtain a second division result;
otherwise, the original control voice information is divided based on the matching result to obtain a second division result;
determining control instruction related words contained in the original control voice information based on a control instruction related word list, and performing minimum unit word segment division on the remaining voice information except the control instruction related words in the original control voice information to obtain a minimum division result;
determining the part of speech of each minimum unit word segment in the minimum division result, taking any control instruction related word in the original control voice information as a starting point, simultaneously searching from two ends, and determining the minimum unit word consistent with the part of speech of the corresponding control instruction related word as a secondary related word at each end;
summarizing all the minimum unit word segments between the control instruction related words and the corresponding secondary related words and the control instruction related words to obtain at least one word segment set of each control instruction related word;
integrating the word segment sets based on the part of speech and an instruction grammar frame list of each word segment in the word segment sets to obtain integrated word segments corresponding to each word segment set, determining part of speech connection weight between adjacent word segments in the integrated word segments, third occurrence probability of each word segment in the preset instruction information base and second simultaneous occurrence probability of adjacent word segments in the integrated word segments in the preset instruction information base, and calculating semantic score values of the integrated word segments based on the part of speech connection weight, the third occurrence probability and the second simultaneous occurrence probability:
Figure 172958DEST_PATH_IMAGE007
in the formula (I), the compound is shown in the specification,
Figure 873060DEST_PATH_IMAGE008
a semantic score value for the integrated speech segment,
Figure 789064DEST_PATH_IMAGE009
the total number of word segments contained in the integrated speech segment,
Figure 775474DEST_PATH_IMAGE010
for the ith word segment in the integrated word segment,
Figure 81822DEST_PATH_IMAGE011
for the part-of-speech connection weight between the ith word segment and the (i + 1) th word segment in the integrated word segment,
Figure 382353DEST_PATH_IMAGE012
a third probability of occurrence for the ith word segment in the integrated speech segment,
Figure 480759DEST_PATH_IMAGE013
a third probability of occurrence for the (i + 1) th word segment in the integrated speech segment,
Figure 575754DEST_PATH_IMAGE014
a second simultaneous occurrence probability for the ith word segment and the (i + 1) th word segment in the integrated speech segment;
taking a word segment set corresponding to the maximum semantic score value as a final division word segment set of the control instruction related words, taking a secondary related word in the opposite direction of the retrieval direction corresponding to the final division word segment set as a new starting point to divide word segments, and obtaining a third division result based on all the final division word segment sets until the original control voice information is completely divided;
summarizing the first division result, the second division result and the third division result to obtain a division result set, and obtaining a control speech segment sequence of each division result in the division result set;
semantic scoring is carried out on the control word segment sequences to obtain word segment semantic scoring values, all the control word segment sequences are ranked based on the word segment semantic scoring values, and sequence ranking results are obtained;
determining capacity parameters of an iteration matrix determinant based on all control language segment sequences, generating an iteration characterization matrix corresponding to each control language segment sequence based on the capacity parameters and the number of fields contained in a division result corresponding to the control language segment sequences, performing cumulative-weighing operation on all the iteration characterization matrices based on a sequence sequencing result to obtain a final division characterization matrix, and dividing and performing semantic recognition on the original control voice information based on the final division characterization matrix to obtain a fan control instruction.
7. The voice recognition method of the voice control system for the fan according to claim 1, wherein S3: after a fan control instruction is recognized in the control voice signal and the fan is controlled to execute corresponding operation based on the fan control instruction, the method comprises the following steps:
judging whether a source user of the control voice signal is a historical user in a historical user library or not based on the voice characteristics to obtain a judgment result;
storing the fan control instructions in a historical instruction library of the source user.
8. The method as claimed in claim 7, wherein the determining whether the source user of the control voice signal is a historical user in a historical user library based on the voice feature to obtain a determination result comprises:
judging whether a user voice characteristic consistent with the voice characteristic exists in a historical user library, if so, taking the historical user of which the source user of the control voice signal is in the historical user library as the judgment result;
otherwise, the source user of the control voice signal is not the historical user in the historical user library as the judgment result.
CN202211125810.0A 2022-09-16 2022-09-16 Voice recognition method of fan voice control system Active CN115206323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211125810.0A CN115206323B (en) 2022-09-16 2022-09-16 Voice recognition method of fan voice control system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211125810.0A CN115206323B (en) 2022-09-16 2022-09-16 Voice recognition method of fan voice control system

Publications (2)

Publication Number Publication Date
CN115206323A true CN115206323A (en) 2022-10-18
CN115206323B CN115206323B (en) 2022-11-29

Family

ID=83572796

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211125810.0A Active CN115206323B (en) 2022-09-16 2022-09-16 Voice recognition method of fan voice control system

Country Status (1)

Country Link
CN (1) CN115206323B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011033717A (en) * 2009-07-30 2011-02-17 Secom Co Ltd Noise suppression device
US20130010974A1 (en) * 2011-07-06 2013-01-10 Honda Motor Co., Ltd. Sound processing device, sound processing method, and sound processing program
US20140180682A1 (en) * 2012-12-21 2014-06-26 Sony Corporation Noise detection device, noise detection method, and program
US9431024B1 (en) * 2015-03-02 2016-08-30 Faraday Technology Corp. Method and apparatus for detecting noise of audio signals
JP2017191332A (en) * 2017-06-22 2017-10-19 株式会社Jvcケンウッド Noise detection device, noise detection method, noise reduction device, noise reduction method, communication device, and program
CN113035192A (en) * 2021-02-26 2021-06-25 深圳市超维实业有限公司 Voice recognition method of fan voice control system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2011033717A (en) * 2009-07-30 2011-02-17 Secom Co Ltd Noise suppression device
US20130010974A1 (en) * 2011-07-06 2013-01-10 Honda Motor Co., Ltd. Sound processing device, sound processing method, and sound processing program
US20140180682A1 (en) * 2012-12-21 2014-06-26 Sony Corporation Noise detection device, noise detection method, and program
US9431024B1 (en) * 2015-03-02 2016-08-30 Faraday Technology Corp. Method and apparatus for detecting noise of audio signals
JP2017191332A (en) * 2017-06-22 2017-10-19 株式会社Jvcケンウッド Noise detection device, noise detection method, noise reduction device, noise reduction method, communication device, and program
CN113035192A (en) * 2021-02-26 2021-06-25 深圳市超维实业有限公司 Voice recognition method of fan voice control system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YOSHIHISA UEMURA等: ""Musical noise generation analysis for noise reduction methods based on spectral subtraction and MMSE STSA estimation"", 《IEEE XPLORE》 *
刘静: ""机载环境下语音噪声抑制技术研究及实现"", 《中国优秀硕士学位论文全文数据库》 *

Also Published As

Publication number Publication date
CN115206323B (en) 2022-11-29

Similar Documents

Publication Publication Date Title
EP0535146B1 (en) Continuous speech processing system
US20240028837A1 (en) Device and method for machine reading comprehension question and answer
US6092044A (en) Pronunciation generation in speech recognition
US6208971B1 (en) Method and apparatus for command recognition using data-driven semantic inference
US6167377A (en) Speech recognition language models
US6163768A (en) Non-interactive enrollment in speech recognition
US6839667B2 (en) Method of speech recognition by presenting N-best word candidates
US6311157B1 (en) Assigning meanings to utterances in a speech recognition system
US6751595B2 (en) Multi-stage large vocabulary speech recognition system and method
US7389229B2 (en) Unified clustering tree
EP1012736B1 (en) Text segmentation and identification of topics
US5613036A (en) Dynamic categories for a speech recognition system
EP1922653B1 (en) Word clustering for input data
EP0867858A2 (en) Pronunciation generation in speech recognition
EP0867857A2 (en) Enrolment in speech recognition
US20070094007A1 (en) Conversation controller
KR20090004216A (en) System and method for classifying named entities from speech recongnition
CN112151015A (en) Keyword detection method and device, electronic equipment and storage medium
EP1152398B1 (en) A speech recognition system
US5732190A (en) Number-of recognition candidates determining system in speech recognizing device
CN115206323B (en) Voice recognition method of fan voice control system
CN108899016B (en) Voice text normalization method, device and equipment and readable storage medium
DeMori Syntactic recognition of speech patterns
CN1061451C (en) Concealed Markov-mould Chines word sound idenfitying method and apparatus thereof
Cerisara Automatic discovery of topics and acoustic morphemes from speech

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant