CN115206323A - Voice recognition method of fan voice control system - Google Patents
Voice recognition method of fan voice control system Download PDFInfo
- Publication number
- CN115206323A CN115206323A CN202211125810.0A CN202211125810A CN115206323A CN 115206323 A CN115206323 A CN 115206323A CN 202211125810 A CN202211125810 A CN 202211125810A CN 115206323 A CN115206323 A CN 115206323A
- Authority
- CN
- China
- Prior art keywords
- control
- voice
- word
- fan
- control voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 230000005236 sound signal Effects 0.000 claims abstract description 75
- 230000011664 signaling Effects 0.000 claims abstract description 4
- 238000001228 spectrum Methods 0.000 claims description 76
- 230000008859 change Effects 0.000 claims description 68
- 238000004458 analytical method Methods 0.000 claims description 49
- 239000011159 matrix material Substances 0.000 claims description 34
- 238000009826 distribution Methods 0.000 claims description 31
- 238000012512 characterization method Methods 0.000 claims description 25
- 238000012163 sequencing technique Methods 0.000 claims description 10
- 150000001875 compounds Chemical class 0.000 claims description 6
- 230000006870 function Effects 0.000 claims description 6
- 238000005303 weighing Methods 0.000 claims description 5
- 230000008030 elimination Effects 0.000 abstract description 2
- 238000003379 elimination reaction Methods 0.000 abstract description 2
- 230000009286 beneficial effect Effects 0.000 description 10
- 230000011218 segmentation Effects 0.000 description 7
- 230000007613 environmental effect Effects 0.000 description 6
- 238000000638 solvent extraction Methods 0.000 description 4
- 238000012886 linear function Methods 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000002159 abnormal effect Effects 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000013479 data entry Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003860 storage Methods 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F04—POSITIVE - DISPLACEMENT MACHINES FOR LIQUIDS; PUMPS FOR LIQUIDS OR ELASTIC FLUIDS
- F04D—NON-POSITIVE-DISPLACEMENT PUMPS
- F04D27/00—Control, e.g. regulation, of pumps, pumping installations or pumping systems specially adapted for elastic fluids
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Mechanical Engineering (AREA)
- General Engineering & Computer Science (AREA)
- Control Of Positive-Displacement Air Blowers (AREA)
Abstract
The invention provides a voice recognition method of a fan voice control system, which comprises the following steps: s1: acquiring audio signals within a preset range around the fan; s2: removing a background sound signal in the audio signal to obtain a control voice signal; s3: identifying a fan control instruction in the control voice signal based on the voice characteristics of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction; the method is used for performing voice semantic recognition after background sound elimination is performed on the audio signals acquired around the fan, so that an accurate fan control instruction is acquired, and instruction control of the fan based on voice is realized.
Description
Technical Field
The invention relates to the technical field of voice recognition, in particular to a voice recognition method of a fan voice control system.
Background
Currently, speech Recognition technology, also known as Automatic Speech Recognition (ASR), aims at converting the lexical content of human Speech into computer-readable input, such as keystrokes, binary codes or character sequences. Applications of speech recognition technology include voice dialing, voice navigation, indoor device control, voice document retrieval, simple dictation data entry, and the like. Speech recognition technology, combined with other natural language processing techniques such as machine translation and speech synthesis, can build more complex applications such as fan speech control systems.
The speech recognition technology needs to be able to exclude the influence of various environmental factors. At present, the most significant influence on the voice recognition effect is environmental noise or voice, and in public places, it is almost impossible to expect a computer to understand your voice, which obviously greatly limits the application range of the voice technology. In public places, it is a difficult task to intelligently abandon the environmental voice and obtain the sound of the user for sending control instructions.
Therefore, the invention provides a voice recognition method of the fan voice control system.
Disclosure of Invention
The invention provides a voice recognition method of a fan voice control system, which is used for carrying out voice semantic recognition after background sound elimination is carried out on an audio signal acquired around a fan to acquire an accurate fan control instruction, intelligently abandoning environmental voice and acquiring the sound of a user for sending the control instruction from the environmental voice, and realizing instruction control on the fan based on voice.
The invention provides a voice recognition method of a fan voice control system, which comprises the following steps:
s1: acquiring audio signals within a preset range around the fan;
s2: removing a background sound signal in the audio signal to obtain a control voice signal;
s3: and identifying a fan control instruction in the control voice signal based on the voice characteristics of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction.
Preferably, in the voice recognition method of the voice control system for the fan, S2: removing the background sound signal in the audio signal to obtain a control voice signal, comprising:
s201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal;
s202: and removing the background sound signal in the audio signal to obtain the control voice signal.
Preferably, in the voice recognition method of the voice control system for the fan, S201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal, including:
acquiring an original audio frequency spectrum curve in the preset period of the audio signal, judging whether abrupt change points exist in the original audio frequency spectrum curve, if so, determining first interval time between adjacent abrupt change points in the original audio frequency spectrum curve, and fitting a corresponding first interval time change curve based on the first interval time;
judging whether the abrupt change points are regularly distributed or not based on the first interval time change curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, taking the abrupt change points as a starting point to divide an analysis section curve corresponding to the abrupt change points in the original audio frequency spectrum curve;
determining the sudden change amplitude of the sudden change point, determining a reasonable fluctuation amplitude based on the sudden change amplitude, judging whether a fluctuation point with a difference value not exceeding the corresponding reasonable fluctuation amplitude exists in the analysis section curve, connecting continuous fluctuation points to obtain a reasonable fluctuation curve, and determining a second interval time between the reasonable fluctuation curve and the corresponding sudden change point;
determining a reasonable interval time threshold value based on the time span of the reasonable fluctuation curve, when the second interval time is not more than the reasonable interval time threshold value, taking the reasonable fluctuation curve as an analysis curve segment corresponding to the sudden change point, otherwise, judging that the analysis curve segment does not exist in the corresponding sudden change point;
marking all analysis curve segments in the original audio frequency spectrum curve to obtain a first marking result, determining a third interval time between adjacent analysis curve segments based on the first marking result, and fitting a second interval time change curve based on the third interval time;
judging whether the analysis curve segments are regularly distributed or not based on the second interval time variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, fitting a time span variation curve based on the time span of the analysis curve segments, judging whether the analysis curve segments have a rule or not based on the time span variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, deleting all the analysis curve segments in the original audio frequency spectrum curve to obtain a first audio frequency spectrum curve;
when the original audio frequency spectrum curve does not have the abrupt change point, the original audio frequency spectrum curve is taken as a first audio frequency spectrum curve;
and converting the first audio frequency spectrum curve into a sound signal to obtain a background sound signal in the audio signal.
Preferably, in the voice recognition method of the voice control system for a fan, S3: based on the voice characteristics of the control voice signal, a fan control instruction is identified in the control voice signal, and the fan is controlled to execute corresponding operations based on the fan control instruction, including:
s301: extracting the voice characteristics of the control voice signal;
s302: judging whether only one source user exists in the control voice signal or not based on the voice characteristics, if so, performing semantic recognition on the control voice signal to obtain a fan control instruction, otherwise, performing primary and secondary recognition on the control voice signal based on the voice characteristics to obtain a primary and secondary recognition result, and obtaining the fan control instruction based on the primary and secondary recognition result;
s303: and controlling the fan to execute corresponding operation based on the fan control instruction.
Preferably, in the voice recognition method of the voice control system for the fan, S301: extracting the voice features of the control voice signal, including:
extracting fundamental frequency characteristics of the control voice signal, and calculating a short-time average zero-crossing rate of the control voice signal;
and taking the fundamental frequency characteristic and the short-time average zero crossing rate as the voice characteristic of the control voice signal.
Preferably, the voice recognition method for a voice control system of a fan performs primary and secondary recognition on the control voice signal to obtain primary and secondary recognition results, and obtains a fan control instruction based on the primary and secondary recognition results, includes:
when more than one source user exists in the control voice signal, dividing the control voice signal based on the voice characteristics to obtain a sub-control voice signal set, and marking each sub-control voice signal in the sub-control voice signal set on the control voice signal to obtain a second marking result;
calculating a first weight of each sub-control voice signal based on the position of each sub-control voice signal in the second marking result in the control voice signal;
calculating the average decibel of each sub-control voice signal, and calculating the final weight of the corresponding sub-control voice signal based on the first weight and the corresponding average decibel;
taking the sub-control voice signal corresponding to the maximum final weight as a main control voice signal, taking the remaining sub-control voice signals except the main control voice signal in the sub-control voice signal set as secondary control voice signals, and taking the main control voice signal and the secondary control voice signals as corresponding primary and secondary recognition results;
and carrying out semantic recognition on the main control voice signals in the primary and secondary recognition results to obtain a fan control instruction.
Preferably, the voice recognition method of the fan voice control system performs semantic recognition on the control voice signal to obtain a fan control instruction, and includes:
performing semantic recognition on the control voice signal to obtain a semantic recognition result, and aligning the semantic recognition result with the control voice signal to obtain control voice distribution information;
and integrating the control voice distribution information to obtain a fan control instruction.
Preferably, the voice recognition method for a fan voice control system integrates the control voice distribution information to obtain a fan control instruction, and includes:
determining fourth interval time between adjacent control voice information in the control voice distribution information, and calculating average interval time based on the fourth interval time;
setting a dividing boundary between adjacent control voice information of which the interval time is greater than the average interval time in the control voice distribution information, and dividing the control voice distribution information based on all the dividing boundaries to obtain a first dividing result;
sequentially matching original control voice information in the control voice distribution information with word segments in a preset instruction library to obtain a matching result;
judging whether the original control voice information contains unmatched residual word segments or not based on the matching result, if so, calculating the association degree between the residual word segments and the adjacent matched word segments based on the first occurrence probability of the residual word segments in a preset instruction information base, the second occurrence probability of each adjacent matched word segment in the preset instruction information base and the first simultaneous occurrence probability of the residual word segments and the adjacent matched word segments in the preset instruction information base:
in the formula (I), the compound is shown in the specification,for the degree of association between the remaining word segment and the adjacent matched word segment,is a logarithmic function with a base 2,in order to be said first probability of occurrence,in order to be said second probability of occurrence,is the first co-occurrence probability;
dividing the residual word segments and adjacent matched word segments corresponding to the larger association degree into the same word segments, and dividing the original control voice information by combining the matching results to obtain second division results;
otherwise, dividing the original control voice information based on the matching result to obtain a second division result;
determining control instruction related words contained in the original control voice information based on a control instruction related word list, and performing minimum unit word segment division on the remaining voice information except the control instruction related words in the original control voice information to obtain a minimum division result;
determining the part of speech of each minimum unit word segment in the minimum division result, taking any control instruction related word in the original control voice information as a starting point, simultaneously searching from two ends, and determining the minimum unit word consistent with the part of speech of the corresponding control instruction related word as a secondary related word at each end;
summarizing all the minimum unit word segments between the control instruction related words and the corresponding secondary related words and the control instruction related words to obtain at least one word segment set of each control instruction related word;
integrating the word segment sets based on the part of speech and an instruction grammar frame list of each word segment in the word segment sets to obtain integrated word segments corresponding to each word segment set, determining part of speech connection weight between adjacent word segments in the integrated word segments, third occurrence probability of each word segment in the preset instruction information base and second simultaneous occurrence probability of adjacent word segments in the integrated word segments in the preset instruction information base, and calculating semantic score values of the integrated word segments based on the part of speech connection weight, the third occurrence probability and the second simultaneous occurrence probability:
in the formula (I), the compound is shown in the specification,a semantic score value for the integrated speech segment,the total number of word segments contained in the integrated speech segment,for the ith word segment in the integrated speech segment,a part-of-speech connection weight between the ith word segment and the (i + 1) th word segment in the integrated word segments,a third probability of occurrence for the ith word segment in the integrated speech segment,a third probability of occurrence for the (i + 1) th word segment in the integrated speech segment,a second simultaneous occurrence probability for the ith word segment and the (i + 1) th word segment in the integrated speech segment;
taking the word segment set corresponding to the maximum semantic score value as a final segmentation word segment set of the control instruction related words, taking secondary related words in the opposite direction of the retrieval direction corresponding to the final segmentation word segment set as new starting points to perform word segment segmentation, and obtaining a third segmentation result based on all the final segmentation word segment sets until all the original control voice information is completely segmented;
summarizing the first division result, the second division result and the third division result to obtain a division result set, and obtaining a control speech segment sequence of each division result in the division result set;
performing semantic scoring on the control word segment sequences to obtain word segment semantic scoring values, and sequencing all the control word segment sequences based on the word segment semantic scoring values to obtain sequence sequencing results;
determining capacity parameters of an iteration matrix determinant based on all control language segment sequences, generating an iteration characterization matrix corresponding to each control language segment sequence based on the capacity parameters and the number of fields contained in a division result corresponding to the control language segment sequences, performing cumulative-weighing operation on all the iteration characterization matrices based on a sequence sequencing result to obtain a final division characterization matrix, and dividing and performing semantic recognition on the original control voice information based on the final division characterization matrix to obtain a fan control instruction.
Preferably, in the voice recognition method of the voice control system for the fan, S3: the method for recognizing the fan control instruction in the control voice signal comprises the following steps of, after the fan control instruction is used for controlling the fan to execute corresponding operation, the method comprises the following steps:
judging whether a source user of the control voice signal is a historical user in a historical user library or not based on the voice characteristics to obtain a judgment result;
storing the fan control instructions in a historical instruction library of the source user.
Preferably, the voice recognition method for the voice control system of the fan, based on the voice feature, determines whether a source user of the control voice signal is a historical user in a historical user library, and obtains a determination result, including:
judging whether a user voice characteristic consistent with the voice characteristic exists in a historical user library, if so, taking the historical user of which the source user of the control voice signal is in the historical user library as the judgment result;
otherwise, the source user of the control voice signal is not the historical user in the historical user library as the judgment result.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flow chart of a voice recognition method of a voice control system of a fan according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a voice recognition method of a voice control system of a fan according to another embodiment of the present invention;
FIG. 3 is a flowchart illustrating a voice recognition method of a voice control system of a fan according to another embodiment of the present invention.
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Example 1:
the invention provides a voice recognition method of a fan voice control system, which comprises the following steps with reference to fig. 1:
s1: acquiring audio signals within a preset range around the fan;
s2: background sound signals in the audio signals are removed, and control voice signals are obtained;
s3: and identifying a fan control instruction in the control voice signal based on the voice characteristics of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction.
In this embodiment, the preset range is a preset range for acquiring the control command.
In this embodiment, the audio signal is an information carrier with frequency and amplitude variation of the sound wave within a preset range around the fan.
In this embodiment, the background sound signal is an audio signal of the audio signal except for the audio signal of the user giving the control instruction.
In this embodiment, the control speech signal is an audio signal obtained after removing the background speech signal in the audio signal.
In this embodiment, the speech features are the fundamental frequency features of the control speech signal and the short-time average zero-crossing rate.
In this embodiment, the fan control command is a voice command for controlling the fan, which is recognized in the control voice signal.
The beneficial effects of the above technology are: after background sound of an audio signal acquired around the fan is removed, voice semantic recognition is carried out to acquire an accurate fan control instruction, environmental voice is intelligently abandoned, and the sound of a user for sending the control instruction is acquired, so that instruction control of the fan based on voice is realized.
Example 2:
on the basis of the embodiment 1, the voice recognition method of the fan voice control system includes the following steps: removing the background sound signal in the audio signal to obtain a control sound signal, with reference to fig. 2, including:
s201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal;
s202: and removing the background sound signal in the audio signal to obtain the control voice signal.
In this embodiment, the original audio frequency spectrum curve is an audio frequency spectrum curve corresponding to the audio signal.
The beneficial effects of the above technology are: the background sound signal in the audio signal is determined based on the original audio frequency spectrum curve, so that the recognition accuracy of the background sound signal is improved, the background sound signal in the audio signal is removed, background noise is filtered, and accurate extraction of the control voice signal is realized.
Example 3:
on the basis of the embodiment 2, the voice recognition method of the fan voice control system includes step S201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal, including:
acquiring an original audio frequency spectrum curve in the preset period of the audio signal, judging whether abrupt change points exist in the original audio frequency spectrum curve, if so, determining first interval time between adjacent abrupt change points in the original audio frequency spectrum curve, and fitting a corresponding first interval time change curve based on the first interval time;
judging whether the abrupt change points are regularly distributed or not based on the first interval time change curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, taking the abrupt change points as a starting point to divide an analysis section curve corresponding to the abrupt change points in the original audio frequency spectrum curve;
determining the sudden change amplitude of the sudden change point, determining a reasonable fluctuation amplitude based on the sudden change amplitude, judging whether a fluctuation point with a difference value not exceeding the corresponding reasonable fluctuation amplitude exists in the analysis section curve, connecting continuous fluctuation points to obtain a reasonable fluctuation curve, and determining a second interval time between the reasonable fluctuation curve and the corresponding sudden change point;
determining a reasonable interval time threshold value based on the time span of the reasonable fluctuation curve, when the second interval time is not more than the reasonable interval time threshold value, taking the reasonable fluctuation curve as an analysis curve section of the corresponding abrupt change point, otherwise, judging that the analysis curve section does not exist in the corresponding abrupt change point;
marking all analysis curve segments in the original audio frequency spectrum curve to obtain a first marking result, determining a third interval time between adjacent analysis curve segments based on the first marking result, and fitting a second interval time change curve based on the third interval time;
judging whether the analysis curve segments are regularly distributed or not based on the second interval time variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, fitting a time span variation curve based on the time span of the analysis curve segments, judging whether the analysis curve segments have a rule or not based on the time span variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, deleting all the analysis curve segments in the original audio frequency spectrum curve to obtain a first audio frequency spectrum curve;
when the sudden change point does not exist in the original audio frequency spectrum curve, the original audio frequency spectrum curve is taken as a first audio frequency spectrum curve;
and converting the first audio frequency spectrum curve into a sound signal to obtain a background sound signal in the audio signal.
In this embodiment, the original audio spectrum curve is an audio spectrum curve corresponding to a part of audio lines in a preset period of the audio signal.
In this embodiment, the abrupt change point is a point where the amplitude of the audio frequency spectrum curve suddenly increases, that is, a point where the difference between the amplitude and the amplitude of the previous point is greater than the average value of the amplitude differences of all the previous adjacent points.
In this embodiment, the first interval time is the interval time between adjacent abrupt change points in the original audio frequency spectrum curve.
In this embodiment, the first interval time variation curve is a curve representing the variation of the first time interval between adjacent abrupt change points in the original audio frequency spectrum curve from left to right.
In this embodiment, whether the abrupt change points are regularly distributed is determined based on the first interval time variation curve, which is: and judging whether the function corresponding to the first interval time change curve is a linear function, if so, judging that the abrupt change points are regularly distributed, and otherwise, judging that the abrupt change points are not regularly distributed.
In this embodiment, the first audio spectrum curve is an audio spectrum curve corresponding to the background sound signal.
In this embodiment, the analysis section curve is a partial curve segment that is divided from left to right in the original audio frequency spectrum curve with the abrupt change point as the starting point.
In this embodiment, the abrupt change amplitude is an amplitude difference between an amplitude corresponding to an abrupt change point of the original audio frequency spectrum curve and an amplitude corresponding to a point before the abrupt change point in the original audio frequency spectrum curve.
In this embodiment, the reasonable fluctuation range is [ -0.1m,0.1m ], where m is the sudden change range.
In this embodiment, the fluctuation point is a point in the analysis section curve whose difference from the abrupt change point does not exceed a corresponding reasonable fluctuation range.
In this embodiment, the rational fluctuation curve is a curve obtained by connecting continuous fluctuation points.
In this embodiment, the second interval is the interval between the reasonable undulation curve and the corresponding abrupt change point.
In this embodiment, the reasonable interval time threshold is determined based on the time span of the reasonable fluctuation curve, which is:
the product of the time span of the rational fluctuation curve and 0.1 is the rational interval time threshold.
In this embodiment, the time span of the reasonable fluctuation curve is the span of the reasonable fluctuation curve on the time coordinate axis.
In this embodiment, the analysis curve segment is a reasonable fluctuation curve when the second interval time does not exceed the reasonable interval time threshold.
In this embodiment, the first marking result is a result obtained after marking all the analysis curve segments in the original audio frequency spectrum curve.
In this embodiment, the third interval is the interval between adjacent analysis curve segments.
In this embodiment, the second interval time variation curve is a curve representing a third interval time between adjacent analysis curve segments in the original audio spectrum curve from left to right.
In this embodiment, it is determined whether the analysis curve segment is regularly distributed based on the second interval time variation curve, that is: and judging whether the function corresponding to the second interval time change curve is a linear function, if so, judging that the analysis curve section is regularly distributed, and otherwise, judging that the analysis curve section is not regularly distributed.
In this embodiment, the time span variation curve is a curve representing the time span of the analysis curve segment in the original audio frequency spectrum curve from left to right.
In this embodiment, whether the analysis curve segment has a rule or not is determined based on the time span change curve, which is:
and judging whether the function corresponding to the time span change curve is a linear function, if so, judging that the analysis curve segment has a rule, and otherwise, judging that the analysis curve segment does not have a rule.
The beneficial effects of the above technology are: the method comprises the steps of identifying and analyzing abrupt change points contained in an original audio frequency spectrum curve in a preset period of an audio signal, analyzing curve segment division on the abrupt change points, judging whether the abrupt change points and the analysis curve segments have rules or not based on time intervals between adjacent abrupt change points, time span of the analysis curve segments and interval time between adjacent analysis curve segments, if so, reserving the abrupt change points and the analysis curve segments, otherwise, deleting the corresponding abrupt change points or the analysis curve segments to obtain accurate background signals, and analyzing and identifying abnormal points in the audio frequency spectrum curve through amplitude distribution analysis in the audio frequency spectrum curve to realize accurate extraction of background sounds in the audio signal.
Example 4:
on the basis of the embodiment 1, the voice recognition method of the fan voice control system includes the following steps: based on the voice feature of the control voice signal, a fan control instruction is recognized in the control voice signal, and the fan is controlled to perform corresponding operations based on the fan control instruction, which includes, with reference to fig. 3:
s301: extracting the voice characteristics of the control voice signal;
s302: judging whether only one source user exists in the control voice signal or not based on the voice characteristics, if so, performing semantic recognition on the control voice signal to obtain a fan control instruction, otherwise, performing primary and secondary recognition on the control voice signal based on the voice characteristics to obtain a primary and secondary recognition result, and obtaining the fan control instruction based on the primary and secondary recognition result;
s303: and controlling the fan to execute corresponding operation based on the fan control instruction.
In this embodiment, the source user is the user who sends the instruction in the control voice signal.
In this embodiment, the primary and secondary recognition results are obtained by performing primary and secondary recognition on the control speech signal based on the speech features.
The beneficial effects of the above technology are: the number of users sending control instructions in the control voice signals is judged, and when more than one user sends the control instructions, primary and secondary recognition is carried out on the control voice signals of each user in the control voice signals, so that accurate control over the fan is achieved, interference of the voice recognition control instructions is further reduced, and the accuracy of intelligent fan control is achieved to a greater extent.
Example 5:
on the basis of the embodiment 4, in the voice recognition method of the fan voice control system, S301: extracting the voice features of the control voice signal, including:
extracting fundamental frequency characteristics of the control voice signal, and calculating a short-time average zero crossing rate of the control voice signal;
and taking the fundamental frequency characteristic and the short-time average zero crossing rate as the voice characteristic of the control voice signal.
In this embodiment, the fundamental frequency feature is the pitch frequency of the control speech signal.
In this embodiment, the short-time average zero crossing rate is the number of times that the signal passes through a zero value in each frame of the control speech signal.
The beneficial effects of the above technology are: the fundamental frequency characteristic and the short-time average zero crossing rate of the control voice signal are used as voice characteristics, the difference of speaking characteristics among individuals can be distinguished based on the voice characteristics, and an important distinguishing basis is provided for the subsequent identification of the total number of source users in the control voice signal.
Example 6:
on the basis of embodiment 4, the voice recognition method for a voice control system of a fan, which performs primary and secondary recognition on the control voice signal to obtain a primary and secondary recognition result, and obtains a fan control instruction based on the primary and secondary recognition result, includes:
when more than one source user exists in the control voice signal, dividing the control voice signal based on the voice characteristics to obtain a sub-control voice signal set, and marking each sub-control voice signal in the sub-control voice signal set on the control voice signal to obtain a second marking result;
calculating a first weight of each sub-control voice signal based on the position of each sub-control voice signal in the second marking result in the control voice signal;
calculating the average decibel of each sub-control voice signal, and calculating the final weight of the corresponding sub-control voice signal based on the first weight and the corresponding average decibel;
taking the sub-control voice signal corresponding to the maximum final weight as a main control voice signal, taking the remaining sub-control voice signals except the main control voice signal in the sub-control voice signal set as secondary control voice signals, and taking the main control voice signal and the secondary control voice signals as corresponding primary and secondary recognition results;
and carrying out semantic recognition on the main control voice signals in the primary and secondary recognition results to obtain a fan control instruction.
In this embodiment, the sub-control speech signal set is a set formed by a plurality of self-control speech signals obtained by dividing the control speech signal based on speech features when more than one source user exists in the control speech signal.
In this embodiment, the sub-control speech signal is a control speech signal included in the sub-control speech signal set.
In this embodiment, the second labeling result labels each sub-control speech signal in the set of sub-control speech signals with a result obtained after the control speech signal.
In this embodiment, based on the position of each sub-control speech signal in the second labeling result in the control speech signal, a first weight of each sub-control speech signal is calculated, that is:
and the ratio of the time span of the sub-control voice signal to the time span between the end point of the sub-control voice signal and the end point of the control voice signal is the first weight of the corresponding sub-control voice signal.
In this embodiment, the average decibel is an average value of decibel amplitudes of the sub-control speech signal.
In this embodiment, based on the first weight and the corresponding average decibel, the final weight of the corresponding sub-control speech signal is calculated, that is:
and taking the product of the average decibel and the first weight as the final weight of the corresponding sub-control voice signal.
In this embodiment, the main control speech signal is the sub-control speech signal corresponding to the maximum final weight.
In this embodiment, the secondary control speech signal is the remaining sub-control speech signals in the set of sub-control speech signals except the primary control speech signal.
The beneficial effects of the above technology are: and determining the final weight of the sub-control voice signals based on the sequencing position of the sub-control voice signals in the control voice signals and the average decibel of the sub-control voice signals, and performing primary and secondary recognition on the sub-control voice signals based on the final weight, so that the fans are respectively controlled according to the primary and secondary status of the control voice signals, and the abnormal control condition caused by the existence of conflicting control instructions in the control voice signals is avoided.
Example 7:
on the basis of embodiment 4, the voice recognition method for a fan voice control system performs semantic recognition on the control voice signal to obtain a fan control instruction, and includes:
carrying out semantic recognition on the control voice signal to obtain a semantic recognition result, and aligning the semantic recognition result with the control voice signal to obtain control voice distribution information;
and integrating the control voice distribution information to obtain a fan control instruction.
In this embodiment, the semantic recognition result is a result obtained by performing semantic recognition on the control speech signal.
In this embodiment, the control speech distribution information is a result obtained by aligning the semantic recognition result with the control speech signal.
The beneficial effects of the above technology are: the control voice distribution information is integrated based on the time distribution information of the voice information after the semantic recognition is carried out on the control voice signal, so that the control instruction of the fan can be better understood, the accuracy of the voice instruction recognition is improved, and the control of the fan is more accurate and intelligent.
Example 8:
on the basis of embodiment 7, the voice recognition method for a fan voice control system integrates the control voice distribution information to obtain a fan control instruction, and includes:
determining fourth interval time between adjacent control voice information in the control voice distribution information, and calculating average interval time based on the fourth interval time;
setting a dividing boundary between adjacent control voice information of which the interval time is greater than the average interval time in the control voice distribution information, and dividing the control voice distribution information based on all the dividing boundaries to obtain a first dividing result;
sequentially matching original control voice information in the control voice distribution information with word segments in a preset instruction library to obtain a matching result;
judging whether unmatched residual word segments exist in the original control voice information or not based on the matching result, if so, calculating the association degree between the residual word segments and the adjacent matched word segments based on the first occurrence probability of the residual word segments in a preset instruction information base, the second occurrence probability of each adjacent matched word segment in the preset instruction information base and the first simultaneous occurrence probability of the residual word segments and the adjacent matched word segments in the preset instruction information base:
in the formula (I), the compound is shown in the specification,for the degree of association between the remaining word segment and the adjacent matched word segment,is a logarithmic function with a base 2,in order to be said first probability of occurrence,in order to be said second probability of occurrence,is the first co-occurrence probability;
dividing the residual word segments and adjacent matched word segments corresponding to the larger association degree into the same word segments, and dividing the original control voice information by combining the matching results to obtain second division results;
otherwise, dividing the original control voice information based on the matching result to obtain a second division result;
determining control instruction related words contained in the original control voice information based on a control instruction related word list, and performing minimum unit word segment division on the remaining voice information except the control instruction related words in the original control voice information to obtain a minimum division result;
determining the part of speech of each minimum unit word segment in the minimum division result, taking any control instruction related word in the original control voice information as a starting point, simultaneously searching from two ends, and determining the minimum unit word consistent with the part of speech of the corresponding control instruction related word as a secondary related word at each end;
summarizing all the minimum unit word segments between the control instruction related words and the corresponding secondary related words and the control instruction related words to obtain at least one word segment set of each control instruction related word;
integrating the word segment sets based on the part of speech and an instruction grammar frame list of each word segment in the word segment sets to obtain integrated word segments corresponding to each word segment set, determining part of speech connection weight between adjacent word segments in the integrated word segments, third occurrence probability of each word segment in the preset instruction information base and second simultaneous occurrence probability of adjacent word segments in the integrated word segments in the preset instruction information base, and calculating semantic score values of the integrated word segments based on the part of speech connection weight, the third occurrence probability and the second simultaneous occurrence probability:
in the formula (I), the compound is shown in the specification,a semantic score value for the integrated speech segment,the total number of word segments contained in the integrated word segments,for the ith word segment in the integrated speech segment,for the part-of-speech connection weight between the ith word segment and the (i + 1) th word segment in the integrated word segment,a third probability of occurrence for the ith word segment in the integrated speech segment,a third probability of occurrence for the (i + 1) th word segment in the integrated speech segment,a second simultaneous occurrence probability for the ith word segment and the (i + 1) th word segment in the integrated utterance segment;
taking a word segment set corresponding to the maximum semantic score value as a final division word segment set of the control instruction related words, taking a secondary related word in the opposite direction of the retrieval direction corresponding to the final division word segment set as a new starting point to divide word segments, and obtaining a third division result based on all the final division word segment sets until the original control voice information is completely divided;
summarizing the first division result, the second division result and the third division result to obtain a division result set, and obtaining a control speech segment sequence of each division result in the division result set;
performing semantic scoring on the control word segment sequences to obtain word segment semantic scoring values, and sequencing all the control word segment sequences based on the word segment semantic scoring values to obtain sequence sequencing results;
determining capacity parameters of an iteration matrix determinant based on all control language segment sequences, generating an iteration characterization matrix corresponding to each control language segment sequence based on the capacity parameters and the number of fields contained in a division result corresponding to the control language segment sequences, performing cumulative-weighing operation on all the iteration characterization matrices based on a sequence sequencing result to obtain a final division characterization matrix, and dividing and performing semantic recognition on the original control voice information based on the final division characterization matrix to obtain a fan control instruction.
In this embodiment, the fourth interval is an interval between adjacent control speech information in the control speech distribution information.
In this embodiment, the average interval time is the average of all four interval times.
In this embodiment, the division boundary is used for subsequently dividing the control voice distribution information and obtaining a division position of the first division result.
In this embodiment, the first division result is a result obtained by dividing the control speech distribution information based on all the division boundaries.
In this embodiment, the preset command library is a command library prepared in advance and storing all fan control commands.
In this embodiment, the matching result is a result obtained by sequentially matching the original control speech information in the control speech distribution information with the word segments in the preset instruction library.
In this embodiment, the remaining word segments are the word segments that are not matched in the original control speech information.
In this embodiment, the first occurrence probability is the occurrence probability of the remaining word segments in the preset instruction information base.
In this embodiment, the second occurrence probability is the occurrence probability of the adjacent matched word segment in the preset instruction information base.
In this embodiment, the first simultaneous occurrence probability is a probability that the remaining word segment and the adjacent matched word segment occur simultaneously in the preset instruction information base.
In this embodiment, the adjacent matched word segments are word segments adjacent to the remaining word segments and already matched with word segments in the preset instruction library.
In this embodiment, the association degree is a numerical value representing the association degree between the remaining word segment and the adjacent matched word segment.
In this embodiment, the second division result is a division result obtained by dividing the remaining word segments and the adjacent matched word segments corresponding to the larger association degree into the same word segment and dividing the original control speech information by combining the matching result.
In this embodiment, the list of words related to the control command is a list of words included in the fan control command.
In this embodiment, the control instruction related word is a word in a control instruction related word list included in the original control voice information.
In this embodiment, the minimum division result is a result obtained by performing minimum unit word segment division on the remaining voice information except the control instruction related word in the original control voice information.
In this embodiment, the secondary related word is the smallest unit word determined at each end of the control instruction related word and having the same part of speech as the corresponding control instruction related word and the closest distance to the corresponding control instruction related word.
In this embodiment, the minimum unit word is a word in the minimum division result.
In this embodiment, the word segment set is a set formed by all the minimum unit word segments between the control instruction related word and the corresponding secondary related word and the word segment of each control instruction related word obtained after summarizing the control instruction related word.
In this embodiment, the integrated speech segment is a speech segment obtained by integrating the word segment set based on the part of speech of each word segment in the word segment set and the instruction grammar frame list.
In this embodiment, the instruction syntax frame list is a list including instruction syntax frames, for example: verb plus noun.
In this embodiment, the part-of-speech connection weight is a value representing a moderate part-of-speech connection level between adjacent word segments in the integrated word segment, and is determined according to a preset part-of-speech connection weight list, for example: the connection weight of nouns and verbs is 1, and the connection weight of nouns and nouns is 0.5.
In this embodiment, the third occurrence probability is the occurrence probability of each word segment in the integrated speech segment in the preset instruction information base.
In this embodiment, the second simultaneous occurrence probability is a probability that adjacent word segments in the integrated speech segment occur simultaneously in the preset instruction information base.
In this embodiment, the semantic score value is a numerical value obtained by scoring the whole speech segment.
In this embodiment, the final divided word segment set is the word segment set corresponding to the maximum semantic score value.
In this embodiment, the third segmentation result is a result obtained by summarizing all the final segmentation result.
In this embodiment, the partitioning result set is a set obtained by summarizing the first partitioning result, the second partitioning result, and the third partitioning result.
In this embodiment, the control speech segment sequence is a speech segment sequence obtained by dividing the original control speech information based on the corresponding division result in the division result set.
In this embodiment, the term semantic score is a score obtained by semantically scoring the control term sequence (the same principle as that of calculating the semantic score of the integrated term).
In this embodiment, the sequence ordering result is a result obtained by ordering all the control word segment sequences from large to small based on the word segment semantic score value.
In this embodiment, the iteration matrix determinant is a matrix determinant corresponding to each control speech segment sequence and used for subsequent cumulative-called iteration.
In this embodiment, the capacity parameter of the determinant of the iterative matrix is determined based on all the control speech segment sequences, that is, the capacity parameter is:
and determining the total number of the language segments contained in the control language segment sequence, and taking the maximum total number of the language segments in all the control language segment sequences as the row number and the column number of the determinant of the iterative matrix.
In this embodiment, the capacity parameter is the number of rows and columns of the determinant of the iterative matrix.
In this embodiment, based on the capacity parameter and the number of fields included in the division result corresponding to the control word sequence, an iterative characterization matrix corresponding to each control word sequence is generated, that is, the iterative characterization matrix is:
determining the total number of fields contained in each speech segment in the control speech segment sequence, and determining the total number sequence of the fields corresponding to the control speech segment sequence based on the total number of the fields;
and determining the number of rows of the corresponding field total number sequence in the corresponding iteration characterization matrix based on the ordinal number of each control language segment sequence in the sequence sequencing result, setting the corresponding rows in the corresponding iteration characterization matrix as the corresponding field total number sequence based on the determined number of rows, and setting other values except the field total number sequence in the corresponding iteration characterization matrix as 0 to obtain the corresponding iteration characterization matrix.
In this embodiment, based on the sequence ordering result, performing a cumulative weighting operation on all iterative characterization matrices to obtain a final partition characterization matrix, which is:
and on the basis of the sequence sequencing result, sequentially carrying out cumulative weighing on the control language segment sequences corresponding to each control language segment sequence to obtain a final division characterization matrix.
In this embodiment, the final partition characterization matrix is a matrix obtained by performing a cumulative-scaling operation on all iterative characterization matrices based on the sequence ordering result.
In this embodiment, the original control speech information is divided and semantically identified based on the final division representation matrix, and a fan control instruction is obtained, that is:
determining the total number u of the iterative characterization matrix, taking the numerical value in the corresponding row of the non-0 numerical value in the matrix after the final division of the u-time root number of the characterization matrix as a field total number sequence, and dividing the original control voice information from left to right according to the field total number sequence to obtain a voice information division result;
and performing semantic recognition on the voice information division result to obtain a fan control instruction.
The beneficial effects of the above technology are: based on interval time division in control voice distribution information, result division after residual association based on a result matched with a word segment in a preset instruction library, and integrated division based on part-of-speech association, an iteration representation matrix is generated based on the total number of fields in the word segment in the result obtained after the control voice distribution information is divided by the three division modes, a corresponding division result is represented, the iteration representation matrix is subjected to accumulated weighing iteration based on the iteration representation matrix and semantic score values of control word segment sequences of each division result, and then root division is carried out, the same division word segment in the three division modes can be determined, so that the final division result comprises the same division result in the three division modes, the original control voice information is divided and subjected to semantic recognition by the three comprehensive division modes, and the recognition accuracy of the fan control instruction is improved.
Example 9:
on the basis of the embodiment 1, in the voice recognition method of the fan voice control system, S3: the method for recognizing the fan control instruction in the control voice signal comprises the following steps of, after the fan control instruction is used for controlling the fan to execute corresponding operation, the method comprises the following steps:
judging whether a source user of the control voice signal is a historical user in a historical user library or not based on the voice characteristics to obtain a judgment result;
storing the fan control instructions in a historical instruction library of the source user.
In this embodiment, the determination result is a determination result of determining whether the source user of the control voice information is a historical user in the historical user library.
In this embodiment, the historical command library is a command library for storing all the fan control commands issued by the corresponding source users.
The beneficial effects of the above technology are: the fan control instructions are stored respectively according to users, and an information basis is provided for optimization of voice instruction recognition.
Example 10:
on the basis of embodiment 9, the voice recognition method for a fan voice control system, which determines whether a source user of the control voice signal is a historical user in a historical user library based on the voice feature, and obtains a determination result, includes:
judging whether a user voice characteristic consistent with the voice characteristic exists in a historical user library, if so, taking the historical user of which the source user of the control voice signal is in the historical user library as the judgment result;
otherwise, the source user of the control voice signal is not the historical user in the historical user library as the judgment result.
In this embodiment, the user speech features are the user speech features that are consistent with the speech features and stored in the historical user library.
In this embodiment, the historical user library is used to store each user who has performed voice control on the fan and the corresponding user voice characteristics.
The beneficial effects of the above technology are: the voice characteristics of the control voice signal are matched with the voice characteristics of the users in the historical user library, whether the source user of the control voice signal is the historical user in the historical user library or not can be judged, and a basis is provided for sub-user storage of the fan control instruction in the follow-up process.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.
Claims (8)
1. A voice recognition method of a fan voice control system is characterized by comprising the following steps:
s1: acquiring audio signals within a preset range around the fan;
s2: removing a background sound signal in the audio signal to obtain a control voice signal;
s3: identifying a fan control instruction in the control voice signal based on the voice feature of the control voice signal, and controlling the fan to execute corresponding operation based on the fan control instruction;
wherein, S2: removing the background sound signal in the audio signal to obtain a control voice signal, comprising:
s201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal;
s202: removing a background sound signal in the audio signal to obtain the control voice signal;
wherein, S201: determining a background sound signal in the audio signal based on an original audio frequency spectrum curve of the audio signal, including:
acquiring an original audio frequency spectrum curve in the preset period of the audio signal, judging whether abrupt change points exist in the original audio frequency spectrum curve, if so, determining first interval time between adjacent abrupt change points in the original audio frequency spectrum curve, and fitting a corresponding first interval time change curve based on the first interval time;
judging whether the abrupt change points are regularly distributed or not based on the first interval time change curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, taking the abrupt change points as a starting point to divide an analysis section curve corresponding to the abrupt change points in the original audio frequency spectrum curve;
determining the sudden change amplitude of the sudden change point, determining a reasonable fluctuation amplitude based on the sudden change amplitude, judging whether a fluctuation point with a difference value not exceeding the corresponding reasonable fluctuation amplitude exists in the analysis section curve, connecting continuous fluctuation points to obtain a reasonable fluctuation curve, and determining a second interval time between the reasonable fluctuation curve and the corresponding sudden change point;
determining a reasonable interval time threshold value based on the time span of the reasonable fluctuation curve, when the second interval time is not more than the reasonable interval time threshold value, taking the reasonable fluctuation curve as an analysis curve segment corresponding to the sudden change point, otherwise, judging that the analysis curve segment does not exist in the corresponding sudden change point;
marking all analysis curve segments in the original audio frequency spectrum curve to obtain a first marking result, determining a third interval time between adjacent analysis curve segments based on the first marking result, and fitting a second interval time change curve based on the third interval time;
judging whether the analysis curve segments are regularly distributed or not based on the second interval time variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, fitting a time span variation curve based on the time span of the analysis curve segments, judging whether the analysis curve segments have a rule or not based on the time span variation curve, if so, taking the original audio frequency spectrum curve as a first audio frequency spectrum curve, otherwise, deleting all the analysis curve segments in the original audio frequency spectrum curve to obtain a first audio frequency spectrum curve;
when the sudden change point does not exist in the original audio frequency spectrum curve, the original audio frequency spectrum curve is taken as a first audio frequency spectrum curve;
and converting the first audio frequency spectrum curve into a sound signal to obtain a background sound signal in the audio signal.
2. The voice recognition method of the voice control system of the fan according to claim 1, wherein S3: based on the voice characteristics of the control voice signal, a fan control instruction is identified in the control voice signal, and the fan is controlled to execute corresponding operations based on the fan control instruction, including:
s301: extracting the voice characteristics of the control voice signal;
s302: judging whether only one source user exists in the control voice signal or not based on the voice characteristics, if so, performing semantic recognition on the control voice signal to obtain a fan control instruction, otherwise, performing primary and secondary recognition on the control voice signal based on the voice characteristics to obtain a primary and secondary recognition result, and obtaining the fan control instruction based on the primary and secondary recognition result;
s303: and controlling the fan to execute corresponding operation based on the fan control instruction.
3. The voice recognition method of the voice control system for the fan as claimed in claim 2, wherein S301: extracting the voice features of the control voice signal, including:
extracting fundamental frequency characteristics of the control voice signal, and calculating a short-time average zero-crossing rate of the control voice signal;
and taking the fundamental frequency characteristic and the short-time average zero crossing rate as the voice characteristic of the control voice signal.
4. The voice recognition method of claim 2, wherein performing primary and secondary recognition on the control voice signal to obtain a primary and secondary recognition result, and obtaining a fan control command based on the primary and secondary recognition result comprises:
when more than one source user exists in the control voice signal, dividing the control voice signal based on the voice characteristics to obtain a sub-control voice signal set, and marking each sub-control voice signal in the sub-control voice signal set on the control voice signal to obtain a second marking result;
calculating a first weight of each sub-control voice signal based on the position of each sub-control voice signal in the second marking result in the control voice signal;
calculating the average decibel of each sub-control voice signal, and calculating the final weight of the corresponding sub-control voice signal based on the first weight and the corresponding average decibel;
taking the sub-control voice signal corresponding to the maximum final weight as a main control voice signal, taking the remaining sub-control voice signals except the main control voice signal in the sub-control voice signal set as secondary control voice signals, and taking the main control voice signal and the secondary control voice signals as corresponding primary and secondary recognition results;
and carrying out semantic recognition on the main control voice signals in the primary and secondary recognition results to obtain a fan control instruction.
5. The voice recognition method of claim 2, wherein performing semantic recognition on the control voice signal to obtain a fan control command comprises:
performing semantic recognition on the control voice signal to obtain a semantic recognition result, and aligning the semantic recognition result with the control voice signal to obtain control voice distribution information;
and integrating the control voice distribution information to obtain a fan control instruction.
6. The method as claimed in claim 5, wherein the step of integrating the control speech distribution information to obtain the fan control command comprises:
determining fourth interval time between adjacent control voice information in the control voice distribution information, and calculating average interval time based on the fourth interval time;
setting a dividing boundary between adjacent control voice information with the interval time larger than the average interval time in the control voice distribution information, and dividing the control voice distribution information based on all the dividing boundaries to obtain a first dividing result;
sequentially matching original control voice information in the control voice distribution information with word segments in a preset instruction library to obtain a matching result;
judging whether unmatched residual word segments exist in the original control voice information or not based on the matching result, if so, calculating the association degree between the residual word segments and the adjacent matched word segments based on the first occurrence probability of the residual word segments in a preset instruction information base, the second occurrence probability of each adjacent matched word segment in the preset instruction information base and the first simultaneous occurrence probability of the residual word segments and the adjacent matched word segments in the preset instruction information base:
in the formula (I), the compound is shown in the specification,for the degree of association between the remaining word segment and the adjacent matched word segment,is a logarithmic function with a base 2,in order to be said first probability of occurrence,in order to be said second probability of occurrence,is the first co-occurrence probability;
dividing the rest word segments and adjacent matched word segments corresponding to larger association degrees into the same word segment, and dividing the original control voice information by combining the matching result to obtain a second division result;
otherwise, the original control voice information is divided based on the matching result to obtain a second division result;
determining control instruction related words contained in the original control voice information based on a control instruction related word list, and performing minimum unit word segment division on the remaining voice information except the control instruction related words in the original control voice information to obtain a minimum division result;
determining the part of speech of each minimum unit word segment in the minimum division result, taking any control instruction related word in the original control voice information as a starting point, simultaneously searching from two ends, and determining the minimum unit word consistent with the part of speech of the corresponding control instruction related word as a secondary related word at each end;
summarizing all the minimum unit word segments between the control instruction related words and the corresponding secondary related words and the control instruction related words to obtain at least one word segment set of each control instruction related word;
integrating the word segment sets based on the part of speech and an instruction grammar frame list of each word segment in the word segment sets to obtain integrated word segments corresponding to each word segment set, determining part of speech connection weight between adjacent word segments in the integrated word segments, third occurrence probability of each word segment in the preset instruction information base and second simultaneous occurrence probability of adjacent word segments in the integrated word segments in the preset instruction information base, and calculating semantic score values of the integrated word segments based on the part of speech connection weight, the third occurrence probability and the second simultaneous occurrence probability:
in the formula (I), the compound is shown in the specification,a semantic score value for the integrated speech segment,the total number of word segments contained in the integrated speech segment,for the ith word segment in the integrated word segment,for the part-of-speech connection weight between the ith word segment and the (i + 1) th word segment in the integrated word segment,a third probability of occurrence for the ith word segment in the integrated speech segment,a third probability of occurrence for the (i + 1) th word segment in the integrated speech segment,a second simultaneous occurrence probability for the ith word segment and the (i + 1) th word segment in the integrated speech segment;
taking a word segment set corresponding to the maximum semantic score value as a final division word segment set of the control instruction related words, taking a secondary related word in the opposite direction of the retrieval direction corresponding to the final division word segment set as a new starting point to divide word segments, and obtaining a third division result based on all the final division word segment sets until the original control voice information is completely divided;
summarizing the first division result, the second division result and the third division result to obtain a division result set, and obtaining a control speech segment sequence of each division result in the division result set;
semantic scoring is carried out on the control word segment sequences to obtain word segment semantic scoring values, all the control word segment sequences are ranked based on the word segment semantic scoring values, and sequence ranking results are obtained;
determining capacity parameters of an iteration matrix determinant based on all control language segment sequences, generating an iteration characterization matrix corresponding to each control language segment sequence based on the capacity parameters and the number of fields contained in a division result corresponding to the control language segment sequences, performing cumulative-weighing operation on all the iteration characterization matrices based on a sequence sequencing result to obtain a final division characterization matrix, and dividing and performing semantic recognition on the original control voice information based on the final division characterization matrix to obtain a fan control instruction.
7. The voice recognition method of the voice control system for the fan according to claim 1, wherein S3: after a fan control instruction is recognized in the control voice signal and the fan is controlled to execute corresponding operation based on the fan control instruction, the method comprises the following steps:
judging whether a source user of the control voice signal is a historical user in a historical user library or not based on the voice characteristics to obtain a judgment result;
storing the fan control instructions in a historical instruction library of the source user.
8. The method as claimed in claim 7, wherein the determining whether the source user of the control voice signal is a historical user in a historical user library based on the voice feature to obtain a determination result comprises:
judging whether a user voice characteristic consistent with the voice characteristic exists in a historical user library, if so, taking the historical user of which the source user of the control voice signal is in the historical user library as the judgment result;
otherwise, the source user of the control voice signal is not the historical user in the historical user library as the judgment result.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211125810.0A CN115206323B (en) | 2022-09-16 | 2022-09-16 | Voice recognition method of fan voice control system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211125810.0A CN115206323B (en) | 2022-09-16 | 2022-09-16 | Voice recognition method of fan voice control system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115206323A true CN115206323A (en) | 2022-10-18 |
CN115206323B CN115206323B (en) | 2022-11-29 |
Family
ID=83572796
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211125810.0A Active CN115206323B (en) | 2022-09-16 | 2022-09-16 | Voice recognition method of fan voice control system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115206323B (en) |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011033717A (en) * | 2009-07-30 | 2011-02-17 | Secom Co Ltd | Noise suppression device |
US20130010974A1 (en) * | 2011-07-06 | 2013-01-10 | Honda Motor Co., Ltd. | Sound processing device, sound processing method, and sound processing program |
US20140180682A1 (en) * | 2012-12-21 | 2014-06-26 | Sony Corporation | Noise detection device, noise detection method, and program |
US9431024B1 (en) * | 2015-03-02 | 2016-08-30 | Faraday Technology Corp. | Method and apparatus for detecting noise of audio signals |
JP2017191332A (en) * | 2017-06-22 | 2017-10-19 | 株式会社Jvcケンウッド | Noise detection device, noise detection method, noise reduction device, noise reduction method, communication device, and program |
CN113035192A (en) * | 2021-02-26 | 2021-06-25 | 深圳市超维实业有限公司 | Voice recognition method of fan voice control system |
-
2022
- 2022-09-16 CN CN202211125810.0A patent/CN115206323B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2011033717A (en) * | 2009-07-30 | 2011-02-17 | Secom Co Ltd | Noise suppression device |
US20130010974A1 (en) * | 2011-07-06 | 2013-01-10 | Honda Motor Co., Ltd. | Sound processing device, sound processing method, and sound processing program |
US20140180682A1 (en) * | 2012-12-21 | 2014-06-26 | Sony Corporation | Noise detection device, noise detection method, and program |
US9431024B1 (en) * | 2015-03-02 | 2016-08-30 | Faraday Technology Corp. | Method and apparatus for detecting noise of audio signals |
JP2017191332A (en) * | 2017-06-22 | 2017-10-19 | 株式会社Jvcケンウッド | Noise detection device, noise detection method, noise reduction device, noise reduction method, communication device, and program |
CN113035192A (en) * | 2021-02-26 | 2021-06-25 | 深圳市超维实业有限公司 | Voice recognition method of fan voice control system |
Non-Patent Citations (2)
Title |
---|
YOSHIHISA UEMURA等: ""Musical noise generation analysis for noise reduction methods based on spectral subtraction and MMSE STSA estimation"", 《IEEE XPLORE》 * |
刘静: ""机载环境下语音噪声抑制技术研究及实现"", 《中国优秀硕士学位论文全文数据库》 * |
Also Published As
Publication number | Publication date |
---|---|
CN115206323B (en) | 2022-11-29 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP0535146B1 (en) | Continuous speech processing system | |
US20240028837A1 (en) | Device and method for machine reading comprehension question and answer | |
US6092044A (en) | Pronunciation generation in speech recognition | |
US6208971B1 (en) | Method and apparatus for command recognition using data-driven semantic inference | |
US6167377A (en) | Speech recognition language models | |
US6163768A (en) | Non-interactive enrollment in speech recognition | |
US6839667B2 (en) | Method of speech recognition by presenting N-best word candidates | |
US6311157B1 (en) | Assigning meanings to utterances in a speech recognition system | |
US6751595B2 (en) | Multi-stage large vocabulary speech recognition system and method | |
US7389229B2 (en) | Unified clustering tree | |
EP1012736B1 (en) | Text segmentation and identification of topics | |
US5613036A (en) | Dynamic categories for a speech recognition system | |
EP1922653B1 (en) | Word clustering for input data | |
EP0867858A2 (en) | Pronunciation generation in speech recognition | |
EP0867857A2 (en) | Enrolment in speech recognition | |
US20070094007A1 (en) | Conversation controller | |
KR20090004216A (en) | System and method for classifying named entities from speech recongnition | |
CN112151015A (en) | Keyword detection method and device, electronic equipment and storage medium | |
EP1152398B1 (en) | A speech recognition system | |
US5732190A (en) | Number-of recognition candidates determining system in speech recognizing device | |
CN115206323B (en) | Voice recognition method of fan voice control system | |
CN108899016B (en) | Voice text normalization method, device and equipment and readable storage medium | |
DeMori | Syntactic recognition of speech patterns | |
CN1061451C (en) | Concealed Markov-mould Chines word sound idenfitying method and apparatus thereof | |
Cerisara | Automatic discovery of topics and acoustic morphemes from speech |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |