CN108917283A

CN108917283A - A kind of intelligent refrigerator control method, system, intelligent refrigerator and cloud server

Info

Publication number: CN108917283A
Application number: CN201810763475.4A
Authority: CN
Inventors: 文俊
Original assignee: Sichuan Hongmei Intelligent Technology Co Ltd
Current assignee: Sichuan Hongmei Intelligent Technology Co Ltd
Priority date: 2018-07-12
Filing date: 2018-07-12
Publication date: 2018-11-30

Abstract

The present invention provides a kind of intelligent refrigerator control method, system, intelligent refrigerator and cloud servers, applied to the method for intelligent refrigerator, including：Receive the voice signal of the carrying phonetic order of user's input；The voice signal is converted into digital signal；Phonetic feature sequence is extracted from the digital signal；The phonetic feature sequence is sent to external cloud server；Receive and parse through the semantic expressiveness that the external cloud server is sent according to the phonetic feature sequence；Operation is executed according to the semantic expressiveness after parsing.This programme can be improved user experience.

Description

A kind of intelligent refrigerator control method, system, intelligent refrigerator and cloud server

Technical field

The present invention relates to Smart Home technical field, in particular to a kind of intelligent refrigerator control method, system, intelligent refrigerator And cloud server.

Background technique

With the development of mobile Internet and artificial intelligence, intelligent appliance has stepped into people's lives, changes people's Life style.Important component of the refrigerator as field of household appliances, it is inevitable to develop towards intelligent direction.Intelligence currently on the market Refrigerator is integrated with refrigerator control function, food materials management function, menu function of search, video/audio function etc. mostly, and people and refrigerator The interactive mode of these upper functions is particularly important.

It is realized currently, the interactive voice between user and refrigerator is based on offline speech recognition technology more, i.e., in refrigerator local Storaged voice order word, speech recognition are locally carried out in refrigerator.

But since intelligent refrigerator is locally stored, ability is limited, so being stored in the offline order word of intelligent refrigerator local Limited amount.Therefore, offline speech recognition is directed to the order word fixed single of a certain function, this voice for requiring user to say refers to Order is necessarily present in local offline order word, otherwise cannot be identified, poor so as to cause user experience.

Summary of the invention

The embodiment of the invention provides a kind of intelligent refrigerator control method, system, intelligent refrigerator and cloud servers, can It improves the user experience.

The embodiment of the invention provides a kind of intelligent refrigerator control methods, are applied to intelligent refrigerator, including：

Receive the voice signal of the carrying phonetic order of user's input；

The voice signal is converted into digital signal；

Phonetic feature sequence is extracted from the digital signal；

The phonetic feature sequence is sent to external cloud server；

Receive and parse through the semantic expressiveness that the external cloud server is sent according to the phonetic feature sequence；

Operation is executed according to the semantic expressiveness after parsing.

Preferably, it is described the voice signal is converted into digital signal after, described from the digital signal Before extracting phonetic feature sequence, further comprise：

According to preset frame length and time sequencing, the digital signal is divided at least two frame of digital signals；

From each frame of digital signal of division, detection as the phonetic order starting point initial number signal with The termination digital signal of terminating point as the phonetic order；

It is described to extract phonetic feature sequence from the digital signal, including：

According to the time sequencing, successively from the initial number signal to each frame number terminated in digital signal Characteristic parameter is extracted in word signal, and forms phonetic feature sequence.

Preferably, in each frame of digital signal from division, rising for the starting point as the phonetic order is detected The termination digital signal of beginning digital signal and the terminating point as the phonetic order, including：

Determine the short-time energy value and zero-crossing rate value of each frame of digital signal；

Using the first frame digital signal of the time sequencing as current digital signal, execute：

S0：Determine whether the short-time energy value of the current digital signal is more than or equal to preset first energy cut-off Otherwise value, executes S5 if so, executing S1；

S1：Determine whether the zero-crossing rate value of the current digital signal is more than or equal to preset first zero-crossing rate threshold Value if so, determining the initial number signal of starting point of the current digital signal as the phonetic order, and executes S2 And otherwise S3 executes S2 and S0；

S2：According to the time sequencing, believe the next frame digital signal of the current digital signal as Contemporary Digital Number；

S3：Determine whether the short-time energy value of the current digital signal is less than preset second energy threshold, such as Fruit is to execute S4, otherwise, executes S2 and S3；

S4：Determine whether the zero-crossing rate value of the current digital signal is less than preset second zero-crossing rate threshold value, such as Fruit is the termination digital signal for determining the current digital signal as the terminating point of the phonetic order, otherwise, execute S2 and S3。

Preferably, the short-time energy value and zero-crossing rate value of each frame of digital signal of the determination, including：

The short-time energy value of each frame of digital signal is determined using following first formula：

First formula：

Wherein, X_i(m) the i-th frame of digital signal when the incoming value of characterization is m, E_iCharacterize X_i(m) short-time energy value, N characterization The frame length；

The zero-crossing rate value of each frame of digital signal is determined using following second formula：

Second formula：

Wherein, sgn [] is sign function, i.e.,：

Wherein, X_i(m) the i-th frame of digital signal when the incoming value of characterization is m, L_iCharacterize X_i(m) zero-crossing rate value, N characterize institute State frame length.

Preferably, it is described the voice signal is converted into digital signal after, further comprise：

Enhance the high fdrequency component in the digital signal；

Then,

Phonetic feature sequence is extracted from the digital signal after enhancing high fdrequency component.

Preferably, the semanteme for receiving and parsing through the external cloud server and being sent according to the phonetic feature sequence It indicates, including：

Receive the JavaScript object numbered musical notation JSON number that the external cloud server is sent according to phonetic feature sequence According to packet；

The JSON data packet is parsed, the semantic expressiveness in the JSON data packet is obtained.

Preferably, after the voice signal of the carrying phonetic order of the reception user input, described by institute's predicate Before sound signal is converted to digital signal, further comprise：

Remove the interference signal in the voice signal, wherein the interference signal, including：Noise signal and/or echo Signal；

Then,

It is described that the voice signal is converted into digital signal, including：

The voice signal after removal interference signal is converted into digital signal.

Second aspect, the embodiment of the invention provides a kind of intelligent refrigerator control methods, are applied to cloud server, packet It includes：

At least one order word and at least one sound template is stored in advance, wherein the sound template, including：At least At least one corresponding speech characteristic parameter of one order word；

Receive the phonetic feature sequence that external smart refrigerator is sent；

In the sound template for judging storage, if there is the approximate template corresponding to the phonetic feature sequence；

If so, determining the corresponding control command word of the approximate template from the order word of storage；

The semanteme of the control command word is analyzed, and obtains the semantic expressiveness of the control command word, wherein the semanteme It indicates, including：The corresponding field of the control command word, intention and word slot；

The semantic expressiveness is sent to the external smart refrigerator.

Preferably, in the sound template of the judgement storage, if exist corresponding to the phonetic feature sequence Approximate template, including：

D0：Determine untreated combination, using first sound template in preset template sequence as current speech template, It wherein, include at least one described sound template in the untreated combination；

D1：Detect the similarity of the phonetic feature sequence Yu the current speech template, and by the current speech mould Plate is deleted from the untreated set, executes D2；

D2：Whether the quantity for judging the sound template in the untreated set is 0, if so, D3 is executed, otherwise by institute Next sound template of current speech template described in template sequence is stated as current speech template, returns to D1；

D3：Determining with the highest sound template of phonetic feature sequence similarity is approximate template, and described in execution from The corresponding control command word of the approximate template is determined in the order word of storage.

Preferably, described that the semantic expressiveness is sent to the external smart refrigerator, including：

The semantic expressiveness is encapsulated as JSON data packet；

The JSON data packet is sent to the external smart refrigerator.

The third aspect, the embodiment of the invention provides a kind of intelligent refrigerators, including：

Communications processor element, the voice signal of the carrying phonetic order for receiving user's input；By feature extraction unit The phonetic feature sequence extracted is sent to external cloud server；The external cloud server is received and parsed through according to spy Levy the semantic expressiveness that the phonetic feature sequence that extraction unit extracts is sent；

Signal processing unit, for the received voice signal of the communications processor element to be converted to digital signal；

The feature extraction unit, for extracting voice from the digital signal that the signal processing unit is converted Characteristic sequence；

Control unit, for executing operation according to the semantic expressiveness after communications processor element parsing.

Fourth aspect, the embodiment of the invention provides a kind of cloud servers, including：

Cloud storage unit, at least one order word and at least one described corresponding voice of order word to be stored in advance Template, wherein the sound template includes the corresponding speech characteristic parameter of the order word；

Cloud interactive unit, for receiving the phonetic feature sequence of external smart refrigerator transmission；Cloud processing unit is obtained The semantic expressiveness taken is sent to the external smart refrigerator；

The cloud processing unit, in the sound template for judging cloud storage unit storage, if deposit In the approximate template for corresponding to the received phonetic feature sequence of the cloud interactive unit；If so, from described in storage The corresponding control command word of the approximate template is determined in order word；The semanteme of the control command word is analyzed, and described in acquisition The semantic expressiveness of control command word, wherein the semantic expressiveness, including：The corresponding field of the control command word, intention and word Slot.

5th aspect, the embodiment of the invention provides a kind of intelligent refrigerator control systems, including：At least one third aspect Cloud server described in the intelligent refrigerator and fourth aspect.

In embodiments of the present invention, intelligent refrigerator is after receiving the voice signal for carrying phonetic order without carrying out voice Identification need to only pre-process voice signal, that is, convert voice signals into digital signal and extract from digital signal Phonetic feature sequence can be sent to external cloud server processing, received by the phonetic feature sequence of phonetic order out After the semantic expressiveness that external cloud server is sent, operation is executed according to the i.e. executable phonetic order of semantic expressiveness is corresponding, and Intelligent refrigerator is not necessarily to therefore can not only reduce intelligent refrigerator CPU and EMS memory occupation each offline order word is locally stored Rate can also avoid that the limited order word fixed single for making a certain function of ability, caused language are locally stored because of intelligent refrigerator Sound instruction not in offline order word and cannot identified situation, so as to improve the user experience.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.

Fig. 1 is a kind of flow chart for intelligent refrigerator control method that one embodiment of the invention provides；

Fig. 2 is the flow chart for another intelligent refrigerator control method that one embodiment of the invention provides；

Fig. 3 is a kind of structural schematic diagram for intelligent refrigerator that one embodiment of the invention provides；

Fig. 4 is a kind of structural schematic diagram for cloud server that one embodiment of the invention provides；

Fig. 5 is a kind of structural schematic diagram for intelligent refrigerator control system that one embodiment of the invention provides；

Fig. 6 is the structural schematic diagram for another intelligent refrigerator control system that one embodiment of the invention provides.

Specific embodiment

In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments, based on the embodiments of the present invention, those of ordinary skill in the art Every other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.

As shown in Figure 1, it is applied to intelligent refrigerator the embodiment of the invention provides a kind of intelligent refrigerator control method, including：

Step 101：Receive the voice signal of the carrying phonetic order of user's input；

Step 102：The voice signal is converted into digital signal；

Step 103：Phonetic feature sequence is extracted from the digital signal；

Step 104：The phonetic feature sequence is sent to external cloud server；

Step 105：Receive and parse through the semantic table that the external cloud server is sent according to the phonetic feature sequence Show；

Step 106：Operation is executed according to the semantic expressiveness after parsing.

In an embodiment of the present invention, it is described the voice signal is converted into digital signal after, described from institute It states before extracting phonetic feature sequence in digital signal, further comprises：

In embodiments of the present invention, intelligent refrigerator carries out framing to digital signal according to preset frame length and time sequencing Processing, can be excessive and increase processing difficulty to avoid the data volume of digital signal.And determine the starting point and termination of phonetic order Point, can not only reduce the collection capacity of data in speech recognition, save the processing time, moreover it is possible to exclude the dry of unvoiced segments or noise section It disturbs, improves the performance of speech recognition.

In an embodiment of the present invention, in each frame of digital signal from division, detection is used as the phonetic order Starting point initial number signal and the terminating point as the phonetic order termination digital signal, including：

In embodiments of the present invention, short-time energy value and zero-crossing rate that intelligent refrigerator passes through determining each frame of digital signal Value, can be combined, it is bigger that energy spectrum in digital signal is detected with short-time energy the advantages of short-time energy and zero-crossing rate Voiced sound, voiceless sound and noise are distinguished with zero-crossing rate, and then determine the starting point and ending point of phonetic order, so that from starting Point extracts phonetic feature sequence into each frame of digital signal of terminating point, when the collection capacity for reducing data, saving processing Between while, moreover it is possible to the interference for excluding unvoiced segments or noise section improves the performance of speech recognition.

In an embodiment of the present invention, the short-time energy value and zero-crossing rate value of each frame of digital signal of the determination, including：

First formula：

Second formula：

Wherein, sgn [] is sign function, i.e.,：

In embodiments of the present invention, intelligent refrigerator passes through biography corresponding with typing in the second formula in above-mentioned first formula Enter value m, that is, can determine that the short-time energy value and zero-crossing rate value of each frame of digital signal, then in short-term by each frame of digital signal Energy value and zero-crossing rate value, that is, can determine that the start-stop point of phonetic order, so that according to start-stop point from each frame of digital signal Determine required digital signal.

In order to improve the accuracy of speech recognition, in an embodiment of the present invention, the voice signal is converted described After digital signal, further comprise：

Enhance the high fdrequency component in the digital signal；

Then,

In embodiments of the present invention, intelligent refrigerator by enhancing digital signal in high fdrequency component, i.e., to digital signal into Row pretreatment removes spectral tilt, carrys out compensated digital signal by articulatory system institute so that the frequency spectrum of digital signal becomes flat The high frequency section of inhibition helps to improve the signal-to-noise ratio of digital signal, and removes the influence of raw door excitation and mouth and nose radiation.

In an embodiment of the present invention, described to receive and parse through the external cloud server according to the phonetic feature sequence The semantic expressiveness sent is arranged, including：

In embodiments of the present invention, intelligent refrigerator is after receiving the JSON data packet that external cloud server is sent, i.e., JSON data packet can be parsed, and obtains the corresponding semantic expressiveness of phonetic order in JSON data packet, so as to be held according to semantic expressiveness Row operation.And semantic expressiveness is packaged into JSON data packet by external cloud server, is not only convenient for the transmission of semantic expressiveness, also just In the reading and parsing of intelligent refrigerator.

In an embodiment of the present invention, after the voice signal of the carrying phonetic order of the reception user input, It is described the voice signal is converted into digital signal before, further comprise：

Then,

In embodiments of the present invention, intelligent refrigerator needs to carry out using microphone array after receiving voice signal Noise reduction process and elimination echo signal, and ambient noise is filtered out, so as to remove the interference in voice signal, so that voice is believed Voice command identification in number is more accurate.

As shown in Fig. 2, being applied to cloud server, packet the embodiment of the invention provides a kind of intelligent refrigerator control method It includes：

Step 201：At least one order word and at least one sound template is stored in advance, wherein the sound template, packet It includes：At least one corresponding speech characteristic parameter of at least one described order word；

Step 202：Receive the phonetic feature sequence that external smart refrigerator is sent；

Step 203：In the sound template for judging storage, if there is the approximation corresponding to the phonetic feature sequence Template；

Step 204：If so, determining the corresponding control command word of the approximate template from the order word of storage；

Step 205：The semanteme of the control command word is analyzed, and obtains the semantic expressiveness of the control command word, wherein The semantic expressiveness, including：The corresponding field of the control command word, intention and word slot；

Step 206：The semantic expressiveness is sent to the external smart refrigerator.

In embodiments of the present invention, cloud server is by being stored in advance at least one order word and at least one order word Corresponding sound template can make external smart refrigerator after receiving voice signal without carrying out speech recognition, but by Cloud server identification, so as to reduce the consumption of external smart refrigerator CPU and memory；And cloud server is outside receiving After the phonetic feature sequence that portion's intelligent refrigerator is sent, it can be determined according to each sound template of storage and correspond to phonetic feature sequence Approximate template, can determine that the corresponding control command word of phonetic feature sequence further according to approximate template, then by control command The semantic expressiveness of word is sent to external smart refrigerator, and external smart refrigerator can execute operation according to semantic expressiveness.To sum up, Digital signal is handled by cloud server, it can be to avoid because of external smart refrigerator, that ability is locally stored is limited so that a certain The order word fixed single of function, caused phonetic order not in offline order word and cannot identified situation, so as to To improve the user experience.

In an embodiment of the present invention, in the sound template of the judgement storage, if exist corresponding to institute's predicate The approximate template of sound characteristic sequence, including：

In embodiments of the present invention, cloud server is needed when determining the approximate template for corresponding to phonetic feature sequence The similarity for detecting phonetic feature sequence and each sound template, so as to determine highest with phonetic feature sequence similarity Approximate template, then determine the corresponding control command word of approximate template, you can get it recognition result.

In an embodiment of the present invention, described that the semantic expressiveness is sent to the external smart refrigerator, including：

The semantic expressiveness is encapsulated as JSON data packet；

The JSON data packet is sent to the external smart refrigerator.

In embodiments of the present invention, for cloud server after the semantic expressiveness for getting control command word, not being will be semantic Expression is transmitted directly to external smart refrigerator, but needs for semantic expressiveness to be packaged into JSON data packet and return to intelligent refrigerator, In order to the transmission of semantic expressiveness, it is also convenient for the reading and parsing of external smart refrigerator.

As shown in figure 3, the embodiment of the invention provides a kind of intelligent refrigerators, including：

Communications processor element 301, the voice signal of the carrying phonetic order for receiving user's input；By feature extraction list The phonetic feature sequence that member 303 is extracted is sent to external cloud server；Receive and parse through the external cloud server The semantic expressiveness sent according to the phonetic feature sequence that feature extraction unit is extracted；

Signal processing unit 302, for the received voice signal of the communications processor element 301 to be converted to number Signal；

The feature extraction unit 303, for being extracted from the digital signal that the signal processing unit 302 is converted Phonetic feature sequence out；

Control unit 304, for executing operation according to the semantic expressiveness after the communications processor element 301 parsing.

In embodiments of the present invention, communications processor element is after receiving the voice signal for carrying phonetic order without carrying out Speech recognition, only voice signal need to be pre-processed, i.e., by signal processing unit convert voice signals into digital signal, And the phonetic feature sequence of phonetic order, communications processor element are extracted from digital signal by feature extraction unit Phonetic feature sequence is sent to external cloud server processing, so that control unit receives external cloud in communications processor element After the semantic expressiveness for holding server to send, executes operation according to the i.e. executable phonetic order of semantic expressiveness is corresponding, and intelligence ice Case is not necessarily to not only reduce intelligent refrigerator CPU and memory usage, may be used also each offline order word is locally stored therefore The limited order word fixed single for making a certain function of ability is locally stored to avoid because of intelligent refrigerator, caused phonetic order is not In offline order word and cannot identified situation, so as to improve the user experience.

In an embodiment of the present invention, the signal processing unit is further used for suitable according to preset frame length and time The digital signal is divided at least two frame of digital signals by sequence；From each frame of digital signal of division, described in detection conduct The termination digital signal of the initial number signal of the starting point of phonetic order and the terminating point as the phonetic order；

The feature extraction unit is used for according to the time sequencing, successively from the initial number signal to the end Characteristic parameter only is extracted in each frame of digital signal in digital signal, and forms phonetic feature sequence.

In an embodiment of the present invention, the signal processing unit, for determining the short-time energy of each frame of digital signal Value and zero-crossing rate value；

In an embodiment of the present invention, the signal processing unit, for determining each frame number using following first formula The short-time energy value of word signal：

First formula：

Second formula：

Wherein, sgn [] is sign function, i.e.,：

In an embodiment of the present invention, the signal processing unit is further used for enhancing the height in the digital signal Frequency component；

The feature extraction unit, for extracting phonetic feature sequence from the digital signal after enhancing high fdrequency component Column.

In an embodiment of the present invention, the communications processor element, for receiving the external cloud server according to language The JavaScript object numbered musical notation JSON data packet that sound characteristic sequence is sent；The JSON data packet is parsed, the JSON is obtained Semantic expressiveness in data packet.

In an embodiment of the present invention, the signal processing unit is further used for removing dry in the voice signal Disturb signal, wherein the interference signal, including：Noise signal and/or echo signal；By institute's predicate after removal interference signal Sound signal is converted to digital signal.

As shown in figure 4, the embodiment of the invention provides a kind of cloud servers, including：

Cloud storage unit 401, it is corresponding at least one order word and at least one described order word to be stored in advance Sound template, wherein the sound template includes the corresponding speech characteristic parameter of the order word；

Cloud interactive unit 402, for receiving the phonetic feature sequence of external smart refrigerator transmission；By cloud processing unit 403 semantic expressiveness obtained are sent to the external smart refrigerator；

The cloud processing unit 403, for judging in the sound template that the cloud storage unit 401 stores, With the presence or absence of the approximate template for corresponding to the received phonetic feature sequence of the cloud interactive unit 402；If so, from depositing The corresponding control command word of the approximate template is determined in the order word of storage；The semanteme of the control command word is analyzed, and Obtain the semantic expressiveness of the control command word, wherein the semantic expressiveness, including：The corresponding field of the control command word, It is intended to and word slot.

In embodiments of the present invention, at least one order word and at least one order are stored in advance by cloud storage unit The corresponding sound template of word can make external smart refrigerator after receiving voice signal without carrying out speech recognition, but It is identified by cloud processing unit, so as to reduce the consumption of external smart refrigerator CPU and memory；And cloud interactive unit is connecing After the phonetic feature sequence for receiving the transmission of external smart refrigerator, cloud processing unit can store each according to cloud storage unit Sound template determines the approximate template for corresponding to phonetic feature sequence, can determine that phonetic feature sequence further according to approximate template Corresponding control command word, then the semantic expressiveness of control command word is sent to by external smart refrigerator by cloud interactive unit, External smart refrigerator can execute operation according to semantic expressiveness.To sum up, voice signal is handled by cloud server, The limited order word fixed single for making a certain function of ability, caused voice can be locally stored to avoid because of external smart refrigerator Instruction not in offline order word and cannot identified situation, so as to improve the user experience.

As shown in figure 5, the embodiment of the invention provides a kind of intelligent refrigerator control systems, including：Institute at least one Fig. 3 Cloud server 502 described in the intelligent refrigerator 501 and Fig. 4 stated.

In embodiments of the present invention, at least one order word and at least one order word are stored in advance by cloud server Corresponding sound template can make intelligent refrigerator after receiving voice signal without carrying out speech recognition, but by cloud Server identification, so as to reduce the consumption of intelligent refrigerator CPU and memory；And cloud server is in the meaning for obtaining phonetic order After figure, that is, corresponding semantic expressiveness is obtained, then semantic expressiveness is sent to intelligent refrigerator, intelligent refrigerator can be according to semantic expressiveness Execute corresponding operation process.Therefore intelligent ice can be not only reduced in local each offline order word without intelligent refrigerator It is solid can also to avoid being locally stored the limited order word for making a certain function of ability because of intelligent refrigerator for case CPU and memory usage Order one, caused phonetic order not in offline order word and cannot identified situation, so as to improve user use Experience.

It is below " search Liu with phonetic order to more clearly illustrate technical solution of the present invention and advantage The pre-stored order word of lustily water ", cloud server be " search ", " Liu ", " lustily water ", " braised aubergines ", " way ", with And with order word be " search ", " Liu ", " lustily water " speech characteristic parameter sound template A and have order word " to search Rope ", " braised aubergines ", " way " speech characteristic parameter sound template B for, to a kind of intelligent ice provided in an embodiment of the present invention Case control system is described in detail, as shown in fig. 6, can specifically include following steps：

Step 601, it is the voice letter for searching for the lustily water of Liu that intelligent refrigerator, which receives the carrying phonetic order of user's input, Number.

Specifically, user need to only input the phonetic order to be executed, intelligent refrigerator if you need to interact with intelligent refrigerator Receive the voice signal of the carrying phonetic order of user's input.

For example, intelligent refrigerator receives the voice that the carrying phonetic order of user's input is " the lustily water of search Liu " Signal；

Step 602, the interference signal in intelligent refrigerator removal voice signal, wherein interference signal, including：Noise signal And/or echo signal.

Specifically, it after receiving voice signal, needs to handle voice signal, i.e., is dropped using microphone array It makes an uproar, eliminate echo, and filter out background noise with software function, so as to improve the accuracy of speech recognition.

For example, noise signal and echo signal are removed using microphone array, and utilizes preset software filtered points Ambient noise in voice signal.

Step 603, the voice signal after removing interference signal is converted to digital signal by intelligent refrigerator.

For example, intelligent refrigerator is filtered out using noise reduction of microphone array, elimination echo, and with software function After background noise, needs to carry out analog/digital conversion processing, that is, digital signal is converted voice signals into, in order to digitized processing.

For example, intelligent refrigerator converts voice signals into digital signal.

Step 604, digital signal is divided at least two frame of digital according to preset frame length and time sequencing by intelligent refrigerator Signal.

Specifically, intelligent refrigerator is not to be transmitted directly to digital signal after converting voice signals into digital signal Cloud server, but need first to pre-process digital signal, i.e., sub-frame processing is carried out to digital signal, by digital signal It is divided at least two frame of digital signals, avoids the information content of digital signal excessive and increase processing difficulty.

For example, with preset frame length be sequentially in time 20ms, by when a length of 2s digital signal be divided into 100 Frame of digital signal.

Step 605, intelligent refrigerator detects rising for the starting point as phonetic order from each frame of digital signal of division The termination digital signal of beginning digital signal and the terminating point as phonetic order.

Specifically, intelligent refrigerator is needed to carry out signal end detection, that is, is detected after carrying out sub-frame processing to digital signal The starting point and ending point of phonetic order is saved the processing time, can also be arranged so as to reduce the collection capacity of data in speech recognition Except the interference of unvoiced segments or noise segment, the performance of speech recognition is improved.

I.e.：

It determines the short-time energy value and zero-crossing rate value of each frame of digital signal, executes：

S0：Intelligent refrigerator determines whether the short-time energy value of current digital signal is more than or equal to preset first energy cut-off Otherwise value, executes S5 if so, executing S1.

S1：Intelligent refrigerator determines whether the zero-crossing rate value of current digital signal is more than or equal to preset first zero-crossing rate threshold Value if so, determining initial number signal of the current digital signal as the starting point of phonetic order, and executes S2 and S3, no Then, S2 and S0 is executed.

S2：Intelligent refrigerator sequentially in time, is believed the next frame digital signal of current digital signal as Contemporary Digital Number.

S3：Intelligent refrigerator determines whether the short-time energy value of current digital signal is less than preset second energy threshold, such as Fruit is to execute S4, otherwise, executes S2 and S3.

S4：Intelligent refrigerator determines whether the zero-crossing rate value of current digital signal is less than preset second zero-crossing rate threshold value, such as Fruit is the termination digital signal for determining current digital signal as the terminating point of phonetic order, otherwise, executes S2 and S3.

For example, the short-time energy value and zero-crossing rate value of 100 frame of digital signals are calculated separately；

When the short-time energy value of the 10th frame of digital signal is more than or equal to preset first energy threshold, and zero-crossing rate threshold value is big When being equal to preset first zero-crossing rate threshold value, that is, it can determine that the 10th frame of digital signal is the starting point of phonetic order；

When the short-time energy value of the 90th frame of digital signal is less than preset second energy threshold, and zero-crossing rate threshold value be less than it is pre- If the second zero-crossing rate threshold value when, determine the 90th frame of digital signal be phonetic order terminating point.

Step 606, intelligent refrigerator enhancing is from initial number signal into each frame of digital signal terminated in digital signal High fdrequency component.

Specifically, intelligent refrigerator is after detecting the start-stop point of phonetic order, it is also necessary to carry out preemphasis processing, that is, enhance Voice high-frequency part in digital signal makes speech signal spec-trum become flat, and then realizes and improve speech discrimination accuracy Purpose.

For example, the high frequency of each frame of digital signal of the enhancing from the 10th frame of digital signal into the 90th frame of digital signal Component.

Step 607, intelligent refrigerator sequentially in time, is extracted from each frame of digital signal after enhancing high fdrequency component Characteristic parameter forms phonetic feature sequence, and phonetic feature sequence is sent to cloud server.

Specifically, after being pre-processed to voice signal, i.e., using noise reduction of microphone array, eliminate echo, filter out After figure viewed from behind noise, analog/digital conversion, voice framing, signal end detection and preemphasis processing, it can be extracted from digital signal The phonetic feature sequence changed over time is removed unrelated superfluous so as to extracted from voice signal to useful information is identified Remaining information, then phonetic feature sequence is sent to cloud server, cloud server can carry out voice recognition processing.

For example, sequentially in time, believe from the 10th frame of digital signal to the 90th frame of digital after enhancing high fdrequency component Characteristic parameter is extracted in each frame of digital signal in number, and forms phonetic feature sequence, and phonetic feature sequence is sent To cloud server.

Step 608, cloud server receives the phonetic feature sequence that intelligent refrigerator is sent.

Specifically, cloud server can carry out voice knowledge after the phonetic feature sequence for receiving intelligent refrigerator transmission Not.

Step 609, the similarity of cloud server difference phonetic feature sequence and sound template A and sound template B, and really The fixed and highest sound template of phonetic feature sequence similarity is similar templates.

Specifically, cloud server executes：

D0：The untreated combination including sound template A and sound template B is determined, by first in preset template sequence A sound template is as current speech template；

D1：Cloud server detects the similarity of phonetic feature sequence and current speech template, and by current speech template It is deleted from untreated set.

D2：Cloud server judges whether the quantity of the sound template in untreated set is 0, if so, D3 is executed, it is no Then using sound template B in template sequence as current speech template, D1 is returned.

D3：Cloud server determination is approximate template with the highest sound template of phonetic feature sequence similarity, and is executed The corresponding control command word of the approximate template is determined in the order word from storage.

To sum up, cloud server, can be with by the similarity of detection phonetic feature sequence and each sound template Determine that corresponding sound template is approximate template with phonetic feature sequence according to similarity, so as to complete language according to approximate template Sound identification.

For example, preset template sequence be sound template A, sound template B；

Cloud server determine include sound template A and sound template B untreated combination, and according to template sequence by language Sound template A is as current speech template；

It is 99% that cloud server, which detects phonetic feature sequence and the similarity of sound template A,；

The similarity for detecting phonetic feature sequence and sound template A is 0；

It can determine that sound template A is similar templates according to similarity.

Step 610, cloud server determines the corresponding control command word of approximation template from the order word of storage.

Specifically, cloud server is after determining approximate template, that is, can determine the corresponding each order word of approximate template, Control command word is determined, so as to complete speech recognition according to control command word.

For example, the corresponding order word of sound template A is " search ", " Liu ", " lustily water "；

Therefore control command word is " searching for Liu's lustily water ".

Step 611, the semanteme of cloud server analysis and Control order word, and the semantic expressiveness of control command word is obtained, In, semantic expressiveness, including：The corresponding field of control command word, intention and word slot.

Specifically, cloud server needs to carry out semantic understanding after determining control command word, that is, passes through grammer, language The analysis of justice, pragmatic, determines field, intention and word slot, so as to obtain the semantic expressiveness of control command word.

For example, cloud server determines that the field of control command word " searching for Liu's lustily water " is " music ", is intended to It is " Liu " and " lustily water " for " search music " word slot.

Step 612, semantic expressiveness is encapsulated as JSON data packet by cloud server, and JSON data packet is sent to intelligence Refrigerator.

Specifically, after the semantic expressiveness for determining control command word, JSON data packet can be encapsulated as to semantic expressiveness, In order to the transmission of data and the reading of intelligent refrigerator and parsing.

For example, field is " music " by cloud server, is intended to " search music " word slot is " Liu " and " forgets Feelings water " is encapsulated as JSON data packet, and JSON data packet is sent to intelligent refrigerator.

Step 613, intelligent refrigerator receives the JSON data packet that cloud server is sent, and parses and obtains in JSON data packet Semantic expressiveness.

Specifically, intelligent refrigerator can be analyzed in the JSON data packet for receiving cloud server transmission by parsing The intention of user speech instruction, that is, obtain the semantic expressiveness of JSON data packet.

For example, intelligent refrigerator receives the JSON data packet that cloud server is sent, and parsing and obtaining field is " sound Pleasure " is intended to the semantic expressiveness that " search music " word slot is " Liu " and " lustily water ".

Step 614, intelligent refrigerator executes operation according to the semantic expressiveness of acquisition.

Specifically, intelligent refrigerator can execute after getting the semantic expressiveness in JSON data packet according to semantic expressiveness Corresponding operation process.

For example, intelligent refrigerator executes the lustily water of search Liu according to according to semantic expressiveness.

The each embodiment of the present invention at least has the advantages that：

1, in an embodiment of the present invention, intelligent refrigerator is after receiving the voice signal for carrying phonetic order without carrying out Speech recognition need to only pre-process voice signal, that is, convert voice signals into digital signal and from digital signal The phonetic feature sequence of phonetic order is extracted, phonetic feature sequence can be sent to external cloud server processing, connect After receiving the semantic expressiveness that external cloud server is sent, according to semantic expressiveness, i.e. executable phonetic order is corresponding executes behaviour Make, and therefore intelligent refrigerator, can not only reduce intelligent refrigerator CPU and memory it is not necessary that each offline order word is being locally stored Occupancy can also avoid that the limited order word fixed single for making a certain function of ability is locally stored because of intelligent refrigerator, cause Phonetic order not in offline order word and cannot identified situation, so as to improve the user experience.

2, in an embodiment of the present invention, intelligent refrigerator carries out digital signal according to preset frame length and time sequencing Sub-frame processing, can be excessive and increase processing difficulty to avoid the data volume of digital signal.And determine phonetic order starting point and Terminating point can not only reduce the collection capacity of data in speech recognition, save the processing time, moreover it is possible to exclude unvoiced segments or noise section Interference, improve the performance of speech recognition.

3, in an embodiment of the present invention, short-time energy value and zero passage that intelligent refrigerator passes through determining each frame of digital signal Rate value can combine the advantages of short-time energy and zero-crossing rate, and energy spectrum in digital signal is detected with short-time energy and is compared Big voiced sound distinguishes voiceless sound and noise with zero-crossing rate, and then determines the starting point and ending point of phonetic order, so that from Initial point extracts phonetic feature sequence into each frame of digital signal of terminating point, collection capacity, saving processing in reduction data While time, moreover it is possible to which the interference for excluding unvoiced segments or noise section improves the performance of speech recognition.

4, in an embodiment of the present invention, intelligent refrigerator is believed number by the high fdrequency component in enhancing digital signal It number is pre-processed, so that the frequency spectrum of digital signal becomes flat, removes spectral tilt, carrying out compensated digital signal by pronunciation is The inhibited high frequency section of system helps to improve the signal-to-noise ratio of digital signal, and removes the shadow that raw door excitation is radiated with mouth and nose It rings.

5, in an embodiment of the present invention, intelligent refrigerator needs to utilize microphone array after receiving voice signal It carries out noise reduction process and eliminates echo signal, and filter out ambient noise, so as to remove the interference in voice signal, so that language Voice command identification in sound signal is more accurate.

6, in an embodiment of the present invention, cloud server is by being stored in advance at least one order word and at least one life The corresponding sound template of word is enabled, can make external smart refrigerator after receiving voice signal without carrying out speech recognition, and It is to be identified by cloud server, so as to reduce the consumption of external smart refrigerator CPU and memory；And cloud server is receiving After the phonetic feature sequence sent to external smart refrigerator, it can be determined according to each sound template of storage and correspond to phonetic feature The approximate template of sequence can determine that the corresponding control command word of phonetic feature sequence further according to approximate template, then will control The semantic expressiveness of order word is sent to external smart refrigerator, and external smart refrigerator can execute operation according to semantic expressiveness.To sum up As it can be seen that digital signal is handled by cloud server, it can be to avoid because external smart refrigerator is locally stored, ability is limited to be made A certain function order word fixed single, caused phonetic order not in offline order word and cannot identified situation, So as to improve the user experience.

7, in an embodiment of the present invention, cloud server is after the semantic expressiveness for getting control command word, be not by Semantic expressiveness is transmitted directly to external smart refrigerator, but needs for semantic expressiveness to be packaged into JSON data packet and return to intelligent ice Case is also convenient for the reading and parsing of external smart refrigerator in order to the transmission of semantic expressiveness.

It should be noted that, in this document, such as first and second etc relational terms are used merely to an entity Or operation is distinguished with another entity or operation, is existed without necessarily requiring or implying between these entities or operation Any actual relationship or order.Moreover, the terms "include", "comprise" or its any other variant be intended to it is non- It is exclusive to include, so that the process, method, article or equipment for including a series of elements not only includes those elements, It but also including other elements that are not explicitly listed, or further include solid by this process, method, article or equipment Some elements.In the absence of more restrictions, the element limited by sentence " including a 〃〃 ", it is not excluded that There is also other identical factors in the process, method, article or apparatus that includes the element.

Finally, it should be noted that：The foregoing is merely presently preferred embodiments of the present invention, is merely to illustrate skill of the invention Art scheme, is not intended to limit the scope of the present invention.Any modification for being made all within the spirits and principles of the present invention, Equivalent replacement, improvement etc., are included within the scope of protection of the present invention.

Claims

1. a kind of intelligent refrigerator control method, which is characterized in that it is applied to intelligent refrigerator, including：

Receive the voice signal of the carrying phonetic order of user's input；

The voice signal is converted into digital signal；

Phonetic feature sequence is extracted from the digital signal；

The phonetic feature sequence is sent to external cloud server；

Operation is executed according to the semantic expressiveness after parsing.

2. the method according to claim 1, wherein

It is described the voice signal is converted into digital signal after, it is described extracted from the digital signal voice spy Before levying sequence, further comprise：

From each frame of digital signal of division, initial number signal and the conduct of the starting point as the phonetic order are detected The termination digital signal of the terminating point of the phonetic order；

According to the time sequencing, successively believe from each frame of digital of the initial number signal into the termination digital signal Characteristic parameter is extracted in number, and forms phonetic feature sequence.

3. according to the method described in claim 2, it is characterized in that,

In each frame of digital signal from division, detection as the phonetic order starting point initial number signal with The termination digital signal of terminating point as the phonetic order, including：

S0：Determine whether the short-time energy value of the current digital signal is more than or equal to preset first energy threshold, such as Fruit is to execute S1, otherwise, executes S5；

S1：Determine whether the zero-crossing rate value of the current digital signal is more than or equal to preset first zero-crossing rate threshold value, such as Fruit is the initial number signal for determining the current digital signal as the starting point of the phonetic order, and executes S2 and S3, Otherwise, S2 and S0 is executed；

S2：According to the time sequencing, using the next frame digital signal of the current digital signal as current digital signal；

S3：Determine whether the short-time energy value of the current digital signal is less than preset second energy threshold, if so, S4 is executed, otherwise, executes S2 and S3；

S4：Determine whether the zero-crossing rate value of the current digital signal is less than preset second zero-crossing rate threshold value, if so, It determines termination digital signal of the current digital signal as the terminating point of the phonetic order, otherwise, executes S2 and S3.

4. according to the method described in claim 3, it is characterized in that,

The short-time energy value and zero-crossing rate value of each frame of digital signal of determination, including：

First formula：

Wherein, X_i(m) the i-th frame of digital signal when the incoming value of characterization is m, E_iCharacterize X_i(m) short-time energy value, N characterization described in Frame length；

Second formula：

Wherein, sgn [] is sign function, i.e.,：

Wherein, X_i(m) the i-th frame of digital signal when the incoming value of characterization is m, L_iCharacterize X_i(m) zero-crossing rate value, N characterize the frame It is long.

5. according to claim 1 to any method in 4, which is characterized in that

It is described the voice signal is converted into digital signal after, further comprise：

Enhance the high fdrequency component in the digital signal；

Then,

Phonetic feature sequence is extracted from the digital signal after enhancing high fdrequency component；

And/or

The semantic expressiveness for receiving and parsing through the external cloud server and being sent according to the phonetic feature sequence, including：

Receive the JavaScript object numbered musical notation JSON data packet that the external cloud server is sent according to phonetic feature sequence；

The JSON data packet is parsed, the semantic expressiveness in the JSON data packet is obtained；

And/or

After the voice signal of the carrying phonetic order of the reception user input, the voice signal is converted to described Before digital signal, further comprise：

Remove the interference signal in the voice signal, wherein the interference signal, including：Noise signal and/or echo letter Number；

Then,

6. a kind of intelligent refrigerator control method, which is characterized in that it is applied to cloud server, including：

At least one order word and at least one sound template is stored in advance, wherein the sound template, including：At least one At least one corresponding speech characteristic parameter of the order word；

The semanteme of the control command word is analyzed, and obtains the semantic expressiveness of the control command word, wherein the semanteme table Show, including：The corresponding field of the control command word, intention and word slot；

The semantic expressiveness is sent to the external smart refrigerator.

7. according to the method described in claim 6, it is characterized in that,

In the sound template of the judgement storage, if there is the approximate template corresponding to the phonetic feature sequence, wrap It includes：

D0：Determine untreated combination, using first sound template in preset template sequence as current speech template, In, it include at least one described sound template in the untreated combination；

D1：Detect the similarity of the phonetic feature sequence Yu the current speech template, and by the current speech template from The untreated set is deleted, and D2 is executed；

D2：Whether the quantity for judging the sound template in the untreated set is 0, if so, D3 is executed, otherwise by the mould Next sound template of current speech template described in plate sequence returns to D1 as current speech template；

D3：Determination is approximate template with the highest sound template of phonetic feature sequence similarity, and is executed described from storage The order word in determine the corresponding control command word of the approximate template；

And/or

It is described that the semantic expressiveness is sent to the external smart refrigerator, including：

The semantic expressiveness is encapsulated as JSON data packet；

The JSON data packet is sent to the external smart refrigerator.

8. a kind of intelligent refrigerator, which is characterized in that including：

Communications processor element, the voice signal of the carrying phonetic order for receiving user's input；Feature extraction unit is extracted The phonetic feature sequence be sent to external cloud server；The external cloud server is received and parsed through to be mentioned according to feature The semantic expressiveness that the phonetic feature sequence for taking unit to extract is sent；

The feature extraction unit, for extracting phonetic feature from the digital signal that the signal processing unit is converted Sequence；

9. a kind of cloud server, which is characterized in that including：

Cloud storage unit, at least one order word and the corresponding voice mould of at least one described order word to be stored in advance Plate, wherein the sound template includes the corresponding speech characteristic parameter of the order word；

Cloud interactive unit, for receiving the phonetic feature sequence of external smart refrigerator transmission；Cloud processing unit is obtained The semantic expressiveness is sent to the external smart refrigerator；

The cloud processing unit, in the sound template for judging cloud storage unit storage, if exist pair The approximate template of the received phonetic feature sequence of cloud interactive unit described in Ying Yu；If so, from the order of storage The corresponding control command word of the approximate template is determined in word；The semanteme of the control command word is analyzed, and obtains the control The semantic expressiveness of order word, wherein the semantic expressiveness, including：The corresponding field of the control command word, intention and word slot.

10. a kind of intelligent refrigerator control system, which is characterized in that including：At least one intelligent refrigerator according to any one of claims 8 and Cloud server as claimed in claim 9.