CN104536978A - Voice data identifying method and device - Google Patents

Voice data identifying method and device Download PDF

Info

Publication number
CN104536978A
CN104536978A CN201410736576.4A CN201410736576A CN104536978A CN 104536978 A CN104536978 A CN 104536978A CN 201410736576 A CN201410736576 A CN 201410736576A CN 104536978 A CN104536978 A CN 104536978A
Authority
CN
China
Prior art keywords
identified
speech data
steering order
speech
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201410736576.4A
Other languages
Chinese (zh)
Inventor
丁小燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chery Automobile Co Ltd
Original Assignee
Chery Automobile Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chery Automobile Co Ltd filed Critical Chery Automobile Co Ltd
Priority to CN201410736576.4A priority Critical patent/CN104536978A/en
Publication of CN104536978A publication Critical patent/CN104536978A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/48Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • Library & Information Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The invention discloses a voice data identifying method and device and belongs to the technical field of vehicle-mounted voice identification. The voice data identifying method includes that receiving the voice data to be identified input by a user, sending the voice data to be identified to a voice identifying server, recording the send time of the voice data to be identified, dividing the voice data to be identified into a plurality of data segments with preset time length according to the sampling time of the voice data to be identified, detecting the matching degree between the voice characteristic information of each data segment and pre-stored reference information, confirming first reference information matched with the first voice characteristic information in the voice data to be identified, acquiring a first control command corresponding to the voice data to be identified based on the confirmed first reference information, if the identification information sent from the voice identifying server is not received, confirming that the first control command is the identification result of the voice data to be identified. The voice data identifying method and device are capable of improving the voice data identifying flexibility.

Description

Identify the method and apparatus of speech data
Technical field
The present invention relates to vehicle-mounted voice recognition technology field, particularly a kind of method and apparatus identifying speech data.
Background technology
Along with the fast development of automotive electronic technology, car entertainment function is more and more abundanter, and the process operated on it also becomes increasingly complex, and manually operates and controls each amusement function, driver can be made to divert attention when steering vehicle, traffic safety is on the hazard.Exploration on Train Operation Safety can be solved to a certain extent by speech recognition technology.
Normally used speech recognition technology is the speech recognition technology based on local instruction type, namely multiple instruction word is set in advance in this locality, when driver needs a certain function starting vehicle, to the speech data of speech ciphering equipment input command adapted thereto word, when this speech ciphering equipment receives the speech data of this instruction word, after speech data is converted to Word message, this Word message and the local instruction word stored are contrasted, if the local instruction word stored comprises this Word message, then determine the instruction word that this speech data is corresponding, as recognition result, and then can export and respond this recognition result.
Realizing in process of the present invention, inventor finds that prior art at least exists following problem:
Due to the speech recognition technology based on local instruction type, be merely able to identify the speech data of the instruction word preset, driver is so just needed to remember a large amount of instruction word, if driver have input the speech data of non-instruction word, then cannot obtain recognition result by said method, make the dirigibility that identifies speech data poor like this.
Summary of the invention
In order to solve the problem of prior art, embodiments provide a kind of method and apparatus identifying speech data.Described technical scheme is as follows:
First aspect, provide a kind of method identifying speech data, described method comprises:
Receive the speech data to be identified of user's input, described speech data to be identified is sent to speech recognition server, records the transmitting time of described speech data to be identified;
According to the sampling time of described speech data to be identified, described speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in described speech data to be identified matches, based on the first reference information determined, obtain the first steering order that described speech data to be identified is corresponding;
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, then described first steering order is defined as the recognition result of described speech data to be identified; If from described transmitting time, in preset duration, receive the identification message carrying the second steering order that described speech recognition server sends, then described second steering order is defined as the recognition result of described speech data to be identified.
Alternatively, described method also comprises:
According to the matching degree of described first voice characteristics information and described first reference information, obtain the degree of confidence of described first steering order;
If described from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, then described first steering order be defined as the recognition result of described speech data to be identified, comprise:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, and the degree of confidence of described first steering order is not less than default confidence threshold value, then described first steering order is defined as the recognition result of described speech data to be identified.
Alternatively, described method also comprises:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, but the degree of confidence of described first steering order is less than default confidence threshold value, then send the cue of described speech data recognition failures to be identified.
Alternatively, the speech data to be identified of described reception user input, comprising:
When receiving phonetic entry request, receive the speech data of user's input, when user stops the duration after inputting to reach default reception duration threshold value, before user being stopped inputting, the speech data of input is defined as speech data to be identified.
Alternatively, described method also comprises:
Send the cue of the recognition result of the speech data described to be identified determined.
Second aspect, provide a kind of device identifying speech data, described device comprises:
Transceiver module, for receiving the speech data to be identified of user's input, sending to speech recognition server by described speech data to be identified, recording the transmitting time of described speech data to be identified;
First acquisition module, for the sampling time according to described speech data to be identified, described speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in described speech data to be identified matches, based on the first reference information determined, obtain the first steering order that described speech data to be identified is corresponding;
Determination module, if for from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, then described first steering order is defined as the recognition result of described speech data to be identified; If from described transmitting time, in preset duration, receive the identification message carrying the second steering order that described speech recognition server sends, then described second steering order is defined as the recognition result of described speech data to be identified.
Alternatively, described device also comprises the second acquisition module, for:
According to the matching degree of described first voice characteristics information and described first reference information, obtain the degree of confidence of described first steering order;
Described determination module, for:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, and the degree of confidence of described first steering order is not less than default confidence threshold value, then described first steering order is defined as the recognition result of described speech data to be identified.
Alternatively, described device also comprises the first reminding module, for:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, but the degree of confidence of described first steering order is less than default confidence threshold value, then send the cue of described speech data recognition failures to be identified.
Alternatively, described transceiver module, for:
When receiving phonetic entry request, receive the speech data of user's input, when user stops the duration after inputting to reach default reception duration threshold value, before user being stopped inputting, the speech data of input is defined as speech data to be identified.
Alternatively, described device also comprises the second reminding module, for:
Send the cue of the recognition result of the speech data described to be identified determined.
The beneficial effect that the technical scheme that the embodiment of the present invention provides is brought is:
During the embodiment of the present invention provides, receive the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, record the transmitting time of speech data to be identified, according to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified, if from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified, like this, the recognition method of the semantics recognition mode of this locality and speech recognition server can be combined, obtain the recognition result of often kind of recognition method respectively, therefrom choose the recognition result that a recognition result is defined as speech data to be identified, and do not need user to remember a large amount of instruction word, thus, the dirigibility that speech data is identified can be improved.
Accompanying drawing explanation
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, below the accompanying drawing used required in describing embodiment is briefly described, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
Fig. 1 is a kind of method flow diagram identifying speech data that the embodiment of the present invention provides;
Fig. 2 is the structural representation of a kind of system that the embodiment of the present invention provides;
Fig. 3 is a kind of apparatus structure schematic diagram identifying speech data that the embodiment of the present invention provides;
Fig. 4 is the structural representation of a kind of speech recognition apparatus that the embodiment of the present invention provides.
Embodiment
For making the object, technical solutions and advantages of the present invention clearly, below in conjunction with accompanying drawing, embodiment of the present invention is described further in detail.
Embodiment one
Embodiments provide a kind of method identifying speech data, as shown in Figure 1, the treatment scheme in the method can comprise following step:
Step 101, receives the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, records the transmitting time of speech data to be identified.
Step 102, according to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding.
Step 103, if from transmitting time, in preset duration, does not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified; If from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified.
During the embodiment of the present invention provides, receive the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, record the transmitting time of speech data to be identified, according to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified, if from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified, like this, the recognition method of the semantics recognition mode of this locality and speech recognition server can be combined, obtain the recognition result of often kind of recognition method respectively, therefrom choose the recognition result that a recognition result is defined as speech data to be identified, and do not need user to remember a large amount of instruction word, thus, the dirigibility that speech data is identified can be improved.
Embodiment two
Embodiments provide a kind of method identifying speech data, the method can be realized by speech recognition apparatus.Speech recognition apparatus wherein can be the arbitrary equipment with speech identifying function.
Below in conjunction with embodiment, be described in detail the treatment scheme shown in Fig. 1, content can be as follows:
Step 101, receives the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, records the transmitting time of speech data to be identified.
In force, speech recognition and treatment technology obtain common concern at the man-machine interface of infotech, its application in electronic product makes the life of people become more excellent, pass through voice command, people can make the corresponding operating of its voice responsive instruction by control system equipment, speech recognition can be applied to multiple field, and such as speech recognition technology is applied on vehicular platform, and it can make to seem more flexibly simply to the driving of automobile, safety and comfort more.In the embodiment of the present invention, carry out the detailed description of scheme for the speech recognition of vehicular platform, similar for the situation being applied to other field, do not repeat them here.
Along with the development of automobile industry and the universal of automobile, people are to the security of automobile.Convenience is had higher requirement, like this, the function of adding in automobile gets more and more, more and more intelligent, vehicle-mounted voice has become the important component part of onboard system, the various functions of onboard system can be controlled by the voice of user, particularly, as shown in Figure 2, the equipment for identifying speech data can be provided with in automobile, speech recognition button can be provided with in this equipment, when user needs to carry out a certain operation, such as, start navigating instrument to navigate, now, user can click speech recognition button, this equipment generates phonetic entry request, open the microphone of this equipment, open successfully, cue can be sent by loudspeaker, prompting user input voice data, user can to this equipment input voice, this equipment receives this voice, these voice are simulating signal, by microphone, this simulating signal can be converted to digital signal, after user has inputted, button can have been clicked, now, if because the sound of user is too small, this equipment cannot receive speech data, then this equipment can send the cue that speech data takes defeat, if the speech data that this equipment receives, then the speech data received can be defined as speech data to be identified, in order to make the recognition result of speech data to be identified more accurate, this equipment can by the wireless communication devices of self, speech data to be identified is sent to speech recognition server, when receiving speech data to be identified to make speech recognition server, it is identified, this equipment is when sending speech data to be identified to speech recognition server, the transmitting time of speech data to be identified can be recorded.
Alternatively, the processing mode of the speech data to be identified of above-mentioned reception user input can be varied, a kind of optional processing mode is below provided, specifically can comprise following content: when receiving phonetic entry request, receive the speech data of user's input, when user stops the duration after inputting to reach default reception duration threshold value, before user being stopped inputting, the speech data of input is defined as speech data to be identified.
In force, user can click speech recognition button, this equipment generates phonetic entry request, this equipment opens the microphone of this equipment by this phonetic entry request, open successfully, the voice prestored can be play, prompting user input voice data, user can to this equipment input speech data, this equipment can receive this speech data, in order to determine the end time of user input voice data, duration threshold value (namely receiving duration threshold value) can be pre-set, when user stops the duration after inputting to reach reception duration threshold value, before user can being stopped inputting, the speech data of input is defined as speech data to be identified.
Step 102, according to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding.
Wherein, the first voice characteristics information can be arbitrary voice characteristics information, and the first reference information can be any reference information.
In force, the said equipment can carry out pre-service to speech data to be identified, such as, (sample frequency can be 10KHz or 16KHz etc.) is sampled to speech data to be identified, anti-confusion filtering, remove the process such as glottal excitation and noise effect, then, this equipment can carry out feature extraction to the speech data to be identified after process, the effect of feature extraction is from the waveform of speech data, extract the parameter that one or more groups can describe speech data feature, as average energy, Zero-crossing Number, resonance peak, cepstrum, linear predictor coefficient etc., to carry out follow-up voice training and identification, and the selection direct relation of parameter the height of speech recognition apparatus discrimination, detailed process can be: voice signal can see the signal of short-term stationarity usually as, such as can think within the time period (as 10-20ms) of presetting, its spectral characteristic and some physical features parameter can regard constant as approx, so just can adopt the analysis and processing method of stationary process, speech data to be identified is processed, be specifically as follows: data segment speech data to be identified being separated into multiple preset duration, end-point detection can be carried out to each data segment, end-point detection just refers to from the one piece of data comprising voice, determine voice starting point and end point, multiple reference information can be previously stored with in this equipment, this reference information is that the speech data by obtaining above-mentioned process carries out training in a large number obtaining, this equipment can obtain the voice characteristics information of each data segment, and itself and the reference information prestored are carried out matching detection, obtain the reference information matched with voice characteristics information, if the first voice characteristics information in speech data to be identified and the first reference information match, then this equipment can carry out semantic understanding based on the first reference information, thus obtain the recognition result that this equipment identifies speech data to be identified, this equipment can pass through this recognition result, generate corresponding steering order (i.e. the first steering order).
This equipment can also be identified speech data to be identified by alternate manner, such as based on channel model and phonic knowledge method, utilize the method etc. of artificial neural network, said method can be processed by the mode of prior art, does not repeat them here.
Speech recognition server can be identified speech data to be identified by said method, because the amount of the reference information prestored in speech recognition server is larger, be far longer than the amount of the reference information stored in this equipment, therefore, the recognition result that usual speech recognition server identifies speech data to be identified is more accurate, its concrete processing procedure see above-mentioned related content, can not repeat them here.
Alternatively, because this equipment identifies speech data to be identified, the recognition result obtained may be inaccurate, can be described by the accuracy of some mode to this recognition result, corresponding processing mode can be varied, a kind of optional processing mode is below provided, following content can be comprised: according to the matching degree of the first voice characteristics information and the first reference information, obtain the degree of confidence of the first steering order.
In force, because speech data can be subject to the impacts such as noise, the voice characteristics information of speech data and reference information be there are differences, after the said equipment determines the first reference information that the first voice characteristics information is corresponding, the matching degree of the first voice characteristics information and the first reference information can be calculated, this matching degree can be have characteristic ratio in same characteristic features and the first voice characteristics information with the first reference information, and this equipment can using the degree of confidence of this ratio as the first steering order.
Step 103, if from transmitting time, in preset duration, does not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified; If from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified.
In force, the situation of speech recognition server cannot be connected to owing to often running into the said equipment, like this, the recognition result that speech recognition server cannot be identified in time sends to this equipment, in order to can in time to the recognition result of user feedback speech data to be identified, namely corresponding steering order is performed, can pre-set with transmitting time is the certain time length of sart point in time, if this equipment is in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order can be defined as the recognition result of speech data to be identified by this equipment, if in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order can be defined as the recognition result of speech data to be identified by this equipment.Wherein, if in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, this equipment also can pass through some recognition result system of selection, from the first steering order and the second steering order, select a steering order as the recognition result of speech data to be identified.
Alternatively, for the situation of recognition result in above-mentioned steps 103, first steering order being defined as speech data to be identified, whether this equipment can also can be done further to judge as the recognition result of speech data to be identified to the first steering order by degree of confidence, specifically can comprise following content: if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, and the degree of confidence of the first steering order is not less than default confidence threshold value, then the first steering order is defined as the recognition result of speech data to be identified.
In force, the confidence threshold value of the steering order that this equipment is determined can be pre-set in the said equipment, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then this equipment can obtain the degree of confidence of the first steering order, and itself and confidence threshold value are compared, if the numerical value of the degree of confidence of the first steering order is more than or equal to default confidence threshold value, then the first steering order is defined as the recognition result of speech data to be identified.
Alternatively, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, but the degree of confidence of the first steering order is less than default confidence threshold value, then send the cue of speech data recognition failures to be identified.
In force, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then this equipment can obtain the degree of confidence of the first steering order, and itself and confidence threshold value is compared, if the numerical value of the degree of confidence of the first steering order is less than default confidence threshold value, then can by the voice pre-set, the speech data failure to be identified of this recognition of devices of prompting user.
Alternatively, after the said equipment determines the recognition result of speech data to be identified, can send cue to user, its processing procedure can comprise following content: the cue sending the recognition result of the speech data to be identified determined.
In force, when this equipment determines the recognition result of speech data to be identified, the voice prestored can be play by loudspeaker, to point out user's recognition result, now, user can also judge that whether this recognition result is correct, if correct, user can to the speech data of this equipment input for confirming, when this equipment receives this speech data, corresponding steering order can be performed, if mistake, user can input corresponding speech data to this equipment, when this equipment receives this speech data, can stop performing corresponding steering order, and send cue, speech data to be identified is re-entered to point out user.
During the embodiment of the present invention provides, receive the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, record the transmitting time of speech data to be identified, according to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified, if from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified, like this, the recognition method of the semantics recognition mode of this locality and speech recognition server can be combined, obtain the recognition result of often kind of recognition method respectively, therefrom choose the recognition result that a recognition result is defined as speech data to be identified, and do not need user to remember a large amount of instruction word, thus, the dirigibility that speech data is identified can be improved.
Embodiment three
Based on identical technical conceive, the embodiment of the present invention additionally provides a kind of device identifying speech data, and as shown in Figure 3, this device comprises:
Transceiver module 310, for receiving the speech data to be identified of user's input, sending to speech recognition server by speech data to be identified, recording the transmitting time of speech data to be identified;
First acquisition module 320, for the sampling time according to speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding;
Determination module 330, if for from transmitting time, in preset duration, does not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified; If from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified.
Alternatively, this device also comprises the second acquisition module, for:
According to the matching degree of the first voice characteristics information and the first reference information, obtain the degree of confidence of the first steering order;
Determination module 330, for:
If from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, and the degree of confidence of the first steering order is not less than default confidence threshold value, then the first steering order is defined as the recognition result of speech data to be identified.
Alternatively, this device also comprises the first reminding module, for:
If from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, but the degree of confidence of the first steering order is less than default confidence threshold value, then send the cue of speech data recognition failures to be identified.
Alternatively, transceiver module 310, for:
When receiving phonetic entry request, receive the speech data of user's input, when user stops the duration after inputting to reach default reception duration threshold value, before user being stopped inputting, the speech data of input is defined as speech data to be identified.
Alternatively, this device also comprises the second reminding module, for:
Send the cue of the recognition result of the speech data to be identified determined.
During the embodiment of the present invention provides, receive the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, record the transmitting time of speech data to be identified, according to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified, if from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified, like this, the recognition method of the semantics recognition mode of this locality and speech recognition server can be combined, obtain the recognition result of often kind of recognition method respectively, therefrom choose the recognition result that a recognition result is defined as speech data to be identified, and do not need user to remember a large amount of instruction word, thus, the dirigibility that speech data is identified can be improved.
It should be noted that: the device of the identification speech data that above-described embodiment provides is when identifying speech data, only be illustrated with the division of above-mentioned each functional module, in practical application, can distribute as required and by above-mentioned functions and be completed by different functional modules, inner structure by equipment is divided into different functional modules, to complete all or part of function described above.In addition, the device of the identification speech data that above-described embodiment provides belongs to same design with the embodiment of the method for identification speech data, and its specific implementation process refers to embodiment of the method, repeats no more here.
Embodiment four
Fig. 4 is a kind of speech recognition apparatus structural representation that the embodiment of the present invention provides.See Fig. 4, this speech recognition apparatus may be used for the method implementing the identification speech data provided in above-described embodiment.Wherein, this speech recognition apparatus can be mobile phone, panel computer pad, Wearable mobile device (as intelligent watch) etc.Preferred:
Speech recognition apparatus 700 can comprise communication unit 110, includes the storer 120 of one or more computer-readable recording mediums, input block 130, display unit 140, sensor 150, voicefrequency circuit 160, WiFi (wireless fidelity, Wireless Fidelity) module 170, include the parts such as processor 180 and power supply 190 that more than or processes core.It will be understood by those skilled in the art that the speech recognition apparatus structure shown in figure does not form the restriction to speech recognition apparatus, the parts more more or less than diagram can be comprised, or combine some parts, or different parts are arranged.Wherein:
Communication unit 110 can be used for receiving and sending messages or in communication process, the reception of signal and transmission, this communication unit 110 can be RF (Radio Frequency, radio frequency) circuit, router, modulator-demodular unit, etc. network communication equipment.Especially, when communication unit 110 is RF circuit, after being received by the downlink information of base station, more than one or one processor 180 is transferred to process; In addition, base station is sent to by relating to up data.Usually, RF circuit as communication unit includes but not limited to antenna, at least one amplifier, tuner, one or more oscillator, subscriber identity module (SIM) card, transceiver, coupling mechanism, LNA (Low Noise Amplifier, low noise amplifier), diplexer etc.In addition, communication unit 110 can also by radio communication and network and other devices communicatings.Described radio communication can use arbitrary communication standard or agreement, include but not limited to GSM (Global System of Mobile communication, global system for mobile communications), GPRS (General Packet Radio Service, general packet radio service), CDMA (Code Division Multiple Access, CDMA), WCDMA (Wideband Code DivisionMultiple Access, Wideband Code Division Multiple Access (WCDMA)), LTE (Long Term Evolution, Long Term Evolution), Email, SMS (Short Messaging Service, Short Message Service) etc.Storer 120 can be used for storing software program and module, and processor 180 is stored in software program and the module of storer 120 by running, thus performs the application of various function and data processing.Storer 120 mainly can comprise storage program district and store data field, and wherein, storage program district can store operating system, application program (such as sound-playing function, image player function etc.) etc. needed at least one function; Store data field and can store the data (such as voice data, phone directory etc.) etc. created according to the use of speech recognition apparatus 700.In addition, storer 120 can comprise high-speed random access memory, can also comprise nonvolatile memory, such as at least one disk memory, flush memory device or other volatile solid-state parts.Correspondingly, storer 120 can also comprise Memory Controller, to provide the access of processor 180 and input block 130 pairs of storeies 120.
Input block 130 can be used for the numeral or the character information that receive input, and produces and to arrange with user and function controls relevant keyboard, mouse, control lever, optics or trace ball signal and inputs.Preferably, input block 130 can comprise Touch sensitive surface 131 and other input equipments 132.Touch sensitive surface 131, also referred to as touch display screen or Trackpad, user can be collected or neighbouring touch operation (such as user uses any applicable object or the operations of annex on Touch sensitive surface 131 or near Touch sensitive surface 131 such as finger, stylus) thereon, and drive corresponding coupling arrangement according to the formula preset.Optionally, Touch sensitive surface 131 can comprise touch detecting apparatus and touch controller two parts.Wherein, touch detecting apparatus detects the touch orientation of user, and detects the signal that touch operation brings, and sends signal to touch controller; Touch controller receives touch information from touch detecting apparatus, and converts it to contact coordinate, then gives processor 180, and the order that energy receiving processor 180 is sent also is performed.In addition, the polytypes such as resistance-type, condenser type, infrared ray and surface acoustic wave can be adopted to realize Touch sensitive surface 131.Except Touch sensitive surface 131, input block 130 can also comprise other input equipments 132.Preferably, other input equipments 132 can include but not limited to one or more in physical keyboard, function key (such as volume control button, switch key etc.), trace ball, mouse, control lever etc.
Display unit 140 can be used for the various graphical user interface showing information or the information being supplied to user and the speech recognition apparatus 700 inputted by user, and these graphical user interface can be made up of figure, text, icon, video and its combination in any.Display unit 140 can comprise display panel 141, optionally, the forms such as LCD (Liquid Crystal Display, liquid crystal display), OLED (Organic Light-EmittingDiode, Organic Light Emitting Diode) can be adopted to configure display panel 141.Further, Touch sensitive surface 131 can cover display panel 141, when Touch sensitive surface 131 detects thereon or after neighbouring touch operation, send processor 180 to determine the type of touch event, on display panel 141, provide corresponding vision to export with preprocessor 180 according to the type of touch event.Although in the drawings, Touch sensitive surface 131 and display panel 141 be as two independently parts realize input and input function, in certain embodiments, can by Touch sensitive surface 131 and display panel 141 integrated and realize input and output function.
Speech recognition apparatus 700 also can comprise at least one sensor 150, such as optical sensor, motion sensor and other sensors.Preferably, optical sensor can comprise ambient light sensor and proximity transducer, and wherein, ambient light sensor the light and shade of environmentally light can regulate the brightness of display panel 141, proximity transducer when speech recognition apparatus 700 moves in one's ear, can cut out display panel 141 and/or backlight.As the one of motion sensor, Gravity accelerometer can detect the size of all directions (are generally three axles) acceleration, size and the direction of gravity can be detected time static, can be used for identifying the application (such as horizontal/vertical screen switching, dependent game, magnetometer pose calibrating) of mobile phone attitude, Vibration identification correlation function (such as passometer, knock) etc.; As for speech recognition apparatus 700 also other sensors such as configurable gyroscope, barometer, hygrometer, thermometer, infrared ray sensor, do not repeat them here.
Voicefrequency circuit 160, loudspeaker 161, microphone 162 can provide the audio interface between user and speech recognition apparatus 700.Voicefrequency circuit 160 can by receive voice data conversion after electric signal, be transferred to loudspeaker 161, by loudspeaker 161 be converted to voice signal export; On the other hand, the voice signal of collection is converted to electric signal by microphone 162, voice data is converted to after being received by voicefrequency circuit 160, after again voice data output processor 180 being processed, through RF circuit 110 to send to such as another speech recognition apparatus, or export voice data to storer 120 to process further.Voicefrequency circuit 160 also may comprise earphone jack, to provide the communication of peripheral hardware earphone and speech recognition apparatus 700.
In order to realize radio communication, this speech recognition apparatus can be configured with wireless communication unit 170, this wireless communication unit 170 can be WiFi module.WiFi belongs to short range wireless transmission technology, and speech recognition apparatus 700 can help user to send and receive e-mail by wireless communication unit 170, browse webpage and access streaming video etc., and its broadband internet wireless for user provides is accessed.Although diagrammatically show wireless communication unit 170, be understandable that, it does not belong to must forming of speech recognition apparatus 700, can omit in the scope of essence not changing invention as required completely.
Processor 180 is control centers of speech recognition apparatus 700, utilize the various piece of various interface and the whole mobile phone of connection, software program in storer 120 and/or module is stored in by running or performing, and call the data be stored in storer 120, perform various function and the process data of speech recognition apparatus 700, thus integral monitoring is carried out to mobile phone.Optionally, processor 180 can comprise one or more process core; Preferably, processor 180 accessible site application processor and modem processor, wherein, application processor mainly processes operating system, user interface and application program etc., and modem processor mainly processes radio communication.Be understandable that, above-mentioned modem processor also can not be integrated in processor 180.
Speech recognition apparatus 700 also comprises the power supply 190 (such as battery) of powering to all parts, preferably, power supply can be connected with processor 180 logic by power-supply management system, thus realizes the functions such as management charging, electric discharge and power managed by power-supply management system.Power supply 190 can also comprise one or more direct current or AC power, recharging system, power failure detection circuit, power supply changeover device or the random component such as inverter, power supply status indicator.
Although not shown, speech recognition apparatus 700 can also comprise camera, bluetooth module etc., does not repeat them here.Specifically in the present embodiment, the display unit of speech recognition apparatus is touch-screen display, speech recognition apparatus also includes storer, and one or more than one program, one of them or more than one program are stored in storer, and are configured to perform described more than one or one routine package containing the instruction for carrying out following operation by more than one or one processor:
Receive the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, records the transmitting time of speech data to be identified;
According to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding;
If from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified; If from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified.
Alternatively, the method also comprises:
According to the matching degree of the first voice characteristics information and the first reference information, obtain the degree of confidence of the first steering order;
If from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order be defined as the recognition result of speech data to be identified, comprise:
If from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, and the degree of confidence of the first steering order is not less than default confidence threshold value, then the first steering order is defined as the recognition result of speech data to be identified.
Alternatively, the method also comprises:
If from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, but the degree of confidence of the first steering order is less than default confidence threshold value, then send the cue of speech data recognition failures to be identified.
Alternatively, receive the speech data to be identified of user's input, comprising:
When receiving phonetic entry request, receive the speech data of user's input, when user stops the duration after inputting to reach default reception duration threshold value, before user being stopped inputting, the speech data of input is defined as speech data to be identified.
Alternatively, the method also comprises:
Send the cue of the recognition result of the speech data to be identified determined.
During the embodiment of the present invention provides, receive the speech data to be identified of user's input, speech data to be identified is sent to speech recognition server, record the transmitting time of speech data to be identified, according to the sampling time of speech data to be identified, speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in speech data to be identified matches, based on the first reference information determined, obtain the first steering order that speech data to be identified is corresponding, if from transmitting time, in preset duration, do not receive the identification message carrying the second steering order that speech recognition server sends, then the first steering order is defined as the recognition result of speech data to be identified, if from transmitting time, in preset duration, receive the identification message carrying the second steering order that speech recognition server sends, then the second steering order is defined as the recognition result of speech data to be identified, like this, the recognition method of the semantics recognition mode of this locality and speech recognition server can be combined, obtain the recognition result of often kind of recognition method respectively, therefrom choose the recognition result that a recognition result is defined as speech data to be identified, and do not need user to remember a large amount of instruction word, thus, the dirigibility that speech data is identified can be improved.
One of ordinary skill in the art will appreciate that all or part of step realizing above-described embodiment can have been come by hardware, the hardware that also can carry out instruction relevant by program completes, described program can be stored in a kind of computer-readable recording medium, the above-mentioned storage medium mentioned can be ROM (read-only memory), disk or CD etc.
The foregoing is only preferred embodiment of the present invention, not in order to limit the present invention, within the spirit and principles in the present invention all, any amendment done, equivalent replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (10)

1. identify a method for speech data, it is characterized in that, described method comprises:
Receive the speech data to be identified of user's input, described speech data to be identified is sent to speech recognition server, records the transmitting time of described speech data to be identified;
According to the sampling time of described speech data to be identified, described speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in described speech data to be identified matches, based on the first reference information determined, obtain the first steering order that described speech data to be identified is corresponding;
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, then described first steering order is defined as the recognition result of described speech data to be identified; If from described transmitting time, in preset duration, receive the identification message carrying the second steering order that described speech recognition server sends, then described second steering order is defined as the recognition result of described speech data to be identified.
2. method according to claim 1, is characterized in that, described method also comprises:
According to the matching degree of described first voice characteristics information and described first reference information, obtain the degree of confidence of described first steering order;
If described from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, then described first steering order be defined as the recognition result of described speech data to be identified, comprise:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, and the degree of confidence of described first steering order is not less than default confidence threshold value, then described first steering order is defined as the recognition result of described speech data to be identified.
3. method according to claim 2, is characterized in that, described method also comprises:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, but the degree of confidence of described first steering order is less than default confidence threshold value, then send the cue of described speech data recognition failures to be identified.
4. method according to claim 1, is characterized in that, the speech data to be identified of described reception user input, comprising:
When receiving phonetic entry request, receive the speech data of user's input, when user stops the duration after inputting to reach default reception duration threshold value, before user being stopped inputting, the speech data of input is defined as speech data to be identified.
5. method according to claim 1, is characterized in that, described method also comprises:
Send the cue of the recognition result of the speech data described to be identified determined.
6. identify a device for speech data, it is characterized in that, described device comprises:
Transceiver module, for receiving the speech data to be identified of user's input, sending to speech recognition server by described speech data to be identified, recording the transmitting time of described speech data to be identified;
First acquisition module, for the sampling time according to described speech data to be identified, described speech data to be identified is divided into the data segment of multiple preset duration, the voice characteristics information of each data segment obtained and the reference information prestored are carried out matching detection, determine and the first reference information that the first voice characteristics information in described speech data to be identified matches, based on the first reference information determined, obtain the first steering order that described speech data to be identified is corresponding;
Determination module, if for from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, then described first steering order is defined as the recognition result of described speech data to be identified; If from described transmitting time, in preset duration, receive the identification message carrying the second steering order that described speech recognition server sends, then described second steering order is defined as the recognition result of described speech data to be identified.
7. device according to claim 6, is characterized in that, described device also comprises the second acquisition module, for:
According to the matching degree of described first voice characteristics information and described first reference information, obtain the degree of confidence of described first steering order;
Described determination module, for:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, and the degree of confidence of described first steering order is not less than default confidence threshold value, then described first steering order is defined as the recognition result of described speech data to be identified.
8. device according to claim 7, is characterized in that, described device also comprises the first reminding module, for:
If from described transmitting time, in preset duration, do not receive the identification message carrying the second steering order that described speech recognition server sends, but the degree of confidence of described first steering order is less than default confidence threshold value, then send the cue of described speech data recognition failures to be identified.
9. device according to claim 6, is characterized in that, described transceiver module, for:
When receiving phonetic entry request, receive the speech data of user's input, when user stops the duration after inputting to reach default reception duration threshold value, before user being stopped inputting, the speech data of input is defined as speech data to be identified.
10. device according to claim 6, is characterized in that, described device also comprises the second reminding module, for:
Send the cue of the recognition result of the speech data described to be identified determined.
CN201410736576.4A 2014-12-05 2014-12-05 Voice data identifying method and device Pending CN104536978A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201410736576.4A CN104536978A (en) 2014-12-05 2014-12-05 Voice data identifying method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201410736576.4A CN104536978A (en) 2014-12-05 2014-12-05 Voice data identifying method and device

Publications (1)

Publication Number Publication Date
CN104536978A true CN104536978A (en) 2015-04-22

Family

ID=52852506

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201410736576.4A Pending CN104536978A (en) 2014-12-05 2014-12-05 Voice data identifying method and device

Country Status (1)

Country Link
CN (1) CN104536978A (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104873062A (en) * 2015-05-29 2015-09-02 王旭昂 Water cup with voice control function
CN105187710A (en) * 2015-07-28 2015-12-23 广东欧珀移动通信有限公司 Photography controlling method, intelligent watch, photography terminal and system
CN105355195A (en) * 2015-09-25 2016-02-24 小米科技有限责任公司 Audio frequency recognition method and audio frequency recognition device
CN106469556A (en) * 2015-08-20 2017-03-01 现代自动车株式会社 Speech recognition equipment, the vehicle with speech recognition equipment, control method for vehicles
CN106531151A (en) * 2016-11-16 2017-03-22 北京云知声信息技术有限公司 Voice recognition method and voice recognition device
CN107595010A (en) * 2017-09-07 2018-01-19 太仓埃特奥数据科技有限公司 A kind of mobile Internet big data analyzes sample message display platform
CN108172219A (en) * 2017-11-14 2018-06-15 珠海格力电器股份有限公司 method and device for recognizing voice
CN108419108A (en) * 2018-03-06 2018-08-17 深圳创维数字技术有限公司 Sound control method, device, remote controler and computer storage media
CN108848012A (en) * 2018-06-22 2018-11-20 广州钱柜软件科技有限公司 A kind of home entertainment device intelligence control system
CN109658931A (en) * 2018-12-19 2019-04-19 平安科技(深圳)有限公司 Voice interactive method, device, computer equipment and storage medium
CN110473570A (en) * 2018-05-09 2019-11-19 广达电脑股份有限公司 Integrated voice identification system and method
CN110534109A (en) * 2019-09-25 2019-12-03 深圳追一科技有限公司 Audio recognition method, device, electronic equipment and storage medium
CN110534102A (en) * 2019-09-19 2019-12-03 北京声智科技有限公司 A kind of voice awakening method, device, equipment and medium
CN111081248A (en) * 2019-12-27 2020-04-28 安徽仁昊智能科技有限公司 Artificial intelligence speech recognition device
CN111524529A (en) * 2020-04-15 2020-08-11 广州极飞科技有限公司 Audio data processing method, device and system, electronic equipment and storage medium
CN112329457A (en) * 2019-07-17 2021-02-05 北京声智科技有限公司 Input voice recognition method and related equipment
CN112349337A (en) * 2020-11-03 2021-02-09 中科创达软件股份有限公司 Vehicle-mounted machine detection method, system, electronic equipment and storage medium
CN112382292A (en) * 2020-12-11 2021-02-19 北京百度网讯科技有限公司 Voice-based control method and device
CN113053363A (en) * 2021-05-12 2021-06-29 京东数字科技控股股份有限公司 Speech recognition method, speech recognition apparatus, and computer-readable storage medium
CN117893244A (en) * 2024-03-15 2024-04-16 中国海洋大学 Comprehensive management and control system for seaweed hydrothermal carbonization application based on machine learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708865A (en) * 2012-04-25 2012-10-03 北京车音网科技有限公司 Method, device and system for voice recognition
CN103137129A (en) * 2011-12-02 2013-06-05 联发科技股份有限公司 Voice recognition method and electronic device
CN103187060A (en) * 2011-12-28 2013-07-03 上海博泰悦臻电子设备制造有限公司 Vehicle-mounted speech processing device
CN103440867A (en) * 2013-08-02 2013-12-11 安徽科大讯飞信息科技股份有限公司 Method and system for recognizing voice

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103137129A (en) * 2011-12-02 2013-06-05 联发科技股份有限公司 Voice recognition method and electronic device
CN103187060A (en) * 2011-12-28 2013-07-03 上海博泰悦臻电子设备制造有限公司 Vehicle-mounted speech processing device
CN102708865A (en) * 2012-04-25 2012-10-03 北京车音网科技有限公司 Method, device and system for voice recognition
CN103440867A (en) * 2013-08-02 2013-12-11 安徽科大讯飞信息科技股份有限公司 Method and system for recognizing voice

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
谭建豪等: "《数据挖掘技术》", 31 January 2009, 北京:中国水利水电出版社 *
赵士滨: "《多媒体技术应用》", 31 October 2009 *

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104873062A (en) * 2015-05-29 2015-09-02 王旭昂 Water cup with voice control function
CN105187710A (en) * 2015-07-28 2015-12-23 广东欧珀移动通信有限公司 Photography controlling method, intelligent watch, photography terminal and system
CN106469556A (en) * 2015-08-20 2017-03-01 现代自动车株式会社 Speech recognition equipment, the vehicle with speech recognition equipment, control method for vehicles
CN105355195A (en) * 2015-09-25 2016-02-24 小米科技有限责任公司 Audio frequency recognition method and audio frequency recognition device
CN106531151B (en) * 2016-11-16 2019-10-11 北京云知声信息技术有限公司 Audio recognition method and device
CN106531151A (en) * 2016-11-16 2017-03-22 北京云知声信息技术有限公司 Voice recognition method and voice recognition device
CN107595010A (en) * 2017-09-07 2018-01-19 太仓埃特奥数据科技有限公司 A kind of mobile Internet big data analyzes sample message display platform
CN108172219A (en) * 2017-11-14 2018-06-15 珠海格力电器股份有限公司 method and device for recognizing voice
CN108419108A (en) * 2018-03-06 2018-08-17 深圳创维数字技术有限公司 Sound control method, device, remote controler and computer storage media
CN110473570A (en) * 2018-05-09 2019-11-19 广达电脑股份有限公司 Integrated voice identification system and method
CN108848012A (en) * 2018-06-22 2018-11-20 广州钱柜软件科技有限公司 A kind of home entertainment device intelligence control system
CN109658931A (en) * 2018-12-19 2019-04-19 平安科技(深圳)有限公司 Voice interactive method, device, computer equipment and storage medium
CN109658931B (en) * 2018-12-19 2024-05-10 平安科技(深圳)有限公司 Voice interaction method, device, computer equipment and storage medium
CN112329457A (en) * 2019-07-17 2021-02-05 北京声智科技有限公司 Input voice recognition method and related equipment
CN110534102A (en) * 2019-09-19 2019-12-03 北京声智科技有限公司 A kind of voice awakening method, device, equipment and medium
CN110534109A (en) * 2019-09-25 2019-12-03 深圳追一科技有限公司 Audio recognition method, device, electronic equipment and storage medium
CN111081248A (en) * 2019-12-27 2020-04-28 安徽仁昊智能科技有限公司 Artificial intelligence speech recognition device
CN111524529A (en) * 2020-04-15 2020-08-11 广州极飞科技有限公司 Audio data processing method, device and system, electronic equipment and storage medium
CN111524529B (en) * 2020-04-15 2023-11-24 广州极飞科技股份有限公司 Audio data processing method, device and system, electronic equipment and storage medium
CN112349337A (en) * 2020-11-03 2021-02-09 中科创达软件股份有限公司 Vehicle-mounted machine detection method, system, electronic equipment and storage medium
CN112349337B (en) * 2020-11-03 2023-06-30 中科创达软件股份有限公司 Vehicle-mounted device detection method, system, electronic equipment and storage medium
CN112382292A (en) * 2020-12-11 2021-02-19 北京百度网讯科技有限公司 Voice-based control method and device
CN113053363A (en) * 2021-05-12 2021-06-29 京东数字科技控股股份有限公司 Speech recognition method, speech recognition apparatus, and computer-readable storage medium
CN113053363B (en) * 2021-05-12 2024-03-01 京东科技控股股份有限公司 Speech recognition method, speech recognition apparatus, and computer-readable storage medium
CN117893244A (en) * 2024-03-15 2024-04-16 中国海洋大学 Comprehensive management and control system for seaweed hydrothermal carbonization application based on machine learning
CN117893244B (en) * 2024-03-15 2024-06-04 中国海洋大学 Comprehensive management and control system for seaweed hydrothermal carbonization application based on machine learning

Similar Documents

Publication Publication Date Title
CN104536978A (en) Voice data identifying method and device
CN104123937B (en) Remind method to set up, device and system
CN104133652B (en) A kind of audio play control method, and terminal
CN103400508A (en) Method, device and terminal for outputting guidance information of parking places
CN108447472A (en) Voice awakening method and device
CN105005909A (en) Method and device for predicting lost users
CN104636047A (en) Method and device for operating objects in list and touch screen terminal
CN104135728B (en) Method for connecting network and device
CN106170034B (en) A kind of sound effect treatment method and mobile terminal
CN104461597A (en) Starting control method and device for application program
CN104764458A (en) Method and device for outputting navigation route information
CN106331370A (en) Data transmission method and terminal device
CN106453830A (en) Falling detection method and device
CN107436758A (en) The method for information display and mobile terminal of a kind of mobile terminal
CN103365419A (en) Method and device for triggering alarm clock control command
CN106940997A (en) A kind of method and apparatus that voice signal is sent to speech recognition system
CN104239343A (en) User input information processing method and device
CN105526944B (en) Information cuing method and device
CN103745133A (en) Information processing method and terminal
CN103744574A (en) Method and device for turning off alarm clock of mobile terminal and mobile terminal
CN105049591A (en) Method and device for processing incoming call
CN103945241A (en) Streaming data statistical method, system and related device
CN104898936A (en) Page turning method and mobile device
CN103366104A (en) Method and device for controlling accessing of application
CN106126171B (en) A kind of sound effect treatment method and mobile terminal

Legal Events

Date Code Title Description
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20150422