CN111415656A - Voice semantic recognition method and device and vehicle - Google Patents

Voice semantic recognition method and device and vehicle Download PDF

Info

Publication number
CN111415656A
CN111415656A CN201910009490.4A CN201910009490A CN111415656A CN 111415656 A CN111415656 A CN 111415656A CN 201910009490 A CN201910009490 A CN 201910009490A CN 111415656 A CN111415656 A CN 111415656A
Authority
CN
China
Prior art keywords
user
information
voice
speech
voice information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910009490.4A
Other languages
Chinese (zh)
Other versions
CN111415656B (en
Inventor
刘磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Qinggan Intelligent Technology Co Ltd
Original Assignee
Shanghai Qinggan Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Qinggan Intelligent Technology Co Ltd filed Critical Shanghai Qinggan Intelligent Technology Co Ltd
Priority to CN201910009490.4A priority Critical patent/CN111415656B/en
Publication of CN111415656A publication Critical patent/CN111415656A/en
Application granted granted Critical
Publication of CN111415656B publication Critical patent/CN111415656B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/005Language recognition
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L15/222Barge in, i.e. overridable guidance for interrupting prompts
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • User Interface Of Digital Computer (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a speech semantic recognition method, which comprises the steps of judging whether speech information of a user is received in real time; when voice information is received, judging whether the voice information conforms to a preset dialect; if so, performing corresponding response operation according to the voice information; if not, analyzing the voice information, obtaining key words in the voice information, obtaining the target intention of the user according to the key words and/or the combination of the key words, and obtaining and displaying at least one piece of input demonstration information matched with the target intention and the preset words of the user. The application also relates to a voice semantic recognition device and a vehicle. The voice semantic recognition method can refer the voice interaction technology to the car machine equipment, can reduce manual operation of a user by using the voice recognition technology, can provide voice guide for the user under the condition that the user does not master the voice, provides more appropriate help, accelerates the progress of mastering the voice function of the user, and improves user experience.

Description

Voice semantic recognition method and device and vehicle
Technical Field
The application relates to the technical field of voice recognition, in particular to a voice semantic recognition method, a voice semantic recognition device and a vehicle.
Background
The speech recognition technology is a high-tech technology for correctly recognizing human speech by a machine and converting vocabulary contents in the human speech into corresponding computer-readable inputtable texts or commands. With the continuous progress of science and technology, the field of speech recognition technology is more and more extensive. Compared with other input modes such as keyboard input and the like, the voice recognition technology is more in line with the daily habits of users, and therefore, the voice recognition technology becomes one of the most important human-computer interaction technologies.
However, the existing voice functions cannot be as intelligent as a real person, the specific dialogues and the using methods need the user to learn to use the voice functions better, the user does not want to spend time and energy on reading the specifications, and even if the user wants to read the specifications, many dialogues are difficult to remember.
Aiming at the defects in multiple aspects of the prior art, the application provides a speech semantic recognition method, a speech semantic recognition device and a vehicle.
Disclosure of Invention
The application aims to provide a voice semantic recognition method, a voice semantic recognition device and a vehicle, a voice interaction technology can be introduced into vehicle equipment, manual operation of a user can be reduced by using the voice recognition technology, voice guide can be provided for the user under the condition that the user does not master the voice, more appropriate help is provided, meanwhile, the progress of mastering a voice function of the user is accelerated, and user experience is improved.
In order to solve the above technical problem, the present application provides a speech semantic recognition method, including the following steps: judging whether voice information of a user is received in real time; when voice information is received, judging whether the voice information conforms to a preset dialect; if so, performing corresponding response operation according to the voice information; if not, analyzing the voice information, obtaining key words in the voice information, obtaining the target intention of the user according to the key words and/or the combination of the key words, and obtaining and displaying at least one piece of input demonstration information matched with the target intention and the preset words of the user.
In one embodiment, the step of parsing the voice information, obtaining keywords in the voice information, and obtaining the user target intention according to the keywords and/or the combination of the keywords comprises converting the received voice information into at least one piece of text information; performing word segmentation on the text information, wherein word segmentation adopts word segmentation based on a word bank; identifying keywords according to the segmented text; and acquiring the target intention of the user according to the keywords and/or the combination of the keywords.
In one embodiment, the step of converting the received voice information into at least one piece of text information includes performing feature recognition on the voice information to obtain voice features of the user, wherein the voice features of the user at least include regional feature data where the user is located; judging the official language type of the region corresponding to the language type used by the user according to the voice characteristics of the user; the voice information is converted into at least one piece of text information matching the official language type.
In one embodiment, the step of converting the received speech information into at least one text message is followed by error correction of the at least one text message by near word matching and common homophone replacement.
In one embodiment, the word segmentation based on the word bank is to perform word segmentation on the text information by means of a Chinese dictionary database, a historical behavior word bank and a hot search word bank.
In one embodiment, the step of obtaining and presenting at least one input demonstration information matching the target intention of the user and the preset dialect comprises classifying the input demonstration information according to a preset rule.
In one embodiment, the step of obtaining and displaying at least one piece of input demonstration information matching the target intention and the preset dialect of the user comprises weighting and scoring the input demonstration information according to the matching degree of the target intention and the preset dialect of the user, and obtaining and displaying input demonstration information ranked at the top n bits, wherein n is a positive integer greater than or equal to 1.
In order to solve the above technical problem, the present application further provides a speech semantic recognition apparatus, which includes a memory and a processor, wherein the memory is used for storing executable program codes; the processor is configured to call the executable program code in the memory to perform the steps of: judging whether voice information of a user is received in real time; when voice information is received, judging whether the voice information conforms to a preset dialect; if so, making corresponding response operation according to the voice information; if not, analyzing the voice information, obtaining key words in the voice information, obtaining the target intention of the user according to the key words and/or the combination of the key words, and obtaining and displaying at least one piece of input demonstration information matched with the target intention and the preset words of the user.
In one embodiment, the processor is further configured to convert the received voice message into at least one text message; performing word segmentation on the text information, wherein word segmentation adopts word segmentation based on a word bank; identifying keywords according to the segmented text; and acquiring the target intention of the user according to the keywords and/or the combination of the keywords.
In order to solve the technical problem, the present application further provides a vehicle equipped with the above speech semantic recognition device, wherein the vehicle is an unmanned vehicle, a manually driven vehicle, or an intelligent vehicle capable of freely switching between the unmanned vehicle and the manually driven vehicle.
The speech semantic recognition method, the speech semantic recognition device and the vehicle can refer the speech interaction technology to the vehicle equipment, the manual operation of a user can be reduced by using the speech recognition technology, speech guide can be provided for the user under the condition that the user does not master speech, more appropriate help is provided, meanwhile, the progress of mastering the speech function of the user is accelerated, and the user experience is improved.
The foregoing description is only an overview of the technical solutions of the present application, and in order to make the technical means of the present application more clearly understood, the present application may be implemented in accordance with the content of the description, and in order to make the above and other objects, features, and advantages of the present application more clearly understood, the following preferred embodiments are described in detail with reference to the accompanying drawings.
Drawings
Fig. 1 is a schematic flow chart of a speech semantic recognition method according to an embodiment of the present application.
Fig. 2 is a flowchart illustrating step S15 in the speech semantic recognition method shown in fig. 1 according to an embodiment.
Fig. 3 is a flowchart illustrating step S16 in the speech semantic recognition method shown in fig. 1 according to an embodiment.
Fig. 4 is a schematic structural diagram of a speech semantic recognition apparatus according to an embodiment of the present application.
Detailed Description
To further clarify the technical measures and effects taken by the present application to achieve the intended purpose, the present application will be described in detail below with reference to the accompanying drawings and preferred embodiments.
While the present application has been described in terms of specific embodiments and examples for achieving the desired objects and objectives, it is to be understood that the invention is not limited to the disclosed embodiments, but is to be accorded the widest scope consistent with the principles and novel features as defined by the appended claims.
Fig. 1 is a schematic flow chart of a speech semantic recognition method according to a first embodiment of the present application, and as shown in fig. 1, the speech semantic recognition method includes the following steps:
and step S11, judging whether the voice information of the user is received in real time.
In particular, voice information of the user may be received through a microphone or other voice input device.
If the voice message of the user is not received, executing step S12, not processing; if the voice message of the user is received, step S13 is executed to determine whether the predetermined dialect is satisfied.
If the predetermined rule is satisfied, go to step S14: and carrying out corresponding response operation according to the voice information.
Specifically, the preset dialect is a dialect grasped through the pre-machine language learning, that is, when voice information in accordance with the preset dialect is received, a corresponding response operation may be performed without processing. For example, the preset dialog of the embodiment may be "please help me navigate to XXX", "turn on air conditioner", "turn on radio", and so on.
If the preset words are not met, for example, "navigate me to turn on the air conditioner and also turn on the radio", "play the song and listen to", "yes time when the meal is needed, find a parking lot to park, i want to go to the meal", etc., the received information cannot be identified and the corresponding response operation cannot be performed, step S15 is executed: and analyzing the language information, acquiring key words in the voice information, and acquiring the target intention of the user according to the key words and/or the combination of the key words.
Specifically, in an embodiment, in order to facilitate user operation, a user does not need to train words in advance or adopt fixed words, and the method and the device can directly identify and process common natural languages, analyze received voice information, acquire keywords in the voice information, and then acquire a user target intention according to the keywords and/or a combination of the keywords.
Specifically, in one embodiment, the step S15 of parsing the received language information to obtain keywords in the voice information, and obtaining the user target intention according to the keywords and/or the combination of the keywords may be converting the voice information into plain text information, obtaining the keywords of the voice information by segmenting the plain text information, and obtaining the user target intention according to the keywords and/or the combination of the keywords. In another embodiment, the target intention of the user can also be obtained by extracting the voice feature information according to the voice information, generating the recognition result of the voice information according to the voice feature information and a preset acoustic model, and then according to a preset algorithm and the recognition result of the voice information.
Specifically, the target intention of the user may include functions desired to be used, such as a navigation function, a function of controlling various devices on the vehicle, such as an in-vehicle multimedia device, a window, a lighting device, and the like. The user's target intent may also include a destination desired to be reached, songs desired to be listened to, people desired to talk to, and so on.
Step S16: and acquiring and displaying at least one piece of input demonstration information matched with the target intention and preset dialogues of the user.
Specifically, in the present embodiment, the input demonstration information may be a preset utterance grasped in advance by machine language learning, or may be information that can be recognized and generated based on a combination of the target intention of the user and the preset utterance. For example, when the navigation function is used, the preset terminology is "please help me navigate to XXX", and the target intention of the user includes a destination to be reached, for example, "Tiananmen square", and the generated input demonstration information may include "please help me navigate to Tiananmen square".
Specifically, in the present embodiment, the generated input demonstration information is acquired and presented while the information is being voice-broadcasted.
Specifically, in the present embodiment, the input demonstration information may be classified by functions, such as a multimedia play function, a navigation function, and the like.
Fig. 2 is a flowchart illustrating an embodiment of step S15 in the speech semantic recognition method shown in fig. 1. As shown in fig. 2, the step of parsing the language information, obtaining the keywords in the voice information, and obtaining the user target intention according to the keywords and/or the combination of the keywords in the embodiment may specifically include the following processes.
Step S21: and carrying out feature recognition on the received voice information to obtain the voice features of the user.
Specifically, the language features of the user at least comprise regional feature data where the user is located.
Specifically, the regional characteristic of the user refers to the user's location or the user's native region, and can be determined according to the language type used by the user. The language types may include different languages, dialects, etc., such as english, japanese, korean, arabic, cantonese, sichuan dialects, etc. Specifically, semantic analysis can be performed on the received voice message to obtain the language type to which the voice message belongs, and the regional feature data where the user is located can be obtained according to the language type to which the voice message belongs.
Specifically, in the present embodiment, after performing semantic analysis on the speech information, the specific content of the speech can be obtained. And then, comparing the vocabulary, the semantics and the like in the specific content with a pre-established language vocabulary database, wherein the language vocabulary database comprises vocabulary libraries corresponding to different language types. Therefore, the corresponding language type can be compared according to the vocabulary corresponding to the voice information of the user, and the regional characteristic data of the user can be further predicted. For example, if the user uses portuguese, the user may be a user from the portuguese usage country or the user is in the portuguese usage country, and if the user uses cantonese, the user may be a user from or in guangdong, hong kong, etc.
Step S22: and judging the official language type of the region corresponding to the language type used by the user according to the voice characteristics of the user.
Specifically, in the present embodiment, the official language type of the region corresponding to the language type used by the user can be determined according to the region feature data where the user is located, for example, if the region feature data where the user is located corresponds to sichuan, it is known that the language type used by the user is sichuan and the corresponding official language is mandarin.
Specifically, in another embodiment, the user may also trigger the language button and select the language type of the speech information that the user wishes to recognize, for example, the language type may be, but is not limited to, chinese (mandarin and local dialects such as cantonese, northeast, tetragon, etc.), english, french, german, korean, etc., so as to obtain the official language type corresponding to the language type after processing.
Step S23: the voice information is converted into at least one piece of text information matching the official language type.
Specifically, in the present embodiment, in order to improve the reliability of speech information recognition, words and phrases related to the language information may be acquired through big data learning to compose pieces of text information. In another embodiment, the user's voice information may also be directly converted into a piece of plain text information.
Specifically, in order to prevent the factor of processing error of converting the voice information into the text information, in an embodiment, the step of converting the received voice information into at least one piece of text information further includes performing error correction processing on the at least one piece of text information through near-synonym matching and common homophone replacement.
Specifically, in the present embodiment, when performing error correction processing, error correction is performed by near-synonym matching, and then it is determined whether a phrase exists by using common homophones, and if so, error correction replacement is performed. For example, "i want to eat XX food and please help me recommend restaurant nearby" food, "wrong information such as" business "," mistake "or" real object "may be generated in the voice information conversion text information, and the" food "is replaced with the correct one after the error correction processing.
Step S24: and segmenting the text information.
Specifically, in the present embodiment, word segmentation is based on a lexicon, and word segmentation based on the lexicon is performed on the text information by means of a chinese dictionary database, a historical behavior lexicon, and a popular search lexicon.
Specifically, The accuracy Of Word segmentation depends On The algorithm and Word bank, different languages need different Word segmentation techniques due to different composition, for example, english is in Word units, Word and Word are separated by spaces, Chinese is in Word units, and adjacent words are connected to form a Word, in another embodiment, regular Word segmentation and dictionary-Based Word segmentation algorithm mmseg (a Word Identification System for man dictionary Chinese Text Based On twovariables Of The Maximum Matching algorithm) can be used, thereby realizing Word segmentation for english and Chinese.
Specifically, in the present embodiment, the principle of word segmentation is that the keyword is segmented according to the minimum word segmentation times. The recognition complexity can be reduced and the recognition efficiency can be improved through word segmentation.
Step S25: and acquiring keywords according to the text after word segmentation.
Specifically, in the present embodiment, the keywords are identified according to the segmented text, and for the text that cannot be identified, the pre-established user-used word library is used for matching and identifying. In another embodiment, text that is not recognized may also be discarded.
Step S26: and acquiring the target intention of the user according to the keywords and/or the combination of the keywords.
Specifically, in the present embodiment, the target intention of the user may be obtained according to the keywords and/or the combination of the keywords, and the operation that the user may want to perform may be inferred, so as to provide guidance and help.
Fig. 3 is a flowchart illustrating an embodiment of step S16 in the speech semantic recognition method shown in fig. 1 in fig. 3. As shown in fig. 3, the step of acquiring and displaying at least one input demonstration information matching the user target intention and the preset dialect in the present embodiment specifically includes the following steps.
Step S31: the input demonstration information is classified according to preset rules.
Specifically, in the present embodiment, the preset rule may be classified by functions, such as a navigation function of a vehicle, a play function of in-vehicle multimedia, and the like.
Specifically, as machine language is continuously learned, the data volume of input demonstration information becomes more and more huge, and the classification of the input demonstration information according to preset rules is to improve the response rate, so that a user can acquire the input demonstration information more quickly, and the user experience is improved.
Step S32: and carrying out sequencing weighted scoring on the preset dialect of the input demonstration information according to the matching degree of the preset dialect and the target intention of the user, and acquiring and displaying the input demonstration information with the top n grades.
Specifically, in the present embodiment, the terminal displays the input demonstration information of the top n bits that matches the user's target intention and the preset dialect to the highest extent. In other embodiments, the terminal may display the input demonstration information matching the target intention and preset dialogues of the user and having the highest historical use frequency of the user.
Fig. 4 is a schematic structural diagram of an embodiment of the speech semantic recognition apparatus according to the present application, and as shown in fig. 4, the speech semantic recognition apparatus 40 according to the present embodiment includes: a memory 401 and a processor 402. The memory 401 is used for storing executable program code; the processor 402 is configured to call the executable program code in the memory 401 to perform the following steps: judging whether voice information of a user is received in real time; when voice information is received, judging whether the voice information conforms to a preset dialect; if so, making corresponding response operation according to the voice information; if not, analyzing the voice information, obtaining key words in the voice information, obtaining the target intention of the user according to the key words and/or the combination of the key words, and obtaining and displaying at least one piece of input demonstration information matched with the target intention and the preset words of the user.
In one embodiment, the processor 402 is further configured to convert the received voice message into at least one text message; performing word segmentation on the text information, wherein word segmentation adopts word segmentation based on a word bank; identifying keywords according to the segmented text; and acquiring the target intention of the user according to the keywords and/or the combination of the keywords.
The application also provides a vehicle which is provided with the voice semantic recognition device and is an unmanned vehicle, a manually-driven vehicle or an intelligent vehicle capable of freely switching between the unmanned vehicle and the manually-driven vehicle.
The speech semantic recognition method, the speech semantic recognition device and the vehicle can refer the speech interaction technology to the vehicle equipment, the manual operation of a user can be reduced by using the speech recognition technology, speech guide can be provided for the user under the condition that the user does not master speech, more appropriate help is provided, meanwhile, the progress of mastering the speech function of the user is accelerated, and the user experience is improved.
Although the present application has been described with reference to a preferred embodiment, it should be understood that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the application, and all changes, substitutions and alterations that fall within the spirit and scope of the application are to be understood as being covered by the following claims.

Claims (10)

1. A speech semantic recognition method, characterized in that the speech semantic recognition method comprises:
judging whether voice information of a user is received in real time;
when the voice information is received, judging whether the voice information conforms to a preset dialect;
if yes, performing corresponding response operation according to the voice information;
if not, analyzing the voice information, acquiring key words in the voice information, acquiring a user target intention according to the key words and/or the combination of the key words, and acquiring and displaying at least one piece of input demonstration information matched with the user target intention and the preset conversation.
2. The speech semantic recognition method according to claim 1, wherein the step of parsing the speech information, obtaining keywords in the speech information, and obtaining the user target intention according to the keywords and/or the combination of the keywords comprises:
converting the received voice information into at least one piece of text information;
performing word segmentation on the text information, wherein word segmentation based on a word bank is adopted;
recognizing the keywords according to the segmented text;
and acquiring the target intention of the user according to the keywords and/or the combination of the keywords.
3. The speech semantic recognition method of claim 2, wherein the step of converting the received speech information into at least one text message comprises:
performing feature recognition on the voice information to acquire voice features of the user, wherein the voice features of the user at least comprise regional feature data of the user;
judging the official language type of the region corresponding to the language type used by the user according to the voice characteristics of the user;
converting the voice information into the at least one piece of text information matching the official language type.
4. The speech semantic recognition method of claim 2, wherein the step of converting the received speech information into at least one text message is followed by:
and carrying out error correction processing on the at least one piece of text information through similar meaning word matching and common homophone word replacement.
5. The speech semantic recognition method of claim 2, wherein the thesaurus-based segmentation is a segmentation of the text information by means of a chinese dictionary database, a historical behavior thesaurus, and a popular search thesaurus.
6. The speech semantic recognition method of claim 1, wherein the step of obtaining and presenting at least one input demonstration information matching the user's target intent and the predetermined dialect is preceded by the steps of:
and classifying the input demonstration information according to a preset rule to improve the response rate.
7. The speech semantic recognition method of claim 1, wherein the step of obtaining and presenting at least one input demonstration information matching the user's target intent and the preset dialect comprises:
and carrying out weighted scoring on the input demonstration information according to the matching degree with the target intention of the user and the preset dialect, and acquiring and displaying the input demonstration information with top n grades, wherein n is a positive integer greater than or equal to 1.
8. A speech semantic recognition device is characterized by comprising a memory and a processor,
the memory is used for storing executable program codes;
the processor is configured to call the executable program code in the memory to perform the steps of:
judging whether voice information of a user is received in real time;
when voice information is received, judging whether the voice information conforms to a preset dialect;
if so, making corresponding response operation according to the voice information;
if not, analyzing the voice information, acquiring key words in the voice information, acquiring a user target intention according to the key words and/or the combination of the key words, and acquiring and displaying at least one piece of input demonstration information matched with the user target intention and the preset conversation.
9. The speech semantic recognition apparatus of claim 8, wherein the processor is further configured to convert the received speech information into at least one text message; performing word segmentation on the text information, wherein word segmentation based on a word bank is adopted; identifying keywords according to the segmented text; and acquiring the target intention of the user according to the keywords and/or the combination of the keywords.
10. A vehicle equipped with the speech semantic recognition device according to claim 9, characterized in that the vehicle is an unmanned vehicle, a human-driven vehicle, or an intelligent vehicle that freely switches between an unmanned vehicle and a human-driven vehicle.
CN201910009490.4A 2019-01-04 2019-01-04 Speech semantic recognition method, device and vehicle Active CN111415656B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910009490.4A CN111415656B (en) 2019-01-04 2019-01-04 Speech semantic recognition method, device and vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910009490.4A CN111415656B (en) 2019-01-04 2019-01-04 Speech semantic recognition method, device and vehicle

Publications (2)

Publication Number Publication Date
CN111415656A true CN111415656A (en) 2020-07-14
CN111415656B CN111415656B (en) 2024-04-30

Family

ID=71494055

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910009490.4A Active CN111415656B (en) 2019-01-04 2019-01-04 Speech semantic recognition method, device and vehicle

Country Status (1)

Country Link
CN (1) CN111415656B (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112017663A (en) * 2020-08-14 2020-12-01 博泰车联网(南京)有限公司 Voice generalization method and device and computer storage medium
CN112102840A (en) * 2020-09-09 2020-12-18 中移(杭州)信息技术有限公司 Semantic recognition method, device, terminal and storage medium
CN112346697A (en) * 2020-09-14 2021-02-09 北京沃东天骏信息技术有限公司 Method, device and storage medium for controlling equipment
CN112896189A (en) * 2021-02-26 2021-06-04 江西江铃集团新能源汽车有限公司 Automatic driving vehicle control method and device, readable storage medium and vehicle-mounted terminal
CN113205817A (en) * 2021-07-06 2021-08-03 明品云(北京)数据科技有限公司 Speech semantic recognition method, system, device and medium
CN114842847A (en) * 2022-04-27 2022-08-02 中国第一汽车股份有限公司 Vehicle-mounted voice control method and device
CN115457959A (en) * 2022-11-08 2022-12-09 广州小鹏汽车科技有限公司 Voice interaction method, server and computer readable storage medium
CN117292688A (en) * 2023-11-24 2023-12-26 深圳市华南英才科技有限公司 Control method based on intelligent voice mouse and intelligent voice mouse

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080052062A1 (en) * 2003-10-28 2008-02-28 Joey Stanford System and Method for Transcribing Audio Files of Various Languages
US20110087491A1 (en) * 2009-10-14 2011-04-14 Andreas Wittenstein Method and system for efficient management of speech transcribers
US20120035915A1 (en) * 2009-04-30 2012-02-09 Tasuku Kitade Language model creation device, language model creation method, and computer-readable storage medium
US20120109649A1 (en) * 2010-11-01 2012-05-03 General Motors Llc Speech dialect classification for automatic speech recognition
CN103578472A (en) * 2012-08-10 2014-02-12 海尔集团公司 Method and device for controlling electrical equipment
CN105047198A (en) * 2015-08-24 2015-11-11 百度在线网络技术(北京)有限公司 Voice error correction processing method and apparatus
CN105206266A (en) * 2015-09-01 2015-12-30 重庆长安汽车股份有限公司 Vehicle-mounted voice control system and method based on user intention guess
CN105654954A (en) * 2016-04-06 2016-06-08 普强信息技术(北京)有限公司 Cloud voice recognition system and method
CN106847276A (en) * 2015-12-30 2017-06-13 昶洧新能源汽车发展有限公司 A kind of speech control system with accent recognition
CN107155121A (en) * 2017-04-26 2017-09-12 海信集团有限公司 The display methods and device of Voice command text
CN108053823A (en) * 2017-11-28 2018-05-18 广西职业技术学院 A kind of speech recognition system and method
CN108121528A (en) * 2017-12-06 2018-06-05 深圳市欧瑞博科技有限公司 Sound control method, device, server and computer readable storage medium
CN108447473A (en) * 2018-03-06 2018-08-24 深圳市沃特沃德股份有限公司 Voice translation method and device
CN108877791A (en) * 2018-05-23 2018-11-23 百度在线网络技术(北京)有限公司 Voice interactive method, device, server, terminal and medium based on view

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080052062A1 (en) * 2003-10-28 2008-02-28 Joey Stanford System and Method for Transcribing Audio Files of Various Languages
US20120035915A1 (en) * 2009-04-30 2012-02-09 Tasuku Kitade Language model creation device, language model creation method, and computer-readable storage medium
US20110087491A1 (en) * 2009-10-14 2011-04-14 Andreas Wittenstein Method and system for efficient management of speech transcribers
US20120109649A1 (en) * 2010-11-01 2012-05-03 General Motors Llc Speech dialect classification for automatic speech recognition
CN103578472A (en) * 2012-08-10 2014-02-12 海尔集团公司 Method and device for controlling electrical equipment
CN105047198A (en) * 2015-08-24 2015-11-11 百度在线网络技术(北京)有限公司 Voice error correction processing method and apparatus
CN105206266A (en) * 2015-09-01 2015-12-30 重庆长安汽车股份有限公司 Vehicle-mounted voice control system and method based on user intention guess
CN106847276A (en) * 2015-12-30 2017-06-13 昶洧新能源汽车发展有限公司 A kind of speech control system with accent recognition
CN105654954A (en) * 2016-04-06 2016-06-08 普强信息技术(北京)有限公司 Cloud voice recognition system and method
CN107155121A (en) * 2017-04-26 2017-09-12 海信集团有限公司 The display methods and device of Voice command text
CN108053823A (en) * 2017-11-28 2018-05-18 广西职业技术学院 A kind of speech recognition system and method
CN108121528A (en) * 2017-12-06 2018-06-05 深圳市欧瑞博科技有限公司 Sound control method, device, server and computer readable storage medium
CN108447473A (en) * 2018-03-06 2018-08-24 深圳市沃特沃德股份有限公司 Voice translation method and device
CN108877791A (en) * 2018-05-23 2018-11-23 百度在线网络技术(北京)有限公司 Voice interactive method, device, server, terminal and medium based on view

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112017663A (en) * 2020-08-14 2020-12-01 博泰车联网(南京)有限公司 Voice generalization method and device and computer storage medium
CN112017663B (en) * 2020-08-14 2024-04-30 博泰车联网(南京)有限公司 Voice generalization method and device and computer storage medium
CN112102840A (en) * 2020-09-09 2020-12-18 中移(杭州)信息技术有限公司 Semantic recognition method, device, terminal and storage medium
CN112102840B (en) * 2020-09-09 2024-05-03 中移(杭州)信息技术有限公司 Semantic recognition method, semantic recognition device, terminal and storage medium
CN112346697A (en) * 2020-09-14 2021-02-09 北京沃东天骏信息技术有限公司 Method, device and storage medium for controlling equipment
CN112896189A (en) * 2021-02-26 2021-06-04 江西江铃集团新能源汽车有限公司 Automatic driving vehicle control method and device, readable storage medium and vehicle-mounted terminal
CN113205817A (en) * 2021-07-06 2021-08-03 明品云(北京)数据科技有限公司 Speech semantic recognition method, system, device and medium
CN114842847A (en) * 2022-04-27 2022-08-02 中国第一汽车股份有限公司 Vehicle-mounted voice control method and device
CN115457959A (en) * 2022-11-08 2022-12-09 广州小鹏汽车科技有限公司 Voice interaction method, server and computer readable storage medium
CN115457959B (en) * 2022-11-08 2023-02-10 广州小鹏汽车科技有限公司 Voice interaction method, server and computer readable storage medium
CN117292688A (en) * 2023-11-24 2023-12-26 深圳市华南英才科技有限公司 Control method based on intelligent voice mouse and intelligent voice mouse
CN117292688B (en) * 2023-11-24 2024-02-06 深圳市华南英才科技有限公司 Control method based on intelligent voice mouse and intelligent voice mouse

Also Published As

Publication number Publication date
CN111415656B (en) 2024-04-30

Similar Documents

Publication Publication Date Title
CN111415656A (en) Voice semantic recognition method and device and vehicle
CN107016994B (en) Voice recognition method and device
CN108304372B (en) Entity extraction method and device, computer equipment and storage medium
JP4666648B2 (en) Voice response system, voice response program
US8606581B1 (en) Multi-pass speech recognition
US5787230A (en) System and method of intelligent Mandarin speech input for Chinese computers
WO2017071182A1 (en) Voice wakeup method, apparatus and system
KR100679042B1 (en) Method and apparatus for speech recognition, and navigation system using for the same
US20110093261A1 (en) System and method for voice recognition
CN109637537B (en) Method for automatically acquiring annotated data to optimize user-defined awakening model
WO2020123227A1 (en) Speech processing system
CN109801628B (en) Corpus collection method, apparatus and system
JP5703491B2 (en) Language model / speech recognition dictionary creation device and information processing device using language model / speech recognition dictionary created thereby
WO2005077098B1 (en) Handwriting and voice input with automatic correction
CN114328881A (en) Short text matching-based voice question-answering method and system
CN112927679A (en) Method for adding punctuation marks in voice recognition and voice recognition device
CN112015872A (en) Question recognition method and device
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
Kurzekar et al. Continuous speech recognition system: A review
CN110852075A (en) Voice transcription method and device for automatically adding punctuation marks and readable storage medium
CN115240655A (en) Chinese voice recognition system and method based on deep learning
US11783824B1 (en) Cross-assistant command processing
CN116052655A (en) Audio processing method, device, electronic equipment and readable storage medium
CN115132178B (en) Semantic endpoint detection system based on deep learning
CN104424942A (en) Method for improving character speed input accuracy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant