CN109003611A - Method, apparatus, equipment and medium for vehicle audio control - Google Patents

Method, apparatus, equipment and medium for vehicle audio control Download PDF

Info

Publication number
CN109003611A
CN109003611A CN201811150983.1A CN201811150983A CN109003611A CN 109003611 A CN109003611 A CN 109003611A CN 201811150983 A CN201811150983 A CN 201811150983A CN 109003611 A CN109003611 A CN 109003611A
Authority
CN
China
Prior art keywords
text
instruction
vehicle
textual portions
sentence
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811150983.1A
Other languages
Chinese (zh)
Other versions
CN109003611B (en
Inventor
张佳雄
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Apollo Zhilian Beijing Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201811150983.1A priority Critical patent/CN109003611B/en
Publication of CN109003611A publication Critical patent/CN109003611A/en
Application granted granted Critical
Publication of CN109003611B publication Critical patent/CN109003611B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • BPERFORMING OPERATIONS; TRANSPORTING
    • B60VEHICLES IN GENERAL
    • B60RVEHICLES, VEHICLE FITTINGS, OR VEHICLE PARTS, NOT OTHERWISE PROVIDED FOR
    • B60R16/00Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for
    • B60R16/02Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric constitutive elements
    • B60R16/037Electric or fluid circuits specially adapted for vehicles and not otherwise provided for; Arrangement of elements of electric or fluid circuits specially adapted for vehicles and not otherwise provided for electric constitutive elements for occupant comfort, e.g. for automatic adjustment of appliances according to personal settings, e.g. seats, mirrors, steering wheel
    • B60R16/0373Voice control
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Abstract

Embodiment of the disclosure is related to method, apparatus, equipment and the computer readable storage medium of a kind of vehicle audio control.This method include obtain by voice that vehicle identification user inputs and the text generated;It is multiple textual portions that identity information based on the user, which divides the text,;By determining the executable instruction of one or more vehicles associated with each textual portions, an instruction set is generated;And the vehicle is made to execute at least part instruction in the instruction set.The efficiency and accuracy of speech recognition in vehicle-mounted scene can be improved in the technical solution of the disclosure, to improve the interactive voice experience of user.

Description

Method, apparatus, equipment and medium for vehicle audio control
Technical field
The disclosure relates generally to field of information processing, more particularly, to the method controlled for vehicle audio, fill It sets, equipment and computer readable storage medium.
Background technique
Currently, in vehicle-mounted interconnection scene, as speech recognition is increasingly mature with echo cancellation technology, user uses voice The frequency operated also increasingly increases.Interactive voice is also more wheel interactive voices from the development of single-wheel interactive voice, so that voice Interactive process is also more smooth.However, the number for the instruction that user can operate in single interactive voice is still limited in In single instruction, lead to not effectively utilize speech recognition.Moreover, multiple operations that user is frequently performed also can not be simply square Just it completes.In addition, user is also difficult to call each application in onboard system by voice.These defects reduce user's Interactive voice experience.
Summary of the invention
According to an example embodiment of the present disclosure, a kind of scheme of vehicle audio control is provided.
In the first aspect of the disclosure, a kind of method for vehicle audio control is provided.This method includes obtaining By voice that vehicle identification user inputs and the text generated.This method further includes that the identity information based on the user divides this article This is multiple textual portions.Further, this method further includes by determining one or more associated with each textual portions The executable instruction of a vehicle, generates an instruction set.Further, this method further includes that the vehicle is made to execute the instruction At least part instruction in set.
In the second aspect of the disclosure, a kind of device for vehicle audio control is provided.The device includes obtaining Module, be configured as obtaining by voice that vehicle identification user inputs and the text generated.The device further includes division module, quilt It is multiple textual portions that the identity information for being configured to user, which divides text,.Further, which further includes generation module, It is configured as generating an instruction by determining the executable instruction of one or more vehicles associated with each textual portions Set.Further, which further includes execution module, is configured as making at least part in vehicle set of instructions Instruction.
In the third aspect of the disclosure, a kind of electronic equipment is provided.The electronic equipment includes one or more processing Device;And storage device, for storing one or more programs, when one or more programs are held by one or more processors Row, so that the method that one or more processors realize the first aspect according to the disclosure.
In the fourth aspect of the disclosure, a kind of computer readable storage medium is provided, is stored thereon with computer journey Sequence realizes the method for the first aspect according to the disclosure when program is executed by processor.
It should be appreciated that content described in Summary be not intended to limit embodiment of the disclosure key or Important feature, it is also non-for limiting the scope of the present disclosure.The other feature of the disclosure will become easy reason by description below Solution.
Detailed description of the invention
It refers to the following detailed description in conjunction with the accompanying drawings, the above and other feature, advantage and aspect of each embodiment of the disclosure It will be apparent.In the accompanying drawings, the same or similar appended drawing reference indicates the same or similar element, in which:
Multiple embodiments that Fig. 1 shows the disclosure can be in the schematic diagram for the example context wherein realized;
Fig. 2 shows according to the process of some embodiments of the present disclosure controlled for vehicle audio or the signal stream of method Cheng Tu;
Fig. 3 shows the schematic block diagram of the device for vehicle audio control according to some embodiments of the present disclosure;With And
Fig. 4 shows the schematic block diagram that can implement the calculating equipment of multiple embodiments of the disclosure.
Specific embodiment
Embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the certain of the disclosure in attached drawing Embodiment, it should be understood that, the disclosure can be realized by various forms, and should not be construed as being limited to this In the embodiment that illustrates, providing these embodiments on the contrary is in order to more thorough and be fully understood by the disclosure.It should be understood that It is that being given for example only property of the accompanying drawings and embodiments effect of the disclosure is not intended to limit the protection scope of the disclosure.
In the description of embodiment of the disclosure, term " includes " and its similar term should be understood as that opening includes, I.e. " including but not limited to ".Term "based" should be understood as " being based at least partially on ".Term " one embodiment " or " reality Apply example " it should be understood as " at least one embodiment ".Term " first ", " second " etc. may refer to different or identical right As.Hereafter it is also possible that other specific and implicit definition.
As mentioned above, in current vehicle-mounted interconnection scene, user can not effectively utilize speech recognition, can not be direct Multiple operations are called, each application in onboard system can not be also called, so that reducing the interactive voice experience of user.
Embodiment of the disclosure proposes a kind of scheme for vehicle audio control.In this scenario, it obtains by vehicle The text for identifying the voice of user's input and generating;It is multiple textual portions that identity information based on the user, which divides the text,; By determining the executable instruction of one or more vehicles associated with each textual portions, an instruction set is generated;With And the vehicle is made to execute at least part instruction in the instruction set.In this way, it is possible to based on the identity information of user come The multiple instruction that identification user is intended to execute significantly improves user to improve the efficiency and accuracy of instruction identification Interactive voice experience.
Hereinafter reference will be made to the drawings to specifically describe embodiment of the disclosure.
Multiple embodiments that Fig. 1 shows the disclosure can be in the schematic diagram for the example context 100 wherein realized.As schemed Show, example context 100 includes vehicle 110, user 120 and calculating equipment 130.Vehicle 110 can be such as motor vehicles, non-machine Any entity that motor-car etc. can move.Although being described by taking vehicle 100 as an example in the text, but it is to be understood that Vehicle also may alternatively be any entity, such as TV, air-conditioning, refrigerator, micro-wave oven not moved etc. sometimes Household electrical appliance.
Vehicle 110 includes cart-mounted computing device 112, voice capture device 114 and storage equipment 116.Cart-mounted computing device 112 can be centralization or distributed any calculating equipment appropriate, including but not limited to personal computer, server, visitor It is family end, hand-held or laptop devices, multiprocessor, microprocessor, set-top box, programmable consumer electronics, network PC, small-sized Computer, large computer system and distributed cloud with and combinations thereof etc..
Voice capture device 114 can be any acquisition equipment that can collect the voice from user 120.Voice collecting The example of equipment 114 includes but is not limited to car microphone, vehicle-mounted pick-up head with microphone etc..In addition, storage equipment 116 It can be any storage equipment for storing data related with vehicle 110.
In certain embodiments, voice capture device 114 can acquire the voice from user 120, and will be acquired Voice be supplied to cart-mounted computing device 112.Acquired voice can be converted to text by cart-mounted computing device 112, and Identify the executable instruction of one or more vehicle involved in text.The executable instruction of vehicle can be in onboard system Each application operated.For example, the instruction that vehicle can be performed can indicate " opening navigation ", " opening music " etc., so that Navigation application, the music application etc. in onboard system can be opened.
In certain embodiments, storage equipment 116 can store wake-up sentence.Waking up sentence is not usually that vehicle is executable Instruction, but with vehicle can be performed instruction it is associated.Cart-mounted computing device 112 can be obtained from storage equipment 116 and be waken up Sentence, and sentence will be waken up and be compared with text.When waking up sentence and text matches, cart-mounted computing device 112 can To obtain instruction set (workflow can also be referred to as) corresponding with sentence is waken up.For example, cart-mounted computing device 112 can be with The instruction set is obtained from equipment 130 is calculated.Alternatively, cart-mounted computing device 112 can obtain the instruction from storage equipment 116 Set.Then, cart-mounted computing device 112 can execute the instruction set.
For vehicle 110, calculating equipment 130 can be long-range, be also possible to local.Calculating equipment 130 can be Centralized or distributed any calculating equipment appropriate, including but not limited to personal computer, client, are held server Or it is laptop devices, multiprocessor, microprocessor, set-top box, programmable consumer electronics, network PC, minicomputer, big The computer systems and distributed cloud of type with and combinations thereof etc..Calculating equipment 130 can be with vehicle 110, particularly therein vehicle-mounted It calculates equipment 112 to communicate, such as is communicated by wiredly and/or wirelessly connecting.
On the contrary, cart-mounted computing device 112 can send text to calculating equipment when waking up sentence and text mismatches 130.In certain embodiments, Multilevel method can be carried out to the text by calculating equipment 130.It can be right for example, calculating equipment 130 Text executes two layers or three layers processing.Specifically, in first layer processing, calculating equipment 130 can be based on the identity of user 120 It is multiple textual portions that information, which divides the text,.The identity information of user 120 can indicate wake-up language associated with user 120 Sentence, specific to conjunction of user 120 etc..In addition, general connection can also be used by calculating equipment 130 in first layer processing It is multiple textual portions that word, which divides the text,.General conjunction be normally used for divide text word, such as " and ", " and " etc..
In second layer processing, first layer can be handled in obtained multiple textual portions extremely by calculating equipment 130 Few textual portions applicational grammar analysis, to obtain one or more textual portions.In certain embodiments, in addition to first layer At least one textual portions in obtained multiple textual portions are handled, calculating equipment 130 can also be to itself application herein Syntactic analysis.
Further, finger corresponding with obtained one or more textual portions are divided can be determined by calculating equipment 130 It enables.In certain embodiments, finger corresponding with the second layer obtained textual portions of processing can only be determined by calculating equipment 130 It enables.Alternatively, calculating equipment 130 can determine corresponding with the obtained textual portions of both first layer processing and second layer processing Instruction.
In certain embodiments, machine language can be converted to for textual portions according to Deep Semantics analysis by calculating equipment 130 Adopted expression formula, and instruction corresponding with this article this part is determined based on the machine semantic formula.
Deep Semantics analysis can find corresponding semantic role for each predicate of sentence, convert machine for sentence Semantic formula, for example, predicate logic expression formula (such as lambda calculus expressions), based on interdependent combined type semantic meaning representation Formula (dependency-based compositional semantic representation) etc..It is given below in exemplary English sentence and corresponding first-order predicate logic expression formula:
Chinese: all rivers in the state of Colorado are listed in
English: Name all the rivers in Colorado
Semantic formula: answer (river (loc_2 (stateid (' colorado '))))
The method of Deep Semantics analysis includes but is not limited to the semantic analysis of knowledge based library (or database), has supervision language Justice analysis and semi-supervised or unsupervised semantic analysis.In the semantic analysis in knowledge based library, with triple etc. in knowledge base Form has recorded a series of facts.For given sentence, semantic analysis is converted to sentence in knowledge base by switch technology A series of tuples of definition, and constitute an entity relationship diagram.
In having supervision semantic analysis, there is supervision semantic analysis to need to utilize the semantic analysis corpus manually marked.In people In the semantic analysis corpus of work mark, its semantic formula is manually marked for each sentence.
In semi-supervised or unsupervised semantic analysis, unsupervised semantic analysis does not need to utilize the semantic analysis manually marked Corpus, and merely with the physical name in knowledge base/relationship name etc., and the fact that also do not utilize the record in knowledge base.Without prison The semantic analysis although unfavorable semantic analysis corpus manually marked is superintended and directed, but greatest hope (Expectation would generally be used Maximization, EM) algorithm.In each iteration of algorithm, semantic analysis is carried out to sentence, and select confidence level high Sentence and its semantic analysis result are as self-training data set.
In addition to the processing of above-mentioned first layer and second layer processing, in certain embodiments, calculating equipment 130 can also be performed the Three layers of processing, the textual portions that third layer processing can not parse first two layers carry out auxiliary parsing.In third layer processing, Calculating equipment 130 can be to the text portion not associated with the instruction that vehicle 110 can be performed in obtained textual portions that divides Divide application specific to some region of Regional Analysis.In certain embodiments, institute can be handled to the second layer by calculating equipment 130 Textual portions application region property not associated with the instruction that vehicle 110 can be performed is analyzed in obtained textual portions.Alternatively, Calculating equipment 130 can handle in the obtained textual portions of the two first layer processing and the second layer and can not hold with vehicle 110 The associated textual portions application region property analysis of capable instruction.
For example, user can be obtained such as via the global positioning system (GPS) loaded on vehicle 110 by calculating equipment 130 The geographical location being located at when 120 input voice.The geographical location can be based on by calculating equipment 130, to the obtained text of division Textual portions application region property not associated with the instruction that vehicle 110 can be performed is analyzed in this part.Regional Analysis can be with Including but not limited to dialect is analyzed.
Regional Analysis can use the region that user is located at and read the knowledge concerning word sense library specific to the region, semantic point Analysis knowledge base and the exclusive semantic analysis corpus manually marked, to be divided to textual portions.Knowledge concerning word sense library may include The distinctive predicate in the region and noun can be used to identify the words and phrases and word sense disambiguation not identified.Semantic analysis knowledge base can To include the distinctive semantic formula in the region.The exclusive semantic analysis corpus manually marked can be the news according to the region The semantic formula with timeliness generated with hot spot.
Then, calculating equipment 130 can determine corresponding with the obtained one or more textual portions of third layer processing Instruction.In certain embodiments, calculating equipment 130 can analyze according to Deep Semantics as described above by third layer processing gained To textual portions be converted to machine semantic formula, and determined and this article this part pair based on the machine semantic formula The instruction answered.
In certain embodiments, calculating equipment 130, which can also be removed, is handled with first layer, at second layer processing and third layer Manage the repetitive instruction in the corresponding instruction of obtained textual portions.The movement for removing repetitive instruction can be at each layer Reason determines execution when instruction, can also the execution when determining instruction for whole layer processing.
In certain embodiments, available commands set can be determined by calculating equipment 130, so that in the instruction determined only The instruction for belonging to available commands set is executed by vehicle 110.For example, its available commands supported vehicle 110 can be arranged Collection.In this case, the mark of the available vehicle 110 of equipment 130 is calculated, and determines available finger corresponding with the mark Enable set.
In this way, it is possible to identity information, general conjunction, syntactic analysis and Regional Analysis pair based on user 120 The voice of user's input carries out the processing of many levels, to determine that user 120 is intended to the multiple instruction executed, greatlys improve The efficiency and accuracy of instruction identification significantly improves the interactive voice experience of user.
Fig. 2 shows according to the process of some embodiments of the present disclosure controlled for vehicle audio or showing for method 200 Meaning flow chart.For example, method 200 can be performed at calculating equipment 130 as shown in Figure 1 or other systems appropriate. For example, method 200 can by vehicle 110 cart-mounted computing device 112 or calculating equipment associated there execute.This Outside, method 200 can also include the steps that unshowned additional step and/or can be omitted shown, and the scope of the present disclosure exists It is unrestricted in this respect.
210, calculates equipment 130 and obtain the voice that user 120 inputs is identified and the text generated by vehicle 110.For example, The voice and its generated text that user 120 inputs can be " I will go home, and call wife ".In certain embodiments, it counts Text can be obtained in the case where vehicle 100 can not determine the executable instruction of one or more vehicles by calculating equipment 130.Example Such as, as described above, the storage equipment 116 of vehicle 110 can store wake-up sentence.Waking up sentence is not the executable finger of vehicle The instruction that enables but can be performed with vehicle is associated.For example, waking up sentence can be " I will go home ", and " opened with instruction Navigation " and " opening music " are associated.
Since text (" I will go home, and call wife ") is mismatched with the wake-up sentence (" I will go home ") stored, vehicle 110 can not identify the text, are intended to " opening navigation " for executing to make vehicle 110 that can not execute user 120, " open The instruction of music " and " calling wife ".In this case, vehicle 110 sends the text to and calculates equipment 130, in terms of Equipment 130 is calculated to identify text.In this way, it is possible in the limited situation of computing capability of vehicle 110, more accurately Speech recognition is carried out, thus computing resource needed for saving vehicle 100.
220, calculating equipment 130 and dividing text based on the identity information of user 120 is multiple textual portions.In certain realities It applies in example, calculating equipment 130 can be based on identification wake-up sentence associated with user 120 in the text, and is recognizing In the case where waking up sentence, each wake-up sentence is marked off from text as textual portions (also referred to as " the first text Part ").Waking up sentence can be system default, be also possible to be arranged by user 120.Although waking up sentence to be described as It is associated with single user, it can be associated with multiple users or all users but wake up sentence.For example, system default is called out Awake sentence can be adapted for all users.
Assuming that text is " I will go home to call wife after being over ", and waking up sentence is " I will go home ".In this feelings Under condition, calculating equipment 130 can identify wake-up sentence from text " I will go home to call wife after being over ", and " I will be returned Family ", and sentence " I will go home " will be waken up and marked off from text as the first textual portions.In this way, it is possible to make The operation of multiple instruction can be related to by waking up sentence and be easily performed by obtaining user 120, to improve the effect of interactive voice Rate.
In certain embodiments, the conjunction specific to user 120 can be identified in the text by calculating equipment 130, and Text is divided based on conjunction.Specific to the conjunction of user 120, to can be user 120 pre-set, is also possible to count Calculate what equipment 130 learnt from the input of the history voice of user.For example, it is assumed that the conjunction specific to user 120 is " to be over it Afterwards ".In the case, conjunction can be identified from text " I will go home to call wife after being over " by calculating equipment 130 For " after being over ", and text is divided by " I will go home " and " calling wife " based on the conjunction.In this way, it is possible to Text is divided based on user setting or user's habit, to improve user experience level.
Further, in certain embodiments, it is multiple for calculating equipment 130 general conjunction can also be used to divide text Textual portions.General conjunction is the word, such as " and ", " and " etc. for being normally used for dividing text.It is set for example, calculating Standby 130 can be identified from text " I will go home and call wife " general conjunction be " and ", and it is logical based on this Text is divided into " I will go home " and " calling wife " with conjunction.
Then, at least one the textual portions applicational grammar divided in obtained multiple textual portions is analyzed, to obtain Obtain one or more textual portions (also referred to as " the second textual portions ").Assuming that text is that " I will go home to play weather calling Wife ", wake-up sentence " I will go home " can be identified by calculating equipment 130, and text is divided into two textual portions " we Go home " and " playing weather calling wife ".In the case, calculating equipment 130 " can play one of two textual portions Weather calls wife " applicational grammar analysis.
For example, textual portions " play weather and call wife " can be divided based on " meaning guest " syntactic analysis by calculating equipment 130. Since " broadcasting " and " calling " is the predicate in grammer, and " weather " and " wife " is the object in grammer, therefore calculates equipment Textual portions " play weather and call wife " can be divided into two the second textual portions " playing weather " and " called old by 130 Mother-in-law ".In this way, it is possible to further be divided based on grammer to text, to increase the accuracy of speech recognition.
Further, finger corresponding with obtained one or more textual portions are divided can be determined by calculating equipment 130 It enables.In certain embodiments, machine semantic meaning representation can be converted to for textual portions according to Deep Semantics analysis by calculating equipment 130 Formula, and instruction corresponding with this article this part is determined based on the machine semantic formula.It can be with for example, calculating equipment 130 It is determining with textual portions " I will go home ", " playing weather " and " calling wife " corresponding instruction instruction " opening navigation ", " opening Music ", " playing weather " and " calling wife ".
However, sometimes, calculating the possibly some text portions that can not be determined and divide in obtained textual portions of equipment 130 Divide corresponding instruction, caused by this is likely due to the dialect for the specific region that user 120 is located at.In the case, exist In some embodiments, the geographical location being located at when the input voice of user 120 can also be obtained by calculating equipment 130, and is based on Geographical location, the instruction associated second to not can be performed with one or more vehicles in one or more second textual portions The analysis of textual portions application dialect.
For example, vehicle 110 is located in Chongqing City when user 120 inputs voice, geographical location can be based on by calculating equipment 130 In Chongqing City, the second textual portions not associated with the instruction that vehicle can be performed are applied with the dialect point for Chongqing words Analysis.Assuming that textual portions " calling women " are not associated with the instruction that vehicle can be performed, calculating equipment 130 can be to textual portions Dialect analysis is carried out for " calling women " as " calling wife ".
Then, the corresponding instruction of the obtained one or more textual portions of dialect analysis can be determined by calculating equipment 130. In certain embodiments, dialect is analyzed into obtained text as described above, calculating equipment 130 and can analyze according to Deep Semantics It is partially converted to machine semantic formula, and determines finger corresponding with this article this part based on the machine semantic formula It enables.In this way, it is possible to the region being located at for user 120 carries out regional extension, improve speech recognition accuracy and Efficiency.
230, calculating equipment 130 can be by determining that one or more vehicles associated with each textual portions can be held Capable instruction generates an instruction set.As described above, in certain embodiments, calculating equipment 130 can be according to Deep Semantics The textual portions divided are converted to machine semantic formula by analysis, and are determined and be somebody's turn to do based on the machine semantic formula The corresponding instruction of textual portions.An instruction set can be generated in the instruction determined.For example, the instruction determined can be with According to corresponding textual portions in the text be sequentially generated a sequential instructions set.
In addition, in certain embodiments, calculating equipment 130 can be by removing the executable instruction of one or more vehicles In duplicate instruction, generate described instruction set.Assuming that text is " I will go home and open music ", 130 base of equipment is calculated Instruction instruction " opening navigation ", " opening music " and " opening music " associated with the text can be determined in the above method. It is clearly, there are two duplicate instructions " opening music ".In the case, calculating equipment 130 can be by removing a repetition Instruction " open music ", generate the only instruction set comprising " opening a navigation " instruction and " opening a music " instruction. In this way, it is possible to avoid duplicate operation is executed, to improve user experience.
240, calculating equipment 130 can be such that at least part in 110 set of instructions of vehicle instructs.In certain realities It applies in example, its available instruction set supported vehicle 110 can be arranged.In this case, it is available to calculate equipment 130 The mark of vehicle 110, and determine available commands set corresponding with the mark, so that vehicle 110 be made to execute the instruction set In belong to available commands set at least part instruction.
For example, user 120 or the manufacturer of vehicle 110 can set the available instruction set of vehicle 110 to not include " beating Open music ".In this case, even if calculating equipment 130 determines instruction " opening music ", vehicle will not be made by calculating equipment 130 110 execute the instruction.In this way, it is possible to be configured to the operation for wishing that vehicle 110 is able to carry out, to improve vehicle The safety and flexibility of loading system.
In this way, it is possible to identity information, general conjunction, syntactic analysis based on user 120 and dialect analysis to The voice of family input carries out the processing of many levels, and removes duplicate instruction after determining instruction and make vehicle 110 only execute the operation for being allowed to execute.Which not only improves the efficiency of instruction identification and accuracys, also improve onboard system Safety and flexibility, thus significantly improve user interactive voice experience.
Fig. 3 shows the schematic block diagram of the device 300 for vehicle audio control according to some embodiments of the present disclosure. In conjunction with the description of Fig. 1 and Fig. 2, device 300 shown in Fig. 3 includes: to obtain module 310, is configured as obtaining and be used by vehicle identification The voice of family input and the text that generates;Division module 320, being configured as the identity information division text based on the user is Multiple textual portions;Generation module 330 is configured as by determining one or more vehicles associated with each textual portions Executable instruction generates an instruction set;And execution module 340, it is configured as that the vehicle is made to execute the instruction set In at least part instruction.
In embodiment of the disclosure, obtaining module 310 includes: that text obtains module, is configured to respond to the vehicle It can not determine the executable instruction of the one or more vehicle, obtain the text.
In embodiment of the disclosure, division module 320 includes: to wake up sentence identification module, is configured as in the text Associated with the user wake-up sentence of middle identification, the wake-up sentence be not the instruction that the one or more vehicle can be performed but It is associated with the instruction that the one or more vehicle can be performed;And sentence division module is waken up, it is configured to respond to know It is clipped to the wake-up sentence, each wake-up sentence is marked off from the text as the first textual portions.
In embodiment of the disclosure, division module 320 further include: connection string module is configured as in the text Conjunction of the middle identification specific to the user;And conjunction division module, it is configured as dividing this article based on the conjunction This.
In embodiment of the disclosure, division module 320 further include: syntax Analysis Module is configured as to division gained To multiple textual portions in the analysis of at least one textual portions applicational grammar, to obtain one or more second text portions Point.
In embodiment of the disclosure, division module 320 further include: position acquisition module is configured as obtaining the user The geographical location being located at when inputting the voice;And dialect analysis module, it is configured as based on the geographical location, to this Or the second textual portions application not associated with the instruction that the one or more vehicle can be performed in multiple second textual portions Dialect analysis.
In embodiment of the disclosure, generation module 330 includes: instruction set generation module, is configured as passing through removal Duplicate instruction in the executable instruction of the one or more vehicle, generates the instruction set.
In embodiment of the disclosure, execution module 340 includes: identifier acquisition module, is configured as obtaining the vehicle Mark;Determining module is configured to determine that available commands set corresponding with the mark;And instruction execution module, it is configured For make the vehicle execute belong in the instruction set available commands set at least part instruction.
In embodiment of the disclosure, device 300 further include: wake up sentence generation module, being configurable to generate instruction should The wake-up sentence of at least part instruction.
Fig. 4 shows the schematic block diagram that can be used to implement the example apparatus 400 of embodiment of the disclosure.As schemed Show, equipment 400 includes central processing unit (CPU) 401, can be according to the calculating being stored in read-only memory (ROM) 402 Machine program instruction is loaded into the computer program instructions in random access storage device (RAM) 403 from storage unit 408, comes Execute various movements appropriate and processing.In RAM 403, it can also store equipment 400 and operate required various programs and data. CPU 401, ROM 402 and RAM 403 are connected with each other by bus 404.Input/output (I/O) interface 405 is also connected to always Line 404.
Multiple components in equipment 400 are connected to I/O interface 405, comprising: input unit 406, such as keyboard, mouse etc.; Output unit 407, such as various types of displays, loudspeaker etc.;Storage unit 408, such as disk, CD etc.;And it is logical Believe unit 409, such as network interface card, modem, wireless communication transceiver etc..Communication unit 409 allows equipment 400 by such as The computer network of internet and/or various telecommunication networks exchange information/data with other equipment.
Processing unit 401 executes each method as described above and processing, such as process 200.For example, in some implementations In example, process 200 can be implemented as computer software programs, be tangibly embodied in machine readable media, such as storage list Member 408.In some embodiments, some or all of of computer program can be via ROM 402 and/or communication unit 409 And it is loaded into and/or is installed in equipment 400.It, can be with when computer program loads to RAM 403 and when being executed by CPU 401 Execute the one or more steps of procedures described above 200.Alternatively, in other embodiments, CPU 401 can pass through it His any mode (for example, by means of firmware) appropriate and be configured as implementation procedure 200.
Function described herein can be executed at least partly by one or more hardware logic components.Example Such as, without limitation, the hardware logic component for the exemplary type that can be used includes: field programmable gate array (FPGA), dedicated Integrated circuit (ASIC), Application Specific Standard Product (ASSP), the system (SOC) of system on chip, load programmable logic device (CPLD) etc..
For implement disclosed method program code can using any combination of one or more programming languages come It writes.These program codes can be supplied to the place of general purpose computer, special purpose computer or other programmable data processing units Device or controller are managed, so that program code makes defined in flowchart and or block diagram when by processor or controller execution Function/operation is carried out.Program code can be executed completely on machine, partly be executed on machine, as stand alone software Is executed on machine and partly execute or executed on remote machine or server completely on the remote machine to packet portion.
In the context of the disclosure, machine readable media can be tangible medium, may include or is stored for The program that instruction execution system, device or equipment are used or is used in combination with instruction execution system, device or equipment.Machine can Reading medium can be machine-readable signal medium or machine-readable storage medium.Machine readable media can include but is not limited to electricity Son, magnetic, optical, electromagnetism, infrared or semiconductor system, device or equipment or above content any conjunction Suitable combination.The more specific example of machine readable storage medium will include the electrical connection of line based on one or more, portable meter Calculation machine disk, hard disk, random access memory (RAM), read-only memory (ROM), Erasable Programmable Read Only Memory EPROM (EPROM Or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage facilities or Any appropriate combination of above content.
Although this should be understood as requiring operating in this way with shown in addition, depicting each operation using certain order Certain order out executes in sequential order, or requires the operation of all diagrams that should be performed to obtain desired result. Under certain environment, multitask and parallel processing be may be advantageous.Similarly, although containing several tools in being discussed above Body realizes details, but these are not construed as the limitation to the scope of the present disclosure.In the context of individual embodiment Described in certain features can also realize in combination in single realize.On the contrary, in the described in the text up and down individually realized Various features can also realize individually or in any suitable subcombination in multiple realizations.
Although having used specific to this theme of the language description of structure feature and/or method logical action, answer When understanding that theme defined in the appended claims is not necessarily limited to special characteristic described above or movement.On on the contrary, Special characteristic described in face and movement are only to realize the exemplary forms of claims.

Claims (20)

1. a kind of method of vehicle audio control, comprising:
Acquisition text generated by the voice that vehicle identification user inputs;
It is multiple textual portions that identity information based on the user, which divides the text,;
By determining the executable instruction of one or more vehicles associated with each textual portions, an instruction set is generated It closes;And
The vehicle is set to execute at least part instruction in described instruction set.
2. according to the method described in claim 1, wherein obtaining the text and including:
The executable instruction of one or more of vehicles can not be determined in response to the vehicle, obtain the text.
3. according to the method described in claim 1, wherein dividing the text and including:
Wake-up sentence associated with the user is identified in the text, the wake-up sentence is not one or more of The executable instruction of vehicle but it is associated with the instruction that one or more of vehicles can be performed;And
In response to recognizing the wake-up sentence, each wake-up sentence is marked off from the text as the first text portion Point.
4. according to the method described in claim 1, wherein dividing the text and including:
The conjunction specific to the user is identified in the text;And
The text is divided based on the conjunction.
5. according to the method described in claim 1, wherein dividing the text and including:
At least one textual portions applicational grammar analysis to dividing in obtained the multiple textual portions, to obtain one Or multiple second textual portions.
6. according to the method described in claim 5, wherein dividing the text and including:
It obtains the user and inputs the geographical location being located at when the voice;And
Based on the geographical location, to can not held with one or more of vehicles in one or more of second textual portions The associated second textual portions application dialect analysis of capable instruction.
7. according to the method described in claim 1, wherein generation described instruction set includes:
By removing duplicate instruction in the executable instruction of one or more of vehicles, described instruction set is generated.
8. according to the method described in claim 1, at least part for executing the vehicle in described instruction set refers to Order includes:
Obtain the mark of the vehicle;
Determine available commands set corresponding with the mark;And
The vehicle is set to execute at least part instruction for belonging to the available commands set in described instruction set.
9. according to the method described in claim 1, further include:
Generate the wake-up sentence for indicating at least part instruction.
10. a kind of device of vehicle audio control, comprising:
Module is obtained, is configured as obtaining by voice that vehicle identification user inputs and the text generated;
Division module is configured as the identity information based on the user and divides the text to be multiple textual portions;
Generation module is configured as by determining the executable finger of one or more vehicles associated with each textual portions It enables, generates an instruction set;And
Execution module is configured as that the vehicle is made to execute at least part instruction in described instruction set.
11. device according to claim 10, wherein the acquisition module includes:
Text obtains module, and the executable finger of one or more of vehicles can not be determined by being configured to respond to the vehicle It enables, obtains the text.
12. device according to claim 10, wherein the division module includes:
Sentence identification module is waken up, is configured as identifying wake-up sentence associated with the user in the text, it is described Wake up the finger that sentence is not the executable instruction of one or more of vehicles but can be performed with one or more of vehicles It enables associated;And
Sentence division module is waken up, is configured to respond to recognize the wake-up sentence, by each wake-up sentence from the text It marks off in this as the first textual portions.
13. device according to claim 10, wherein the division module includes:
Connection string module is configured as identifying the conjunction specific to the user in the text;And
Conjunction division module is configured as dividing the text based on the conjunction.
14. device according to claim 10, wherein the division module includes:
Syntax Analysis Module is configured as answering at least one textual portions divided in obtained the multiple textual portions With syntactic analysis, to obtain one or more second textual portions.
15. device according to claim 14, wherein the division module includes:
Position acquisition module, is configured as obtaining the user and inputs the geographical location being located at when the voice;And
Dialect analysis module, is configured as based on the geographical location, in one or more of second textual portions not with The associated second textual portions application dialect analysis of the executable instruction of one or more of vehicles.
16. device according to claim 10, wherein the generation module includes:
Instruction set generation module is configured as by removing duplicate finger in the executable instruction of one or more of vehicles It enables, generates described instruction set.
17. device according to claim 10, wherein the execution module includes:
Identifier acquisition module is configured as obtaining the mark of the vehicle;
Determining module is configured to determine that available commands set corresponding with the mark;And
Instruction execution module, being configured as, which executes the vehicle in described instruction set, belongs to the available commands set extremely Few a part instruction.
18. device according to claim 10, further includes:
Sentence generation module is waken up, the wake-up sentence for indicating at least part instruction is configurable to generate.
19. a kind of electronic equipment, the electronic equipment include:
One or more processors;And
Storage device, for storing one or more programs, when one or more of programs are by one or more of processing Device executes, so that one or more of processors realize method as claimed in any one of claims 1-9 wherein.
20. a kind of computer readable storage medium is stored thereon with computer program, realization when described program is executed by processor Method as claimed in any one of claims 1-9 wherein.
CN201811150983.1A 2018-09-29 2018-09-29 Method, apparatus, device and medium for vehicle voice control Active CN109003611B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811150983.1A CN109003611B (en) 2018-09-29 2018-09-29 Method, apparatus, device and medium for vehicle voice control

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811150983.1A CN109003611B (en) 2018-09-29 2018-09-29 Method, apparatus, device and medium for vehicle voice control

Publications (2)

Publication Number Publication Date
CN109003611A true CN109003611A (en) 2018-12-14
CN109003611B CN109003611B (en) 2022-05-27

Family

ID=64589614

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811150983.1A Active CN109003611B (en) 2018-09-29 2018-09-29 Method, apparatus, device and medium for vehicle voice control

Country Status (1)

Country Link
CN (1) CN109003611B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109767758A (en) * 2019-01-11 2019-05-17 中山大学 Vehicle-mounted voice analysis method, system, storage medium and equipment
CN110400562A (en) * 2019-06-24 2019-11-01 歌尔科技有限公司 Interaction processing method, device, equipment and audio frequency apparatus
CN110633476A (en) * 2019-09-27 2019-12-31 北京百度网讯科技有限公司 Method and device for acquiring knowledge annotation information
CN111324202A (en) * 2020-02-19 2020-06-23 中国第一汽车股份有限公司 Interaction method, device, equipment and storage medium
CN112017642A (en) * 2019-05-31 2020-12-01 华为技术有限公司 Method, device and equipment for speech recognition and computer readable storage medium
CN112241628A (en) * 2019-07-18 2021-01-19 本田技研工业株式会社 Agent device, control method for agent device, and storage medium

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140112225A1 (en) * 2012-10-23 2014-04-24 Qualcomm Incorporated Systems and methods for low power wake up signal and operations for wlan
CN104145304A (en) * 2012-03-08 2014-11-12 Lg电子株式会社 An apparatus and method for multiple device voice control
CN106471570A (en) * 2014-05-30 2017-03-01 苹果公司 Order single language input method more
WO2017071182A1 (en) * 2015-10-26 2017-05-04 乐视控股(北京)有限公司 Voice wakeup method, apparatus and system
WO2017092189A1 (en) * 2015-11-30 2017-06-08 中兴通讯股份有限公司 Method realizing voice wake-up, device, terminal, and computer storage medium
CN107204185A (en) * 2017-05-03 2017-09-26 深圳车盒子科技有限公司 Vehicle-mounted voice exchange method, system and computer-readable recording medium
CN107199971A (en) * 2017-05-03 2017-09-26 深圳车盒子科技有限公司 Vehicle-mounted voice exchange method, terminal and computer-readable recording medium
CN107527614A (en) * 2016-06-21 2017-12-29 瑞昱半导体股份有限公司 Speech control system and its method
CN107578776A (en) * 2017-09-25 2018-01-12 咪咕文化科技有限公司 A kind of awakening method of interactive voice, device and computer-readable recording medium
CN107680591A (en) * 2017-09-21 2018-02-09 百度在线网络技术(北京)有限公司 Voice interactive method, device and its equipment based on car-mounted terminal
CN108091329A (en) * 2017-12-20 2018-05-29 江西爱驰亿维实业有限公司 Method, apparatus and computing device based on speech recognition controlled automobile
WO2018157388A1 (en) * 2017-03-03 2018-09-07 深圳前海达闼云端智能科技有限公司 Wake-up method and device for robot, and robot

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104145304A (en) * 2012-03-08 2014-11-12 Lg电子株式会社 An apparatus and method for multiple device voice control
US20140112225A1 (en) * 2012-10-23 2014-04-24 Qualcomm Incorporated Systems and methods for low power wake up signal and operations for wlan
CN106471570A (en) * 2014-05-30 2017-03-01 苹果公司 Order single language input method more
WO2017071182A1 (en) * 2015-10-26 2017-05-04 乐视控股(北京)有限公司 Voice wakeup method, apparatus and system
WO2017092189A1 (en) * 2015-11-30 2017-06-08 中兴通讯股份有限公司 Method realizing voice wake-up, device, terminal, and computer storage medium
CN107527614A (en) * 2016-06-21 2017-12-29 瑞昱半导体股份有限公司 Speech control system and its method
WO2018157388A1 (en) * 2017-03-03 2018-09-07 深圳前海达闼云端智能科技有限公司 Wake-up method and device for robot, and robot
CN107204185A (en) * 2017-05-03 2017-09-26 深圳车盒子科技有限公司 Vehicle-mounted voice exchange method, system and computer-readable recording medium
CN107199971A (en) * 2017-05-03 2017-09-26 深圳车盒子科技有限公司 Vehicle-mounted voice exchange method, terminal and computer-readable recording medium
CN107680591A (en) * 2017-09-21 2018-02-09 百度在线网络技术(北京)有限公司 Voice interactive method, device and its equipment based on car-mounted terminal
CN107578776A (en) * 2017-09-25 2018-01-12 咪咕文化科技有限公司 A kind of awakening method of interactive voice, device and computer-readable recording medium
CN108091329A (en) * 2017-12-20 2018-05-29 江西爱驰亿维实业有限公司 Method, apparatus and computing device based on speech recognition controlled automobile

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
日经大数据: "《深度学习的商业化应用:谷歌工程师前沿解读人工智能》", 31 July 2018 *

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109767758A (en) * 2019-01-11 2019-05-17 中山大学 Vehicle-mounted voice analysis method, system, storage medium and equipment
CN109767758B (en) * 2019-01-11 2021-06-08 中山大学 Vehicle-mounted voice analysis method, system, storage medium and device
CN112017642A (en) * 2019-05-31 2020-12-01 华为技术有限公司 Method, device and equipment for speech recognition and computer readable storage medium
CN112017642B (en) * 2019-05-31 2024-04-26 华为技术有限公司 Speech recognition method, apparatus, device and computer readable storage medium
CN110400562A (en) * 2019-06-24 2019-11-01 歌尔科技有限公司 Interaction processing method, device, equipment and audio frequency apparatus
CN110400562B (en) * 2019-06-24 2022-03-22 歌尔科技有限公司 Interactive processing method, device, equipment and audio equipment
CN112241628A (en) * 2019-07-18 2021-01-19 本田技研工业株式会社 Agent device, control method for agent device, and storage medium
CN110633476A (en) * 2019-09-27 2019-12-31 北京百度网讯科技有限公司 Method and device for acquiring knowledge annotation information
CN110633476B (en) * 2019-09-27 2024-04-05 北京百度网讯科技有限公司 Method and device for acquiring knowledge annotation information
CN111324202A (en) * 2020-02-19 2020-06-23 中国第一汽车股份有限公司 Interaction method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN109003611B (en) 2022-05-27

Similar Documents

Publication Publication Date Title
CN109003611A (en) Method, apparatus, equipment and medium for vehicle audio control
US11503155B2 (en) Interactive voice-control method and apparatus, device and medium
JP7391452B2 (en) Semantic understanding model training method, apparatus, electronic device and computer program
US10943582B2 (en) Method and apparatus of training acoustic feature extracting model, device and computer storage medium
CN108520743B (en) Voice control method of intelligent device, intelligent device and computer readable medium
CN109002510B (en) Dialogue processing method, device, equipment and medium
US20230206911A1 (en) Processing natural language using machine learning to determine slot values based on slot descriptors
KR102429436B1 (en) Server for seleting a target device according to a voice input, and controlling the selected target device, and method for operating the same
KR102261552B1 (en) Providing Method For Voice Command and Electronic Device supporting the same
US8374867B2 (en) System and method for standardized speech recognition infrastructure
WO2018149209A1 (en) Voice recognition method, electronic device, and computer storage medium
CN107657950B (en) Automobile voice control method, system and device based on cloud and multi-command words
CN110807333B (en) Semantic processing method, device and storage medium of semantic understanding model
CN109147797A (en) Client service method, device, computer equipment and storage medium based on Application on Voiceprint Recognition
JP7300435B2 (en) Methods, apparatus, electronics, and computer-readable storage media for voice interaction
CN107331400A (en) A kind of Application on Voiceprint Recognition performance improvement method, device, terminal and storage medium
CN107656996B (en) Man-machine interaction method and device based on artificial intelligence
CN108364650B (en) Device and method for adjusting voice recognition result
JP7213943B2 (en) Audio processing method, device, device and storage medium for in-vehicle equipment
WO2020233363A1 (en) Speech recognition method and device, electronic apparatus, and storage medium
KR102170088B1 (en) Method and system for auto response based on artificial intelligence
CN106372054A (en) Multilingual semantic analysis method and apparatus
CN112669842A (en) Man-machine conversation control method, device, computer equipment and storage medium
CN106649253A (en) Auxiliary control method and system based on post verification
CN114155854B (en) Voice data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20211018

Address after: 100176 101, floor 1, building 1, yard 7, Ruihe West 2nd Road, Beijing Economic and Technological Development Zone, Daxing District, Beijing

Applicant after: Apollo Zhilian (Beijing) Technology Co.,Ltd.

Address before: 100080 No.10, Shangdi 10th Street, Haidian District, Beijing

Applicant before: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant