WO2023019475A1 - Virtual personal assistant displaying method and apparatus, device, medium, and product - Google Patents

Virtual personal assistant displaying method and apparatus, device, medium, and product Download PDF

Info

Publication number
WO2023019475A1
WO2023019475A1 PCT/CN2021/113299 CN2021113299W WO2023019475A1 WO 2023019475 A1 WO2023019475 A1 WO 2023019475A1 CN 2021113299 W CN2021113299 W CN 2021113299W WO 2023019475 A1 WO2023019475 A1 WO 2023019475A1
Authority
WO
WIPO (PCT)
Prior art keywords
vpa
vehicle
information
display
interaction instruction
Prior art date
Application number
PCT/CN2021/113299
Other languages
French (fr)
Chinese (zh)
Inventor
陈真
Original Assignee
阿波罗智联(北京)科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿波罗智联(北京)科技有限公司 filed Critical 阿波罗智联(北京)科技有限公司
Priority to PCT/CN2021/113299 priority Critical patent/WO2023019475A1/en
Publication of WO2023019475A1 publication Critical patent/WO2023019475A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue

Definitions

  • the present disclosure relates to the field of data processing technology, especially to the field of vehicle network technology, and in particular to a display method, device, device, medium and product of a virtual personal assistant.
  • VPA Virtual Personal Assistant
  • the present disclosure provides a display method, device, equipment, medium and product of a virtual personal assistant.
  • a method for displaying a virtual personal assistant including:
  • the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
  • the VPA display information including the VPA image
  • the display information of the VPA is output through the display screen in the vehicle.
  • the output of the VPA display information through the display screen in the vehicle includes:
  • the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  • the manner of determining the target interaction instruction includes:
  • the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
  • the acquisition of the display information of the virtual personal assistant VPA that matches the target attribute information includes:
  • the VPA display information with the highest display frequency is selected.
  • the acquisition of the display information of the virtual personal assistant VPA that matches the target attribute information includes:
  • the VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
  • the method before identifying the identity attribute information of the passenger based on the sound features represented by the target interaction instruction, the method further includes:
  • the VPA display information also includes a first guide; after the VPA display information is output through the display screen in the vehicle, it also includes:
  • the multimedia information is the information displayed to passengers by the car machine of the vehicle through the display screen after outputting the first guide language through the display screen in the vehicle ;
  • the second guide is output through a display screen in the vehicle.
  • identifying the identity attribute information of the passenger based on the sound features represented by the target interaction instruction includes:
  • a virtual personal assistant display system including: a server and a vehicle-mounted terminal of a vehicle;
  • the vehicle-mounted terminal is used to obtain a target interaction instruction, determine the sound characteristics represented by the target interaction instruction, and send the sound characteristics to the server, and the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
  • the server is configured to receive the sound feature represented by the target interaction instruction, identify the identity attribute information of the passenger based on the sound feature represented by the target interaction instruction, and use the identity attribute information of the passenger as the target attribute Information, obtaining the virtual personal assistant VPA display information matched with the target attribute information, the VPA display information including the VPA image, and sending the VPA display information to the vehicle terminal;
  • the vehicle-mounted terminal is further configured to receive the VPA display information, and output the VPA display information through a display screen in the vehicle.
  • the vehicle-mounted terminal is specifically used for:
  • the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  • the vehicle-mounted terminal is also used for:
  • the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
  • the vehicle-mounted terminal is also used for:
  • the VPA display information with the highest display frequency is selected.
  • the server is specifically used for:
  • the VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
  • the server is also used for:
  • the VPA display information also includes a first guide
  • the server is further configured to obtain display status information for multimedia information after outputting the VPA display information through the display screen in the vehicle, and the multimedia information is to pass through the display screen in the vehicle, and to After the first guide is output, the vehicle machine of the vehicle displays the information to passengers through the display screen;
  • the server is further configured to determine a second guide that matches the user behavior data of the passenger and send the second guide to the vehicle-mounted terminal if the display state information indicates that the multimedia information display is completed;
  • the vehicle-mounted terminal is further configured to receive the second guide, and output the second guide through a display screen in the vehicle.
  • the server is specifically used for:
  • a virtual personal assistant display device including:
  • An acquisition module configured to acquire the sound features represented by the target interaction instruction; wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
  • An identification module configured to identify the passenger's identity attribute information based on the sound features represented by the target interaction instruction, and use the passenger's identity attribute information as the target attribute information;
  • the obtaining module is also used to obtain the virtual personal assistant VPA display information matching the target attribute information, and the VPA display information includes a VPA image;
  • the output module is used to output the display information of the VPA through the display screen in the vehicle.
  • the output module is specifically used for:
  • the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  • the device further includes a determination module, the determination module is configured to:
  • the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
  • the acquiring module is specifically used for:
  • the VPA display information with the highest display frequency is selected.
  • the acquiring module is specifically used for:
  • the VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
  • the device may also include a search module and an execution module;
  • the search module is used to search for the information related to the target interaction instruction from the corresponding relationship between each sound feature and user behavior data before identifying the identity attribute information of the passenger based on the sound characteristics represented by the target interaction instruction.
  • User behavior data corresponding to the represented sound features
  • the execution module is used to determine the VPA display information matching the found user behavior data if found, and execute the step of outputting the VPA display information through the display screen in the vehicle;
  • the execution module is further configured to execute the step of identifying the identity attribute information of the passenger based on the voice features represented by the target interaction instruction if not found.
  • the VPA presentation information also includes a first guide; the device may also include: a determination module;
  • the acquisition module is also used to acquire display status information for multimedia information after outputting the VPA display information through the display screen in the vehicle. After the first guide is output, the vehicle machine of the vehicle displays the information to passengers through the display screen;
  • a determining module configured to determine a second guide that matches the user behavior data of the passenger if the display state information indicates that the display of the multimedia information is completed;
  • the output module is further configured to output the second guide language through the display screen in the vehicle.
  • the identification module is specifically used for:
  • an electronic device including:
  • the memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the method described in the first aspect above.
  • a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method described in the first aspect above.
  • a computer program product including a computer program, which implements the method described in the first aspect when executed by a processor.
  • the embodiment of the present disclosure can obtain the sound features represented by the target interaction instruction, and then identify the passenger's identity attribute information based on the sound feature, and use the passenger's identity attribute information as the target attribute information, and output the information corresponding to the target attribute information through the display screen in the vehicle.
  • Matched VPA display information where the VPA display information includes a VPA image. It can be seen that the embodiment of the present disclosure determines the VPA image displayed for the passenger based on the identity attribute information of the passenger, so different VPA images can be displayed to different passengers.
  • FIG. 1 is a flowchart of a display method of a virtual personal assistant provided by an embodiment of the present disclosure
  • Fig. 2 is a flow chart of another display method of a virtual personal assistant provided by an embodiment of the present disclosure
  • Fig. 3 is a flow chart of a method for outputting a second guide provided by an embodiment of the present disclosure
  • Fig. 4 is a signaling diagram of a presentation process of a virtual personal assistant provided by an embodiment of the present disclosure
  • Fig. 5 is an exemplary schematic diagram of a presentation process of a virtual personal assistant provided by an embodiment of the present disclosure
  • Fig. 6 is a schematic structural diagram of a display system of a virtual personal assistant provided by an embodiment of the present disclosure
  • Fig. 7 is a schematic structural diagram of a virtual personal assistant display device provided by an embodiment of the present disclosure.
  • FIG. 8 is a block diagram of an electronic device used to implement the presentation method of a virtual personal assistant according to an embodiment of the present disclosure.
  • VPA Virtual Personal Assistant
  • VPA image displayed on the car screen is fixed, so the display method is single and the flexibility is poor.
  • the embodiment of the present disclosure provides a method for displaying a virtual personal assistant, which is applied to an electronic device.
  • the electronic device may be a vehicle-mounted terminal in the vehicle, or a server corresponding to the vehicle-mounted terminal, which is reasonable.
  • the vehicle-mounted terminal may be a vehicle-mounted brain, or a vehicle super-brain, or a driving brain; IVI), vehicle-mounted infotainment host (Display Head Unit, DHU) or any other type of vehicle-mounted terminal, which is not limited in this embodiment of the present disclosure.
  • IVI vehicle-mounted infotainment host
  • DHU Display Head Unit
  • the method includes the following steps:
  • the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle.
  • the sending out of the target interaction instruction means that the passenger and the vehicle-mounted terminal start to conduct voice interaction.
  • a predetermined wake-up word spoken by a passenger may be a target interaction instruction.
  • the embodiment of the present disclosure does not limit the specific content of the predetermined wake-up word.
  • the vehicle-mounted terminal can directly collect the target interaction instruction through the sound collection device in the vehicle when executing S101, and extract the sound features represented by the target interaction instruction.
  • the server may receive the voice feature represented by the target interaction instruction sent by the vehicle-mounted terminal when executing S101.
  • the vehicle-mounted terminal collects the target interaction command, it can extract the sound features represented by the target interaction command, and then send the sound features represented by the target interaction command to the server, so that the server can obtain the sound features represented by the target interaction command. characterizing sound characteristics.
  • voice features represented by the target interaction instruction may also be called voiceprint features, and the voice features include formants.
  • voiceprint features may also be called voiceprint features
  • voice features include formants.
  • the present disclosure does not limit the manner of extracting voice features, and any implementation manner capable of extracting voice features from voice instructions can be applied to the present disclosure.
  • the collection, storage, use, processing, transmission, provision, and disclosure of passengers' voice interaction instructions involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.
  • the passenger's identity attribute information is used to represent the passenger's natural attributes.
  • the identity attribute information may include gender and/or age.
  • the virtual personal assistant image displayed to the passenger is determined by gender and/or age.
  • the electronic device may input the sound features represented by the target interaction instruction into a pre-trained neural network model, and then obtain identity attribute information output by the neural network model.
  • the neural network model is trained through a training set, and the training set includes a plurality of sample sound features and a training label corresponding to each sample sound feature.
  • the training label includes the identity attribute information of the person to whom the sample voice feature belongs.
  • the embodiment of the present disclosure utilizes a neural network model to identify identity attribute information of passengers through sound features, so that the judgment of identity attribute information of passengers is more accurate and robust.
  • the identity attribute information of passengers can also be identified through a specified voice recognition algorithm information.
  • the collection, storage, use, processing, transmission, provision, and disclosure of passenger target attribute information and training sets are all in compliance with relevant laws and regulations, and do not violate public order and good customs.
  • VPA display information matching the target attribute information.
  • the VPA display information includes a VPA image.
  • the VPA image is used to represent the appearance of the VPA.
  • the VPA image may include an image composed of the VPA's facial features, figure, hairstyle, and clothing.
  • the VPA display information may also include: the first guide language and/or the pronunciation from text to speech (Text To Speech, TTS).
  • TTS Text To Speech
  • the electronic device may display the first guide language while displaying the image of the VPA.
  • the first guide is used to guide passengers to interact with the vehicle terminal.
  • the first guiding language includes: "Try to start the unmanned driving mode” and "Let's watch a cartoon together”.
  • the TTS pronunciation is the pronunciation upon which the first lead language is played.
  • the TTS pronunciation includes any of an elderly voice, an adult male voice, an adult female voice, a child voice, and the like.
  • the corresponding VPA display information when the target attribute information includes 10 years old and male, the corresponding VPA display information includes: cartoon images, children's voices and children's guide words; when the target attribute information includes 20 years old and male, the corresponding VPA display information includes: Male image, adult male voice and male guide; target attribute information includes 20-year-old and female, and its corresponding VPA display information includes: female image, adult female voice and female guide; target attribute information includes 60-year-old and male, The corresponding VPA display information includes: images of old people, voices of old people and guiding words for old people.
  • the vehicle-mounted terminal when executing S104, can directly output the display information of the VPA through the display screen in the vehicle, that is, display the image of the VPA that matches the passenger.
  • the server can send VPA display information to the vehicle-mounted terminal when executing S104, so that the vehicle-mounted terminal can output the VPA display information through the display screen in the vehicle after receiving the VPA display information. Match the VPA image.
  • the vehicle has only one display screen, that is, the display screen at the driver's seat, then, in one implementation, no matter which seat the passenger who issued the target interaction instruction belongs to, the information displayed by the VPA will be displayed through the driver's seat. Display at the seat for output.
  • the embodiment of the present disclosure can obtain the sound features represented by the target interaction instruction, and then identify the passenger's identity attribute information based on the sound feature, and use the passenger's identity attribute information as the target attribute information, and output the information corresponding to the target attribute information through the display screen in the vehicle.
  • Matched VPA display information where the VPA display information includes a VPA image. It can be seen that the embodiment of the present disclosure determines the VPA image displayed for the passenger based on the identity attribute information of the passenger, so different VPA images can be displayed to different passengers.
  • the embodiments of the present disclosure can display different VPA images to different passengers, the embodiments of the present disclosure can more flexibly display the VPA images and at the same time meet the personalized needs of users.
  • the method for determining the target interaction instruction in S101 includes: if multiple voice interaction instructions are collected in the vehicle at the same time, Based on the priority of each voice interaction instruction, the instruction with the highest priority is selected as the target interaction instruction. Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
  • the electronic device when multiple voice interaction commands are collected in the vehicle at the same time, the electronic device can select the command with the highest priority as the target interaction command based on the priority of each voice interaction command ; Furthermore, the sound characteristics of the target interactive instruction are obtained.
  • the vehicle-mounted terminal selects the command with the highest priority as the target interaction command based on the priority of each voice interaction command; The voice feature of the command, and send the voice feature of the target interaction command to the server, so that the voice feature acquired by the server is the voice feature of the voice interaction command with the highest priority.
  • multiple sound collection devices such as in-vehicle microphones, may be installed in the vehicle where the vehicle-mounted terminal is located. Different sound collection devices can collect the voices of passengers in different seating areas.
  • the priorities of the seat areas are in order from high to low: the main driving area, the co-pilot area, and the rear row area.
  • the vehicle-mounted terminal simultaneously collects the interactive command 1 issued by the passenger in the main driving area and the interactive command 2 issued by the passenger in the co-pilot area. Since the priority of interactive command 1 is higher than that of interactive command 2, interactive command 1 is taken as the target Interactive instructions.
  • the source area of the voice interaction instruction that is, the identification manner of the seat area where the passenger who issued the voice interaction instruction is located.
  • multiple sound collection devices are installed at different positions in the vehicle, and each seat in the vehicle is set as an independent sound system. Area, by judging the source direction of the sound signal received by the sound collection equipment in multiple locations, determine which seat the sound signal is sent from, and then know the source area of the voice interaction command, that is, the passenger who issued the voice interaction command. Located in the seating area.
  • the embodiment of the present disclosure can select the command with the highest priority according to the priority of the voice interaction commands, so as to prioritize the interaction needs of the driver and improve the safety of vehicle driving.
  • the vehicle where the vehicle-mounted terminal is located may include multiple display screens.
  • the way of outputting the VPA display information through the display screens in the vehicle in S104 above can be implemented as: outputting the VPA display information through the display screens in the designated seat area in the vehicle .
  • the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  • the electronic device can determine the designated seat area before outputting the VPA display information, and then output the VPA display information on the display screen of the designated seat area in the vehicle.
  • the electronic device can obtain the area identification of the source area of the target interaction instruction from the vehicle-mounted terminal while obtaining the sound feature represented by the target interaction instruction, that is, the area identification of the designated seat area, and then After the electronic device obtains the VPA display information, it can send the VPA display information and the area identification obtained before to the vehicle-mounted terminal, so that the vehicle-mounted terminal can output the VPA display information on the display screen of the designated seat area in the vehicle;
  • the electronic device can also only feed back the VPA display information to the vehicle-mounted terminal, and the vehicle-mounted terminal outputs the VPA display information on the display screen in the designated seat area in the vehicle by itself, which is also reasonable.
  • the embodiments of the present disclosure can be applied to multi-screen interaction scenarios, so that for passengers in different seating areas, different VPA display information can be determined according to the identity attribute information of the passengers, so as to improve the use interest and interaction fun of passengers.
  • the embodiments of the present disclosure can be applied not only to single-screen interaction scenarios, but also to multi-screen interaction scenarios, so that the application range of the embodiments of the present disclosure is wider.
  • S102 before the above S102, it also includes: S105, from the corresponding relationship between each sound feature and user behavior data, find the information represented by the target interaction instruction. The user behavior data corresponding to the voice features. If found, determine the VPA display information matching the found user behavior data, and execute S104; if not found, execute S102.
  • the user behavior data is used to represent the user's historical operation behavior.
  • user behavior data includes: VPA display information set by the user history, videos watched by the user, news watched by the user and/or music played by the user, etc.
  • the electronic device may use the VPA display information set by the user history as the VPA display information matching the user behavior data.
  • the electronic device determines the preference type of the passenger according to the user behavior data, and then determines the VPA display information corresponding to the preference type.
  • the passenger's user behavior data includes: cartoons and nursery rhymes
  • the passenger's favorite type is animation
  • the VPA image and the first guide language corresponding to the animation are determined.
  • each VPA image corresponds to a preference type
  • each first guide language corresponds to a preference type.
  • the collection, storage, use, processing, transmission, provision, and disclosure of the passenger's user behavior data involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.
  • the embodiment of the present disclosure can determine the VPA display information according to the user behavior data. Since the user behavior data can better reflect the user's interest compared with the identity attribute information, the embodiment of the present disclosure determines according to the user behavior data when the user behavior data is collected in advance. The information displayed by the VPA can better meet the interests of users.
  • the above step of S103 acquiring VPA presentation information matching the target attribute information may include the following three implementations:
  • Method 1 Obtain the VPA display information matching the target attribute information from the preset corresponding relationship between each identity attribute information and the VPA display information.
  • the server may pre-collect user behavior data of passengers with different identity attribute information, and then for each passenger, based on the user behavior data of the passenger, determine the VPA display information that matches the user behavior data as the The VPA display information corresponding to the identity attribute information.
  • the pre-collected identity attribute information of passenger 1 includes age 10, and user behavior data includes cartoon A and cartoon B.
  • Passenger 2's identity attribute information includes age 5, and user behavior data includes cartoon C and cartoon B.
  • Passenger 3’s identity attribute information includes age 5, and user behavior data includes nursery rhyme A and cartoon A.
  • cartoon A, cartoon B, cartoon C and nursery rhyme A are all animation types, since passenger 1, passenger 2 and passenger 3 all belong to the children's age group, the children's age group is corresponding to the VPA image of the animation type.
  • the age of the passenger 4 is obtained later as 7 years old, since 7 years old belongs to the age group of children, the VPA image of the animation type is determined.
  • the embodiments of the present disclosure can collect user behavior data of various passengers in advance, so as to obtain the preferences of passengers of different ages and genders, and thus can determine the VPA display information that the passenger may like based on the identity attribute information of the passenger.
  • Method 2 Find the historical VPA display information corresponding to the sound feature represented by the target interactive command, and then select the VPA display information with the highest display frequency among the historical VPA display information.
  • the historical VPA display information is the VPA display information that has been displayed to the passenger who issued the target interaction instruction.
  • the vehicle-mounted terminal can record the voice characteristics of each passenger who has used the voice interaction function, as well as the corresponding historical VPA display information. Therefore, when the sound feature represented by the target interaction instruction is obtained, the historical VPA display information corresponding to the sound feature is searched, and then the VPA display information with the highest display frequency is selected from the historical VPA display information.
  • the VPA display information with the highest display frequency is most likely to be liked by passengers, so in the embodiment of the present disclosure, selecting the VPA display information with the highest display frequency can better meet user preferences.
  • Method 3 When the vehicle-mounted terminal communicates with the server, the vehicle-mounted terminal sends the voice feature to the server, and then the server determines the VPA display information through the method 1.
  • the vehicle-mounted terminal disconnects from the communication connection with the server, that is, when the vehicle-mounted terminal is offline, the vehicle-mounted terminal determines the VPA display information in the second manner.
  • the vehicle-mounted terminal may send the sound feature to the server to obtain the VPA display information, and execute the second method when the VPA display information is not obtained from the server.
  • the electronic device may further display a second guide for passengers, including the following steps:
  • the multimedia information is the information displayed by the vehicle machine to the passengers through the display screen after outputting the first guide language through the display screen in the vehicle.
  • the multimedia information includes: text, picture, audio and/or video, etc.
  • the multimedia information refers to the information displayed by the vehicle-mounted terminal during the interaction between passengers and the vehicle-mounted terminal.
  • Display status information includes: not displayed, completed display, partially displayed, etc.
  • the display status information includes the number of episodes of the series of cartoons currently being played. When the number of episodes is the last episode, it means that the series of cartoons is displayed. One episode, which means that the series of cartoons is partially shown.
  • the display status includes the serial number of the song currently being played.
  • the serial number is the last one, it means that the song display of the album is completed; The songs section of the album is shown.
  • the vehicle-machine system Since before S301, the vehicle-machine system has displayed multimedia information for the passenger, the passenger has corresponding user behavior data, and the second guide that matches the passenger's user behavior data can be determined.
  • the type of user preference is entertainment
  • the second guide language of entertainment is determined.
  • the vehicle-mounted terminal can directly output the second guide language through a display screen in the vehicle.
  • the server can send the second guide to the vehicle-machine system, so that the vehicle-mounted terminal can output the second guide through the display screen in the vehicle.
  • the embodiment of the present disclosure can obtain the passenger's user behavior data after the passenger has interacted with the vehicle-mounted terminal for a period of time, so as to recommend the second guide language that the passenger is more likely to like based on the user behavior data, so as to guide the user to further interact with the vehicle-mounted terminal. Interaction, thereby improving the interaction interest of passengers.
  • the vehicle-mounted terminal obtains a target interaction instruction issued by a passenger in a seat area, and obtains a sound feature represented by the target interaction instruction.
  • the car-machine terminal sends the sound feature to the server.
  • the server receives the voice features, and based on the voice features represented by the target interaction instruction, identifies the identity attribute information of the passenger, and takes the passenger's identity attribute information as the target attribute information.
  • the server obtains the VPA display information matching the target attribute information.
  • the VPA display information includes VPA image, first guide language and TTS pronunciation.
  • the server sends the VPA display information to the vehicle terminal.
  • the vehicle-mounted terminal receives the VPA display information, and outputs the VPA display information through the display screen in the seat area in the vehicle.
  • the vehicle-mounted terminal can collect the target interaction instruction issued by the passenger during the voice interaction process, and then obtain the sound characteristics of the target interaction instruction.
  • the age and gender of the passenger are recognized by the server using voice features, and the VPA display information matching the age and gender of the passenger is determined by the VPA generation switching system.
  • the VPA display information includes VPA image, first guide language and TTS pronunciation; each VPA information corresponds to a gender (male or female) and an age group (such as old, middle-aged, young or infant). According to the age and gender of the passengers, it is possible to determine the VPA display information that users of this age and gender may like.
  • the VPA is displayed as an image of the elderly in the vehicle display screen, and the first guide language of the elderly is played using the elderly TTS pronunciation.
  • the VPA is displayed as a middle-aged female image on the vehicle display screen, and the first guide language of the middle-aged female is played using the TTS pronunciation of the middle-aged female.
  • the VPA When the age of the passenger is middle-aged and the gender is male, the VPA is displayed as a middle-aged male image on the vehicle display screen, and the middle-aged male TTS pronunciation is used to play the first guide language of the middle-aged male.
  • the VPA When the age of the passenger is young and the gender is female, the VPA is displayed as a young female image on the vehicle display screen, and at the same time, the young female TTS pronunciation is used to play the first guide language of the young female.
  • the VPA When the age of the passenger belongs to youth and the gender is male, the VPA is displayed as a young male image on the vehicle display screen, and at the same time, the young male TTS pronunciation is used to play the first guiding language of the young male.
  • the VPA is displayed as an image of infants on the vehicle display screen, and at the same time, the infant's first guide language is played using infant TTS pronunciation.
  • the embodiment of the present disclosure also provides a display system of a virtual personal assistant, as shown in FIG. 6 , including: a server 601 and a vehicle-mounted terminal 602 of the vehicle;
  • the vehicle-mounted terminal 602 is used to obtain the target interaction instruction, and determine the sound characteristics represented by the target interaction instruction, and send the sound characteristics to the server, and the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
  • the server 601 is configured to receive the sound feature represented by the target interaction instruction, identify the passenger's identity attribute information based on the sound feature represented by the target interaction instruction, use the passenger's identity attribute information as the target attribute information, and obtain the matching target attribute information
  • the virtual personal assistant VPA display information, the VPA display information includes the VPA image, and sends the VPA display information to the vehicle terminal;
  • the vehicle-mounted terminal 602 is also used to receive VPA display information, and output the VPA display information through the display screen in the vehicle.
  • vehicle terminal 602 is specifically used for:
  • the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  • vehicle terminal 602 is also used for:
  • the instruction with the highest priority is selected as the target interaction instruction
  • the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
  • vehicle terminal 602 is also used for:
  • the server If it fails to obtain the VPA display information from the server, then search for the historical VPA display information corresponding to the sound feature represented by the target interactive command, wherein the historical VPA display information is the VPA display information that has been displayed to the passenger who issued the target interactive command;
  • the vehicle-mounted terminal 602 if the vehicle-mounted terminal 602 does not receive the VPA presentation information sent by the server within the timeout period after the vehicle-mounted terminal 602 sends the voice feature to the server 601, it means that the acquisition of the VPA presentation information from the server fails.
  • the server 601 is specifically used for:
  • the VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
  • the server 601 is also used for:
  • the VPA display information also includes the first introductory language
  • the server 601 is also used to obtain display status information for multimedia information after outputting the VPA display information through the display screen in the vehicle.
  • the multimedia information is after outputting the first guide language through the display screen in the vehicle. , the information displayed to passengers by the vehicle's machine through the display screen;
  • the server 601 is also used to determine the second guide that matches the passenger's user behavior data and send the second guide to the vehicle-mounted terminal if the display status information indicates that the multimedia information display is completed;
  • the vehicle-mounted terminal 602 is also used to receive the second guide, and output the second guide through the display screen in the vehicle.
  • the server 601 is specifically used for:
  • the embodiment of the present disclosure provides a virtual personal assistant display device, as shown in FIG. 7 , including: an acquisition module 701, an identification module 702 and an output module 703.
  • the acquisition module 701 is configured to acquire the sound features represented by the target interaction instruction; wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
  • the identification module 702 is used to identify the passenger's identity attribute information based on the sound characteristics represented by the target interaction instruction, and use the passenger's identity attribute information as the target attribute information;
  • the obtaining module 701 is also used to obtain the virtual personal assistant VPA display information matching the target attribute information, and the VPA display information includes the VPA image;
  • the output module 703 is configured to output the VPA display information through the display screen in the vehicle.
  • output module specifically for:
  • the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  • the device further includes a determination module, which is used for:
  • the instruction with the highest priority is selected as the target interaction instruction
  • the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
  • obtain modules specifically for:
  • the historical VPA display information is the VPA display information that was displayed to the passenger who issued the target interactive command;
  • obtain modules specifically for:
  • the VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
  • the device may also include a search module and an execution module;
  • the search module is used to search for the corresponding voice feature represented by the target interactive command from the corresponding relationship between each voice feature and user behavior data before identifying the identity attribute information of the passenger based on the voice feature represented by the target interactive command.
  • the execution module is used to determine the VPA display information matching the found user behavior data if it is found, and execute the step of outputting the VPA display information through the display screen in the vehicle;
  • the executing module is further configured to execute the step of identifying the passenger's identity attribute information based on the voice features represented by the target interaction instruction if it is not found.
  • the VPA display information also includes a first guide; the device may also include: a determination module;
  • the obtaining module is also used to obtain the display status information for the multimedia information after outputting the VPA display information through the display screen in the vehicle.
  • the multimedia information is after outputting the first guide language through the display screen in the vehicle. , the information displayed to passengers by the vehicle's machine through the display screen;
  • a determining module configured to determine the second guide that matches the passenger's user behavior data if the display state information indicates that the display of the multimedia information is completed;
  • the output module is also used to output the second guide language through the display screen in the vehicle.
  • the identification module is specifically used for:
  • the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
  • FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure.
  • Electronic device is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers.
  • Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices.
  • the components shown herein, their connections and relationships, and their functions, are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
  • an electronic device 800 includes a computing unit 801, which can perform calculations according to a computer program stored in a read-only memory (ROM) 802 or a computer program loaded from a storage unit 808 into a random access memory (RAM) 803. Various appropriate actions and processes are performed. In the RAM 803, various programs and data necessary for the operation of the electronic device 800 can also be stored.
  • the computing unit 801, ROM 802, and RAM 803 are connected to each other through a bus 804.
  • An input/output (I/O) interface 805 is also connected to the bus 804 .
  • the I/O interface 805 includes: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, an optical disk etc.; and a communication unit 809, such as a network card, a modem, a wireless communication transceiver, and the like.
  • the communication unit 809 allows the electronic device 800 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
  • the computing unit 801 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of computing units 801 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc.
  • the computing unit 801 executes the various methods and processes described above, such as the presentation method of the virtual personal assistant, for example, in some embodiments, the presentation method of the virtual personal assistant can be implemented as a computer software program, which is tangibly contained in A machine-readable medium, such as storage unit 808 .
  • part or all of the computer program can be loaded and/or installed on the electronic device 800 via the ROM 802 and/or the communication unit 809.
  • the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the virtual personal assistant presentation method described above can be performed.
  • the computing unit 801 may be configured in any other appropriate way (for example, by means of firmware) to execute the virtual personal assistant presentation method.
  • Various implementations of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips Implemented in a system of systems (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof.
  • FPGAs field programmable gate arrays
  • ASICs application specific integrated circuits
  • ASSPs application specific standard products
  • SOC system of systems
  • CPLD load programmable logic device
  • computer hardware firmware, software, and/or combinations thereof.
  • programmable processor can be special-purpose or general-purpose programmable processor, can receive data and instruction from storage system, at least one input device, and at least one output device, and transmit data and instruction to this storage system, this at least one input device, and this at least one output device an output device.
  • Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, a special purpose computer, or other programmable data processing devices, so that the program codes, when executed by the processor or controller, make the functions/functions specified in the flow diagrams and/or block diagrams Action is implemented.
  • the program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
  • a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device.
  • a machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
  • a machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing.
  • machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read only memory
  • EPROM or flash memory erasable programmable read only memory
  • CD-ROM compact disk read only memory
  • magnetic storage or any suitable combination of the foregoing.
  • the systems and techniques described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user. ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer.
  • a display device e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor
  • a keyboard and pointing device eg, a mouse or a trackball
  • Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, speech input or, tactile input) to receive input from the user.
  • the systems and techniques described herein can be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., as a a user computer having a graphical user interface or web browser through which a user can interact with embodiments of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system.
  • the components of the system can be interconnected by any form or medium of digital data communication, eg, a communication network. Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN) and the Internet.
  • a computer system may include clients and servers.
  • Clients and servers are generally remote from each other and typically interact through a communication network.
  • the relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other.
  • the server can be a cloud server, a server of a distributed system, or a server combined with a blockchain.
  • steps may be reordered, added or deleted using the various forms of flow shown above.
  • each step described in the present disclosure may be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution disclosed in the present disclosure can be achieved, no limitation is imposed herein.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Acoustics & Sound (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The present disclosure relates to the technical field of data processing, and in particular to the technical field of Internet of Vehicles, and provides a virtual personal assistant (VPA) displaying method and apparatus, a device, a medium, and a product. A specific implementation solution is: obtaining a sound feature represented by a target interaction instruction, wherein the target interaction instruction is a voice interaction instruction issued by a passenger in a vehicle; then identifying identity attribute information of the passenger on the basis of the sound feature represented by the target interaction instruction, and using the identity attribute information of the passenger as target attribute information; then obtaining VPA displaying information matching the target attribute information, wherein the VPA displaying information comprises a VPA image; and then outputting the VPA displaying information by means of a display screen in the vehicle.

Description

虚拟个人助理的展示方法、装置、设备、介质及产品Display method, device, equipment, medium and product of virtual personal assistant 技术领域technical field
本公开涉及数据处理技术领域,尤其涉及车辆网技术领域,具体涉及虚拟个人助理的展示方法、装置、设备、介质及产品。The present disclosure relates to the field of data processing technology, especially to the field of vehicle network technology, and in particular to a display method, device, device, medium and product of a virtual personal assistant.
背景技术Background technique
车机与乘客交互时,在车机屏幕上展示虚拟个人助理(Virtual Personal Assistant,VPA),通过VPA引导乘客进行交互,例如引导乘客播放视频、查看新闻等。When the car-machine interacts with passengers, a virtual personal assistant (Virtual Personal Assistant, VPA) is displayed on the car-machine screen, and the VPA guides passengers to interact, such as guiding passengers to play videos, view news, etc.
发明内容Contents of the invention
本公开提供了一种虚拟个人助理的展示方法、装置、设备、介质及产品。The present disclosure provides a display method, device, equipment, medium and product of a virtual personal assistant.
根据本公开的第一方面,提供了一种虚拟个人助理的展示方法,包括:According to a first aspect of the present disclosure, a method for displaying a virtual personal assistant is provided, including:
获取目标交互指令所表征的声音特征;其中,所述目标交互指令为车辆内的乘客发出的语音交互指令;Acquiring the sound features represented by the target interaction instruction; wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,将所述乘客的身份属性信息作为目标属性信息;Identifying the passenger's identity attribute information based on the sound features represented by the target interaction instruction, and using the passenger's identity attribute information as the target attribute information;
获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,所述VPA展示信息包括VPA形象;Obtaining the virtual personal assistant VPA display information matching the target attribute information, the VPA display information including the VPA image;
通过车辆内的显示屏,对所述VPA展示信息进行输出。The display information of the VPA is output through the display screen in the vehicle.
可选的,所述通过车辆内的显示屏,对所述VPA展示信息进行输出,包括:Optionally, the output of the VPA display information through the display screen in the vehicle includes:
通过所述车辆内的指定座位区域的显示屏,对所述VPA展示信息进行输出;Outputting the VPA display information through a display screen in a designated seating area in the vehicle;
其中,所述指定座位区域为发出所述目标交互指令的乘客所位于的座位区域。Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
可选的,所述目标交互指令的确定方式包括:Optionally, the manner of determining the target interaction instruction includes:
若所述车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;If multiple voice interaction instructions are collected in the vehicle at the same time, then based on the priority of each voice interaction instruction, select the instruction with the highest priority as the target interaction instruction;
其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区域所确定的。Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
可选的,所述获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,包括:Optionally, the acquisition of the display information of the virtual personal assistant VPA that matches the target attribute information includes:
查找所述目标交互指令所表征声音特征对应的历史VPA展示信息,其中,所述历史 VPA展示信息为向发出所述目标交互指令的乘客展示过的VPA展示信息;Find the historical VPA display information corresponding to the sound feature represented by the target interaction instruction, wherein the historical VPA display information is the VPA display information shown to the passenger who issued the target interaction instruction;
在所述历史VPA展示信息中,选择展示频率最高的VPA展示信息。Among the historical VPA display information, the VPA display information with the highest display frequency is selected.
可选的,所述获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,包括:Optionally, the acquisition of the display information of the virtual personal assistant VPA that matches the target attribute information includes:
从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与所述目标属性信息相匹配的VPA展示信息。The VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
可选的,在基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息之前,所述方法还包括:Optionally, before identifying the identity attribute information of the passenger based on the sound features represented by the target interaction instruction, the method further includes:
从关于各声音特征与用户行为数据的对应关系中,查找与所述目标交互指令所表征的声音特征对应的用户行为数据;From the corresponding relationship between each sound feature and user behavior data, find the user behavior data corresponding to the sound feature represented by the target interaction instruction;
若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行通过车辆内的显示屏,对所述VPA展示信息进行输出的步骤;If it is found, then determine the VPA display information that matches the found user behavior data, and execute the step of outputting the VPA display information through the display screen in the vehicle;
若未查找到,则执行基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息的步骤。If not found, perform the step of identifying the identity attribute information of the passenger based on the voice features represented by the target interaction instruction.
可选的,所述VPA展示信息还包括第一引导语;在通过车辆内的显示屏,对所述VPA展示信息进行输出之后,还包括:Optionally, the VPA display information also includes a first guide; after the VPA display information is output through the display screen in the vehicle, it also includes:
获取针对多媒体信息的展示状态信息,所述多媒体信息为在通过车辆内的显示屏,对所述第一引导语进行输出后,所述车辆的车机通过所述显示屏所展示给乘客的信息;Acquiring display status information for multimedia information, the multimedia information is the information displayed to passengers by the car machine of the vehicle through the display screen after outputting the first guide language through the display screen in the vehicle ;
若展示状态信息表征多媒体信息展示完成,则确定与所述乘客的用户行为数据匹配的第二引导语;If the display status information indicates that the display of the multimedia information is completed, then determine the second guide that matches the user behavior data of the passenger;
通过所述车辆内的显示屏,对所述第二引导语进行输出。The second guide is output through a display screen in the vehicle.
可选的,基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,包括:Optionally, identifying the identity attribute information of the passenger based on the sound features represented by the target interaction instruction includes:
将所述目标交互指令所表征的声音特征输入预先训练的神经网络模型,获取所述神经网络模型输出的身份属性信息。Inputting the voice features represented by the target interaction instruction into a pre-trained neural network model, and obtaining identity attribute information output by the neural network model.
根据本公开的第二方面,提供了一种虚拟个人助理的展示系统,包括:服务器和车辆的车载终端;According to a second aspect of the present disclosure, a virtual personal assistant display system is provided, including: a server and a vehicle-mounted terminal of a vehicle;
所述车载终端,用于获取目标交互指令,并确定所述目标交互指令所表征的声音特征,向服务器发送所述声音特征,所述目标交互指令为车辆内的乘客发出的语音交互指令;The vehicle-mounted terminal is used to obtain a target interaction instruction, determine the sound characteristics represented by the target interaction instruction, and send the sound characteristics to the server, and the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
所述服务器,用于接收所述目标交互指令所表征的声音特征,基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,将所述乘客的身份属性信息作为目标属性信息,获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,所述VPA展 示信息包括VPA形象,并向所述车载终端发送VPA展示信息;The server is configured to receive the sound feature represented by the target interaction instruction, identify the identity attribute information of the passenger based on the sound feature represented by the target interaction instruction, and use the identity attribute information of the passenger as the target attribute Information, obtaining the virtual personal assistant VPA display information matched with the target attribute information, the VPA display information including the VPA image, and sending the VPA display information to the vehicle terminal;
所述车载终端,还用于接收所述VPA展示信息,并通过车辆内的显示屏,对所述VPA展示信息进行输出。The vehicle-mounted terminal is further configured to receive the VPA display information, and output the VPA display information through a display screen in the vehicle.
可选的,所述车载终端,具体用于:Optionally, the vehicle-mounted terminal is specifically used for:
通过所述车辆内的指定座位区域的显示屏,对所述VPA展示信息进行输出;Outputting the VPA display information through a display screen in a designated seating area in the vehicle;
其中,所述指定座位区域为发出所述目标交互指令的乘客所位于的座位区域。Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
可选的,所述车载终端,还用于:Optionally, the vehicle-mounted terminal is also used for:
若所述车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;If multiple voice interaction instructions are collected in the vehicle at the same time, then based on the priority of each voice interaction instruction, select the instruction with the highest priority as the target interaction instruction;
其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区域所确定的。Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
可选的,所述车载终端还用于:Optionally, the vehicle-mounted terminal is also used for:
若从所述服务器获取VPA展示信息失败,则查找所述目标交互指令所表征声音特征对应的历史VPA展示信息,其中,所述历史VPA展示信息为向发出所述目标交互指令的乘客展示过的VPA展示信息;If it fails to obtain the VPA display information from the server, then search for the historical VPA display information corresponding to the sound feature represented by the target interaction instruction, wherein the historical VPA display information is shown to the passenger who issued the target interaction instruction VPA display information;
在所述历史VPA展示信息中,选择展示频率最高的VPA展示信息。Among the historical VPA display information, the VPA display information with the highest display frequency is selected.
可选的,所述服务器,具体用于:Optionally, the server is specifically used for:
从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与所述目标属性信息相匹配的VPA展示信息。The VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
可选的,所述服务器,还用于:Optionally, the server is also used for:
在基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息之前,从关于各声音特征与用户行为数据的对应关系中,查找与所述目标交互指令所表征的声音特征对应的用户行为数据;Before identifying the identity attribute information of the passenger based on the sound features represented by the target interaction instruction, search for the correspondence between the sound features represented by the target interaction instruction from the correspondence between each sound feature and user behavior data user behavior data;
若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行通过车辆内的显示屏,对所述VPA展示信息进行输出的步骤;If it is found, then determine the VPA display information that matches the found user behavior data, and execute the step of outputting the VPA display information through the display screen in the vehicle;
若未查找到,则执行基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息的步骤。If not found, perform the step of identifying the identity attribute information of the passenger based on the voice features represented by the target interaction instruction.
可选的,所述VPA展示信息还包括第一引导语;Optionally, the VPA display information also includes a first guide;
所述服务器,还用于在通过车辆内的显示屏,对所述VPA展示信息进行输出之后,获取针对多媒体信息的展示状态信息,所述多媒体信息为在通过车辆内的显示屏,对所述第一引导语进行输出后,所述车辆的车机通过所述显示屏所展示给乘客的信息;The server is further configured to obtain display status information for multimedia information after outputting the VPA display information through the display screen in the vehicle, and the multimedia information is to pass through the display screen in the vehicle, and to After the first guide is output, the vehicle machine of the vehicle displays the information to passengers through the display screen;
所述服务器,还用于若展示状态信息表征多媒体信息展示完成,则确定与所述乘客的用户行为数据匹配的第二引导语,向车载终端发送所述第二引导语;The server is further configured to determine a second guide that matches the user behavior data of the passenger and send the second guide to the vehicle-mounted terminal if the display state information indicates that the multimedia information display is completed;
所述车载终端,还用于接收所述第二引导语,并通过所述车辆内的显示屏,对所述第二引导语进行输出。The vehicle-mounted terminal is further configured to receive the second guide, and output the second guide through a display screen in the vehicle.
可选的,所述服务器,具体用于:Optionally, the server is specifically used for:
将所述目标交互指令所表征的声音特征输入预先训练的神经网络模型,获取所述神经网络模型输出的身份属性信息。Inputting the voice features represented by the target interaction instruction into a pre-trained neural network model, and obtaining identity attribute information output by the neural network model.
根据本公开的第三方面,提供了一种虚拟个人助理的展示装置,包括:According to a third aspect of the present disclosure, a virtual personal assistant display device is provided, including:
获取模块,用于获取目标交互指令所表征的声音特征;其中,所述目标交互指令为车辆内的乘客发出的语音交互指令;An acquisition module, configured to acquire the sound features represented by the target interaction instruction; wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
识别模块,用于基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,将所述乘客的身份属性信息作为目标属性信息;An identification module, configured to identify the passenger's identity attribute information based on the sound features represented by the target interaction instruction, and use the passenger's identity attribute information as the target attribute information;
所述获取模块,还用于获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,所述VPA展示信息包括VPA形象;The obtaining module is also used to obtain the virtual personal assistant VPA display information matching the target attribute information, and the VPA display information includes a VPA image;
输出模块,用于通过车辆内的显示屏,对所述VPA展示信息进行输出。The output module is used to output the display information of the VPA through the display screen in the vehicle.
可选的,所述输出模块,具体用于:Optionally, the output module is specifically used for:
通过所述车辆内的指定座位区域的显示屏,对所述VPA展示信息进行输出;Outputting the VPA display information through a display screen in a designated seating area in the vehicle;
其中,所述指定座位区域为发出所述目标交互指令的乘客所位于的座位区域。Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
可选的,所述装置还包括确定模块,所述确定模块,用于:Optionally, the device further includes a determination module, the determination module is configured to:
若所述车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;If multiple voice interaction instructions are collected in the vehicle at the same time, then based on the priority of each voice interaction instruction, select the instruction with the highest priority as the target interaction instruction;
其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区域所确定的。Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
可选的,所述获取模块,具体用于:Optionally, the acquiring module is specifically used for:
查找所述目标交互指令所表征声音特征对应的历史VPA展示信息,其中,所述历史VPA展示信息为向发出所述目标交互指令的乘客展示过的VPA展示信息;Find historical VPA display information corresponding to the sound feature represented by the target interaction instruction, wherein the historical VPA display information is the VPA display information shown to the passenger who issued the target interaction instruction;
在所述历史VPA展示信息中,选择展示频率最高的VPA展示信息。Among the historical VPA display information, the VPA display information with the highest display frequency is selected.
可选的,所述获取模块,具体用于:Optionally, the acquiring module is specifically used for:
从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与所述目标属性信息相匹配的VPA展示信息。The VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
可选的,该装置还可以包括查找模块和执行模块;Optionally, the device may also include a search module and an execution module;
查找模块,用于在基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息之前,从关于各声音特征与用户行为数据的对应关系中,查找与所述目标交互指令所表征的声音特征对应的用户行为数据;The search module is used to search for the information related to the target interaction instruction from the corresponding relationship between each sound feature and user behavior data before identifying the identity attribute information of the passenger based on the sound characteristics represented by the target interaction instruction. User behavior data corresponding to the represented sound features;
执行模块,用于若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行通过车辆内的显示屏,对所述VPA展示信息进行输出的步骤;The execution module is used to determine the VPA display information matching the found user behavior data if found, and execute the step of outputting the VPA display information through the display screen in the vehicle;
执行模块,还用于若未查找到,则执行基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息的步骤。The execution module is further configured to execute the step of identifying the identity attribute information of the passenger based on the voice features represented by the target interaction instruction if not found.
可选的,所述VPA展示信息还包括第一引导语;该装置还可以包括:确定模块;Optionally, the VPA presentation information also includes a first guide; the device may also include: a determination module;
所述获取模块,还用于在通过车辆内的显示屏,对所述VPA展示信息进行输出之后,获取针对多媒体信息的展示状态信息,所述多媒体信息为在通过车辆内的显示屏,对所述第一引导语进行输出后,所述车辆的车机通过所述显示屏所展示给乘客的信息;The acquisition module is also used to acquire display status information for multimedia information after outputting the VPA display information through the display screen in the vehicle. After the first guide is output, the vehicle machine of the vehicle displays the information to passengers through the display screen;
确定模块,用于若展示状态信息表征多媒体信息展示完成,则确定与所述乘客的用户行为数据匹配的第二引导语;A determining module, configured to determine a second guide that matches the user behavior data of the passenger if the display state information indicates that the display of the multimedia information is completed;
所述输出模块,还用于通过所述车辆内的显示屏,对所述第二引导语进行输出。The output module is further configured to output the second guide language through the display screen in the vehicle.
可选的,所述识别模块,具体用于:Optionally, the identification module is specifically used for:
将所述目标交互指令所表征的声音特征输入预先训练的神经网络模型,获取所述神经网络模型输出的身份属性信息。Inputting the voice features represented by the target interaction instruction into a pre-trained neural network model, and obtaining identity attribute information output by the neural network model.
根据本公开的第四方面,提供了一种电子设备,包括:According to a fourth aspect of the present disclosure, an electronic device is provided, including:
至少一个处理器;以及at least one processor; and
与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行上述第一方面所述的方法。The memory stores instructions executable by the at least one processor, and the instructions are executed by the at least one processor, so that the at least one processor can execute the method described in the first aspect above.
根据本公开的第五方面,提供了存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行上述第一方面所述方法。According to a fifth aspect of the present disclosure, there is provided a non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method described in the first aspect above.
根据本公开的第六方面,提供了一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现上述第一方面所述的方法。According to a sixth aspect of the present disclosure, there is provided a computer program product, including a computer program, which implements the method described in the first aspect when executed by a processor.
本公开实施例可以获取目标交互指令所表征的声音特征,然后基于声音特征识别乘客的身份属性信息,将乘客的身份属性信息作为目标属性信息,并通过车辆内的显示屏输出与目标属性信息相匹配的VPA展示信息,其中VPA展示信息包括VPA形象。可见本公开实施例基于乘客的身份属性信息确定为乘客展示的VPA形象,因此能够向不同的乘客展示不同的VPA形象。The embodiment of the present disclosure can obtain the sound features represented by the target interaction instruction, and then identify the passenger's identity attribute information based on the sound feature, and use the passenger's identity attribute information as the target attribute information, and output the information corresponding to the target attribute information through the display screen in the vehicle. Matched VPA display information, where the VPA display information includes a VPA image. It can be seen that the embodiment of the present disclosure determines the VPA image displayed for the passenger based on the identity attribute information of the passenger, so different VPA images can be displayed to different passengers.
应当理解,本部分所描述的内容并非旨在标识本公开的实施例的关键或重要特征,也不用于限制本公开的范围。本公开的其它特征将通过以下的说明书而变得容易理解。It should be understood that what is described in this section is not intended to identify key or important features of the embodiments of the present disclosure, nor is it intended to limit the scope of the present disclosure. Other features of the present disclosure will be readily understood through the following description.
附图说明Description of drawings
附图用于更好地理解本方案,不构成对本公开的限定。其中:The accompanying drawings are used to better understand the present solution, and do not constitute a limitation to the present disclosure. in:
图1是本公开实施例提供的一种虚拟个人助理的展示方法流程图;FIG. 1 is a flowchart of a display method of a virtual personal assistant provided by an embodiment of the present disclosure;
图2是本公开实施例提供的另一种虚拟个人助理的展示方法流程图;Fig. 2 is a flow chart of another display method of a virtual personal assistant provided by an embodiment of the present disclosure;
图3是本公开实施例提供的一种输出第二引导语的方法流程图;Fig. 3 is a flow chart of a method for outputting a second guide provided by an embodiment of the present disclosure;
图4是本公开实施例提供的一种虚拟个人助理的展示过程的信令图;Fig. 4 is a signaling diagram of a presentation process of a virtual personal assistant provided by an embodiment of the present disclosure;
图5是本公开实施例提供的一种虚拟个人助理的展示过程的示例性示意图;Fig. 5 is an exemplary schematic diagram of a presentation process of a virtual personal assistant provided by an embodiment of the present disclosure;
图6是本公开实施例提供的一种虚拟个人助理的展示系统结构示意图;Fig. 6 is a schematic structural diagram of a display system of a virtual personal assistant provided by an embodiment of the present disclosure;
图7是本公开实施例提供的一种虚拟个人助理的展示装置的结构示意图;Fig. 7 is a schematic structural diagram of a virtual personal assistant display device provided by an embodiment of the present disclosure;
图8是用来实现本公开实施例的虚拟个人助理的展示方法的电子设备的框图。FIG. 8 is a block diagram of an electronic device used to implement the presentation method of a virtual personal assistant according to an embodiment of the present disclosure.
具体实施方式Detailed ways
以下结合附图对本公开的示范性实施例做出说明,其中包括本公开实施例的各种细节以助于理解,应当将它们认为仅仅是示范性的。因此,本领域普通技术人员应当认识到,可以对这里描述的实施例做出各种改变和修改,而不会背离本公开的范围和精神。同样,为了清楚和简明,以下的描述中省略了对公知功能和结构的描述。Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and they should be regarded as exemplary only. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
当今的车辆具有语音交互能力,车辆内的车载终端与乘客交互时,在车机屏幕上展示虚拟个人助理(Virtual Personal Assistant,VPA),通过VPA引导乘客进行交互,例如引导乘客播放视频、查看新闻等。Today's vehicles have voice interaction capabilities. When the vehicle-mounted terminal in the vehicle interacts with passengers, it displays a virtual personal assistant (Virtual Personal Assistant, VPA) on the screen of the vehicle, and guides passengers to interact through the VPA, such as guiding passengers to play videos and view news. wait.
然而车机屏幕上展示的VPA形象是固定的,因此展示方式单一,灵活性差。However, the VPA image displayed on the car screen is fixed, so the display method is single and the flexibility is poor.
本公开实施例为了提高展示VPA形象的灵活性,提供了一种虚拟个人助理的展示方法,应用于电子设备。在具体应用中,该电子设备可以是车辆内的车载终端,也可以是车载终端对应的服务器,这都是合理的。In order to improve the flexibility of displaying the image of a VPA, the embodiment of the present disclosure provides a method for displaying a virtual personal assistant, which is applied to an electronic device. In a specific application, the electronic device may be a vehicle-mounted terminal in the vehicle, or a server corresponding to the vehicle-mounted terminal, which is reasonable.
另外,示例性的,该车载终端可以为车载大脑、或汽车超级大脑、或驾驶脑;也可以为车机、信息娱乐主机(Infotainment Head Unit、IHU)、车载信息娱乐系统(In-Vehicle Infotainment、IVI)、车载信息娱乐主机(Display Head Unit,DHU)或其他任意一种车载终端,本公开实施例对此不做限定。In addition, as an example, the vehicle-mounted terminal may be a vehicle-mounted brain, or a vehicle super-brain, or a driving brain; IVI), vehicle-mounted infotainment host (Display Head Unit, DHU) or any other type of vehicle-mounted terminal, which is not limited in this embodiment of the present disclosure.
如图1所示,该方法包括如下步骤:As shown in Figure 1, the method includes the following steps:
S101,获取目标交互指令所表征的声音特征。S101. Acquire sound features represented by a target interaction instruction.
其中,目标交互指令为车辆内的乘客发出的语音交互指令。Wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle.
目标交互指令的发出表示乘客与车载终端开始进行语音交互。例如,乘客说出的预定的唤醒词,则预定的唤醒词可以为目标交互指令。本公开实施例不对预定的唤醒词的具体内容进行限定。The sending out of the target interaction instruction means that the passenger and the vehicle-mounted terminal start to conduct voice interaction. For example, a predetermined wake-up word spoken by a passenger may be a target interaction instruction. The embodiment of the present disclosure does not limit the specific content of the predetermined wake-up word.
可以理解的是,在电子设备为车载终端时,车载终端执行S101时可直接通过车辆内的声音采集设备,采集目标交互指令,并提取目标交互指令所表征的声音特征。It can be understood that, when the electronic device is a vehicle-mounted terminal, the vehicle-mounted terminal can directly collect the target interaction instruction through the sound collection device in the vehicle when executing S101, and extract the sound features represented by the target interaction instruction.
而在电子设备为服务器时,服务器执行S101时可接收车载终端发送的目标交互指令所表征的声音特征。该种情况下,车载终端采集到目标交互指令后,可以提取目标交互指令所表征的声音特征,进而,将目标交互指令所表征的声音特征发送至服务器,以使服务器可以获取到目标交互指令所表征的声音特征。However, when the electronic device is a server, the server may receive the voice feature represented by the target interaction instruction sent by the vehicle-mounted terminal when executing S101. In this case, after the vehicle-mounted terminal collects the target interaction command, it can extract the sound features represented by the target interaction command, and then send the sound features represented by the target interaction command to the server, so that the server can obtain the sound features represented by the target interaction command. characterizing sound characteristics.
另外,目标交互指令所表征的声音特征也可以称为声纹特征,声音特征包括共振峰。本公开不对声音特征的提取方式进行限定,任一种能够从语音指令中提取声音特征的实现方式均可以应用于本公开。In addition, the voice features represented by the target interaction instruction may also be called voiceprint features, and the voice features include formants. The present disclosure does not limit the manner of extracting voice features, and any implementation manner capable of extracting voice features from voice instructions can be applied to the present disclosure.
本公开的技术方案中,所涉及的乘客的语音交互指令的收集、存储、使用、加工、传输、提供和公开等处理,均符合相关法律法规的规定,且不违背公序良俗。In the technical solution of the present disclosure, the collection, storage, use, processing, transmission, provision, and disclosure of passengers' voice interaction instructions involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.
S102,基于目标交互指令所表征的声音特征,识别乘客的身份属性信息,将乘客的身份属性信息作为目标属性信息。S102, based on the voice features represented by the target interaction instruction, identify the passenger's identity attribute information, and use the passenger's identity attribute information as the target attribute information.
可选的,乘客的身份属性信息用于表示乘客的自然属性。并且,由于不同性别、不同年龄的人员通常对于事物的形式喜好不同,而同一性别或年龄的人员通常具有类似的喜好,因此,示例性的,身份属性信息可以包括性别和/或年龄。进而,通过性别和/或年龄来确定出为乘客展示的虚拟个人助理形象。Optionally, the passenger's identity attribute information is used to represent the passenger's natural attributes. Moreover, since people of different genders and ages usually have different preferences for the form of things, and people of the same gender or age usually have similar preferences, therefore, for example, the identity attribute information may include gender and/or age. Furthermore, the virtual personal assistant image displayed to the passenger is determined by gender and/or age.
一种实施方式中,电子设备可以将目标交互指令所表征的声音特征输入预先训练的神经网络模型,然后获取神经网络模型输出的身份属性信息。In one implementation manner, the electronic device may input the sound features represented by the target interaction instruction into a pre-trained neural network model, and then obtain identity attribute information output by the neural network model.
其中,神经网络模型通过训练集进行训练,训练集包括多个样本声音特征以及每个样本声音特征对应的训练标签。其中,训练标签包括样本声音特征所属人员的身份属性信息。Wherein, the neural network model is trained through a training set, and the training set includes a plurality of sample sound features and a training label corresponding to each sample sound feature. Wherein, the training label includes the identity attribute information of the person to whom the sample voice feature belongs.
本公开实施例利用神经网络模型,通过声音特征识别乘客的身份属性信息,使得对乘客的身份属性信息的判断更准确、鲁棒性更好。The embodiment of the present disclosure utilizes a neural network model to identify identity attribute information of passengers through sound features, so that the judgment of identity attribute information of passengers is more accurate and robust.
需要强调的是,上述的基于神经网络模型识别乘客的身份属性信息的实现方式仅仅作为示例,并不应该构成对本公开的限定,例如:还可以通过指定的声音识别算法,识别出 乘客的身份属性信息。It should be emphasized that the above-mentioned implementation of identifying the identity attribute information of passengers based on the neural network model is only an example, and should not constitute a limitation to the present disclosure. For example, the identity attribute information of passengers can also be identified through a specified voice recognition algorithm information.
本公开的技术方案中,所涉及的乘客的目标属性信息以及训练集的收集、存储、使用、加工、传输、提供和公开等处理,均符合相关法律法规的规定,且不违背公序良俗。In the technical solution of the present disclosure, the collection, storage, use, processing, transmission, provision, and disclosure of passenger target attribute information and training sets are all in compliance with relevant laws and regulations, and do not violate public order and good customs.
S103,获取与目标属性信息相匹配的VPA展示信息。其中,VPA展示信息包括VPA形象。S103. Obtain VPA display information matching the target attribute information. Wherein, the VPA display information includes a VPA image.
VPA形象用于表示VPA的外观,例如VPA形象可以包括VPA的五官、身材、发型和服饰等构成的形象。The VPA image is used to represent the appearance of the VPA. For example, the VPA image may include an image composed of the VPA's facial features, figure, hairstyle, and clothing.
可选的,VPA展示信息还可以包括:第一引导语和/或从文本到语音(Text To Speech,TTS)发音。Optionally, the VPA display information may also include: the first guide language and/or the pronunciation from text to speech (Text To Speech, TTS).
其中,电子设备可以在展示VPA形象的同时,展示第一引导语。第一引导语用于引导乘客与车载终端进行交互。例如,第一引导语包括:“试试启动无人驾驶模式”和“一起看某动画片吧”。Wherein, the electronic device may display the first guide language while displaying the image of the VPA. The first guide is used to guide passengers to interact with the vehicle terminal. For example, the first guiding language includes: "Try to start the unmanned driving mode" and "Let's watch a cartoon together".
TTS发音为播放第一引导语所基于的发音。例如,TTS发音包括:老年声音、成年男性声音、成年女性声音、儿童声音等中的任一种。The TTS pronunciation is the pronunciation upon which the first lead language is played. For example, the TTS pronunciation includes any of an elderly voice, an adult male voice, an adult female voice, a child voice, and the like.
示例性的,目标属性信息包括10岁和男性时,其对应的VPA展示信息包括:卡通形象、儿童声音和儿童引导语;目标属性信息包括20岁和男性时,其对应的VPA展示信息包括:男性形象、成年男性声音和男性引导语;目标属性信息包括20岁和女性时,其对应的VPA展示信息包括:女性形象、成年女性声音和女性引导语;目标属性信息包括60岁和男性时,其对应的VPA展示信息包括:老年形象、老年声音和老年引导语。Exemplarily, when the target attribute information includes 10 years old and male, the corresponding VPA display information includes: cartoon images, children's voices and children's guide words; when the target attribute information includes 20 years old and male, the corresponding VPA display information includes: Male image, adult male voice and male guide; target attribute information includes 20-year-old and female, and its corresponding VPA display information includes: female image, adult female voice and female guide; target attribute information includes 60-year-old and male, The corresponding VPA display information includes: images of old people, voices of old people and guiding words for old people.
S104,通过车辆内的显示屏,对VPA展示信息进行输出。S104, outputting information displayed by the VPA through a display screen in the vehicle.
在电子设备为车载终端时,车载终端执行S104时可直接通过车辆内的显示屏,对VPA展示信息进行输出,即展示出与乘客相匹配的VPA形象。When the electronic device is a vehicle-mounted terminal, when executing S104, the vehicle-mounted terminal can directly output the display information of the VPA through the display screen in the vehicle, that is, display the image of the VPA that matches the passenger.
在电子设备为服务器时,服务器执行S104时可向车载终端发送VPA展示信息,以使得车载终端接收到VPA展示信息后,通过车辆内的显示屏,对VPA展示信息进行输出,即展示出与乘客相匹配的VPA形象。When the electronic device is a server, the server can send VPA display information to the vehicle-mounted terminal when executing S104, so that the vehicle-mounted terminal can output the VPA display information through the display screen in the vehicle after receiving the VPA display information. Match the VPA image.
需要说明的是,若车辆仅仅具有一个显示屏,即主驾驶座位处的显示屏,那么,在一种实现方式中,无论发出目标交互指令的乘客属于哪个座位,VPA展示信息均通过该主驾驶座位处的显示屏进行输出。It should be noted that if the vehicle has only one display screen, that is, the display screen at the driver's seat, then, in one implementation, no matter which seat the passenger who issued the target interaction instruction belongs to, the information displayed by the VPA will be displayed through the driver's seat. Display at the seat for output.
本公开实施例可以获取目标交互指令所表征的声音特征,然后基于声音特征识别乘客的身份属性信息,将乘客的身份属性信息作为目标属性信息,并通过车辆内的显示屏输出与目标属性信息相匹配的VPA展示信息,其中VPA展示信息包括VPA形象。可见本公开 实施例基于乘客的身份属性信息确定为乘客展示的VPA形象,因此能够向不同的乘客展示不同的VPA形象。The embodiment of the present disclosure can obtain the sound features represented by the target interaction instruction, and then identify the passenger's identity attribute information based on the sound feature, and use the passenger's identity attribute information as the target attribute information, and output the information corresponding to the target attribute information through the display screen in the vehicle. Matched VPA display information, where the VPA display information includes a VPA image. It can be seen that the embodiment of the present disclosure determines the VPA image displayed for the passenger based on the identity attribute information of the passenger, so different VPA images can be displayed to different passengers.
同时,由于本公开实施例能够向不同的乘客展示不同的VPA形象,因此本公开实施例能够在更灵活地展示VPA形象的同时,满足用户的个性化需求。At the same time, since the embodiments of the present disclosure can display different VPA images to different passengers, the embodiments of the present disclosure can more flexibly display the VPA images and at the same time meet the personalized needs of users.
可选地,在本公开另一实施例中,无论上述电子设备为车载终端,还是上述电子设备为服务器,S101中目标交互指令的确定方式包括:若车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令。其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区域所确定的。Optionally, in another embodiment of the present disclosure, regardless of whether the above-mentioned electronic device is a vehicle-mounted terminal or the above-mentioned electronic device is a server, the method for determining the target interaction instruction in S101 includes: if multiple voice interaction instructions are collected in the vehicle at the same time, Based on the priority of each voice interaction instruction, the instruction with the highest priority is selected as the target interaction instruction. Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
也就是说,若上述电子设备为车载终端,当车辆内同时采集到多个语音交互指令时,则电子设备可以基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;进而,获取到目标交互指令的声音特征。若上述电子设备为服务器,当车辆内同时采集到多个语音交互指令时,车载终端基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;进而,获取到目标交互指令的声音特征,并将目标交互指令的声音特征发送给服务器,这样,服务器所获取到的声音特征为优先级最高的语音交互指令的声音特征。That is to say, if the above-mentioned electronic device is a vehicle-mounted terminal, when multiple voice interaction commands are collected in the vehicle at the same time, the electronic device can select the command with the highest priority as the target interaction command based on the priority of each voice interaction command ; Furthermore, the sound characteristics of the target interactive instruction are obtained. If the above-mentioned electronic device is a server, when multiple voice interaction commands are collected in the vehicle at the same time, the vehicle-mounted terminal selects the command with the highest priority as the target interaction command based on the priority of each voice interaction command; The voice feature of the command, and send the voice feature of the target interaction command to the server, so that the voice feature acquired by the server is the voice feature of the voice interaction command with the highest priority.
可选的,车载终端所在的车辆中可安装多个声音采集设备,例如车内麦克风。不同的声音采集设备可采集不同座位区域的乘客发出的语音。Optionally, multiple sound collection devices, such as in-vehicle microphones, may be installed in the vehicle where the vehicle-mounted terminal is located. Different sound collection devices can collect the voices of passengers in different seating areas.
例如,假设座位区域的优先级从高到低依次为:主驾区域、副驾区域、后排区域。车载终端同时采集到主驾区域的乘客发出的交互指令1,以及副驾区域的乘客发出的交互指令2,由于交互指令1的优先级高于交互指令2的优先级,因此将交互指令1作为目标交互指令。For example, it is assumed that the priorities of the seat areas are in order from high to low: the main driving area, the co-pilot area, and the rear row area. The vehicle-mounted terminal simultaneously collects the interactive command 1 issued by the passenger in the main driving area and the interactive command 2 issued by the passenger in the co-pilot area. Since the priority of interactive command 1 is higher than that of interactive command 2, interactive command 1 is taken as the target Interactive instructions.
可以理解的,若在同一时刻仅采集到一个交互指令,则将该交互指令作为目标交互指令。It can be understood that if only one interaction instruction is collected at the same time, this interaction instruction is taken as the target interaction instruction.
其中,语音交互指令的来源区域的识别方式,即发出语音交互指令的乘客所位于的座位区域的识别方式,可以存在多种。Among them, there may be many ways to identify the source area of the voice interaction instruction, that is, the identification manner of the seat area where the passenger who issued the voice interaction instruction is located.
除了上述通过声音采集设备的位置区分语音交互指令的来源区域以外,在另一种实现方式中,车辆内的不同位置安装多个声音采集设备,将车辆内每一个座位都设置成一个独立的音区,通过判断多个位置的声音采集设备接收到的声音信号的来源方向,确定出声音信号是从哪个座位上发出的,进而,获知语音交互指令的来源区域,即发出语音交互指令 的乘客所位于的座位区域。In addition to distinguishing the source area of the voice interaction command by the position of the sound collection device, in another implementation, multiple sound collection devices are installed at different positions in the vehicle, and each seat in the vehicle is set as an independent sound system. Area, by judging the source direction of the sound signal received by the sound collection equipment in multiple locations, determine which seat the sound signal is sent from, and then know the source area of the voice interaction command, that is, the passenger who issued the voice interaction command. Located in the seating area.
在车辆中多个乘客同时具有语音交互需求时,本公开实施例能够按照语音交互指令的优先级,选择优先级最高的指令,从而优先保障驾驶员的交互需求,提高车辆行驶的安全性。When multiple passengers in the vehicle have voice interaction needs at the same time, the embodiment of the present disclosure can select the command with the highest priority according to the priority of the voice interaction commands, so as to prioritize the interaction needs of the driver and improve the safety of vehicle driving.
可选地,在本公开的另一实施例中,车载终端所在的车辆可以包括多个显示屏。Optionally, in another embodiment of the present disclosure, the vehicle where the vehicle-mounted terminal is located may include multiple display screens.
在车辆包括多个显示屏的情况下,上述S104中通过车辆内的显示屏,对VPA展示信息进行输出的方式可以实现为:通过车辆内的指定座位区域的显示屏,对VPA展示信息进行输出。其中,指定座位区域为发出目标交互指令的乘客所位于的座位区域。In the case that the vehicle includes multiple display screens, the way of outputting the VPA display information through the display screens in the vehicle in S104 above can be implemented as: outputting the VPA display information through the display screens in the designated seat area in the vehicle . Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
关于识别发出目标交互指令的乘客所位于的座位区域的具体实现方式,可以参见上述实施例中的相应内容,在此不做赘述。可以理解的是,若该电子设备为车载终端,则该电子设备可以在输出VPA展示信息之前,确定出指定座位区域,然后,在车辆内的指定座位区域的显示屏中,输出VPA展示信息。而若该电子设备为服务器,则电子设备可以在获得目标交互指令所表征的声音特征的同时,从车载终端中获取到目标交互指令的来源区域的区域标识,即指定座位区域的区域标识,进而,该电子设备在获得VPA展示信息后,可以将VPA展示信息以及之前获取到的区域标识发送至车载终端,以使得车载终端可以在车辆内的指定座位区域的显示屏中,输出VPA展示信息;当然,若该电子设备为服务器,则电子设备也可以仅仅将VPA展示信息反馈给车载终端,车载终端自行在车辆内的指定座位区域的显示屏中,输出VPA展示信息,这也是合理的。Regarding the specific implementation manner of identifying the seat area where the passenger who issued the target interaction instruction is located, reference may be made to the corresponding content in the foregoing embodiments, and details are not repeated here. It can be understood that, if the electronic device is a vehicle-mounted terminal, the electronic device can determine the designated seat area before outputting the VPA display information, and then output the VPA display information on the display screen of the designated seat area in the vehicle. And if the electronic device is a server, the electronic device can obtain the area identification of the source area of the target interaction instruction from the vehicle-mounted terminal while obtaining the sound feature represented by the target interaction instruction, that is, the area identification of the designated seat area, and then After the electronic device obtains the VPA display information, it can send the VPA display information and the area identification obtained before to the vehicle-mounted terminal, so that the vehicle-mounted terminal can output the VPA display information on the display screen of the designated seat area in the vehicle; Of course, if the electronic device is a server, the electronic device can also only feed back the VPA display information to the vehicle-mounted terminal, and the vehicle-mounted terminal outputs the VPA display information on the display screen in the designated seat area in the vehicle by itself, which is also reasonable.
本公开实施例可以应用在多屏交互场景,从而针对不同座位区域的乘客,根据该乘客的身份属性信息确定不同的VPA展示信息,提升乘客的使用兴趣和交互乐趣。The embodiments of the present disclosure can be applied to multi-screen interaction scenarios, so that for passengers in different seating areas, different VPA display information can be determined according to the identity attribute information of the passengers, so as to improve the use interest and interaction fun of passengers.
同时本公开实施例不仅可以应用在单屏交互场景,也可以应用在多屏交互场景,使得本公开实施例的应用范围更广。At the same time, the embodiments of the present disclosure can be applied not only to single-screen interaction scenarios, but also to multi-screen interaction scenarios, so that the application range of the embodiments of the present disclosure is wider.
可选地,在本公开的另一实施例中,如图2所示,在上述S102之前还包括:S105,从关于各声音特征与用户行为数据的对应关系中,查找与目标交互指令所表征的声音特征对应的用户行为数据。若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行S104;若未查找到,则执行S102。Optionally, in another embodiment of the present disclosure, as shown in FIG. 2 , before the above S102, it also includes: S105, from the corresponding relationship between each sound feature and user behavior data, find the information represented by the target interaction instruction. The user behavior data corresponding to the voice features. If found, determine the VPA display information matching the found user behavior data, and execute S104; if not found, execute S102.
在本公开实施例中,用户行为数据用于表示用户的历史操作行为。例如用户行为数据包括:用户历史设置的VPA展示信息,用户观看过的视频、用户观看过的新闻和/或用户播放过的音乐等。In the embodiments of the present disclosure, the user behavior data is used to represent the user's historical operation behavior. For example, user behavior data includes: VPA display information set by the user history, videos watched by the user, news watched by the user and/or music played by the user, etc.
一种实施方式中,若用户行为数据包含用户历史设置的VPA展示信息,则电子设备可以将用户历史设置的VPA展示信息作为用户行为数据相匹配的VPA展示信息。In one embodiment, if the user behavior data includes the VPA display information set by the user history, the electronic device may use the VPA display information set by the user history as the VPA display information matching the user behavior data.
若用户行为数据不包含用户历史设置的VPA展示信息,则电子设备根据用户行为数据,确定乘客的喜好类型,然后确定该喜好类型对应的VPA展示信息。If the user behavior data does not include the VPA display information set by the user history, the electronic device determines the preference type of the passenger according to the user behavior data, and then determines the VPA display information corresponding to the preference type.
例如,乘客的用户行为数据包括:动画片和儿歌,则确定乘客的喜好类型为动画,然后确定动画对应的VPA形象和第一引导语。其中,每个VPA形象对应一个喜好类型,每个第一引导语对应一个喜好类型。For example, if the passenger's user behavior data includes: cartoons and nursery rhymes, it is determined that the passenger's favorite type is animation, and then the VPA image and the first guide language corresponding to the animation are determined. Wherein, each VPA image corresponds to a preference type, and each first guide language corresponds to a preference type.
本公开的技术方案中,所涉及的乘客的用户行为数据的收集、存储、使用、加工、传输、提供和公开等处理,均符合相关法律法规的规定,且不违背公序良俗。In the technical solution of the present disclosure, the collection, storage, use, processing, transmission, provision, and disclosure of the passenger's user behavior data involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.
本公开实施例可以按照用户行为数据确定VPA展示信息,由于用户行为数据与身份属性信息相比更能够体现用户兴趣,因此,本公开实施例在预先收集到用户行为数据时,按照用户行为数据确定的VPA展示信息,更能符合用户的兴趣。The embodiment of the present disclosure can determine the VPA display information according to the user behavior data. Since the user behavior data can better reflect the user's interest compared with the identity attribute information, the embodiment of the present disclosure determines according to the user behavior data when the user behavior data is collected in advance. The information displayed by the VPA can better meet the interests of users.
可选地,在本公开的另一实施例中,上述S103获取与目标属性信息相匹配的VPA展示信息的步骤,可以包括以下三种实现方式:Optionally, in another embodiment of the present disclosure, the above step of S103 acquiring VPA presentation information matching the target attribute information may include the following three implementations:
方式一、从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与目标属性信息相匹配的VPA展示信息。Method 1: Obtain the VPA display information matching the target attribute information from the preset corresponding relationship between each identity attribute information and the VPA display information.
一种实施方式中,服务器可以预先收集不同身份属性信息的乘客的用户行为数据,然后针对每个乘客,基于该乘客的用户行为数据,确定该用户行为数据相匹配的VPA展示信息,作为与该身份属性信息对应的VPA展示信息。In one embodiment, the server may pre-collect user behavior data of passengers with different identity attribute information, and then for each passenger, based on the user behavior data of the passenger, determine the VPA display information that matches the user behavior data as the The VPA display information corresponding to the identity attribute information.
其中,确定用户行为数据相匹配的VPA展示信息的方式可参考上述描述,此处不再赘述。Wherein, the manner of determining the VPA display information matching the user behavior data can refer to the above description, which will not be repeated here.
例如,预先收集到乘客1身份属性信息包括10岁,用户行为数据包括动画片A和动画片B。乘客2身份属性信息包括5岁,用户行为数据包括动画片C和动画片B。乘客3身份属性信息包括5岁,用户行为数据包括儿歌A和动画片A。假设动画片A、动画片B、动画片C和儿歌A均为动画类型,由于乘客1、乘客2和乘客3均属于儿童年龄段,则将儿童年龄段与动画类型的VPA形象对应。后续获取到乘客4的年龄为7岁时,由于7岁属于儿童年龄段,因此确定动画类型的VPA形象。For example, the pre-collected identity attribute information of passenger 1 includes age 10, and user behavior data includes cartoon A and cartoon B. Passenger 2's identity attribute information includes age 5, and user behavior data includes cartoon C and cartoon B. Passenger 3’s identity attribute information includes age 5, and user behavior data includes nursery rhyme A and cartoon A. Assuming that cartoon A, cartoon B, cartoon C and nursery rhyme A are all animation types, since passenger 1, passenger 2 and passenger 3 all belong to the children's age group, the children's age group is corresponding to the VPA image of the animation type. When the age of the passenger 4 is obtained later as 7 years old, since 7 years old belongs to the age group of children, the VPA image of the animation type is determined.
本公开实施例能够预先收集多种乘客的用户行为数据,从而获得不同年龄和性别的乘客的喜好,因此能够基于乘客的身份属性信息,确定出该乘客可能喜爱的VPA展示信息。The embodiments of the present disclosure can collect user behavior data of various passengers in advance, so as to obtain the preferences of passengers of different ages and genders, and thus can determine the VPA display information that the passenger may like based on the identity attribute information of the passenger.
方式二、查找目标交互指令所表征声音特征对应的历史VPA展示信息,然后在历史 VPA展示信息中,选择展示频率最高的VPA展示信息。Method 2: Find the historical VPA display information corresponding to the sound feature represented by the target interactive command, and then select the VPA display information with the highest display frequency among the historical VPA display information.
其中,历史VPA展示信息为向发出目标交互指令的乘客展示过的VPA展示信息。Wherein, the historical VPA display information is the VPA display information that has been displayed to the passenger who issued the target interaction instruction.
一种实施方式中,车载终端中可以记录各使用过语音交互功能的乘客的声音特征,以及对应的历史VPA展示信息。从而在获取到目标交互指令所表征声音特征时,查找该声音特征对应的历史VPA展示信息,然后在历史VPA展示信息中,选择展示频率最高的VPA展示信息。In one embodiment, the vehicle-mounted terminal can record the voice characteristics of each passenger who has used the voice interaction function, as well as the corresponding historical VPA display information. Therefore, when the sound feature represented by the target interaction instruction is obtained, the historical VPA display information corresponding to the sound feature is searched, and then the VPA display information with the highest display frequency is selected from the historical VPA display information.
由于历史VPA展示信息中,展示频率最高的VPA展示信息受乘客喜爱的可能性最高,因此本公开实施例选择展示频率最高的VPA展示信息,更能够符合用户喜好。Because among the historical VPA display information, the VPA display information with the highest display frequency is most likely to be liked by passengers, so in the embodiment of the present disclosure, selecting the VPA display information with the highest display frequency can better meet user preferences.
方式三、在车载终端与服务器通信连接时,由车载终端向服务器发送声音特征,然后由服务器通过方式一的方式确定VPA展示信息。Method 3: When the vehicle-mounted terminal communicates with the server, the vehicle-mounted terminal sends the voice feature to the server, and then the server determines the VPA display information through the method 1.
在车载终端断开与服务器的通信连接,即车载终端离线时,由车载终端通过方式二的方式确定VPA展示信息。When the vehicle-mounted terminal disconnects from the communication connection with the server, that is, when the vehicle-mounted terminal is offline, the vehicle-mounted terminal determines the VPA display information in the second manner.
具体的,车载终端可以在向服务器发送声音特征以获取VPA展示信息,并在未从服务器获取到VPA展示信息时,执行方式二。Specifically, the vehicle-mounted terminal may send the sound feature to the server to obtain the VPA display information, and execute the second method when the VPA display information is not obtained from the server.
在本公开的另一实施例中,如图3所示,在上述S104之后,电子设备还可以进一步为乘客展示第二引导语,包括以下步骤:In another embodiment of the present disclosure, as shown in FIG. 3 , after the above S104, the electronic device may further display a second guide for passengers, including the following steps:
S301,获取针对多媒体信息的展示状态信息。S301. Acquire display state information for multimedia information.
其中,多媒体信息为在通过车辆内的显示屏,对第一引导语进行输出后,车辆的车机通过显示屏所展示给乘客的信息。Wherein, the multimedia information is the information displayed by the vehicle machine to the passengers through the display screen after outputting the first guide language through the display screen in the vehicle.
可选的,多媒体信息包括:文字、图片、音频和/或视频等。Optionally, the multimedia information includes: text, picture, audio and/or video, etc.
多媒体信息为乘客与车载终端交互的过程中,指示车载终端展示的信息。The multimedia information refers to the information displayed by the vehicle-mounted terminal during the interaction between passengers and the vehicle-mounted terminal.
展示状态信息包括:未展示、展示完成和部分展示等。例如,多媒体信息为某个系列动画片时,展示状态信息包括目前正在播放该系列动画片的集数,在集数为最后一集时,表示该系列动画片展示完成,在集数不为最后一集时,表示该系列动画片部分展示。Display status information includes: not displayed, completed display, partially displayed, etc. For example, when the multimedia information is a series of cartoons, the display status information includes the number of episodes of the series of cartoons currently being played. When the number of episodes is the last episode, it means that the series of cartoons is displayed. One episode, which means that the series of cartoons is partially shown.
又例如,多媒体信息为某个专辑的歌曲时,展示状态包括目前正在播放的歌曲序号,在序号为最后一首时,表示该专辑的歌曲展示完成,在序号不为最后一首时,表示该专辑的歌曲部分展示。For another example, when the multimedia information is a song of a certain album, the display status includes the serial number of the song currently being played. When the serial number is the last one, it means that the song display of the album is completed; The songs section of the album is shown.
S302,若展示状态信息表征多媒体信息展示完成,则确定与乘客的用户行为数据匹配的第二引导语。S302. If the display state information indicates that the display of the multimedia information is completed, determine the second guide that matches the passenger's user behavior data.
由于则S301之前,车机系统已为乘客展示过多媒体信息,因此该乘客存在对应的用 户行为数据,可确定与乘客的用户行为数据匹配的第二引导语。Since before S301, the vehicle-machine system has displayed multimedia information for the passenger, the passenger has corresponding user behavior data, and the second guide that matches the passenger's user behavior data can be determined.
确定与用户行为数据匹配的第二引导语的方式,可参考上述确定与用户行为数据匹配的VPA展示信息的方式,此处不再赘述。For the method of determining the second guide that matches the user behavior data, refer to the above method of determining the VPA display information that matches the user behavior data, which will not be repeated here.
例如,根据用户行为数据,确定用户喜好的类型为娱乐类,则确定娱乐类的第二引导语。For example, according to the user behavior data, it is determined that the type of user preference is entertainment, and then the second guide language of entertainment is determined.
S303,通过车辆内的显示屏,对第二引导语进行输出。S303. Output the second guide language through the display screen in the vehicle.
一种实施方式中,在电子设备为车载终端时,车载终端可以直接通过车辆内的显示屏,对第二引导语进行输出。In one embodiment, when the electronic device is a vehicle-mounted terminal, the vehicle-mounted terminal can directly output the second guide language through a display screen in the vehicle.
在电子设备为服务器时,服务器可以向车机系统发送第二引导语,以使得车载终端通过车辆内的显示屏,对第二引导语进行输出。When the electronic device is a server, the server can send the second guide to the vehicle-machine system, so that the vehicle-mounted terminal can output the second guide through the display screen in the vehicle.
本公开实施例可以在乘客与车载终端进行一段时间的交互后,获得乘客的用户行为数据,从而基于用户行为数据为乘客推荐其更可能喜爱的第二引导语,以引导用户进一步与车载终端进行交互,从而提高了乘客的交互兴趣。The embodiment of the present disclosure can obtain the passenger's user behavior data after the passenger has interacted with the vehicle-mounted terminal for a period of time, so as to recommend the second guide language that the passenger is more likely to like based on the user behavior data, so as to guide the user to further interact with the vehicle-mounted terminal. Interaction, thereby improving the interaction interest of passengers.
参见图4,以下结合具体的应用场景,对本公开实施例提供的虚拟个人助理的展示方法的整体流程进行说明:Referring to Fig. 4, the overall flow of the virtual personal assistant display method provided by the embodiment of the present disclosure is described below in combination with specific application scenarios:
S401,车载终端获取一个座位区域的乘客发出的目标交互指令,并获取目标交互指令所表征的声音特征。S401. The vehicle-mounted terminal obtains a target interaction instruction issued by a passenger in a seat area, and obtains a sound feature represented by the target interaction instruction.
S402,车机终端向服务器发送声音特征。S402. The car-machine terminal sends the sound feature to the server.
S403,服务器接收声音特征,并基于目标交互指令所表征的声音特征,识别乘客的身份属性信息,将乘客的身份属性信息作为目标属性信息。S403, the server receives the voice features, and based on the voice features represented by the target interaction instruction, identifies the identity attribute information of the passenger, and takes the passenger's identity attribute information as the target attribute information.
S404,服务器获取与目标属性信息相匹配的VPA展示信息。其中VPA展示信息包括VPA形象、第一引导语和TTS发音。S404. The server obtains the VPA display information matching the target attribute information. The VPA display information includes VPA image, first guide language and TTS pronunciation.
S405,服务器向车载终端发送VPA展示信息。S405. The server sends the VPA display information to the vehicle terminal.
S406,车载终端接收VPA展示信息,并通过车辆内的该座位区域的显示屏,对VPA展示信息进行输出。S406. The vehicle-mounted terminal receives the VPA display information, and outputs the VPA display information through the display screen in the seat area in the vehicle.
参见图5,本公开实施例提供的虚拟个人助理的展示方法中,车载终端可采集乘客在语音交互过程中发出的目标交互指令,然后获取目标交互指令的声音特征。通过服务器利用声音特征识别乘客的年龄和性别,并通过VPA生成切换系统确定与乘客的年龄和性别相匹配的VPA展示信息。其中,VPA展示信息包括VPA形象、第一引导语和TTS发音;每个VPA信息对应一种性别(男或女)以及一种年龄段(例如老年、中年、青年或者幼儿)。通过乘客的年龄和性别,能够确定出这种年龄和性别的用户可能喜爱的VPA展示信息。Referring to FIG. 5 , in the display method of the virtual personal assistant provided by the embodiment of the present disclosure, the vehicle-mounted terminal can collect the target interaction instruction issued by the passenger during the voice interaction process, and then obtain the sound characteristics of the target interaction instruction. The age and gender of the passenger are recognized by the server using voice features, and the VPA display information matching the age and gender of the passenger is determined by the VPA generation switching system. Among them, the VPA display information includes VPA image, first guide language and TTS pronunciation; each VPA information corresponds to a gender (male or female) and an age group (such as old, middle-aged, young or infant). According to the age and gender of the passengers, it is possible to determine the VPA display information that users of this age and gender may like.
然后在该乘客的年龄属于老年时,在车载显示屏中将VPA显示为老年形象,同时使用老年TTS发音播放老年的第一引导语。Then when the age of the passenger belongs to the elderly, the VPA is displayed as an image of the elderly in the vehicle display screen, and the first guide language of the elderly is played using the elderly TTS pronunciation.
在该乘客的年龄属于中年且性别为女时,在车载显示屏中将VPA显示为中年女性形象,同时使用中年女性TTS发音播放中年女性的第一引导语。When the age of the passenger is middle-aged and the gender is female, the VPA is displayed as a middle-aged female image on the vehicle display screen, and the first guide language of the middle-aged female is played using the TTS pronunciation of the middle-aged female.
在该乘客的年龄属于中年且性别为男时,在车载显示屏中将VPA显示为中年男性形象,同时使用中年男性TTS发音播放中年男性的第一引导语。When the age of the passenger is middle-aged and the gender is male, the VPA is displayed as a middle-aged male image on the vehicle display screen, and the middle-aged male TTS pronunciation is used to play the first guide language of the middle-aged male.
在该乘客的年龄属于青年且性别为女时,在车载显示屏中将VPA显示为青年女性形象,同时使用青年女性TTS发音播放青年女性的第一引导语。When the age of the passenger is young and the gender is female, the VPA is displayed as a young female image on the vehicle display screen, and at the same time, the young female TTS pronunciation is used to play the first guide language of the young female.
在该乘客的年龄属于青年且性别为男时,在车载显示屏中将VPA显示为青年男性形象,同时使用青年男性TTS发音播放青年男性的第一引导语。When the age of the passenger belongs to youth and the gender is male, the VPA is displayed as a young male image on the vehicle display screen, and at the same time, the young male TTS pronunciation is used to play the first guiding language of the young male.
在该乘客的年龄属于幼儿时,在车载显示屏中将VPA显示为幼儿形象,同时使用幼儿TTS发音播放幼儿的第一引导语。When the age of the passenger belongs to infants, the VPA is displayed as an image of infants on the vehicle display screen, and at the same time, the infant's first guide language is played using infant TTS pronunciation.
基于相同的构思,对应于上述方法实施例,本公开实施例还提供了一种虚拟个人助理的展示系统,如图6所示,包括:服务器601和车辆的车载终端602;Based on the same idea, corresponding to the above-mentioned method embodiment, the embodiment of the present disclosure also provides a display system of a virtual personal assistant, as shown in FIG. 6 , including: a server 601 and a vehicle-mounted terminal 602 of the vehicle;
车载终端602,用于获取目标交互指令,并确定目标交互指令所表征的声音特征,向服务器发送声音特征,目标交互指令为车辆内的乘客发出的语音交互指令;The vehicle-mounted terminal 602 is used to obtain the target interaction instruction, and determine the sound characteristics represented by the target interaction instruction, and send the sound characteristics to the server, and the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
服务器601,用于接收目标交互指令所表征的声音特征,基于目标交互指令所表征的声音特征,识别乘客的身份属性信息,将乘客的身份属性信息作为目标属性信息,获取与目标属性信息相匹配的虚拟个人助理VPA展示信息,VPA展示信息包括VPA形象,并向车载终端发送VPA展示信息;The server 601 is configured to receive the sound feature represented by the target interaction instruction, identify the passenger's identity attribute information based on the sound feature represented by the target interaction instruction, use the passenger's identity attribute information as the target attribute information, and obtain the matching target attribute information The virtual personal assistant VPA display information, the VPA display information includes the VPA image, and sends the VPA display information to the vehicle terminal;
车载终端602,还用于接收VPA展示信息,并通过车辆内的显示屏,对VPA展示信息进行输出。The vehicle-mounted terminal 602 is also used to receive VPA display information, and output the VPA display information through the display screen in the vehicle.
可选的,车载终端602,具体用于:Optionally, the vehicle terminal 602 is specifically used for:
通过车辆内的指定座位区域的显示屏,对VPA展示信息进行输出;Output the VPA display information through the display screen in the designated seat area in the vehicle;
其中,指定座位区域为发出目标交互指令的乘客所位于的座位区域。Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
可选的,车载终端602,还用于:Optionally, the vehicle terminal 602 is also used for:
若车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;If multiple voice interaction instructions are collected in the vehicle at the same time, based on the priority of each voice interaction instruction, the instruction with the highest priority is selected as the target interaction instruction;
其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区域所确定的。Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
可选的,车载终端602还用于:Optionally, the vehicle terminal 602 is also used for:
若从服务器获取VPA展示信息失败,则查找目标交互指令所表征声音特征对应的历史VPA展示信息,其中,历史VPA展示信息为向发出目标交互指令的乘客展示过的VPA展示信息;If it fails to obtain the VPA display information from the server, then search for the historical VPA display information corresponding to the sound feature represented by the target interactive command, wherein the historical VPA display information is the VPA display information that has been displayed to the passenger who issued the target interactive command;
在历史VPA展示信息中,选择展示频率最高的VPA展示信息。From the historical VPA display information, select the VPA display information with the highest display frequency.
在本公开实施例中,在车载终端602向服务器601发送声音特征之后的超时时长内,若未接收到服务器发送的VPA展示信息,则表示从服务器获取VPA展示信息失败。In the embodiment of the present disclosure, if the vehicle-mounted terminal 602 does not receive the VPA presentation information sent by the server within the timeout period after the vehicle-mounted terminal 602 sends the voice feature to the server 601, it means that the acquisition of the VPA presentation information from the server fails.
可选的,服务器601,具体用于:Optionally, the server 601 is specifically used for:
从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与目标属性信息相匹配的VPA展示信息。The VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
可选的,服务器601,还用于:Optionally, the server 601 is also used for:
在基于目标交互指令所表征的声音特征,识别乘客的身份属性信息之前,从关于各声音特征与用户行为数据的对应关系中,查找与目标交互指令所表征的声音特征对应的用户行为数据;Before identifying the passenger's identity attribute information based on the sound features represented by the target interaction instruction, from the corresponding relationship between each sound feature and user behavior data, search for the user behavior data corresponding to the sound feature represented by the target interaction instruction;
若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行通过车辆内的显示屏,对VPA展示信息进行输出的步骤;If it is found, determine the VPA display information matching the found user behavior data, and execute the step of outputting the VPA display information through the display screen in the vehicle;
若未查找到,则执行基于目标交互指令所表征的声音特征,识别乘客的身份属性信息的步骤。If not found, perform the step of identifying the identity attribute information of the passenger based on the voice features represented by the target interaction instruction.
可选的,VPA展示信息还包括第一引导语;Optionally, the VPA display information also includes the first introductory language;
服务器601,还用于在通过车辆内的显示屏,对VPA展示信息进行输出之后,获取针对多媒体信息的展示状态信息,多媒体信息为在通过车辆内的显示屏,对第一引导语进行输出后,车辆的车机通过显示屏所展示给乘客的信息;The server 601 is also used to obtain display status information for multimedia information after outputting the VPA display information through the display screen in the vehicle. The multimedia information is after outputting the first guide language through the display screen in the vehicle. , the information displayed to passengers by the vehicle's machine through the display screen;
服务器601,还用于若展示状态信息表征多媒体信息展示完成,则确定与乘客的用户行为数据匹配的第二引导语,向车载终端发送第二引导语;The server 601 is also used to determine the second guide that matches the passenger's user behavior data and send the second guide to the vehicle-mounted terminal if the display status information indicates that the multimedia information display is completed;
车载终端602,还用于接收第二引导语,并通过车辆内的显示屏,对第二引导语进行输出。The vehicle-mounted terminal 602 is also used to receive the second guide, and output the second guide through the display screen in the vehicle.
可选的,服务器601,具体用于:Optionally, the server 601 is specifically used for:
将目标交互指令所表征的声音特征输入预先训练的神经网络模型,获取神经网络模型输出的身份属性信息。Input the sound features represented by the target interaction instructions into the pre-trained neural network model, and obtain the identity attribute information output by the neural network model.
基于相同的构思,对应于上述方法实施例,本公开实施例提供了一种虚拟个人助理的 展示装置,如图7所示,包括:获取模块701、识别模块702和输出模块703。Based on the same idea, corresponding to the above-mentioned method embodiment, the embodiment of the present disclosure provides a virtual personal assistant display device, as shown in FIG. 7 , including: an acquisition module 701, an identification module 702 and an output module 703.
获取模块701,用于获取目标交互指令所表征的声音特征;其中,目标交互指令为车辆内的乘客发出的语音交互指令;The acquisition module 701 is configured to acquire the sound features represented by the target interaction instruction; wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
识别模块702,用于基于目标交互指令所表征的声音特征,识别乘客的身份属性信息,将乘客的身份属性信息作为目标属性信息;The identification module 702 is used to identify the passenger's identity attribute information based on the sound characteristics represented by the target interaction instruction, and use the passenger's identity attribute information as the target attribute information;
获取模块701,还用于获取与目标属性信息相匹配的虚拟个人助理VPA展示信息,VPA展示信息包括VPA形象;The obtaining module 701 is also used to obtain the virtual personal assistant VPA display information matching the target attribute information, and the VPA display information includes the VPA image;
输出模块703,用于通过车辆内的显示屏,对VPA展示信息进行输出。The output module 703 is configured to output the VPA display information through the display screen in the vehicle.
可选的,输出模块,具体用于:Optional, output module, specifically for:
通过车辆内的指定座位区域的显示屏,对VPA展示信息进行输出;Output the VPA display information through the display screen in the designated seat area in the vehicle;
其中,指定座位区域为发出目标交互指令的乘客所位于的座位区域。Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
可选的,装置还包括确定模块,确定模块,用于:Optionally, the device further includes a determination module, which is used for:
若车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;If multiple voice interaction instructions are collected in the vehicle at the same time, based on the priority of each voice interaction instruction, the instruction with the highest priority is selected as the target interaction instruction;
其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区域所确定的。Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
可选的,获取模块,具体用于:Optionally, obtain modules, specifically for:
查找目标交互指令所表征声音特征对应的历史VPA展示信息,其中,历史VPA展示信息为向发出目标交互指令的乘客展示过的VPA展示信息;Find the historical VPA display information corresponding to the sound feature represented by the target interactive command, wherein the historical VPA display information is the VPA display information that was displayed to the passenger who issued the target interactive command;
在历史VPA展示信息中,选择展示频率最高的VPA展示信息。From the historical VPA display information, select the VPA display information with the highest display frequency.
可选的,获取模块,具体用于:Optionally, obtain modules, specifically for:
从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与目标属性信息相匹配的VPA展示信息。The VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
可选的,该装置还可以包括查找模块和执行模块;Optionally, the device may also include a search module and an execution module;
查找模块,用于在基于目标交互指令所表征的声音特征,识别乘客的身份属性信息之前,从关于各声音特征与用户行为数据的对应关系中,查找与目标交互指令所表征的声音特征对应的用户行为数据;The search module is used to search for the corresponding voice feature represented by the target interactive command from the corresponding relationship between each voice feature and user behavior data before identifying the identity attribute information of the passenger based on the voice feature represented by the target interactive command. user behavior data;
执行模块,用于若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行通过车辆内的显示屏,对VPA展示信息进行输出的步骤;The execution module is used to determine the VPA display information matching the found user behavior data if it is found, and execute the step of outputting the VPA display information through the display screen in the vehicle;
执行模块,还用于若未查找到,则执行基于目标交互指令所表征的声音特征,识别乘客的身份属性信息的步骤。The executing module is further configured to execute the step of identifying the passenger's identity attribute information based on the voice features represented by the target interaction instruction if it is not found.
可选的,VPA展示信息还包括第一引导语;该装置还可以包括:确定模块;Optionally, the VPA display information also includes a first guide; the device may also include: a determination module;
获取模块,还用于在通过车辆内的显示屏,对VPA展示信息进行输出之后,获取针对多媒体信息的展示状态信息,多媒体信息为在通过车辆内的显示屏,对第一引导语进行输出后,车辆的车机通过显示屏所展示给乘客的信息;The obtaining module is also used to obtain the display status information for the multimedia information after outputting the VPA display information through the display screen in the vehicle. The multimedia information is after outputting the first guide language through the display screen in the vehicle. , the information displayed to passengers by the vehicle's machine through the display screen;
确定模块,用于若展示状态信息表征多媒体信息展示完成,则确定与乘客的用户行为数据匹配的第二引导语;A determining module, configured to determine the second guide that matches the passenger's user behavior data if the display state information indicates that the display of the multimedia information is completed;
输出模块,还用于通过车辆内的显示屏,对第二引导语进行输出。The output module is also used to output the second guide language through the display screen in the vehicle.
可选的,识别模块,具体用于:Optionally, the identification module is specifically used for:
将目标交互指令所表征的声音特征输入预先训练的神经网络模型,获取神经网络模型输出的身份属性信息。Input the sound features represented by the target interaction instructions into the pre-trained neural network model, and obtain the identity attribute information output by the neural network model.
本公开的技术方案中,所涉及的用户个人信息的收集、存储、使用、加工、传输、提供和公开等处理,均符合相关法律法规的规定,且不违背公序良俗。In the technical solution of this disclosure, the collection, storage, use, processing, transmission, provision, and disclosure of user personal information involved are all in compliance with relevant laws and regulations, and do not violate public order and good customs.
根据本公开的实施例,本公开还提供了一种电子设备、一种可读存储介质和一种计算机程序产品。According to the embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.
图8示出了可以用来实施本公开的实施例的示例电子设备800的示意性框图。电子设备旨在表示各种形式的数字计算机,诸如,膝上型计算机、台式计算机、工作台、个人数字助理、服务器、刀片式服务器、大型计算机、和其它适合的计算机。电子设备还可以表示各种形式的移动装置,诸如,个人数字处理、蜂窝电话、智能电话、可穿戴设备和其它类似的计算装置。本文所示的部件、它们的连接和关系、以及它们的功能仅仅作为示例,并且不意在限制本文中描述的和/或者要求的本公开的实现。FIG. 8 shows a schematic block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic device is intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other suitable computers. Electronic devices may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are by way of example only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
如图8所示,电子设备800包括计算单元801,其可以根据存储在只读存储器(ROM)802中的计算机程序或者从存储单元808加载到随机访问存储器(RAM)803中的计算机程序,来执行各种适当的动作和处理。在RAM 803中,还可存储电子设备800操作所需的各种程序和数据。计算单元801、ROM 802以及RAM 803通过总线804彼此相连。输入/输出(I/O)接口805也连接至总线804。As shown in FIG. 8 , an electronic device 800 includes a computing unit 801, which can perform calculations according to a computer program stored in a read-only memory (ROM) 802 or a computer program loaded from a storage unit 808 into a random access memory (RAM) 803. Various appropriate actions and processes are performed. In the RAM 803, various programs and data necessary for the operation of the electronic device 800 can also be stored. The computing unit 801, ROM 802, and RAM 803 are connected to each other through a bus 804. An input/output (I/O) interface 805 is also connected to the bus 804 .
电子设备800中的多个部件连接至I/O接口805,包括:输入单元806,例如键盘、鼠标等;输出单元807,例如各种类型的显示器、扬声器等;存储单元808,例如磁盘、光盘等;以及通信单元809,例如网卡、调制解调器、无线通信收发机等。通信单元809允许电子设备800通过诸如因特网的计算机网络和/或各种电信网络与其他设备交换信息/数据。Multiple components in the electronic device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, etc.; an output unit 807, such as various types of displays, speakers, etc.; a storage unit 808, such as a magnetic disk, an optical disk etc.; and a communication unit 809, such as a network card, a modem, a wireless communication transceiver, and the like. The communication unit 809 allows the electronic device 800 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
计算单元801可以是各种具有处理和计算能力的通用和/或专用处理组件。计算单元801的一些示例包括但不限于中央处理单元(CPU)、图形处理单元(GPU)、各种专用的人工智能(AI)计算芯片、各种运行机器学习模型算法的计算单元、数字信号处理器(DSP)、以及任何适当的处理器、控制器、微控制器等。计算单元801执行上文所描述的各个方法和处理,例如虚拟个人助理的展示方法,例如,在一些实施例中,虚拟个人助理的展示方法可被实现为计算机软件程序,其被有形地包含于机器可读介质,例如存储单元808。在一些实施例中,计算机程序的部分或者全部可以经由ROM 802和/或通信单元809而被载入和/或安装到电子设备800上。当计算机程序加载到RAM 803并由计算单元801执行时,可以执行上文描述的虚拟个人助理的展示方法的一个或多个步骤。备选地,在其他实施例中,计算单元801可以通过其他任何适当的方式(例如,借助于固件)而被配置为执行虚拟个人助理的展示方法。The computing unit 801 may be various general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of computing units 801 include, but are not limited to, central processing units (CPUs), graphics processing units (GPUs), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 801 executes the various methods and processes described above, such as the presentation method of the virtual personal assistant, for example, in some embodiments, the presentation method of the virtual personal assistant can be implemented as a computer software program, which is tangibly contained in A machine-readable medium, such as storage unit 808 . In some embodiments, part or all of the computer program can be loaded and/or installed on the electronic device 800 via the ROM 802 and/or the communication unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the virtual personal assistant presentation method described above can be performed. Alternatively, in other embodiments, the computing unit 801 may be configured in any other appropriate way (for example, by means of firmware) to execute the virtual personal assistant presentation method.
本文中以上描述的系统和技术的各种实施方式可以在数字电子电路系统、集成电路系统、场可编程门阵列(FPGA)、专用集成电路(ASIC)、专用标准产品(ASSP)、芯片上系统的系统(SOC)、负载可编程逻辑设备(CPLD)、计算机硬件、固件、软件、和/或它们的组合中实现。这些各种实施方式可以包括:实施在一个或者多个计算机程序中,该一个或者多个计算机程序可在包括至少一个可编程处理器的可编程系统上执行和/或解释,该可编程处理器可以是专用或者通用可编程处理器,可以从存储系统、至少一个输入装置、和至少一个输出装置接收数据和指令,并且将数据和指令传输至该存储系统、该至少一个输入装置、和该至少一个输出装置。Various implementations of the systems and techniques described above herein can be implemented in digital electronic circuit systems, integrated circuit systems, field programmable gate arrays (FPGAs), application specific integrated circuits (ASICs), application specific standard products (ASSPs), systems on chips Implemented in a system of systems (SOC), load programmable logic device (CPLD), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include being implemented in one or more computer programs executable and/or interpreted on a programmable system including at least one programmable processor, the programmable processor Can be special-purpose or general-purpose programmable processor, can receive data and instruction from storage system, at least one input device, and at least one output device, and transmit data and instruction to this storage system, this at least one input device, and this at least one output device an output device.
用于实施本公开的方法的程序代码可以采用一个或多个编程语言的任何组合来编写。这些程序代码可以提供给通用计算机、专用计算机或其他可编程数据处理装置的处理器或控制器,使得程序代码当由处理器或控制器执行时使流程图和/或框图中所规定的功能/操作被实施。程序代码可以完全在机器上执行、部分地在机器上执行,作为独立软件包部分地在机器上执行且部分地在远程机器上执行或完全在远程机器或服务器上执行。Program codes for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general-purpose computer, a special purpose computer, or other programmable data processing devices, so that the program codes, when executed by the processor or controller, make the functions/functions specified in the flow diagrams and/or block diagrams Action is implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
在本公开的上下文中,机器可读介质可以是有形的介质,其可以包含或存储以供指令执行系统、装置或设备使用或与指令执行系统、装置或设备结合地使用的程序。机器可读介质可以是机器可读信号介质或机器可读储存介质。机器可读介质可以包括但不限于电子的、磁性的、光学的、电磁的、红外的、或半导体系统、装置或设备,或者上述内容的任何合适组合。机器可读存储介质的更具体示例会包括基于一个或多个线的电气连接、便携式计算机盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦除可编程只读存储器(EPROM或快闪存储器)、光纤、便捷式紧凑盘只读存储器(CD-ROM)、光学储 存设备、磁储存设备、或上述内容的任何合适组合。In the context of the present disclosure, a machine-readable medium may be a tangible medium that may contain or store a program for use by or in conjunction with an instruction execution system, apparatus, or device. A machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, electronic, magnetic, optical, electromagnetic, infrared, or semiconductor systems, apparatus, or devices, or any suitable combination of the foregoing. More specific examples of machine-readable storage media would include one or more wire-based electrical connections, portable computer discs, hard drives, random access memory (RAM), read only memory (ROM), erasable programmable read only memory (EPROM or flash memory), optical fiber, compact disk read only memory (CD-ROM), optical storage, magnetic storage, or any suitable combination of the foregoing.
为了提供与用户的交互,可以在计算机上实施此处描述的系统和技术,该计算机具有:用于向用户显示信息的显示装置(例如,CRT(阴极射线管)或者LCD(液晶显示器)监视器);以及键盘和指向装置(例如,鼠标或者轨迹球),用户可以通过该键盘和该指向装置来将输入提供给计算机。其它种类的装置还可以用于提供与用户的交互;例如,提供给用户的反馈可以是任何形式的传感反馈(例如,视觉反馈、听觉反馈、或者触觉反馈);并且可以用任何形式(包括声输入、语音输入或者、触觉输入)来接收来自用户的输入。To provide for interaction with the user, the systems and techniques described herein can be implemented on a computer having a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to the user. ); and a keyboard and pointing device (eg, a mouse or a trackball) through which a user can provide input to the computer. Other kinds of devices can also be used to provide interaction with the user; for example, the feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and can be in any form (including Acoustic input, speech input or, tactile input) to receive input from the user.
可以将此处描述的系统和技术实施在包括后台部件的计算系统(例如,作为数据服务器)、或者包括中间件部件的计算系统(例如,应用服务器)、或者包括前端部件的计算系统(例如,具有图形用户界面或者网络浏览器的用户计算机,用户可以通过该图形用户界面或者该网络浏览器来与此处描述的系统和技术的实施方式交互)、或者包括这种后台部件、中间件部件、或者前端部件的任何组合的计算系统中。可以通过任何形式或者介质的数字数据通信(例如,通信网络)来将系统的部件相互连接。通信网络的示例包括:局域网(LAN)、广域网(WAN)和互联网。The systems and techniques described herein can be implemented in a computing system that includes back-end components (e.g., as a data server), or a computing system that includes middleware components (e.g., an application server), or a computing system that includes front-end components (e.g., as a a user computer having a graphical user interface or web browser through which a user can interact with embodiments of the systems and techniques described herein), or including such backend components, middleware components, Or any combination of front-end components in a computing system. The components of the system can be interconnected by any form or medium of digital data communication, eg, a communication network. Examples of communication networks include: Local Area Network (LAN), Wide Area Network (WAN) and the Internet.
计算机系统可以包括客户端和服务器。客户端和服务器一般远离彼此并且通常通过通信网络进行交互。通过在相应的计算机上运行并且彼此具有客户端-服务器关系的计算机程序来产生客户端和服务器的关系。服务器可以是云服务器,也可以为分布式系统的服务器,或者是结合了区块链的服务器。A computer system may include clients and servers. Clients and servers are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, a server of a distributed system, or a server combined with a blockchain.
应该理解,可以使用上面所示的各种形式的流程,重新排序、增加或删除步骤。例如,本发公开中记载的各步骤可以并行地执行也可以顺序地执行也可以不同的次序执行,只要能够实现本公开公开的技术方案所期望的结果,本文在此不进行限制。It should be understood that steps may be reordered, added or deleted using the various forms of flow shown above. For example, each step described in the present disclosure may be executed in parallel, sequentially, or in a different order, as long as the desired result of the technical solution disclosed in the present disclosure can be achieved, no limitation is imposed herein.
上述具体实施方式,并不构成对本公开保护范围的限制。本领域技术人员应该明白的是,根据设计要求和其他因素,可以进行各种修改、组合、子组合和替代。任何在本公开的精神和原则之内所作的修改、等同替换和改进等,均应包含在本公开保护范围之内。The specific implementation manners described above do not limit the protection scope of the present disclosure. It should be apparent to those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made depending on design requirements and other factors. Any modifications, equivalent replacements and improvements made within the spirit and principles of the present disclosure shall be included within the protection scope of the present disclosure.

Claims (20)

  1. 一种虚拟个人助理的展示方法,包括:A display method of a virtual personal assistant, comprising:
    获取目标交互指令所表征的声音特征;其中,所述目标交互指令为车辆内的乘客发出的语音交互指令;Acquiring the sound features represented by the target interaction instruction; wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
    基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,将所述乘客的身份属性信息作为目标属性信息;Identifying the passenger's identity attribute information based on the sound features represented by the target interaction instruction, and using the passenger's identity attribute information as the target attribute information;
    获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,所述VPA展示信息包括VPA形象;Obtaining the virtual personal assistant VPA display information matching the target attribute information, the VPA display information including the VPA image;
    通过车辆内的显示屏,对所述VPA展示信息进行输出。The display information of the VPA is output through the display screen in the vehicle.
  2. 根据权利要求1所述的方法,其中,所述通过车辆内的显示屏,对所述VPA展示信息进行输出,包括:The method according to claim 1, wherein said outputting said VPA display information through a display screen in the vehicle comprises:
    通过所述车辆内的指定座位区域的显示屏,对所述VPA展示信息进行输出;Outputting the VPA display information through a display screen in a designated seating area in the vehicle;
    其中,所述指定座位区域为发出所述目标交互指令的乘客所位于的座位区域。Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  3. 根据权利要求1所述的方法,其中,所述目标交互指令的确定方式包括:The method according to claim 1, wherein the method of determining the target interaction instruction comprises:
    若所述车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;If multiple voice interaction instructions are collected in the vehicle at the same time, then based on the priority of each voice interaction instruction, select the instruction with the highest priority as the target interaction instruction;
    其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区域所确定的。Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issues the voice interaction instruction is located.
  4. 根据权利要求1-3任一项所述的方法,其中,所述获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,包括:The method according to any one of claims 1-3, wherein said obtaining display information of a virtual personal assistant (VPA) that matches said target attribute information includes:
    查找所述目标交互指令所表征声音特征对应的历史VPA展示信息,其中,所述历史VPA展示信息为向发出所述目标交互指令的乘客展示过的VPA展示信息;Find historical VPA display information corresponding to the sound feature represented by the target interaction instruction, wherein the historical VPA display information is the VPA display information shown to the passenger who issued the target interaction instruction;
    在所述历史VPA展示信息中,选择展示频率最高的VPA展示信息。Among the historical VPA display information, the VPA display information with the highest display frequency is selected.
  5. 根据权利要求1所述的方法,其中,所述获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,包括:The method according to claim 1, wherein said acquiring the virtual personal assistant (VPA) display information matching said target attribute information comprises:
    从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与所述目标属性信息相匹配的VPA展示信息。The VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
  6. 根据权利要求1所述的方法,在基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息之前,所述方法还包括:According to the method according to claim 1, before identifying the identity attribute information of the passenger based on the sound characteristics represented by the target interaction instruction, the method further includes:
    从关于各声音特征与用户行为数据的对应关系中,查找与所述目标交互指令所表征的声音特征对应的用户行为数据;From the corresponding relationship between each sound feature and user behavior data, find the user behavior data corresponding to the sound feature represented by the target interaction instruction;
    若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行通过车辆内的显示屏,对所述VPA展示信息进行输出的步骤;If it is found, then determine the VPA display information that matches the found user behavior data, and execute the step of outputting the VPA display information through the display screen in the vehicle;
    若未查找到,则执行基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息的步骤。If not found, perform the step of identifying the identity attribute information of the passenger based on the voice features represented by the target interaction instruction.
  7. 根据权利要求1所述的方法,所述VPA展示信息还包括第一引导语;在通过车辆内的显示屏,对所述VPA展示信息进行输出之后,还包括:According to the method according to claim 1, the VPA display information also includes a first guide; after the VPA display information is output through the display screen in the vehicle, it also includes:
    获取针对多媒体信息的展示状态信息,所述多媒体信息为在通过车辆内的显示屏,对所述第一引导语进行输出后,所述车辆的车机通过所述显示屏所展示给乘客的信息;Acquiring display status information for multimedia information, the multimedia information is the information displayed to passengers by the car machine of the vehicle through the display screen after outputting the first guide language through the display screen in the vehicle ;
    若展示状态信息表征多媒体信息展示完成,则确定与所述乘客的用户行为数据匹配的第二引导语;If the display status information indicates that the display of the multimedia information is completed, then determine the second guide that matches the user behavior data of the passenger;
    通过所述车辆内的显示屏,对所述第二引导语进行输出。The second guide is output through a display screen in the vehicle.
  8. 根据权利要求1、5-7任一项所述的方法,其中,基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,包括:The method according to any one of claims 1, 5-7, wherein identifying the identity attribute information of the passenger based on the voice characteristics represented by the target interaction instruction includes:
    将所述目标交互指令所表征的声音特征输入预先训练的神经网络模型,获取所述神经网络模型输出的身份属性信息。Inputting the voice features represented by the target interaction instruction into a pre-trained neural network model, and obtaining identity attribute information output by the neural network model.
  9. 一种虚拟个人助理的展示系统,包括:服务器和车辆的车载终端;A display system of a virtual personal assistant, comprising: a server and a vehicle-mounted terminal of a vehicle;
    所述车载终端,用于获取目标交互指令,并确定所述目标交互指令所表征的声音特征,向服务器发送所述声音特征,所述目标交互指令为车辆内的乘客发出的语音交互指令;The vehicle-mounted terminal is used to obtain a target interaction instruction, determine the sound characteristics represented by the target interaction instruction, and send the sound characteristics to the server, and the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
    所述服务器,用于接收所述目标交互指令所表征的声音特征,基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,将所述乘客的身份属性信息作为目标属性信息,获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,所述VPA展示信息包括VPA形象,并向所述车载终端发送VPA展示信息;The server is configured to receive the sound feature represented by the target interaction instruction, identify the identity attribute information of the passenger based on the sound feature represented by the target interaction instruction, and use the identity attribute information of the passenger as the target attribute Information, obtaining the virtual personal assistant VPA display information matched with the target attribute information, the VPA display information including the VPA image, and sending the VPA display information to the vehicle terminal;
    所述车载终端,还用于接收所述VPA展示信息,并通过车辆内的显示屏,对所述VPA展示信息进行输出。The vehicle-mounted terminal is further configured to receive the VPA display information, and output the VPA display information through a display screen in the vehicle.
  10. 根据权利要求9所述的系统,其中,所述车载终端,具体用于:The system according to claim 9, wherein the vehicle-mounted terminal is specifically used for:
    通过所述车辆内的指定座位区域的显示屏,对所述VPA展示信息进行输出;Outputting the VPA display information through a display screen in a designated seating area in the vehicle;
    其中,所述指定座位区域为发出所述目标交互指令的乘客所位于的座位区域。Wherein, the designated seat area is the seat area where the passenger who issued the target interaction instruction is located.
  11. 根据权利要求9所述的系统,所述车载终端,还用于:The system according to claim 9, the vehicle-mounted terminal is also used for:
    若所述车辆内同时采集到多个语音交互指令,则基于每个语音交互指令的优先级,选取优先级最高的指令,作为目标交互指令;If multiple voice interaction instructions are collected in the vehicle at the same time, then based on the priority of each voice interaction instruction, select the instruction with the highest priority as the target interaction instruction;
    其中,每个语音交互指令的优先级为基于发出该语音交互指令的乘客所位于的座位区 域所确定的。Wherein, the priority of each voice interaction instruction is determined based on the seat area where the passenger who issued the voice interaction instruction is located.
  12. 根据权利要求9-11任一项所述的系统,其中,所述车载终端还用于:The system according to any one of claims 9-11, wherein the vehicle-mounted terminal is also used for:
    若从所述服务器获取VPA展示信息失败,则查找所述目标交互指令所表征声音特征对应的历史VPA展示信息,其中,所述历史VPA展示信息为向发出所述目标交互指令的乘客展示过的VPA展示信息;If it fails to obtain the VPA display information from the server, then search for the historical VPA display information corresponding to the sound feature represented by the target interaction instruction, wherein the historical VPA display information is shown to the passenger who issued the target interaction instruction VPA display information;
    在所述历史VPA展示信息中,选择展示频率最高的VPA展示信息。Among the historical VPA display information, the VPA display information with the highest display frequency is selected.
  13. 根据权利要求9所述的系统,其中,所述服务器,具体用于:The system according to claim 9, wherein the server is specifically used for:
    从关于各个身份属性信息与VPA展示信息之间的预设对应关系中,获取与所述目标属性信息相匹配的VPA展示信息。The VPA display information matching the target attribute information is obtained from the preset corresponding relationship between each identity attribute information and the VPA display information.
  14. 根据权利要求9所述的系统,所述服务器,还用于:The system according to claim 9, the server is further configured to:
    在基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息之前,从关于各声音特征与用户行为数据的对应关系中,查找与所述目标交互指令所表征的声音特征对应的用户行为数据;Before identifying the identity attribute information of the passenger based on the sound features represented by the target interaction instruction, search for the correspondence between the sound features represented by the target interaction instruction from the correspondence between each sound feature and user behavior data user behavior data;
    若查找到,则确定与查找到的用户行为数据相匹配的VPA展示信息,并执行通过车辆内的显示屏,对所述VPA展示信息进行输出的步骤;If it is found, then determine the VPA display information that matches the found user behavior data, and execute the step of outputting the VPA display information through the display screen in the vehicle;
    若未查找到,则执行基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息的步骤。If not found, perform the step of identifying the identity attribute information of the passenger based on the voice features represented by the target interaction instruction.
  15. 根据权利要求9所述的系统,所述VPA展示信息还包括第一引导语;The system according to claim 9, the VPA presentation information further includes a first guide;
    所述服务器,还用于在通过车辆内的显示屏,对所述VPA展示信息进行输出之后,获取针对多媒体信息的展示状态信息,所述多媒体信息为在通过车辆内的显示屏,对所述第一引导语进行输出后,所述车辆的车机通过所述显示屏所展示给乘客的信息;The server is further configured to obtain display status information for multimedia information after outputting the VPA display information through the display screen in the vehicle, and the multimedia information is to pass through the display screen in the vehicle, and to After the first guide is output, the vehicle machine of the vehicle displays the information to passengers through the display screen;
    所述服务器,还用于若展示状态信息表征多媒体信息展示完成,则确定与所述乘客的用户行为数据匹配的第二引导语,向车载终端发送所述第二引导语;The server is further configured to determine a second guide that matches the user behavior data of the passenger and send the second guide to the vehicle-mounted terminal if the display state information indicates that the multimedia information display is completed;
    所述车载终端,还用于接收所述第二引导语,并通过所述车辆内的显示屏,对所述第二引导语进行输出。The vehicle-mounted terminal is further configured to receive the second guide, and output the second guide through a display screen in the vehicle.
  16. 根据权利要求9、13-15任一项所述的系统,其中,所述服务器,具体用于:The system according to any one of claims 9, 13-15, wherein the server is specifically used for:
    将所述目标交互指令所表征的声音特征输入预先训练的神经网络模型,获取所述神经网络模型输出的身份属性信息。Inputting the voice features represented by the target interaction instruction into a pre-trained neural network model, and obtaining identity attribute information output by the neural network model.
  17. 一种虚拟个人助理的展示装置,包括:A display device for a virtual personal assistant, comprising:
    获取模块,用于获取目标交互指令所表征的声音特征;其中,所述目标交互指令为车辆内的乘客发出的语音交互指令;An acquisition module, configured to acquire the sound features represented by the target interaction instruction; wherein, the target interaction instruction is a voice interaction instruction issued by a passenger in the vehicle;
    识别模块,用于基于所述目标交互指令所表征的声音特征,识别所述乘客的身份属性信息,将所述乘客的身份属性信息作为目标属性信息;An identification module, configured to identify the passenger's identity attribute information based on the sound features represented by the target interaction instruction, and use the passenger's identity attribute information as the target attribute information;
    所述获取模块,还用于获取与所述目标属性信息相匹配的虚拟个人助理VPA展示信息,所述VPA展示信息包括VPA形象;The obtaining module is also used to obtain the virtual personal assistant VPA display information matching the target attribute information, and the VPA display information includes a VPA image;
    输出模块,用于通过车辆内的显示屏,对所述VPA展示信息进行输出。The output module is used to output the display information of the VPA through the display screen in the vehicle.
  18. 一种电子设备,包括:An electronic device comprising:
    至少一个处理器;以及at least one processor; and
    与所述至少一个处理器通信连接的存储器;其中,a memory communicatively coupled to the at least one processor; wherein,
    所述存储器存储有可被所述至少一个处理器执行的指令,所述指令被所述至少一个处理器执行,以使所述至少一个处理器能够执行权利要求1-8中任一项所述的方法。The memory stores instructions executable by the at least one processor, the instructions are executed by the at least one processor, so that the at least one processor can perform any one of claims 1-8. Methods.
  19. 一种存储有计算机指令的非瞬时计算机可读存储介质,其中,所述计算机指令用于使所述计算机执行根据权利要求1-8中任一项所述的方法。A non-transitory computer-readable storage medium storing computer instructions, wherein the computer instructions are used to cause the computer to execute the method according to any one of claims 1-8.
  20. 一种计算机程序产品,包括计算机程序,该计算机程序被处理器执行时实现权利要求1-8中任一项所述方法的步骤。A computer program product, comprising a computer program, which implements the steps of any one of claims 1-8 when the computer program is executed by a processor.
PCT/CN2021/113299 2021-08-18 2021-08-18 Virtual personal assistant displaying method and apparatus, device, medium, and product WO2023019475A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/113299 WO2023019475A1 (en) 2021-08-18 2021-08-18 Virtual personal assistant displaying method and apparatus, device, medium, and product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/113299 WO2023019475A1 (en) 2021-08-18 2021-08-18 Virtual personal assistant displaying method and apparatus, device, medium, and product

Publications (1)

Publication Number Publication Date
WO2023019475A1 true WO2023019475A1 (en) 2023-02-23

Family

ID=85239326

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/113299 WO2023019475A1 (en) 2021-08-18 2021-08-18 Virtual personal assistant displaying method and apparatus, device, medium, and product

Country Status (1)

Country Link
WO (1) WO2023019475A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110171372A (en) * 2019-05-27 2019-08-27 广州小鹏汽车科技有限公司 Interface display method, device and the vehicle of car-mounted terminal
CN110427472A (en) * 2019-08-02 2019-11-08 深圳追一科技有限公司 The matched method, apparatus of intelligent customer service, terminal device and storage medium
CN111381673A (en) * 2018-12-28 2020-07-07 哈曼国际工业有限公司 Bidirectional vehicle-mounted virtual personal assistant
US20200339142A1 (en) * 2019-02-28 2020-10-29 Google Llc Modalities for authorizing access when operating an automated assistant enabled vehicle
CN112959998A (en) * 2021-03-19 2021-06-15 恒大新能源汽车投资控股集团有限公司 Vehicle-mounted human-computer interaction method and device, vehicle and electronic equipment

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111381673A (en) * 2018-12-28 2020-07-07 哈曼国际工业有限公司 Bidirectional vehicle-mounted virtual personal assistant
US20200339142A1 (en) * 2019-02-28 2020-10-29 Google Llc Modalities for authorizing access when operating an automated assistant enabled vehicle
CN110171372A (en) * 2019-05-27 2019-08-27 广州小鹏汽车科技有限公司 Interface display method, device and the vehicle of car-mounted terminal
CN110427472A (en) * 2019-08-02 2019-11-08 深圳追一科技有限公司 The matched method, apparatus of intelligent customer service, terminal device and storage medium
CN112959998A (en) * 2021-03-19 2021-06-15 恒大新能源汽车投资控股集团有限公司 Vehicle-mounted human-computer interaction method and device, vehicle and electronic equipment

Similar Documents

Publication Publication Date Title
US20210280190A1 (en) Human-machine interaction
CN107507612B (en) Voiceprint recognition method and device
US11004444B2 (en) Systems and methods for enhancing user experience by communicating transient errors
US10733987B1 (en) System and methods for providing unplayed content
US20230325663A1 (en) Systems and methods for domain adaptation in neural networks
US20240338552A1 (en) Systems and methods for domain adaptation in neural networks using cross-domain batch normalization
JP2017527926A (en) Generation of computer response to social conversation input
US10699706B1 (en) Systems and methods for device communications
US10672379B1 (en) Systems and methods for selecting a recipient device for communications
WO2020091882A1 (en) Systems and methods for domain adaptation in neural networks using domain classifier
EP3647914B1 (en) Electronic apparatus and controlling method thereof
CN109119069B (en) Specific crowd identification method, electronic device and computer readable storage medium
US12020691B2 (en) Dynamic vocabulary customization in automated voice systems
KR20240131944A (en) Face image producing method based on mouth shape, training method of model, and device
CN117809679A (en) Server, display equipment and digital human interaction method
EP4123477A1 (en) Recommending multimedia information
WO2023019475A1 (en) Virtual personal assistant displaying method and apparatus, device, medium, and product
WO2020087534A1 (en) Generating response in conversation
EP3846164B1 (en) Method and apparatus for processing voice, electronic device, storage medium, and computer program product
CN115062691B (en) Attribute identification method and device
CN117809681A (en) Server, display equipment and digital human interaction method
CN115858601A (en) Conducting collaborative search sessions through automated assistant
CN117809682A (en) Server, display equipment and digital human interaction method

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21953722

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE