CN113139070A - Interaction method and device for in-vehicle user, computer equipment and storage medium - Google Patents

Interaction method and device for in-vehicle user, computer equipment and storage medium Download PDF

Info

Publication number
CN113139070A
CN113139070A CN202110308796.7A CN202110308796A CN113139070A CN 113139070 A CN113139070 A CN 113139070A CN 202110308796 A CN202110308796 A CN 202110308796A CN 113139070 A CN113139070 A CN 113139070A
Authority
CN
China
Prior art keywords
information
vehicle
audio
video information
target audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110308796.7A
Other languages
Chinese (zh)
Inventor
陈少吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hechuang Automotive Technology Co Ltd
Original Assignee
Hechuang Automotive Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hechuang Automotive Technology Co Ltd filed Critical Hechuang Automotive Technology Co Ltd
Priority to CN202110308796.7A priority Critical patent/CN113139070A/en
Publication of CN113139070A publication Critical patent/CN113139070A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/40Information retrieval; Database structures therefor; File system structures therefor of multimedia data, e.g. slideshows comprising image and additional audio data
    • G06F16/43Querying
    • G06F16/435Filtering based on additional data, e.g. user or group profiles
    • G06F16/436Filtering based on additional data, e.g. user or group profiles using biological or physiological data of a human being, e.g. blood pressure, facial expression, gestures
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Biophysics (AREA)
  • Child & Adolescent Psychology (AREA)
  • Molecular Biology (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • User Interface Of Digital Computer (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Physiology (AREA)
  • Hospice & Palliative Care (AREA)
  • Psychiatry (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Life Sciences & Earth Sciences (AREA)

Abstract

The application provides an interaction method and device for a user in a vehicle, computer equipment and a storage medium. The method comprises the following steps: the method comprises the steps of obtaining voice information of a rear row user in a vehicle, carrying out emotion recognition on the voice information, obtaining emotion states of the rear row user according to emotion recognition results, determining target audio and video information corresponding to the emotion states from a preset audio and video information base, and sending the audio and video information to vehicle-mounted playing equipment for playing. According to the scheme, emotion recognition is carried out based on the voice information, and related audio and video information is pushed to pacify the passengers in the back row, so that the pacifying efficiency for the users in the back row can be improved, the situation that the users in the front row pay attention to the back row with great energy is avoided, and the safety of the driving process can be improved.

Description

Interaction method and device for in-vehicle user, computer equipment and storage medium
Technical Field
The present application relates to the field of intelligent control of automobiles, and in particular, to an in-automobile user interaction method, apparatus, computer device, and storage medium.
Background
Automobiles are becoming more common in everyday life, and drivers often place children in the rear row to ensure safety, and view through rear view mirrors and communicate by voice to determine the current status. When the children are in abnormal states, the children can be pacified in time.
In the prior art, a driver usually needs to spend time and energy to pay attention to the behavior of a child in the back row, so that the driver is easy to be distracted in the driving process, and the driving safety is influenced.
Disclosure of Invention
Based on this, it is necessary to provide an in-vehicle user interaction method, an apparatus, a computer device, and a storage medium for solving the technical problem existing in the prior art that driving safety is affected due to the behavior of rear children concerned.
A method of interaction by a user in a vehicle, the method comprising: acquiring voice information of rear row users in the vehicle;
performing emotion recognition on the voice information, and obtaining the emotion state of the back-row user based on the emotion recognition result;
determining target audio-video information corresponding to the emotion state from a pre-configured audio-video information base;
and sending the target audio and video information to vehicle-mounted playing equipment so that the vehicle-mounted playing equipment plays the audio and video information.
In one embodiment, the target audio and video information comprises target audio information and projection information corresponding to the target audio information, and the vehicle-mounted playing device comprises a vehicle-mounted audio terminal and a projection terminal; the sending the target audio and video information to a vehicle-mounted playing device so as to enable the vehicle-mounted device to play the target audio and video information comprises the following steps:
and sending the target audio information to a vehicle-mounted audio terminal, and sending the projection information to a projection terminal.
In one embodiment, after the sending the target audio information to the car audio terminal and the sending the projection information to the projection terminal, the method further includes:
acquiring a content tag corresponding to the target audio information;
determining an in-vehicle environment adjustment parameter corresponding to the target audio information according to the content tag;
and adjusting the corresponding vehicle-mounted equipment according to the in-vehicle environment adjustment parameters.
In one embodiment, the determining, from a preconfigured audiovisual information library, target audiovisual information corresponding to the emotional state includes:
if the emotional state meets a preset interaction condition, playing preset interaction information through the vehicle-mounted robot;
and acquiring a control instruction of a front row user in the vehicle for the preset interactive information, and determining the target audio and video information corresponding to the emotional state according to the control instruction.
In one embodiment, after performing emotion recognition on the voice information and obtaining an emotional state of the rear user based on a result of the emotion recognition, the method further includes:
if the emotional state meets a preset interaction condition, acquiring video information of the back-row users through vehicle-mounted camera equipment;
and sending the video information to a front row display module.
In one embodiment, the performing emotion recognition on the voice information, and obtaining an emotional state of the rear row user based on a result of the emotion recognition includes:
acquiring text content in the voice information;
and inputting the text content into a preset emotion recognition model, and obtaining the emotion state of the back-row user according to the output result of the emotion recognition model.
In one embodiment, the performing content emotion recognition on the voice information to obtain an emotional state of the user further includes:
and detecting suspected child crying information of the rear row users, and determining the emotional states of the rear row users according to the detection result of the suspected child crying information.
An in-vehicle user interaction device, the device comprising:
the voice information acquisition module is used for acquiring the voice information of the rear row users in the vehicle;
the emotion state recognition module is used for carrying out emotion recognition on the voice information and obtaining the emotion state of the back-row user based on the emotion recognition result;
the target information acquisition module is used for determining target audio and video information corresponding to the emotion state from a pre-configured audio and video information database;
and the target information sending module is used for sending the target audio and video information to vehicle-mounted playing equipment so as to enable the vehicle-mounted equipment to play the audio and video information.
A computer device comprising a memory storing a computer program and a processor implementing the steps of the in-vehicle user interaction method of any of the above embodiments when the processor executes the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of interaction by a user in a vehicle according to any of the preceding embodiments.
According to the in-vehicle user interaction method, the in-vehicle user interaction device, the computer equipment and the storage medium, emotion recognition is carried out on voice information of a rear-row user in a vehicle by obtaining the voice information, the emotion state of the rear-row user is obtained according to the emotion recognition result, target audio and video information corresponding to the emotion state is determined from a preset audio and video information base, and the audio and video information is sent to vehicle-mounted playing equipment to be played. According to the scheme, emotion recognition is carried out based on the voice information, and related audio and video information is pushed to pacify the passengers in the back row, so that the pacifying efficiency for the users in the back row can be improved, the situation that the users in the front row pay attention to the back row with great energy is avoided, and the safety of the driving process can be improved.
Drawings
FIG. 1 is a diagram of an exemplary environment in which a method for user interaction in a vehicle may be implemented;
FIG. 2 is a schematic flow chart diagram illustrating a method of user interaction in a vehicle in one embodiment;
FIG. 3 is a schematic view of a projection area in one embodiment;
FIG. 4 is a schematic diagram of an in-vehicle device in one embodiment;
FIG. 5 is a flowchart illustrating a method for user interaction in a vehicle according to another embodiment;
FIG. 6 is a block diagram of an exemplary in-vehicle user interaction device;
FIG. 7 is a diagram illustrating an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The interaction method for the in-vehicle user can be applied to the application environment shown in fig. 1. The radio device 102 may communicate with the vehicle terminal 104 through a network, and the vehicle terminal 104 communicates with the in-vehicle playing device 106 through the network. The voice information of the user in the vehicle, which can be acquired by the radio device 102, is sent to the vehicle terminal 104, and after emotion recognition is performed on the voice information by the vehicle terminal 104, corresponding audio and video information is pushed to the vehicle-mounted playing device 106 for playing. The vehicle-mounted playing device can be, but is not limited to, an audio player, a video player, a display screen, a projector, a vehicle-mounted robot, and the like.
In one embodiment, as shown in fig. 2, an interaction method for a user in a vehicle is provided, which is described by taking the application of the method to the vehicle terminal of fig. 1 as an example, and includes the following steps:
step S201, voice information of the rear row users in the vehicle is obtained.
In the present disclosure, the speech information refers to the voice of the user in the rear row of the vehicle, and may be a sentence or a syllable. For example, the speech information may be information for interaction with another user, a vehicle-mounted robot, or the like in the rear row, or may be non-interactive information such as crying uttered by the rear-row user.
In a specific implementation, the vehicle terminal 104 may obtain the voice information of the rear row users in the vehicle through the radio device 102.
And step S202, performing emotion recognition on the voice information, and obtaining the emotion state of the back-row user based on the emotion recognition result.
In the present disclosure, emotion recognition refers to automatic recognition of an emotional state of an individual by acquiring a physiological or non-physiological signal of the individual, and the emotional state of the user can be recognized by the voice, conversation content, and the like of the user. The result of emotion recognition may be the probability that the latter user belongs to a certain emotional state.
The classification of the emotional state can be determined according to the theoretical classification of the emotion and the training condition of the classification sample of the emotional state, such as happy, frightened, sad, etc.
In a specific implementation, the vehicle terminal 104 may perform emotion recognition on the acquired voice information of the rear-row user, obtain an emotion state recognition result corresponding to the voice information, and determine emotion filling of the rear-row user corresponding to the voice information.
For example, vehicle terminal 104 may input the voice information into a pre-trained emotional state recognition model, obtain probabilities that the user belongs to each emotional state classification, and determine an emotional state with the highest probability as the emotional state of the rear user.
For another example, the vehicle terminal 104 may obtain voice information of the rear-row user for a certain time period, recognize emotion of the voice information in the time period to determine a change trend of an emotional state of the rear-row user in the certain time period, and determine the emotional state of the rear-row user according to the change trend.
And step S203, determining target audio and video information corresponding to the emotional state from a pre-configured audio and video information base.
In the present disclosure, the audio/video information base may include audio/video information configured locally by the vehicle terminal 104, or audio/video information acquired by the vehicle terminal 104 from other databases. Vehicle terminal 104 may establish an association relationship between the audio-video information and the emotional state, for example, by means of a classification tag, an association configuration table, and the like.
In the present disclosure, the target audio/video information refers to information corresponding to an emotional state of a back-row user, and may include audio information and video information. The target audiovisual information may comprise a plurality of pieces of audiovisual information.
In a specific implementation, the vehicle terminal 104 may select, according to the emotional state and according to the association relationship between the audio/video information and the emotional state, the target audio/video information corresponding to the emotional state from the audio/video information library.
And step S204, sending the target audio and video information to the vehicle-mounted playing equipment so that the vehicle-mounted playing equipment plays the audio and video information.
In this disclosure, the vehicle-mounted playing device 106 may be configured to play audio and video information, such as playing audio and playing video, or may play audio and video simultaneously through different vehicle-mounted playing devices.
In a specific implementation, the vehicle-mounted terminal 104 may send part or all of the audio and video in the target audio and video information to the vehicle-mounted playing device 106.
According to the in-vehicle user interaction method, the voice information of the back row users in the vehicle is obtained, emotion recognition is carried out on the voice information, the emotion states of the back row users are obtained according to emotion recognition results, target audio and video information corresponding to the emotion states is determined from a preset audio and video information base, and the audio and video information is sent to the vehicle-mounted playing device to be played. According to the scheme, emotion recognition is carried out based on the voice information, and related audio and video information is pushed to pacify the passengers in the back row, so that the pacifying efficiency for the users in the back row can be improved, the situation that the users in the front row pay attention to the back row with great energy is avoided, and the safety of the driving process can be improved.
In one embodiment, the target audio/video information may include target audio information and projection information corresponding to the target audio information, the vehicle-mounted playing device includes a vehicle-mounted audio terminal and a projection terminal, and the step of sending the target audio/video information to the vehicle-mounted playing device determined in step S204 so that the vehicle-mounted device plays the target audio/video information includes:
and sending the target audio information to the vehicle-mounted audio terminal, and sending the projection information to the projection terminal.
In the present disclosure, the vehicle-mounted audio terminal may be a display screen, a vehicle-mounted robot, or other audio playing devices.
In the present disclosure, as shown in fig. 3, the projection area of the projection terminal may be a rear seat surface and a back surface of the main and auxiliary seats, and is used for playing projection information. The target audio information and the projection information can establish an association relationship, and different projection contents can be correspondingly configured according to different target audio information.
For example, the projected content may be matched with audio content, and when the audio is played as an animated story, an interactive plot of the story may be projected. The projection content can be related to the audio content, and when the audio is played in a story in a certain scene, corresponding scenes such as a forest environment, a space environment and an animal theater environment can be played. The projected content may also be independent of the audio content, including abstract light and shadow interactions such as slow moving beams, goldfishes, ants, growing plants, etc.
For another example, when the emotional state of the user is a sad state, it may be determined that the corresponding target audio information is played in the background on the vehicle-mounted screen, and the projection content corresponding to the target audio information is played through projection, so as to sooth the emotion of the back-row user.
In some cases, projection information associated with audio information may be preset according to preferences of a rear-row user to improve efficiency of mood placating for a customer.
According to the scheme of the embodiment, the sound and visual scenes matched with the emotional states are provided for the rear users by sending the target audio and the corresponding projection information to the vehicle-mounted playing device 106, so that timely response to the emotional states of the rear users is achieved.
In one embodiment, the step of determining to transmit the target audio information to the car audio terminal and the step of transmitting the projection information to the projection terminal in step S204 includes:
acquiring a content tag corresponding to target audio information; determining an in-vehicle environment adjustment parameter corresponding to the target audio information according to the content tag; and adjusting the corresponding vehicle-mounted equipment according to the in-vehicle environment adjustment parameters.
In the disclosure, the content tag refers to content classification of audio and video information, such as softness, strength and the like, and can be matched with different in-vehicle temperatures, ventilation, seat position spaces and the like according to different content tags of the audio and video information.
In the present disclosure, as shown in fig. 4, the in-vehicle environment adjustment parameters may include an air conditioner air speed, a ventilation degree of the rear seat, a fragrance, and the like, and according to the in-vehicle environment adjustment parameters, the ventilation, the seat, and the fragrance in the vehicle are adjusted to improve a comfort degree of the rear user in a scene of playing audio and video.
In one embodiment, the step of determining the target audio-video information corresponding to the emotional state from the pre-configured audio-video information library in step S203 includes:
if the emotional state meets the preset interaction condition, playing preset interaction information through the vehicle-mounted robot; and acquiring a control instruction of a front row user in the vehicle for preset interactive information, and determining target audio and video information corresponding to the emotional state according to the control instruction.
In the present disclosure, the preset interaction condition means that the emotional state of the rear user is a negative state, such as an angry, sad or crying state.
In this disclosure, the preset interactive information refers to an option corresponding to the audio/video information that can be played. For example, the voice can be broadcasted through the vehicle-mounted robot, "how many children need to be put, you can also speak the second few with me, 1. children's song; 2. fairy tales; 3. a pre-sleep story; 4. enlightening the national science; 5. english for children. "
When the vehicle-mounted robot broadcasts the voice, the vehicle-mounted robot can turn to the front passenger and perform directional voice recognition so as to improve the efficiency and accuracy of obtaining the front user instruction.
In the present disclosure, the control instruction refers to an instruction given by a front-row user for interactive information played by the in-vehicle robot, and the control instruction may include directional selection, for example: selecting the 1 st; default options may also be included, such as: good/play immediately; it may also be a cancellation recommendation, for example: not needed/cancelled.
Wherein, if the control command of the front row user is not in the scope of the preset answer setting, the vehicle-mounted robot can repeatedly carry out voice broadcast, and remind the front row user to select only in the scope of the voice broadcast, and also can remind the option without matching.
In a specific implementation, the vehicle terminal 104 may play the preset interactive information through the vehicle-mounted robot, receive a control instruction of a front-row user for the interactive information through the vehicle-mounted robot, and determine a target audio/video information corresponding to the emotional state from an audio/video information library according to the control instruction
According to the scheme of the embodiment, the target audio and video information corresponding to the emotional state is determined through interaction with the front-row users, the real-time environment and the real-time requirement in the vehicle can be conveniently acquired, and the adaptability of the acquired target audio and video information can be improved.
In one embodiment, the emotion recognition is performed on the voice information in step S202, and the step of obtaining the emotional state of the rear user based on the result of the emotion recognition includes:
if the emotional state meets the preset interaction condition, acquiring video information of the back-row users through the vehicle-mounted camera equipment; and sending the video information to a front row display module.
In the present disclosure, the preset interaction condition means that the emotional state of the rear user is a negative state, such as an angry, sad or crying state.
In the present disclosure, the vehicle-mounted image pickup apparatus may be disposed at a position where a state of a rear user can be photographed, and may be turned on by a control instruction of the vehicle terminal 104.
In the disclosure, the front display module may be a display screen located in the front of the vehicle or other external display devices, and may be used to display video information captured by the vehicle-mounted camera device.
In specific implementation, when the vehicle terminal 104 determines that the rear-row user is in a negative state, the vehicle-mounted camera device can be started to shoot for the rear-row user, corresponding video information is obtained, and the video information is sent to the front-row display module, so that the front-row user can obtain the current state of the rear-row user without returning, and the safety of a driving process is improved.
It should be noted that the front display module may have an audio playing function, and the priority of the front display module for displaying video and playing audio may be configured, for example, the target audio may be played in the background, and the video information of the rear user is displayed in full screen.
According to the scheme of the embodiment, the camera shooting information of the back row users is acquired and displayed on the front row display module, so that when the back row users are in a negative state, the current state of the back row users can be acquired in time without distraction and look back, and the safety of the driving process is improved.
In one embodiment, the step of performing emotion recognition on the voice information in step S202, and obtaining an emotional state of the rear user based on a result of the emotion recognition, includes:
acquiring text content in the voice information, inputting the text content into a preset emotion recognition model, and obtaining the emotion state of the back-row user according to the output result of the emotion recognition model.
In this embodiment, the text content in the voice message refers to the text information obtained by converting the voice message.
In the present disclosure, the preset emotion recognition model may be a model obtained by training text information converted from historical speech as a sample, and the emotion included in the speech information may be predicted according to the text converted from the speech information. For example, HiGRU (text-based conversational emotion recognition model), vader (text emotion recognition based on lexicon and grammar rules) can be used.
According to the scheme of the embodiment, emotion recognition is carried out on the text content in the voice information through the emotion recognition model, the emotion state is obtained based on the voice information of the user, and the emotion state obtaining efficiency is improved.
In one embodiment, the content emotion recognition is performed on the voice information in step S202, and the step of obtaining the emotional state of the user includes:
and detecting suspected child crying information of the back row users, and determining the emotional states of the back row users according to the detection result of the suspected child crying information.
In this embodiment, when the rear-row user is an infant or a child, the emotional state of the rear-row user may be determined by crying detection.
In a specific implementation, the vehicle terminal 104 may obtain the sound of the rear-row user through the radio device 102, and if the suspected child crying information is detected, the suspected child crying information may be further detected to identify the probability that the child cryes, so as to determine that the emotional state of the rear-row user is the crying state.
The vehicle terminal 104 may obtain target audio and video information corresponding to the crying state of the child, play the audio information on the vehicle-mounted playing device, and play the projection information corresponding to the audio information through the projection device. The vehicle terminal can also acquire video information of the rear row users shot by the vehicle-mounted camera equipment according to the crying information of the children and send the video information to the front row display module.
According to the scheme of the embodiment, the crying state of the child is determined by detecting the crying information of the child, so that when the back row user is an infant or a child, the back row user can be appealed in time, the front row user can acquire the state of the infant or the child in time, the interaction efficiency is improved, and the safety of the driving process can also be improved.
In one embodiment, as shown in fig. 5, there is provided an in-vehicle user interaction method, including:
step S501, the vehicle terminal obtains the voice information of the back row users in the vehicle through the radio equipment.
Step S502, the vehicle terminal acquires text content in the voice message; and inputting the text content into a preset emotion recognition model, and obtaining the emotion state of the back-row user according to the output result of the emotion recognition model.
Step S503, if the emotional state meets the preset interaction condition, acquiring video information of the back-row users through the vehicle-mounted camera equipment; and sending the frequency information to a front row display module.
And step S504, the vehicle terminal determines target audio information corresponding to the emotional state and projection information corresponding to the target audio information from a pre-configured audio and video information base.
Step S505, the vehicle terminal sends the target audio information to the vehicle-mounted audio terminal, and sends the projection information to the projection terminal, so that the vehicle-mounted audio terminal plays the target audio information, and the projection terminal plays the projection information.
Step S506, the vehicle terminal acquires a content label corresponding to the target audio information; determining an in-vehicle environment adjustment parameter corresponding to the target audio information according to the content tag; and adjusting the corresponding vehicle-mounted equipment according to the in-vehicle environment adjustment parameters.
According to the embodiment, the voice information of the back row users in the vehicle is obtained, emotion recognition is carried out on the voice information through the emotion recognition model, the emotion states of the back row users are obtained according to emotion recognition results, the video information of the back row users is sent to the front row display module, the target audio information corresponding to the emotion states and the corresponding projection information are determined from the preset audio and video information base, the target audio information and the corresponding projection information are sent to the vehicle-mounted playing device to be played, the vehicle-mounted device is adjusted according to the in-vehicle environment adjustment parameters corresponding to the audio information, the pacifying efficiency of the back row users can be improved, the front row users are prevented from being interested in the back row conditions with great efforts, and the safety of the driving process can be improved.
It should be understood that although the various steps in the flow charts of fig. 2-5 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-5 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed in turn or alternately with other steps or at least some of the other steps.
The interaction method of the in-vehicle user in the embodiment of the application can be performed in an in-vehicle child care mode. When the in-vehicle child care mode is turned on, the vehicle terminal 104 may acquire video information of a rear child.
The child care mode may be initiated in the following manner:
in an active mode, a front row user can start a child watching mode through a quick button of a steering wheel, and a vehicle terminal controls to start a rear row camera in a vehicle; the front row users can enable the vehicle-mounted robot to start a child watching mode through voice instructions, and the vehicle terminal controls to start the rear row cameras in the vehicle.
The passive mode is characterized in that a child care mode is actively started if a crying state is detected through an infant crying detection technology; through the child emotion detection technology, if a preset emotion state is detected, the child care mode is actively started.
It should be understood that the vehicle terminal may be provided with an infant crying detection mode, a child emotion detection mode, and a child care mode, respectively, whether to turn on the car. When one of the modes is off in the system, the mode cannot be activated in the active or passive manner.
In one embodiment, as shown in fig. 6, there is provided an in-vehicle user interaction device, the device 600 comprising:
the voice information acquisition module 601 is used for acquiring voice information of rear row users in the vehicle;
an emotion state recognition module 602, configured to perform emotion recognition on the voice information, and obtain an emotion state of the subsequent user based on an emotion recognition result;
the target information acquisition module 603 is configured to determine target audio/video information corresponding to the emotional state from a pre-configured audio/video information library;
and the target information sending module 604 is configured to send the target audio/video information to the vehicle-mounted playing device, so that the vehicle-mounted device plays the audio/video information.
In one embodiment, the target audio and video information comprises target audio information and projection information corresponding to the target audio information, and the vehicle-mounted playing device comprises a vehicle-mounted audio terminal and a projection terminal; the target information sending module 604 includes: and the information sending unit is used for sending the target audio information to the vehicle-mounted audio terminal and sending the projection information to the projection terminal.
In an embodiment, the apparatus 600 further includes: acquiring a content tag corresponding to target audio information; determining an in-vehicle environment adjustment parameter corresponding to the target audio information according to the content tag; and adjusting the corresponding vehicle-mounted equipment according to the in-vehicle environment adjustment parameters.
In one embodiment, the target information obtaining module 603 further includes: the target information acquisition unit is used for playing preset interactive information through the vehicle-mounted robot if the emotional state meets a preset interactive condition; and acquiring a control instruction of a front row user in the vehicle for preset interactive information, and determining target audio and video information corresponding to the emotional state according to the control instruction.
In an embodiment, the apparatus 600 further includes: the video information acquisition module is used for acquiring video information of the back-row users through the vehicle-mounted camera equipment if the emotional state meets the preset interaction condition; and sending the video information to a front row display module.
In one embodiment, emotional state recognition module 602, includes: the model identification unit is used for acquiring text contents in the voice information; and inputting the text content into a preset emotion recognition model, and obtaining the emotion state of the back-row user according to the output result of the emotion recognition model.
In one embodiment, emotional state recognition module 602, includes: and the crying detection unit is used for detecting suspected child crying information of the back row users and determining the emotional states of the back row users according to the detection result of the suspected child crying information.
For specific definition of the interaction means of the in-vehicle user, reference may be made to the above definition of the interaction method of the in-vehicle user, which is not described herein again. The various modules in the above-described in-vehicle user interaction device may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
The interaction method for the in-vehicle user can be applied to computer equipment, the computer equipment can be a vehicle terminal, and the internal structure diagram of the computer equipment can be shown in fig. 7. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing model data and audiovisual information data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method of interaction for a user in a vehicle.
Those skilled in the art will appreciate that the architecture shown in fig. 7 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the above-described method embodiments when executing the computer program.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored which, when being executed by a processor, carries out the steps of the above-mentioned method embodiments.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A method of interaction by a user in a vehicle, the method comprising:
acquiring voice information of rear row users in the vehicle;
performing emotion recognition on the voice information, and obtaining the emotion state of the back-row user based on the emotion recognition result;
determining target audio-video information corresponding to the emotion state from a pre-configured audio-video information base;
and sending the target audio and video information to vehicle-mounted playing equipment so that the vehicle-mounted playing equipment plays the audio and video information.
2. The method according to claim 1, wherein the target audio and video information comprises target audio information and projection information corresponding to the target audio information, and the vehicle-mounted playing device comprises a vehicle-mounted audio terminal and a projection terminal; the sending the target audio and video information to a vehicle-mounted playing device so as to enable the vehicle-mounted device to play the target audio and video information comprises the following steps:
and sending the target audio information to a vehicle-mounted audio terminal, and sending the projection information to a projection terminal.
3. The method of claim 2, wherein after the transmitting the target audio information to a car audio terminal and the transmitting the projection information to a projection terminal, further comprising:
acquiring a content tag corresponding to the target audio information;
determining an in-vehicle environment adjustment parameter corresponding to the target audio information according to the content tag;
and adjusting the corresponding vehicle-mounted equipment according to the in-vehicle environment adjustment parameters.
4. The method according to claim 1, wherein the determining the target audio-visual information corresponding to the emotional state from a pre-configured audio-visual information library comprises:
if the emotional state meets a preset interaction condition, playing preset interaction information through the vehicle-mounted robot;
and acquiring a control instruction of a front row user in the vehicle for the preset interactive information, and determining the target audio and video information corresponding to the emotional state according to the control instruction.
5. The method according to claim 1, wherein after performing emotion recognition on the speech information and obtaining an emotional state of the rear row user based on a result of the emotion recognition, the method further comprises:
if the emotional state meets a preset interaction condition, acquiring video information of the back-row users through vehicle-mounted camera equipment;
and sending the video information to a front row display module.
6. The method according to any one of claims 1 to 5, wherein performing emotion recognition on the voice information, and obtaining an emotional state of the rear row user based on a result of the emotion recognition comprises:
acquiring text content in the voice information;
and inputting the text content into a preset emotion recognition model, and obtaining the emotion state of the back-row user according to the output result of the emotion recognition model.
7. The method according to any one of claims 1 to 5, wherein the performing content emotion recognition on the voice information to obtain an emotional state of the user further comprises:
and detecting suspected child crying information of the rear row users, and determining the emotional states of the rear row users according to the detection result of the suspected child crying information.
8. An in-vehicle user interaction device, the device comprising:
the voice information acquisition module is used for acquiring the voice information of the rear row users in the vehicle;
the emotion state recognition module is used for carrying out emotion recognition on the voice information and obtaining the emotion state of the back-row user based on the emotion recognition result;
the target information acquisition module is used for determining target audio and video information corresponding to the emotion state from a pre-configured audio and video information database;
and the target information sending module is used for sending the target audio and video information to vehicle-mounted playing equipment so as to enable the vehicle-mounted equipment to play the audio and video information.
9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.
CN202110308796.7A 2021-03-23 2021-03-23 Interaction method and device for in-vehicle user, computer equipment and storage medium Pending CN113139070A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110308796.7A CN113139070A (en) 2021-03-23 2021-03-23 Interaction method and device for in-vehicle user, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110308796.7A CN113139070A (en) 2021-03-23 2021-03-23 Interaction method and device for in-vehicle user, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113139070A true CN113139070A (en) 2021-07-20

Family

ID=76811589

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110308796.7A Pending CN113139070A (en) 2021-03-23 2021-03-23 Interaction method and device for in-vehicle user, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113139070A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113921045A (en) * 2021-10-22 2022-01-11 北京雷石天地电子技术有限公司 Vehicle-mounted music playing method and device, computer equipment and storage medium
CN113923408A (en) * 2021-09-29 2022-01-11 岚图汽车科技有限公司 Back row detection and interaction system and method
CN114268818A (en) * 2022-01-24 2022-04-01 珠海格力电器股份有限公司 Control method and device for story playing and voice assistant
CN114954332A (en) * 2021-08-16 2022-08-30 长城汽车股份有限公司 Vehicle control method and device, storage medium and vehicle

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104836956A (en) * 2015-05-09 2015-08-12 陈包容 Processing method and device for cellphone video
CN110704017A (en) * 2019-10-15 2020-01-17 北京小米移动软件有限公司 Vehicle-mounted sound control method and device, terminal equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104836956A (en) * 2015-05-09 2015-08-12 陈包容 Processing method and device for cellphone video
CN110704017A (en) * 2019-10-15 2020-01-17 北京小米移动软件有限公司 Vehicle-mounted sound control method and device, terminal equipment and storage medium

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114954332A (en) * 2021-08-16 2022-08-30 长城汽车股份有限公司 Vehicle control method and device, storage medium and vehicle
CN113923408A (en) * 2021-09-29 2022-01-11 岚图汽车科技有限公司 Back row detection and interaction system and method
CN113921045A (en) * 2021-10-22 2022-01-11 北京雷石天地电子技术有限公司 Vehicle-mounted music playing method and device, computer equipment and storage medium
CN113921045B (en) * 2021-10-22 2023-04-21 北京雷石天地电子技术有限公司 Vehicle-mounted music playing method and device, computer equipment and storage medium
CN114268818A (en) * 2022-01-24 2022-04-01 珠海格力电器股份有限公司 Control method and device for story playing and voice assistant
CN114268818B (en) * 2022-01-24 2023-02-17 珠海格力电器股份有限公司 Control method and device for story playing, storage medium and computing equipment

Similar Documents

Publication Publication Date Title
CN113139070A (en) Interaction method and device for in-vehicle user, computer equipment and storage medium
US10893236B2 (en) System and method for providing virtual interpersonal communication
US11302325B2 (en) Automatic dialogue design
JP6466385B2 (en) Service providing apparatus, service providing method, and service providing program
CN108733209A (en) Man-machine interaction method, device, robot and storage medium
US20150206535A1 (en) Speech recognition method and speech recognition device
JP2017007652A (en) Method for recognizing a speech context for speech control, method for determining a speech control signal for speech control, and apparatus for executing the method
US11014508B2 (en) Communication support system, communication support method, and storage medium
JP7180139B2 (en) Robot, robot control method and program
US20230147985A1 (en) Information processing apparatus, information processing method, and computer program
CN110696756A (en) Vehicle volume control method and device, automobile and storage medium
CN111192583B (en) Control device, agent device, and computer-readable storage medium
US11074915B2 (en) Voice interaction device, control method for voice interaction device, and non-transitory recording medium storing program
JP2005313308A (en) Robot, robot control method, robot control program, and thinking device
CN110996163A (en) System and method for automatic caption display
CN111429882B (en) Voice playing method and device and electronic equipment
JP6785889B2 (en) Service provider
CN112735387A (en) User-defined vehicle-mounted voice skill system and method
CN111866382A (en) Method for acquiring image, electronic device and computer readable storage medium
US20200082820A1 (en) Voice interaction device, control method of voice interaction device, and non-transitory recording medium storing program
CN112996194A (en) Light control method and device
CN115858850A (en) Content recommendation method, device, vehicle and computer-readable storage medium
CN114734942A (en) Method and device for adjusting sound effect of vehicle-mounted sound equipment
CN117083581A (en) Man-machine interaction method and device and terminal equipment
US20240059229A1 (en) In-vehicle communication support device and in-vehicle communication support method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210720

RJ01 Rejection of invention patent application after publication