CN115953996A - Method and device for generating natural language based on in-vehicle user information - Google Patents

Method and device for generating natural language based on in-vehicle user information Download PDF

Info

Publication number
CN115953996A
CN115953996A CN202211543220.XA CN202211543220A CN115953996A CN 115953996 A CN115953996 A CN 115953996A CN 202211543220 A CN202211543220 A CN 202211543220A CN 115953996 A CN115953996 A CN 115953996A
Authority
CN
China
Prior art keywords
information
preset
vehicle
personnel
played
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211543220.XA
Other languages
Chinese (zh)
Inventor
李龙飞
刘杰
张炜玮
林孟超
陈彩可
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
FAW Group Corp
Original Assignee
FAW Group Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by FAW Group Corp filed Critical FAW Group Corp
Priority to CN202211543220.XA priority Critical patent/CN115953996A/en
Publication of CN115953996A publication Critical patent/CN115953996A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method and a device for generating a natural language based on in-vehicle user information. The method for generating the natural language based on the in-vehicle user information comprises the following steps: acquiring voice information of people in the vehicle; acquiring basic information of people in the vehicle; acquiring the information of the slot position to be played according to the voice information of personnel in the vehicle; obtaining template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle; and generating natural language information to be played according to the template information to be played and the slot position information to be played. The method for generating the natural language based on the in-vehicle user information acquires the slot position information to be played according to the in-vehicle personnel basic information, so that different natural voice information to be played is generated according to different in-vehicle personnel basic information, and voice interaction is more humanized.

Description

Method and device for generating natural language based on in-vehicle user information
Technical Field
The application relates to the technical field of vehicle human-computer interaction, in particular to a method for generating a natural language based on in-vehicle user information and a device for generating the natural language based on in-vehicle user information.
Background
Natural language generation is currently generally used with only voice interaction initiators in mind. However, in a use scene of a vehicle, there is a case where a plurality of users are used in one cabin at the same time. The prior art can not realize the problem of natural language generation according to the specific conditions of different users by considering the specific conditions of a plurality of users in the voice interaction process.
Accordingly, a solution is desired to solve or at least mitigate the above-mentioned deficiencies of the prior art.
Disclosure of Invention
The present invention is directed to a method for generating a natural language based on in-vehicle user information to solve at least one of the above problems.
The invention provides the following scheme:
according to an aspect of the present invention, there is provided a method of generating a natural language based on in-vehicle user information, the method of generating a natural language based on in-vehicle user information including:
acquiring voice information of people in the vehicle;
acquiring basic information of people in the vehicle;
acquiring the information of the slot position to be played according to the voice information of personnel in the vehicle;
obtaining template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle;
and generating natural language information to be played according to the template information to be played and the slot position information to be played.
Optionally, the obtaining of the slot position information to be played according to the in-vehicle person voice information includes:
analyzing the voice information of the personnel in the vehicle to obtain semantic information;
judging whether to generate natural language information to be played according to the semantic analysis information, if so, judging whether to generate the natural language information to be played
And acquiring the slot position information to be played according to the semantic information.
Optionally, the acquiring of the basic information of the people in the vehicle includes:
acquiring pressure information transmitted by pressure sensors on various seats in the vehicle;
and acquiring the number of people in the vehicle according to the pressure information.
Optionally, the acquiring of the basic information of the people in the vehicle includes:
acquiring in-vehicle image information shot by an in-vehicle camera device;
and identifying the image information so as to acquire the basic information of the people in the vehicle.
Optionally, the basic information of the person in the vehicle includes information of the number of persons, information of a face image of the person, and information of an age of the person.
Optionally, the obtaining of the template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle includes:
acquiring a preset template database, wherein the preset template database comprises at least two preset templates and preset personnel conditions, and one preset template corresponds to one preset personnel condition;
judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with a preset personnel condition in the preset template database, if so, judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with the preset personnel condition in the preset template database
And acquiring a preset template corresponding to the met preset personnel condition as template information to be played.
Optionally, the method for generating a natural language based on the in-vehicle user information further includes:
acquiring a preset face database, wherein the preset face database comprises at least one preset face information;
the following operations are performed for each person face image information:
respectively carrying out similarity calculation on the acquired personnel face image information and each piece of preset face information so as to acquire a similarity value;
judging whether one similarity value is larger than a preset threshold value, if so, judging whether the similarity value is larger than the preset threshold value
Judging whether the number of the face feature information with the similarity value larger than a preset threshold value in each face feature information exceeds one, if not, judging whether the number of the face feature information with the similarity value larger than the preset threshold value exceeds one
Acquiring a preset special voice library, wherein the preset special voice library comprises at least one preset special voice type and preset face information, and one preset special voice type corresponds to one preset face information;
acquiring a preset special voice type corresponding to preset face information with the similarity value larger than a preset threshold value;
and broadcasting the natural language information to be played through the preset special voice type.
Optionally, the method for generating a natural language based on in-vehicle user information further includes:
judging whether the number of the face feature information with the similarity value larger than a preset threshold value in each face feature information exceeds one, if so, judging whether the number of the face feature information with the similarity value larger than the preset threshold value exceeds one
Acquiring a personnel relation map, wherein the personnel relation map comprises at least two pieces of personnel name information and preset face information, priority relation is formed between one piece of personnel name information and at least one piece of other personnel name information except the personnel name information, and one piece of personnel name information corresponds to one piece of preset face information;
acquiring preset face information respectively corresponding to the face feature information of which the similarity value is greater than a preset threshold value in each face feature information;
respectively acquiring personnel name information corresponding to each preset face information;
judging whether the acquired name information of each person has a priority relation, if so, judging whether the acquired name information of each person has the priority relation
Acquiring a preset special voice type corresponding to preset face information corresponding to the name information of the person with the high priority relationship;
and broadcasting the natural language information to be played through the preset special voice type.
Optionally, before the broadcasting the natural language information to be played through the preset special voice type, the method for generating a natural language based on in-vehicle user information further includes:
acquiring a sleep recognition classifier;
acquiring image information of each face of the person;
extracting characteristic information in the face image information of each person;
inputting each feature information into the sleep recognition classifier respectively so as to obtain a classification label, wherein the classification label comprises a sleep label;
when one classification label is a sleep label, acquiring volume information of broadcast voice of a current system;
judging whether the volume information exceeds a preset volume threshold value, if so, judging whether the volume information exceeds the preset volume threshold value
And reducing the volume information to be below the preset volume threshold value and broadcasting the natural language information to be played.
The application also provides a device for generating natural language based on the in-vehicle user information, which comprises:
the in-vehicle personnel voice information acquisition module is used for acquiring in-vehicle personnel voice information;
the system comprises an in-vehicle personnel basic information acquisition module, a passenger information acquisition module and a passenger information acquisition module, wherein the in-vehicle personnel basic information acquisition module is used for acquiring basic information of in-vehicle personnel;
the system comprises a to-be-played slot position information acquisition module, a to-be-played slot position information acquisition module and a play management module, wherein the to-be-played slot position information acquisition module is used for acquiring slot position information to be played according to voice information of people in the vehicle;
the template information to be played acquiring module is used for acquiring template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle;
and the natural language information to be played generating module is used for generating natural language information to be played according to the template information to be played and the slot position information to be played.
The method for generating the natural language based on the in-vehicle user information acquires the slot position information to be played according to the in-vehicle personnel basic information, so that different natural voice information to be played is generated according to different in-vehicle personnel basic information, and voice interaction is more humanized.
Drawings
Fig. 1 is a flowchart of a method for generating a natural language based on in-vehicle user information according to one or more embodiments of the present invention.
Fig. 2 is a block diagram of an electronic device according to a method for generating a natural language based on in-vehicle user information according to one or more embodiments of the present invention.
Fig. 3 is a schematic diagram of template information to be played in the method for generating a natural language based on in-vehicle user information shown in fig. 1.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
Fig. 1 is a flowchart of a method for generating a natural language based on in-vehicle user information according to one or more embodiments of the present invention.
The method for generating the natural language based on the in-vehicle user information as shown in fig. 1 includes:
step 1: acquiring voice information of people in the vehicle;
and 2, step: acquiring basic information of people in the vehicle;
and 3, step 3: acquiring the information of the slot position to be played according to the voice information of personnel in the vehicle;
and 4, step 4: obtaining template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle;
and 5: and generating natural language information to be played according to the template information to be played and the slot position information to be played.
The method for generating the natural language based on the in-vehicle user information acquires the slot position information to be played according to the in-vehicle personnel basic information, so that different natural voice information to be played is generated according to different in-vehicle personnel basic information, and voice interaction is more humanized.
In this embodiment, obtaining the slot information to be played according to the vehicle interior personnel voice information includes:
analyzing voice information of people in the vehicle to acquire semantic information;
judging whether to generate natural language information to be played according to the semantic analysis information, if so, judging whether to generate the natural language information to be played
And acquiring slot position information to be played according to the semantic information.
In one embodiment, the obtaining of the basic information of the people in the vehicle comprises:
acquiring pressure information transmitted by pressure sensors on various seats in the vehicle;
and acquiring the number of people in the vehicle according to the pressure information.
The pressure sensor can know which seats are occupied by people, in the embodiment, an in-vehicle camera device can be arranged, the number of the in-vehicle camera devices can be multiple, one in-vehicle camera device is used for shooting images of people in front of one seat, and by adopting the mode, after the seats are occupied by people, the corresponding camera devices are started to acquire image information of the people on the seats.
In this embodiment, acquiring the basic information of the person in the vehicle includes:
acquiring in-vehicle image information shot by an in-vehicle camera device;
and identifying the image information so as to obtain the basic information of people in the vehicle.
In the embodiment, the condition of the seat is not detected through the pressure sensor, but each camera device is directly started to shoot image information in the vehicle, and the basic information of people in the vehicle is obtained through an image recognition mode.
In this embodiment, the basic information of the people in the vehicle includes information of the number of people, information of the image of the face of the person, and information of the age of the person.
In this embodiment, the information on the number of people may be obtained by identifying each image obtained by the image capturing device, for example, whether each image has a face may be determined by the face image classifier, and if the image has a face, the information on the number of people may be known by the number of faces.
After the face image information of each person is obtained, the features of the face image information of each person can be extracted, and therefore the face image information is input into a preset trained age classifier, and the age information of each person is obtained.
In this embodiment, obtaining the template information to be played according to the in-vehicle person basic information and the in-vehicle person voice information includes:
acquiring a preset template database, wherein the preset template database comprises at least two preset templates and preset personnel conditions, and one preset template corresponds to one preset personnel condition;
judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with a preset personnel condition in the preset template database, if so, judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with the preset personnel condition in the preset template database
And acquiring a preset template corresponding to the met preset personnel condition as template information to be played.
In this embodiment, the method for generating a natural language based on in-vehicle user information further includes:
acquiring a preset face database, wherein the preset face database comprises at least one preset face information;
the following operations are performed for each person face image information:
respectively carrying out similarity calculation on the acquired personnel face image information and each piece of preset face information so as to acquire a similarity value;
judging whether one similarity value is larger than a preset threshold value, if so, judging whether the similarity value is larger than the preset threshold value
Judging whether the number of the face feature information with the similarity value larger than a preset threshold value in each face feature information exceeds one, if not, judging whether the number of the face feature information with the similarity value larger than the preset threshold value exceeds one
Acquiring a preset special voice library, wherein the preset special voice library comprises at least one preset special voice type and preset face information, and one preset special voice type corresponds to one preset face information;
acquiring a preset special voice type corresponding to preset face information with the similarity value larger than a preset threshold value;
and broadcasting the natural language information to be played through the preset special voice type.
In this way, on one hand, the situation of basic information of each person in the vehicle is considered, and on the other hand, the situation of some special characters is also considered, for example, some children frequently sitting in the vehicle like a certain special voice type, for example, dubbing of duola dreams, at this time, when the natural language information to be played is carried out, the natural language information to be played is embodied by presetting the special voice type (for example, the sound of dubbing of duola dreams).
In this embodiment, the method for generating a natural language based on in-vehicle user information further includes:
judging whether the number of the face feature information with the similarity value larger than a preset threshold value in each face feature information exceeds one, if so, judging whether the number of the face feature information with the similarity value larger than the preset threshold value exceeds one
Acquiring a personnel relation map, wherein the personnel relation map comprises at least two pieces of personnel name information and preset face information, priority relation is formed between one piece of personnel name information and at least one piece of other personnel name information except the personnel name information, and one piece of personnel name information corresponds to one piece of preset face information;
acquiring preset face information corresponding to face feature information with similarity values larger than a preset threshold value in the face feature information;
respectively acquiring the name information of the personnel corresponding to each piece of preset face information;
judging whether the acquired name information of each person has a priority relation, if so, judging whether the acquired name information of each person has the priority relation
Acquiring a preset special voice type corresponding to preset face information corresponding to the name information of the person with the high priority relationship;
and broadcasting the natural language information to be played through the preset special voice type.
In some cases, there may be a plurality of special persons, and at this time, it is determined which preset special voice type is used according to the relationship between the special persons, for example, a family with three doors in a car generally dominates children, so that the priority of children is higher, and it can be understood that the priority relationship can be set by itself according to the situation.
In this embodiment, before the broadcasting the to-be-played natural language information by the preset special voice type, the method for generating a natural language based on in-vehicle user information further includes:
acquiring a sleep recognition classifier;
acquiring image information of each face of the person;
extracting characteristic information in the face image information of each person;
inputting each feature information into the sleep recognition classifier respectively so as to obtain a classification label, wherein the classification label comprises a sleep label;
when one classification label is a sleep label, acquiring volume information of broadcast voice of a current system;
judging whether the volume information exceeds a preset volume threshold value, if so, judging whether the volume information exceeds the preset volume threshold value
And reducing the volume information to be below the preset volume threshold value and broadcasting the natural language information to be played.
In some cases, it may be that the sound played will wake up the sleeping child, in which case it may be played as lightly as possible.
The present application is described in further detail below by way of examples, it being understood that the examples do not constitute any limitation to the present application.
In this example, music to be played is taken as a scene for example, and it can be understood that the present application may also be applied to other interactive scenes, such as navigation, and the like, which is not described herein again.
In the scene needing to play music, the method for generating the natural language based on the in-vehicle user information comprises the following steps:
step 1: acquiring voice information of people in the vehicle; in this embodiment, the voice information of the vehicle interior person is: the song singing the Chinese is played.
Step 2: acquiring basic information of people in the vehicle, in this embodiment, the basic information of the people in the vehicle is: 3 persons are in the vehicle, and the basic information of the persons in the vehicle, which is obtained by image recognition, is as follows: the driver position is male, the years are adults (18 to 30 years old), the co-driver position is female (18 to 30 years old), the years are adults, the rear seat position is male, and the years are children (6 to 10 years old). It is understood that the years can be obtained by an age classifier, and the description thereof is omitted.
And step 3: acquiring slot position information to be played according to the voice information of the people in the vehicle, specifically, analyzing the voice information of the people in the vehicle so as to acquire semantic information;
judging whether to generate natural language information to be played according to the semantic analysis information, if so, judging whether to generate the natural language information to be played
And obtaining the slot position information to be played according to the semantic information, wherein in the embodiment, the semantic information is the singing playing country, and the slot position information to be played is the singing country.
Step 4, obtaining template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle, in this embodiment, obtaining the template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle includes:
acquiring a preset template database, wherein the preset template database comprises at least two preset templates and preset personnel conditions, and one preset template corresponds to one preset personnel condition;
judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with a preset personnel condition in the preset template database, if so, judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with the preset personnel condition in the preset template database
And acquiring a preset template corresponding to the met preset personnel condition as template information to be played.
Referring to fig. 3, in the present embodiment, it is assumed that the preset personnel conditions are: if the user age characteristic is children, the generated template information to be played is as follows: a child, let us listen.
It is understood that the preset personnel condition may be set in many ways, as shown in fig. 3, if there are multiple persons, it may be the template 4 in fig. 3, if there are other conditions, it may also be other corresponding templates, it is understood that when multiple preset personnel conditions are met, it may also be selected by priority, for example, in such a way that a template with children is the most priority template, and a template with multiple persons is the second priority template, the priority of each template is set, so that when a template is obtained, the template with the highest priority is obtained.
And 5: generating slot position information to be played according to the template information to be played and the slot position information to be played, wherein the generated slot position information to be played is as follows: the children are enabled to listen and the slot position information to be played is singed and combined in the motherland, so that the natural language information to be played is generated: children, let us listen to a song to sing the country.
In the present embodiment, the to-be-played natural language information includes template information (TTSID (PlayMusic)) and slot information (SongName, singer) to be played. The content included in the natural language information to be played belongs to the prior art, and is not described herein again.
After the natural language information to be played is generated, what voice type is adopted for broadcasting needs to be considered, at the moment, judgment is carried out according to the face of a person in a vehicle, specifically, a preset face database is obtained, and the preset face database comprises at least one piece of preset face information;
the following operations are performed for each person face image information:
respectively carrying out similarity calculation on the acquired personnel face image information and each preset face information so as to acquire a similarity value;
judging whether one similarity value is larger than a preset threshold value, if so, judging whether the similarity value is larger than the preset threshold value
Judging whether the number of the face feature information with the similarity value larger than a preset threshold value in each face feature information exceeds one, if not, judging whether the number of the face feature information with the similarity value larger than the preset threshold value exceeds one
Acquiring a preset special voice library, wherein the preset special voice library comprises at least one preset special voice type and preset face information, and one preset special voice type corresponds to one preset face information;
acquiring a preset special voice type corresponding to preset face information with the similarity value larger than a preset threshold value;
and broadcasting the natural language information to be played through the preset special voice type.
For example, taking the above three people as an example, if the facial image information of the child is the same as the preset facial information in the preset facial database, it indicates that the child has been registered in the preset facial database, and at this time, the child has a preset special voice type, for example, the voice type of dola a, and at this time, the natural language information to be played is broadcasted through the preset special voice type.
In this embodiment, when performing the announcement, it is also considered whether other passengers are sleeping, for example, a woman of three people is sleeping, and at this time, the sound playback should be reduced.
The application also provides a device for generating natural language based on the in-vehicle user information, which comprises an in-vehicle personnel voice information acquisition module, an in-vehicle personnel basic information acquisition module, a slot position information acquisition module to be played, a template information acquisition module to be played and a natural language information generation module to be played, wherein,
the in-vehicle personnel voice information acquisition module is used for acquiring in-vehicle personnel voice information; the in-vehicle personnel basic information acquisition module is used for acquiring basic information of in-vehicle personnel; the to-be-played slot position information acquisition module is used for acquiring the to-be-played slot position information according to the voice information of personnel in the vehicle; the template information to be played acquisition module is used for acquiring template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle; and the to-be-played natural language information generating module is used for generating to-be-played natural language information according to the to-be-played template information and the to-be-played slot position information.
Fig. 2 is a block diagram of an electronic device according to one or more embodiments of the present invention.
As shown in fig. 2, the present application also discloses an electronic device, comprising: the system comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory complete mutual communication through the communication bus; the memory has stored therein a computer program that, when executed by the processor, causes the processor to perform the steps of the method of generating natural language based on in-vehicle user information.
The present application further provides a computer-readable storage medium storing a computer program executable by an electronic device, which, when run on the electronic device, causes the electronic device to perform the steps of the method of generating natural language based on in-vehicle user information.
The communication bus mentioned in the electronic device may be a Peripheral Component Interconnect (PCI) bus, an Extended Industry Standard Architecture (EISA) bus, or the like. The communication bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown, but this does not mean that there is only one bus or one type of bus.
The electronic device includes a hardware layer, an operating system layer running on top of the hardware layer, and an application layer running on top of the operating system. The hardware layer includes hardware such as a Central Processing Unit (CPU), a Memory Management Unit (MMU), and a memory. The operating system may be any one or more computer operating systems that implement control of an electronic device through a Process (Process), such as a Linux operating system, a Unix operating system, an Android operating system, an iOS operating system, or a windows operating system. In the embodiment of the present invention, the electronic device may be a handheld device such as a smart phone and a tablet computer, or an electronic device such as a desktop computer and a portable computer, which is not particularly limited in the embodiment of the present invention.
The execution main body of the electronic device control in the embodiment of the present invention may be an electronic device, or a functional module capable of calling a program and executing the program in the electronic device. The electronic device may obtain the firmware corresponding to the storage medium, the firmware corresponding to the storage medium is provided by a vendor, and the firmware corresponding to different storage media may be the same or different, which is not limited herein. After the electronic device acquires the firmware corresponding to the storage medium, the firmware corresponding to the storage medium may be written into the storage medium, specifically, the firmware corresponding to the storage medium is burned into the storage medium. The process of burning the firmware into the storage medium can be implemented by adopting the prior art, and is not described in the embodiment of the present invention.
The electronic device may further acquire a reset command corresponding to the storage medium, where the reset command corresponding to the storage medium is provided by a vendor, and the reset commands corresponding to different storage media may be the same or different, and are not limited herein.
At this time, the storage medium of the electronic device is a storage medium in which the corresponding firmware is written, and the electronic device may respond to the reset command corresponding to the storage medium in which the corresponding firmware is written, so that the electronic device resets the storage medium in which the corresponding firmware is written according to the reset command corresponding to the storage medium. The process of resetting the storage medium according to the reset command can be implemented by the prior art, and is not described in detail in the embodiment of the present invention.
For convenience of description, the above devices are described as being functionally divided into various units and modules. Of course, the functions of the units and modules may be implemented in one or more software and/or hardware when the present application is implemented.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
For simplicity of explanation, the method embodiments are described as a series of acts or combinations, but those skilled in the art will appreciate that the embodiments are not limited by the order of acts, as some steps may occur in other orders or concurrently in accordance with the embodiments. Further, those skilled in the art will appreciate that the embodiments described in the specification are presently preferred and that no particular act is required to implement the invention.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for generating natural language based on in-vehicle user information is characterized by comprising the following steps:
acquiring voice information of people in the vehicle;
acquiring basic information of people in the vehicle;
acquiring the information of the slot position to be played according to the voice information of personnel in the vehicle;
obtaining template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle;
and generating natural language information to be played according to the template information to be played and the slot position information to be played.
2. The method for generating the natural language based on the in-vehicle user information according to claim 1, wherein the obtaining the slot information to be played according to the in-vehicle personnel voice information comprises:
analyzing the voice information of the people in the vehicle to acquire semantic information;
judging whether to generate natural language information to be played according to the semantic analysis information, if so, judging whether to generate the natural language information to be played
And acquiring the slot position information to be played according to the semantic information.
3. The method for generating the natural language based on the in-vehicle user information according to claim 2, wherein the acquiring of the basic information of the in-vehicle person comprises:
acquiring pressure information transmitted by pressure sensors on various seats in the vehicle;
and acquiring the number of people in the vehicle according to the pressure information.
4. The method for generating the natural language based on the in-vehicle user information according to claim 2, wherein the obtaining of the basic information of the in-vehicle person comprises:
acquiring in-vehicle image information shot by an in-vehicle camera device;
and identifying the image information so as to obtain the basic information of people in the vehicle.
5. The method of generating the natural language based on the in-vehicle user information according to claim 4, wherein the in-vehicle person basic information includes person number information, person face image information, and person age information.
6. The method for generating the natural language according to the in-vehicle user information, according to claim 5, wherein the obtaining the template information to be played according to the in-vehicle personal basic information and the in-vehicle personal voice information includes:
acquiring a preset template database, wherein the preset template database comprises at least two preset templates and preset personnel conditions, and one preset template corresponds to one preset personnel condition;
judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with a preset personnel condition in the preset template database, if so, judging whether the acquired personnel number information, the personnel face image information and the personnel age information accord with the preset personnel condition in the preset template database
And acquiring a preset template corresponding to the met preset personnel condition as template information to be played.
7. The method of generating natural language based on in-vehicle user information according to claim 6, wherein the method of generating natural language based on in-vehicle user information further comprises:
acquiring a preset face database, wherein the preset face database comprises at least one preset face information;
the following operations are performed for each person face image information:
respectively carrying out similarity calculation on the acquired personnel face image information and each piece of preset face information so as to acquire a similarity value;
judging whether one similarity value is larger than a preset threshold value, if so, judging whether the similarity value is larger than the preset threshold value
Judging whether the number of the face feature information with the similarity value larger than a preset threshold value in each face feature information exceeds one, if not, judging whether the number of the face feature information with the similarity value larger than the preset threshold value exceeds one
Acquiring a preset special voice library, wherein the preset special voice library comprises at least one preset special voice type and preset face information, and one preset special voice type corresponds to one preset face information;
acquiring a preset special voice type corresponding to preset face information with the similarity value larger than a preset threshold value;
and broadcasting the natural language information to be played through the preset special voice type.
8. The method of generating natural language based on in-vehicle user information according to claim 7, wherein the method of generating natural language based on in-vehicle user information further comprises:
judging whether the number of the face feature information with the similarity value larger than a preset threshold value in each face feature information exceeds one, if so, judging whether the number of the face feature information with the similarity value larger than the preset threshold value exceeds one
Acquiring a personnel relation map, wherein the personnel relation map comprises at least two pieces of personnel name information and preset face information, priority relation is formed between one piece of personnel name information and at least one piece of other personnel name information except the personnel name information, and one piece of personnel name information corresponds to one piece of preset face information;
acquiring preset face information respectively corresponding to the face feature information of which the similarity value is greater than a preset threshold value in each face feature information;
respectively acquiring the name information of the personnel corresponding to each piece of preset face information;
judging whether the acquired personnel name information has a priority relation, if so, judging whether the acquired personnel name information has the priority relation
Acquiring a preset special voice type corresponding to preset face information corresponding to the name information of the person with the high priority relationship;
and broadcasting the natural language information to be played through the preset special voice type.
9. The method of claim 8, wherein before the broadcasting of the natural language information to be played through the preset special voice genre, the method of generating the natural language based on the in-vehicle user information further comprises:
acquiring a sleep recognition classifier;
acquiring image information of each face of the person;
extracting characteristic information in the face image information of each person;
inputting each feature information into the sleep recognition classifier respectively so as to obtain a classification label, wherein the classification label comprises a sleep label;
when one classification label is a sleep label, acquiring volume information of broadcast voice of a current system;
judging whether the volume information exceeds a preset volume threshold value, if so, judging whether the volume information exceeds the preset volume threshold value
And reducing the volume information to be below the preset volume threshold value and broadcasting the natural language information to be played.
10. An apparatus for generating a natural language based on in-vehicle user information, the apparatus comprising:
the in-vehicle personnel voice information acquisition module is used for acquiring in-vehicle personnel voice information;
the system comprises an in-vehicle personnel basic information acquisition module, a passenger information acquisition module and a passenger information acquisition module, wherein the in-vehicle personnel basic information acquisition module is used for acquiring basic information of in-vehicle personnel;
the system comprises a to-be-played slot position information acquisition module, a to-be-played slot position information acquisition module and a play module, wherein the to-be-played slot position information acquisition module is used for acquiring the to-be-played slot position information according to the voice information of personnel in a vehicle;
the template information to be played acquiring module is used for acquiring template information to be played according to the basic information of the people in the vehicle and the voice information of the people in the vehicle;
and the to-be-played natural language information generation module is used for generating to-be-played natural language information according to the to-be-played template information and the to-be-played slot position information.
CN202211543220.XA 2022-12-02 2022-12-02 Method and device for generating natural language based on in-vehicle user information Pending CN115953996A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211543220.XA CN115953996A (en) 2022-12-02 2022-12-02 Method and device for generating natural language based on in-vehicle user information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211543220.XA CN115953996A (en) 2022-12-02 2022-12-02 Method and device for generating natural language based on in-vehicle user information

Publications (1)

Publication Number Publication Date
CN115953996A true CN115953996A (en) 2023-04-11

Family

ID=87289878

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211543220.XA Pending CN115953996A (en) 2022-12-02 2022-12-02 Method and device for generating natural language based on in-vehicle user information

Country Status (1)

Country Link
CN (1) CN115953996A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117672180A (en) * 2023-12-08 2024-03-08 广州凯迪云信息科技有限公司 Voice communication control method and system for digital robot

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117672180A (en) * 2023-12-08 2024-03-08 广州凯迪云信息科技有限公司 Voice communication control method and system for digital robot

Similar Documents

Publication Publication Date Title
CN107766787B (en) Face attribute identification method, device, terminal and storage medium
CN110047487B (en) Wake-up method and device for vehicle-mounted voice equipment, vehicle and machine-readable medium
WO2021135685A1 (en) Identity authentication method and device
WO2018149209A1 (en) Voice recognition method, electronic device, and computer storage medium
CN106933465A (en) A kind of content display method and intelligence desktop terminal based on intelligence desktop
CN108874356A (en) voice broadcast method, device, mobile terminal and storage medium
CN108831477B (en) Voice recognition method, device, equipment and storage medium
EP3617946A1 (en) Context acquisition method and device based on voice interaction
CN110232340B (en) Method and device for establishing video classification model and video classification
US20190066695A1 (en) Voiceprint registration method, server and storage medium
CN109119079A (en) voice input processing method and device
CN110619897A (en) Conference summary generation method and vehicle-mounted recording system
CN111684459A (en) Identity authentication method, terminal equipment and storage medium
US9940326B2 (en) System and method for speech to speech translation using cores of a natural liquid architecture system
WO2021218432A1 (en) Method and apparatus for interpreting picture book, electronic device and smart robot
CN111723653B (en) Method and device for reading drawing book based on artificial intelligence
WO2023197648A1 (en) Screenshot processing method and apparatus, electronic device, and computer readable medium
CN115953996A (en) Method and device for generating natural language based on in-vehicle user information
CN111063006A (en) Image-based literary work generation method, device, equipment and storage medium
CN111506183A (en) Intelligent terminal and user interaction method
US20230109852A1 (en) Data processing method and apparatus, device, and medium
CN115798470A (en) Intelligent voice interaction method, device and equipment for vehicle and storage medium
CN107748642A (en) Adjust method, apparatus, storage medium and the electronic equipment of picture
CN110428668B (en) Data extraction method and device, computer system and readable storage medium
CN113535308A (en) Language adjusting method, language adjusting device, electronic equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination