CN116634236A - Virtual idol interaction method, electronic device, storage medium and program product - Google Patents
Virtual idol interaction method, electronic device, storage medium and program product Download PDFInfo
- Publication number
- CN116634236A CN116634236A CN202310580142.9A CN202310580142A CN116634236A CN 116634236 A CN116634236 A CN 116634236A CN 202310580142 A CN202310580142 A CN 202310580142A CN 116634236 A CN116634236 A CN 116634236A
- Authority
- CN
- China
- Prior art keywords
- scene
- interaction
- virtual
- image
- user
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 430
- 102100031788 E3 ubiquitin-protein ligase MYLIP Human genes 0.000 title claims abstract description 400
- 238000000034 method Methods 0.000 title claims abstract description 167
- 230000002452 interceptive effect Effects 0.000 claims abstract description 280
- 238000004891 communication Methods 0.000 claims abstract description 22
- 230000006854 communication Effects 0.000 claims abstract description 22
- 230000003190 augmentative effect Effects 0.000 claims description 90
- 230000009471 action Effects 0.000 claims description 70
- 238000004590 computer program Methods 0.000 claims description 41
- 230000014509 gene expression Effects 0.000 claims description 27
- 230000006870 function Effects 0.000 claims description 24
- 238000012545 processing Methods 0.000 claims description 14
- 230000002194 synthesizing effect Effects 0.000 claims description 8
- 230000008569 process Effects 0.000 abstract description 43
- 230000000875 corresponding effect Effects 0.000 description 120
- 230000009286 beneficial effect Effects 0.000 description 22
- 238000005516 engineering process Methods 0.000 description 22
- 238000013473 artificial intelligence Methods 0.000 description 17
- 238000013135 deep learning Methods 0.000 description 13
- 230000008901 benefit Effects 0.000 description 8
- 238000000605 extraction Methods 0.000 description 7
- 230000004927 fusion Effects 0.000 description 7
- 238000010801 machine learning Methods 0.000 description 7
- 241000282414 Homo sapiens Species 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 230000000007 visual effect Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 239000011521 glass Substances 0.000 description 5
- 230000004044 response Effects 0.000 description 5
- 102100028065 Fibulin-5 Human genes 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 4
- 238000007654 immersion Methods 0.000 description 4
- 238000003058 natural language processing Methods 0.000 description 4
- 210000005155 neural progenitor cell Anatomy 0.000 description 4
- 230000003287 optical effect Effects 0.000 description 4
- 230000011218 segmentation Effects 0.000 description 4
- 101710170766 Fibulin-5 Proteins 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 3
- 238000013459 approach Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 3
- 238000013136 deep learning model Methods 0.000 description 3
- 230000002708 enhancing effect Effects 0.000 description 3
- 235000017166 Bambusa arundinacea Nutrition 0.000 description 2
- 235000017491 Bambusa tulda Nutrition 0.000 description 2
- 241001330002 Bambuseae Species 0.000 description 2
- 241000196324 Embryophyta Species 0.000 description 2
- 235000015334 Phyllostachys viridis Nutrition 0.000 description 2
- 238000003491 array Methods 0.000 description 2
- 239000011425 bamboo Substances 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 230000005540 biological transmission Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 230000008859 change Effects 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 235000009508 confectionery Nutrition 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 230000002996 emotional effect Effects 0.000 description 2
- 235000013305 food Nutrition 0.000 description 2
- 239000013307 optical fiber Substances 0.000 description 2
- 230000000644 propagated effect Effects 0.000 description 2
- 239000004984 smart glass Substances 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 235000014653 Carica parviflora Nutrition 0.000 description 1
- 244000132059 Carica parviflora Species 0.000 description 1
- 241000723358 Clethra alnifolia Species 0.000 description 1
- 101710190174 E3 ubiquitin-protein ligase MYLIP Proteins 0.000 description 1
- 206010061991 Grimacing Diseases 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 238000010411 cooking Methods 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 230000008451 emotion Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000006698 induction Effects 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 238000004898 kneading Methods 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000002156 mixing Methods 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000002787 reinforcement Effects 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000004576 sand Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 230000003238 somatosensory effect Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000013526 transfer learning Methods 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/478—Supplemental services, e.g. displaying phone caller identification, shopping application
- H04N21/4788—Supplemental services, e.g. displaying phone caller identification, shopping application communicating with other users, e.g. chatting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T13/00—Animation
- G06T13/20—3D [Three Dimensional] animation
- G06T13/40—3D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/40—Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
- H04N21/47—End-user applications
- H04N21/488—Data services, e.g. news ticker
- H04N21/4882—Data services, e.g. news ticker for displaying messages, e.g. warnings, reminders
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The application provides a virtual idol interaction method, electronic equipment, a storage medium and a program product, wherein the method comprises the following steps: receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server; acquiring a target interaction scene corresponding to a user; obtaining a background image according to the target interaction scene; acquiring a foreground image containing one or more virtual idol images; generating an interactive scene image according to the background image and the foreground image; and displaying the interactive scene image by using the terminal equipment. According to the application, the communication connection between the terminal equipment and the target server is established, the server is utilized to provide the virtual idol interaction function, and the trouble that a user needs to download and install the application program by himself is avoided. In addition, the generation process of the interaction scene is optimized, the individuation degree is improved, the communication connection is enhanced, the method is widely applicable to various scenes, and the efficiency and the user experience of virtual idol interaction are improved.
Description
Technical Field
The application relates to the technical field of virtual persons and artificial intelligence, in particular to a virtual idol interaction method, electronic equipment, a computer readable storage medium and a computer program product.
Background
The virtual idol is a virtual character with independent image and characteristics created by computer graphics and artificial intelligence technology. In recent years, with rapid development of internet technology, computer graphics and artificial intelligence technology, virtual idol is widely used in the fields of entertainment, games, advertising and the like. The virtual idol can realize interaction with the user, and brings more immersive experience for the user.
In the prior art, the virtual idol interaction generally requires a user to install a corresponding application program on a computer or a mobile terminal, and then interact with the virtual idol through the application program, and the mode requires the user to download and install the application program by himself, so that the use is inconvenient.
Based on this, the present application provides a virtual idol interaction method, an electronic device, a computer readable storage medium and a computer program product to improve the prior art.
Disclosure of Invention
The application aims to provide a virtual idol interaction method, electronic equipment, a computer readable storage medium and a computer program product, wherein the server is utilized to provide a virtual idol interaction function by establishing communication connection between terminal equipment and a target server, so that the trouble that a user needs to download and install an application program by himself is avoided.
The application adopts the following technical scheme:
in a first aspect, the present application provides a virtual idol interaction method, the method comprising:
receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function;
acquiring a target interaction scene corresponding to a user;
obtaining a background image according to the target interaction scene;
acquiring a foreground image containing one or more virtual idol images;
generating an interactive scene image according to the background image and the foreground image;
and displaying the interaction scene image by using the terminal equipment so that the user interacts with one or more virtual idols in the target interaction scene.
The beneficial effect of this technical scheme lies in: by establishing communication connection between the terminal equipment and the target server, the server is utilized to provide a virtual idol interaction function, so that the trouble that a user needs to download and install an application program by himself is avoided. In addition, the generation process of the interaction scene is optimized, the individuation degree is improved, the communication connection is enhanced, the method is widely applicable to various scenes, and the efficiency and the user experience of virtual idol interaction are improved.
Specifically, by establishing communication connection between the terminal equipment and the target server, the virtual idol interaction method provided by the application can realize real-time transmission and processing of data in the interaction process, thereby being beneficial to ensuring the accuracy and the real-time performance of the data in the interaction process and further improving the user experience; according to the method, the background image is acquired according to the target interaction scene corresponding to the user, the foreground image containing one or more virtual idol images is acquired, and the interaction scene images are generated by combining the images; the virtual idol interaction method provided by the application allows the user to select or customize the target interaction scene, so that the personalized requirement of the user is met, the user can interact with one or more virtual idols in a specific scene according to the preference and the requirement of the user, and the immersive experience is enhanced; the virtual idol interaction method provided by the application is suitable for a plurality of fields such as entertainment, games, advertisements and the like, has higher universality, and can provide more vivid and rich interaction experience for users in various application scenes.
In some possible implementations, before acquiring the target interaction scene corresponding to the user, the method further includes:
and playing a preset interactive video by using the terminal equipment, wherein in the preset interactive video, one or more virtual idol images prompt the user to determine the target interactive scene.
The beneficial effect of this technical scheme lies in: the terminal equipment is used for playing the preset interactive video, so that one or more virtual idol images prompt a user to determine a target interactive scene, the participation degree of the user can be enhanced, and the user can interact with the virtual idol images in the process of watching the preset interactive video, so that the user can more intuitively know and select favorite interactive scenes; in the preset interactive video, the virtual idol can intuitively display different target interactive scenes to help a user to quickly select, so that the user can more conveniently select the interactive scenes, and the overall user experience is improved; in the preset interactive video, the virtual idol can prompt the user to determine a target interactive scene in different modes, such as through voice, characters or actions, and the like, and the diversified prompting mode not only enables the interactive process to be more vivid and interesting, but also is beneficial to meeting the requirements and the favorites of different users; through the prompt and the guidance of the virtual idol in the preset interactive video, the user can know the characteristics and the styles of the virtual idol more deeply, which is helpful for improving the sense of identity and intimacy of the user to the virtual idol, thereby enhancing the emotional connection between the user and the virtual idol.
In some possible implementations, the obtaining the target interaction scenario corresponding to the user includes:
displaying a plurality of interaction scenes by using the terminal equipment, and taking the selected interaction scene as the target interaction scene when a selection operation aiming at one interaction scene is received; or,
and receiving first input information of a user by using the terminal equipment, and acquiring a target interaction scene corresponding to the user according to the first input information, wherein the first input information is text information, voice information or image information.
The beneficial effect of this technical scheme lies in: the user can acquire the target interaction scene in two ways, one way is to display a plurality of interaction scenes on the terminal equipment for the user to select, the other way is to receive the first input information (text information, voice information or image information) of the user, and the target interaction scene is determined according to the first input information, so that the user can select the proper interaction scene more conveniently by the diversified selection ways, and the user experience is improved; the user is allowed to select or customize the target interaction scene according to own preference and demand, and the demand and interest of the user can be better known by receiving the first input information (text information, voice information or image information) of the user, so that the interaction scene which meets the personalized demand of the user is provided; the user can determine the target interaction scene through selecting operation or inputting information, and the mode enables the user to participate more actively in the selection process of the interaction scene, so that the interactivity between the user and the virtual idol is improved, and the immersive experience of the user is enhanced; the method and the device provide multiple interactive scene acquisition modes, can meet the operation habits and requirements of different users, can select among multiple interactive scenes displayed on the terminal equipment for users who like visual selection, and can determine target interactive scenes through input information for users who like self definition, and the flexibility is beneficial to improving overall user experience.
In some possible implementations, the obtaining, according to the first input information, the target interaction scenario corresponding to the user includes:
according to the first input information, one of a plurality of interaction scenes is determined to be the target interaction scene; or,
acquiring scene configuration information according to the first input information, and configuring the target interaction scene according to the scene configuration information, wherein the scene configuration information comprises at least one of the following: scene type, scene topic, scene time, scene role, and scene item.
The beneficial effect of this technical scheme lies in: according to the first input information of the user, one interaction scene can be determined from a plurality of (preset) interaction scenes to serve as a target interaction scene, or scene configuration information is acquired according to the input information and a target interaction scene is generated, so that the requirements of the user can be known more accurately, the interaction scene which is more in line with the expectations of the user is provided, and the user experience is improved; according to the first input information of the user, the scene configuration information is obtained, personalized customization of the target interactive scene can be achieved, the scene configuration information comprises various elements such as scene types, scene subjects, scene time, scene roles, scene objects and the like, the user can carry out deep customization according to own favorites and demands, and the interactive scene is more personalized and attractive; the user can determine the target interaction scene by inputting text information, voice information or image information, the operation mode is simple and convenient, the use requirements of the user under different scenes are met, and the user operation experience is improved; the target interaction scene is determined from the plurality of interaction scenes according to the first input information of the user or generated according to the scene configuration information, so that the range of the interaction scene which can be selected by the user is enlarged, the requirements of different users can be met, and the universality and the practicability of the virtual idol interaction method are enhanced.
In some possible implementations, the acquiring a background image according to the target interaction scene includes:
and acquiring a preset image corresponding to the target interaction scene as the background image.
The beneficial effect of this technical scheme lies in: the method has the advantages that the preset image corresponding to the target interaction scene is obtained to serve as the background image, the required background image can be generated rapidly, time and calculation resources required for generating the background image are saved, and the operation efficiency of the virtual even image interaction method is improved; because the preset image is specially designed for the target interactive scene, the preset image has higher image quality and specificity, the use of the preset image as the background image is beneficial to ensuring that the generated interactive scene image has higher quality and visual effect, thereby improving the user experience; by acquiring the preset image corresponding to the target interaction scene as the background image, the operation flow is simplified, the user can more easily perform virtual idol interaction, the user does not need to manually create or select the background image, and the corresponding background image can be obtained only by selecting the target interaction scene, so that the operation convenience is improved; the mode of using the preset image as the background image is convenient for later updating and expansion, and a developer can add a new preset image or replace an old preset image for the existing target interaction scene at any time, so that the content of the interaction scene is enriched, and the adaptability and the expandability of the virtual even-image interaction method are improved.
In some possible implementations, the terminal device is an augmented reality device worn by the user;
the obtaining the target interaction scene corresponding to the user comprises the following steps:
displaying a plurality of augmented reality scenes by using the terminal equipment;
when a selection operation for one of the augmented reality scenes is received, the selected augmented reality scene is taken as the target interactive scene.
The beneficial effect of this technical scheme lies in: the terminal equipment is the augmented reality equipment worn by the user, so that the user can interact with the virtual idol in the real environment, and compared with the traditional display equipment, the augmented reality equipment can bring more immersive experience to the user, and the reality and the interestingness of the virtual idol interaction are improved; the method has the advantages that the plurality of augmented reality scenes are displayed by using the augmented reality equipment for selection by a user, so that the selection process of the target interaction scene is simplified, the user can determine the target interaction scene only by selecting in the augmented reality environment with strong immersion and strong substitution, and the operation convenience is improved; a user may interact with a virtual idol through an augmented reality device in a real environment. The method enhances the user participation degree, so that the user can participate in the virtual idol interaction more invested, and the user experience is improved; the augmented reality device has very high adaptability, can be applied to multiple scenes such as entertainment, education, business and the like, and can meet the user demands under different scenes by using the augmented reality device to perform virtual idol interaction, so that the universality and the practicability of the virtual idol interaction method are enhanced.
In some possible implementations, the acquiring a background image according to the target interaction scene includes:
when the selected augmented reality scene is a virtual reality scene, acquiring a preset image corresponding to the target interaction scene as the background image;
when the selected augmented reality scene is an augmented reality scene or a mixed reality scene, a real-time image of the current environment is acquired by using a camera of the terminal device as the background image.
The beneficial effect of this technical scheme lies in: the background images are acquired in different modes according to the selected type of the augmented reality scene (virtual reality, augmented reality or mixed reality), so that the virtual idol interaction method can flexibly adapt to various types of interaction scenes, and the universality and practicability of the virtual idol interaction method are improved; in an augmented reality or mixed reality scene, a camera of the terminal equipment is used for collecting a real-time image of the current environment as a background image, so that seamless fusion of a virtual even image and a real environment can be realized, the sense of reality of the interaction of the virtual even image is enhanced, and the user experience is improved; the background images are acquired in different modes in different types of augmented reality scenes, so that rich and various interaction experiences are provided for users, and the users can select virtual reality, augmented reality or mixed reality scenes to perform virtual idol interaction according to own preference and demand; in a virtual reality scene, by acquiring a preset (3D) image as a background image, time and calculation resources required for generating the background image can be saved, and in an augmented reality or mixed reality scene, an environment image acquired in real time is used as the background image, so that higher-degree reality and virtual fusion can be realized, and the interaction quality is improved.
In some possible implementations, the acquiring a foreground image including one or more virtual idols includes:
acquiring face driving data and action driving data of each virtual idol image, and acquiring a target position of each virtual idol image in each frame of the background image;
and generating each frame of the foreground image which corresponds to each frame of the background image and contains one or more virtual even images according to the face driving data and the action driving data of each virtual even image and the target position of each virtual even image in each frame of the background image.
The beneficial effect of this technical scheme lies in: the face driving data and the action driving data of each virtual idol are obtained, and corresponding foreground images are generated in each frame of background images according to the driving data, so that the virtual idol can dynamically interact according to the operation or the preset action of a user, and the interestingness and the sense of reality of the virtual idol interaction method are improved; the face driving data and the action driving data of the virtual idol are respectively obtained, so that the mouth shape, the expression and the action of the virtual idol can be displayed more carefully, the expressive force and the sense of reality of the virtual idol are improved, and the user experience is improved; corresponding foreground images can be generated according to the face driving data and the action driving data of different virtual idol images, so that the virtual idol image interaction method can adapt to various virtual idol images, and the universality and the practicability of the virtual idol image interaction method are enhanced; the target position of each virtual idol in each frame of background image is obtained, the position of the virtual idol in the interaction scene is accurately controlled, more natural and smooth virtual idol interaction is facilitated, and user experience is improved.
In some possible implementations, the acquiring the face driving data and the action driving data of each virtual even image includes:
acquiring an interactive text and an interactive action of each virtual idol;
for each of the virtual idol images, performing the following processing:
driving the mouth shape and expression of the virtual idol according to the interactive text of the virtual idol to obtain face driving data of the virtual idol;
and driving the motion of the virtual idol according to the interaction motion of the virtual idol so as to obtain motion driving data of the virtual idol.
The beneficial effect of this technical scheme lies in: according to the interactive text and the interactive action of the virtual idol, the mouth shape, the expression and the action of the virtual idol are respectively driven, so that the performance of the virtual idol in the interactive process is more natural and real, and the user experience of the virtual idol interaction method is improved; the method has the advantages that the interactive text and the interactive action are allowed to be set for each virtual idol, so that unique interactive characteristics can be displayed among different virtual idols, the method is beneficial to providing more abundant and diverse interactive experience for users, and the sense of closeness of the users to the virtual idols is enhanced; the interactive text and the interactive action of each virtual idol are acquired, and the interactive performance of the virtual idol can be flexibly adjusted according to the requirements of users and scene setting, so that the virtual idol interaction method can be better adapted to different use scenes, and the practicability of the virtual idol interaction method is enhanced; the face driving data and the action driving data of the virtual idol are respectively generated from the interactive text and the interactive action, so that the virtual idol has finer performance in the interaction process, the interaction quality of the virtual idol interaction method is improved, and the user experience is further improved.
In some possible implementations, the obtaining the interactive text and the interactive action of each virtual idol image includes:
receiving second input information of the user by using the terminal equipment, wherein the second input information is text information, voice information or image information;
and according to the second input information, acquiring the interactive text and the interactive action of each virtual idol image so as to respond to the second input information.
The beneficial effect of this technical scheme lies in: the terminal equipment is used for receiving second input information of the user, and the interactive text and the interactive action of each virtual idol are obtained according to the information, so that real-time interaction with the user is realized, and the real-time performance and pertinence of the interaction are enhanced; the method has the advantages that various input modes such as text information, voice information and image information are supported, so that a user can select a proper input mode to interact with the virtual idol according to own preference and scene requirements, and usability and user experience of the virtual idol interaction method are improved; the user is allowed to trigger the interactive text and the interactive action of the virtual idol through the second input information, personalized interaction between the user and the virtual idol is realized, the approach and immersion of the user to the virtual idol are enhanced, and the user experience is improved; according to the second input information of the user, the interactive text and the interactive action of the virtual idol are determined, so that the virtual idol can make corresponding reactions to the demands and the instructions of the user, the intelligent degree of the virtual idol interaction method is improved, and the user experience is further improved.
In some possible implementations, the obtaining, according to the second input information, the interactive text and the interactive action of each virtual idol image includes:
according to the second input information, acquiring an interactive text and an interactive action of the first virtual idol;
and according to the second input information and the interactive text and the interactive action of the first virtual even image to the kth virtual even image, acquiring the interactive text and the interactive action of the kth+1th virtual even image, responding to the second input information or other virtual even images, wherein k is a positive integer smaller than N, N is the number of virtual even images in the foreground image, and N is an integer larger than 1.
The beneficial effect of this technical scheme lies in: by determining the interactive text and the interactive action of the virtual idol one by one, the role and the position of each virtual idol in the interactive scene can be fully considered, and a richer and finer interactive effect is realized; according to the second input information and the interactive text and the interactive actions of the k+1 virtual idol, the interactive text and the interactive actions of the k+1 virtual idol are obtained, so that the interaction between the virtual idol has consistency and synergy, and the overall interaction quality and the user experience of the virtual idol interaction method are improved; according to the second input information and the interactive text and the interactive action of other virtual idol images, the interactive text and the interactive action of each virtual idol image are obtained, diversified requirements and scene changes of users can be responded flexibly, and the practicability and the adaptability of the virtual idol image interaction method are enhanced; the interactive text and the interactive action of the k+1th virtual idol are obtained according to the second input information and the interactive text and the interactive action of the k-th virtual idol, so that dynamic adjustment and response of the virtual idol in the interaction process can be realized, and the dynamic performance and user experience of the virtual idol interaction method can be improved.
In some possible implementations, the generating an interactive scene image from the background image and the foreground image includes:
synthesizing the background image and the corresponding foreground image of each frame to generate the interactive scene image of each frame; in each frame of the interaction scene image, each virtual even image appears at a corresponding target position in the background image of the current frame, and interacts with the user through the mouth shape and the expression corresponding to the corresponding interaction text and the action corresponding to the corresponding interaction action.
The beneficial effect of this technical scheme lies in: synthesizing each frame of background image and the corresponding foreground image thereof, generating each frame of interaction scene image in real time, ensuring the instantaneity and continuity of the virtual even image interaction method, and providing smoother and natural interaction experience for users; in each frame of interaction scene image, the virtual idol interacts with the user according to the mouth shape and the expression corresponding to the interaction text and the action corresponding to the interaction action, so that the personality characteristics of the virtual idol are fully reflected, and the interaction process is more real, vivid and interesting; when an interactive scene image is generated, the virtual idol image is placed at a target position corresponding to the virtual idol image in the background image, so that the virtual idol image can flexibly adapt to various scenes, and richer and diversified interactive experience is provided for users; the virtual idol and the background image are synthesized, fusion between the virtual idol and the background is realized, and the visual effect and the sense of reality of the interactive scene are enhanced, so that the interactive experience of a user is improved.
In a second aspect, the present application provides an electronic device comprising a memory storing a computer program and at least one processor configured to implement the following steps when executing the computer program:
receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function;
acquiring a target interaction scene corresponding to a user;
obtaining a background image according to the target interaction scene;
acquiring a foreground image containing one or more virtual idol images;
generating an interactive scene image according to the background image and the foreground image;
and displaying the interaction scene image by using the terminal equipment so that the user interacts with one or more virtual idols in the target interaction scene.
In a third aspect, the present application provides a computer-readable storage medium storing a computer program which, when executed by at least one processor, performs the steps of any of the methods or performs the functions of any of the electronic devices described above.
In a fourth aspect, the application provides a computer program product comprising a computer program which, when executed by at least one processor, performs the steps of any of the methods or performs the functions of any of the electronic devices described above.
Drawings
The application is further described below with reference to the drawings and the detailed description.
Fig. 1 is a flow chart of a virtual even image interaction method provided by an embodiment of the application.
Fig. 2 is a schematic flow chart of a virtual idol interaction process based on an augmented reality device according to an embodiment of the present application.
Fig. 3 is a block diagram of an electronic device according to an embodiment of the present application.
Fig. 4 is a schematic structural diagram of a computer program product according to an embodiment of the present application.
Detailed Description
The technical scheme of the present application will be described below with reference to the drawings and the specific embodiments of the present application, and it should be noted that, on the premise of no conflict, new embodiments may be formed by any combination of the embodiments or technical features described below.
In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any implementation or design described as "exemplary" or "e.g." in the examples of this application should not be construed as preferred or advantageous over other implementations or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
The first, second, etc. descriptions in the embodiments of the present application are only used for illustration and distinction of description objects, and no order division is used, nor does it represent a particular limitation on the number in the embodiments of the present application, nor should it constitute any limitation on the embodiments of the present application.
The technical field and related terms of the embodiments of the present application are briefly described below.
The virtual objects include virtual humans, virtual animals, virtual cartoon figures, and the like. The virtual person is a personified image constructed by CG technology and operated in a code form, and has various interaction modes such as language communication, expression, action display and the like. The technology of virtual persons has been rapidly developed in the field of artificial intelligence and has been applied in many technical fields such as video, media, games, finance, travel, education, medical treatment, etc., and not only can a virtual host, a virtual anchor, a virtual training lecturer, a virtual customer service, a virtual lawyer, a virtual teacher, a virtual idol, a virtual doctor, a virtual lecturer, a virtual assistant, etc., but also a video can be generated through text or audio one-key. In the virtual people, the service type virtual people mainly have the functions of replacing real people to serve and provide daily accompaniment, are the virtualization of service type roles in reality, and have the industrial value of mainly reducing the cost of the existing service type industry and enhancing the cost reduction of the stock market.
Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. The design principle and the implementation method of various intelligent machines are researched by artificial intelligence, so that the machines have the functions of perception, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.
Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. The computer program may learn experience E given a certain class of tasks T and performance metrics P, and increase with experience E if its performance in task T happens to be measured by P. Machine learning is specialized in studying how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, reorganizing existing knowledge structures to continually improve its own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence.
Deep learning is a special machine learning by which the world is represented using a hierarchy of nested concepts, each defined as being associated with a simple concept, and achieving great functionality and flexibility, while a more abstract representation is computed in a less abstract way. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
Extended Reality (XR) refers to creating a real and virtual combined digital environment by a modern high-tech means with a computer as a core, and a novel man-machine interaction mode, so that an experienter is immersed in seamless transition between a virtual world and a real world, and is a generic name of multiple technologies such as AR, VR, MR and the like.
VR (Virtual Reality) refers to a Virtual environment that can be seen by a person through wearing a device (e.g., VR glasses) and has no dead angle at 360 degrees, so that the person can appear to be in another place, or a video or a game.
AR (Augmented Reality ) refers to superimposing virtual elements in the real world by wearable devices (e.g., google Glass) or non-wearable devices (e.g., a camera-phone and tablet). For example, the AR may improve navigation functions. Compared to 2D maps, augmented reality can superimpose directions on the road seen by the driver through the windshield, and the driver can steer accurately according to the simulated arrow.
MR (Mixed Reality) refers to the perfect fusion of the real world with the rendered graphics, creating an environment where users can interact directly with the digital and physical world. The MR fuses real and virtual things together and presents them in the same display screen. The user may experience the MR environment through a head display, a cell phone, or a tablet computer and interact with the digital item by moving it or placing it into the physical world.
NPC is an abbreviation for English Non-Player Character, translated into "Non-Player characters," which refer to characters in games, virtual worlds, etc., that are controlled by computer programs, rather than characters that are manipulated by real players. NPCs can play various roles, such as merchants, task issuers, teammates, enemies, sacrificial offerings, king, captchas, soldiers, etc., adding vivid scenes and interactions to the game.
The virtual object interaction application is used for providing virtual object interaction functions. Virtual objects can simulate human communication and behavior and interact with users. Such software (referred to as virtual human interactive applications) is typically driven by artificial intelligence and natural language processing techniques and is capable of interacting with a user by means of text, speech, images, forms, etc.
The virtual idol is a virtual character with independent image and characteristics created by computer graphics and artificial intelligence technology. In recent years, with rapid development of internet technology, computer graphics and artificial intelligence technology, virtual idol is widely used in the fields of entertainment, games, advertising and the like. The virtual idol can realize interaction with the user, and brings more immersive experience for the user.
In the prior art, the virtual idol interaction generally requires a user to install a corresponding application program on a computer or a mobile terminal, and then interact with the virtual idol through the application program, and the mode requires the user to download and install the application program by himself, so that the use is inconvenient. The generation process of the virtual idol interaction scene is complex, a large amount of computing resources and time are needed, and the problems of delay and the like in the interaction process of the user and the virtual idol are caused, so that the user experience is affected. The current virtual idol interaction method often cannot flexibly meet the personalized requirements of users when acquiring background images and foreground images, for example, users may wish to interact with virtual idols in specific scenes, but existing products cannot provide enough scene choices.
Based on this, the present application provides a virtual idol interaction method, an electronic device, a computer readable storage medium and a computer program product to improve the prior art.
The scheme provided by the embodiment of the application relates to technologies such as virtual man, interactive design, artificial intelligence, 3D modeling, cloud computing and the like, and is specifically described by the following embodiment. The following description of the embodiments is not intended to limit the preferred embodiments.
(virtual idol interaction method)
Referring to fig. 1, fig. 1 is a schematic flow chart of a virtual idol interaction method provided by an embodiment of the application.
The embodiment of the application provides a virtual idol interaction method, which comprises the following steps:
step S101: receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function;
step S102: acquiring a target interaction scene corresponding to a user;
step S103: obtaining a background image according to the target interaction scene;
step S104: acquiring a foreground image containing one or more virtual idol images;
step S105: generating an interactive scene image according to the background image and the foreground image;
step S106: and displaying the interaction scene image by using the terminal equipment so that the user interacts with one or more virtual idols in the target interaction scene.
It should be noted that the order of step S103 and step S104 may be interchanged as far as it can be realized, and the present application is not limited thereto.
The virtual idol interaction method can be operated on the electronic equipment, the electronic equipment and the terminal equipment (used by a user) can be independent, and the electronic equipment and the terminal equipment can be integrated. When the electronic device and the terminal device are independent, the electronic device may be a computer, a server (including a cloud server), or the like having computing power. The terminal device is not limited in the embodiment of the application, and may be, for example, an intelligent terminal device having a display screen, a microphone and a speaker, such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, an intelligent wearable device, an augmented reality device, or the terminal device may be a workstation or a console having a display screen, a microphone and a speaker. The display screen may be a touch display screen or a non-touch display screen.
The embodiment of the application is not limited to the augmented reality device, and may be, for example, glasses, helmets, etc., or the augmented reality device may be a system including glasses or helmets. The system may include, for example, eyeglasses or helmets, and various sensors, control handles or other controls that may be used to sense the position and posture of the user to determine the rendered picture, and to receive control operations from the user. Augmented reality devices may use a variety of sensors to track the user's actions and environment. Common sensors may include, for example, the following:
accelerometer and gyroscope: for tracking the motion and direction of an augmented reality device.
Magnetometer: for detecting the earth's magnetic field to determine the orientation of the augmented reality device.
A camera head: for capturing real world images for Augmented Reality (AR) and Mixed Reality (MR) applications.
A proximity sensor: for detecting a distance between the augmented reality device and the object.
Optical sensor: for detecting light intensity and color.
The display function of the augmented reality device is realized by a built-in display screen. For Virtual Reality (VR) helmets, two small displays, one for each eye, are typically used to create a stereoscopic image, which may be Liquid Crystal Displays (LCDs) or Organic Light Emitting Diode (OLED) displays. For Augmented Reality (AR) glasses and Mixed Reality (MR) helmets, a transparent display screen may be used to project the virtual image into the user's field of view, which may be implemented using a variety of techniques, such as projection, waveguide optics, and reflective optics.
In the embodiment of the application, the virtual doll comprises one or more of a virtual human, a virtual animal and a virtual cartoon figure. As one example, the virtual idol is a virtual person "JING" (chinese name: mirror). By way of example, the names of the virtual idol interactive applications may be "race star", "future idol", "exclusive lover", "star picking 365", "one good brother" and so on.
In the embodiment of the application, the user refers to an audience of the virtual idol.
The embodiment of the application is not limited to the interaction scene provided by the virtual idol, and the virtual idol can provide various interaction scenes, such as voice interaction, text chat, image interaction, virtual game interaction, voice interaction novel, cartoon making interaction, virtual concert (concert), virtual variety program, virtual sports competition interaction, photographing group photo, virtual dance interaction, virtual reality game, online performance interaction, gesture control interaction, somatosensory game interaction, personalized customization, virtual travel interaction, online live broadcast interaction, social network interaction and the like.
In each interaction scene, a user can interact with one or more virtual idols, so that different users can conveniently select a proper number of virtual idols to interact with the users according to own needs and preferences, the interestingness of the interaction process is improved, and the competitiveness of products (namely virtual idol interaction application) is improved.
The one or more virtual idol images contained in the foreground image can be one or more virtual idol images selected by a user, or one or more virtual idol images corresponding to the current interaction scene.
The target server may run one or more applications (e.g., applications that are virtual idol interactive applications) for providing interactive functionality for the virtual idol. These applications may be written using one or more programming languages, such as Java, python, node. Js, etc., and may utilize various frameworks and libraries to implement various functions, such as natural language processing, speech-to-text, image recognition, etc. To improve availability and performance of the target server, a load balancer may be used to distribute requests across multiple servers, and a failover mechanism may be used to automatically switch to a standby server in the event of a server failure. The electronic device and the target server can be independent, and the electronic device and the target server can be integrated.
The preset interactive video of the virtual idol can be stored in a video library corresponding to the virtual idol, and the application is not limited to the preset interactive video. The video library corresponding to the virtual idol needs to be stored on one or more servers, and the servers can use cloud storage or own storage to store videos. The video library corresponding to the virtual idol stores one or more preset interactive videos, and in each preset interactive video, the virtual idol guides a user to determine a target interactive scene in a selection or configuration mode, express greetings and self-introduction, display own talents (such as singing, dancing and the like), share daily life (such as eating, traveling, shopping and the like), publicize own (or merchant) products or brands, participate in special holidays or activities (such as new year evening, plot and the like), release special messages or tattoos and the like.
Therefore, the server is utilized to provide the virtual idol interaction function by establishing communication connection between the terminal equipment and the target server, and the trouble that a user needs to download and install an application program by himself is avoided. In addition, the generation process of the interaction scene is optimized, the individuation degree is improved, the communication connection is enhanced, the method is widely applicable to various scenes, and the efficiency and the user experience of virtual idol interaction are improved.
Specifically, by establishing communication connection between the terminal equipment and the target server, the virtual idol interaction method provided by the application can realize real-time transmission and processing of data in the interaction process, thereby being beneficial to ensuring the accuracy and the real-time performance of the data in the interaction process and further improving the user experience; according to the method, the background image is acquired according to the target interaction scene corresponding to the user, the foreground image containing one or more virtual idol images is acquired, and the interaction scene images are generated by combining the images; the virtual idol interaction method provided by the application allows the user to select or customize the target interaction scene, so that the personalized requirement of the user is met, the user can interact with one or more virtual idols in a specific scene according to the preference and the requirement of the user, and the immersive experience is enhanced; the virtual idol interaction method provided by the application is suitable for a plurality of fields such as entertainment, games, advertisements and the like, has higher universality, and can provide more vivid and rich interaction experience for users in various application scenes.
For example, company a has introduced a virtual idol interaction application that a user can interact with through a smartphone or virtual reality device. The following is a typical example of an interaction scenario:
and the user opens the virtual idol interactive application and sends an access request to the target server. And after receiving the access request, the target server establishes communication connection with the terminal equipment of the user.
The user browses and selects a target interactive scene. For example, the user selects an indoor KTV box scene.
And the target server acquires the background image according to the selected target interaction scene. For example, the background image is a KTV box-room image.
The target server obtains a foreground image containing one or more virtual idols. For example, in a KTV box scene, one virtual doll is singing a song on a stage in the box, and another virtual doll sits on a sofa.
And the target server synthesizes the background image and the foreground image to generate an interactive scene image. In this interactive scenario, two virtual idols will appear on the stage and sofa, respectively, in the KTV box background.
And finally, displaying the interactive scene image by the terminal equipment of the user. The user may interact with the virtual idol in the target interaction scenario, such as requesting songs, singing or talking with the virtual idol, drying cups, etc.
Through the virtual idol interaction method, a user can obtain immersive experience like being in the scene, and perform interesting and vivid interaction with the virtual idol. Meanwhile, the method can be flexibly suitable for various scenes, so that users can enjoy the fun of interaction with the virtual idol in different occasions.
In some embodiments, before acquiring the target interaction scene corresponding to the user, the method further includes:
and playing a preset interactive video by using the terminal equipment, wherein in the preset interactive video, one or more virtual idol images prompt the user to determine the target interactive scene.
Therefore, the terminal equipment is used for playing the preset interactive video, so that one or more virtual idol images prompt a user to determine a target interactive scene, the participation of the user can be enhanced, and the user can interact with the virtual idol images in the process of watching the preset interactive video, so that the user can more intuitively know and select favorite interactive scenes; in the preset interactive video, the virtual idol can intuitively display different target interactive scenes to help a user to quickly select, so that the user can more conveniently select the interactive scenes, and the overall user experience is improved; in the preset interactive video, the virtual idol can prompt the user to determine a target interactive scene in different modes, such as through voice, characters or actions, and the like, and the diversified prompting mode not only enables the interactive process to be more vivid and interesting, but also is beneficial to meeting the requirements and the favorites of different users; through the prompt and the guidance of the virtual idol in the preset interactive video, the user can know the characteristics and the styles of the virtual idol more deeply, which is helpful for improving the sense of identity and intimacy of the user to the virtual idol, thereby enhancing the emotional connection between the user and the virtual idol.
For example, the user opens a virtual idol interactive application. When the application is started, the terminal equipment plays a preset interactive video. In the preset interactive video, several virtual idols appear in different scenes, such as beach, forest, city street, etc.
In the preset interactive video, the virtual idol invites the user to select a target interactive scene. Each virtual idol presents and describes scene characteristics to the user in the respective scene to assist the user in making selections. Of course, there may be a scenario in which 2 or more virtual idols are presented together.
The user watches the preset interactive video, and selects a target interactive scene according to the prompt of the virtual idol. For example, the user selects a beach scene.
The user's selection is sent to the target server, which then performs subsequent processing, such as obtaining background images, foreground images, and interactive content of the virtual idol image, etc., according to the selected target interactive scene.
And the terminal equipment of the user displays the interactive scene image of the selected beach scene. In this scenario, the user may interact with the virtual idol, e.g., swim with the virtual idol, sun-shine, or conduct a conversation, etc.
In some embodiments, the obtaining the target interaction scenario corresponding to the user includes:
displaying a plurality of interaction scenes by using the terminal equipment, and taking the selected interaction scene as the target interaction scene when a selection operation aiming at one interaction scene is received; or,
and receiving first input information of a user by using the terminal equipment, and acquiring a target interaction scene corresponding to the user according to the first input information, wherein the first input information is text information, voice information or image information.
Therefore, a user can acquire target interaction scenes in two ways, one way is to display a plurality of interaction scenes on terminal equipment for selection by the user, the other way is to receive first input information (text information, voice information or image information) of the user, and the target interaction scenes are determined according to the first input information, so that the user can select appropriate interaction scenes more conveniently by the diversified selection ways, and user experience is improved; the user is allowed to select or customize the target interaction scene according to own preference and demand, and the demand and interest of the user can be better known by receiving the first input information (text information, voice information or image information) of the user, so that the interaction scene which meets the personalized demand of the user is provided; the user can determine the target interaction scene through selecting operation or inputting information, and the mode enables the user to participate more actively in the selection process of the interaction scene, so that the interactivity between the user and the virtual idol is improved, and the immersive experience of the user is enhanced; the method and the device provide multiple interactive scene acquisition modes, can meet the operation habits and requirements of different users, can select among multiple interactive scenes displayed on the terminal equipment for users who like visual selection, and can determine target interactive scenes through input information for users who like self definition, and the flexibility is beneficial to improving overall user experience.
As one example, a user launches a virtual idol interactive application. The application displays a plurality of interaction scenes on the terminal equipment for the user to select, for example: beach, forest, city street, etc. The user may select their favorite scene by directly on the terminal device, e.g. clicking on a beach scene on the screen. The application will take the selected scene as the target interaction scene.
As another example, a user may select a scene by providing first input information, such as: text information (e.g., input "i want to go to the beach"), voice information (e.g., say "take me to go to the beach"), or image information (e.g., upload a photo of the beach). The application determines a target interaction scene corresponding to the user according to the first input information provided by the user. The user's terminal device presents an interactive scene image of the selected target interactive scene (e.g., beach).
In some embodiments, the obtaining, according to the first input information, the target interaction scene corresponding to the user includes:
extracting first semantic information from the first input information by using a semantic extraction model corresponding to the first input information;
and acquiring a target interaction scene corresponding to the user according to the first semantic information.
In some embodiments, when the input information is text information, the semantic extraction model corresponding to the input information is a pre-trained language model based on deep learning; when the input information is voice information, the semantic extraction model corresponding to the input information comprises a voice-to-text model based on deep learning and a pre-training language model based on deep learning; when the input information is image information, the semantic extraction model corresponding to the input information is a semantic segmentation model based on deep learning. Wherein the input information includes first input information, second input information, and the like.
Therefore, aiming at different types of input information (text, voice and image), a corresponding semantic extraction model is adopted, so that the virtual idol interaction application can process various types of input, and the universality and adaptability of the virtual idol interaction application are enhanced; the pre-training language model, the voice-to-text model and the semantic segmentation model based on deep learning have higher accuracy, and semantic information in input information can be effectively extracted; the deep learning model has strong processing capacity, can process a large amount of complex data, and improves the response speed and processing efficiency of the virtual idol interactive application; the model based on deep learning can be continuously optimized and updated according to new data, so that the virtual idol interactive application has good adaptability and expansibility when processing the continuously changing user demands; the method can process various types of input information, can be widely applied to various virtual idol interaction scenes, and meets the requirements of different users; by supporting multiple input information types, the interactivity between the virtual idol and the user is enhanced, so that the interaction process is more vivid and interesting.
The embodiment of the application uses a model based on deep learning to extract semantic features of input information. For different types of input information, different deep learning models are adopted for semantic extraction, including a pre-training language model, a voice-to-text model, a semantic segmentation model and the like.
For text information, pre-trained language models, such as BERT, GPT, etc., are employed to extract semantic features of the input text. The models can learn the structure and semantic information of the language by pre-training a large amount of texts, so that the semantic information of the input texts can be effectively extracted.
For voice information, a voice-to-text model based on deep learning, such as CTC, transformer, is adopted to convert the voice information into text information, and a pre-training language model is adopted to extract semantic features of the text information. Thus, semantic information related to the input information can be extracted from the voice information.
For image information, a semantic segmentation model based on deep learning, such as UNet, deep Lab and the like, is adopted to segment the image, and semantic information corresponding to each pixel point is extracted. Thus, semantic information related to the input information can be accurately extracted from the image.
The method has the advantages that semantic features of input information can be extracted more accurately by using a semantic extraction model based on deep learning, so that accuracy and efficiency of semantic understanding are improved; the pre-training language model can improve the natural language processing capacity, including text classification, emotion analysis, machine translation and other aspects; by adopting different deep learning models, the multi-mode information can be processed, including text information, voice information, image information and the like, so that more comprehensive information processing is realized.
In some embodiments, the obtaining, according to the first input information, the target interaction scene corresponding to the user includes:
according to the first input information, one of a plurality of interaction scenes is determined to be the target interaction scene; or,
acquiring scene configuration information according to the first input information, and configuring the target interaction scene according to the scene configuration information, wherein the scene configuration information comprises at least one of the following: scene type, scene topic, scene time, scene role, and scene item.
The scene type is not limited in the embodiment of the present application, and may include, for example, a forest, beach, amusement park, city street, shopping mall, stadium, concert hall, museum, gallery, school, office, meeting room, restaurant, cafe, park, garden, mountain, canyon, river, lake, ocean, coral reef, glacier, snow mountain, desert, oasis, tropical rain forest, jungle, castle, palace, ancient site, historic site, bamboo forest, etc.
The scene theme is not limited by the embodiment of the application, and can comprise, for example, a sweet pet, suspense, old wind, martial arts, horror, science fiction, campus, daily, urban, history, adventure, fairy, war, magic, western, fairy, environmental protection, literature, music, sports, food, cooking, and the like.
The embodiment of the application does not limit the scene time, the scene time can comprise seasons, time periods, weather and the like, the seasons can be spring, summer, autumn and winter, the time periods can be dawn, sunrise, early morning, midday, afternoon, evening, dusk, daytime and night and the like, and the weather can be sunny, cloudy, overcast, rainy, snowy, thunderstorm, typhoon, sand storm, haze, hail, tornado and the like.
The embodiment of the application does not limit the scene roles, and can comprise one or more virtual idols and corresponding clothes (namely clothes, shoes and ornaments), one or more NPCs and corresponding clothes, one or more pets and the like.
The embodiment of the application does not limit scene articles, and can comprise articles such as furniture, plants, electrical appliances, stationery, tableware, food, ornaments, tools and the like.
Therefore, according to the first input information of the user, one interaction scene can be determined from a plurality of (preset) interaction scenes to serve as a target interaction scene, or scene configuration information is acquired according to the input information and the target interaction scene is generated, so that the requirements of the user can be known more accurately, the interaction scene which is more in line with the requirements of the user is provided, and the user experience is improved; according to the first input information of the user, the scene configuration information is obtained, personalized customization of the target interactive scene can be achieved, the scene configuration information comprises various elements such as scene types, scene subjects, scene time, scene roles, scene objects and the like, the user can carry out deep customization according to own favorites and demands, and the interactive scene is more personalized and attractive; the user can determine the target interaction scene by inputting text information, voice information or image information, the operation mode is simple and convenient, the use requirements of the user under different scenes are met, and the user operation experience is improved; the target interaction scene is determined from the plurality of interaction scenes according to the first input information of the user or generated according to the scene configuration information, so that the range of the interaction scene which can be selected by the user is enlarged, the requirements of different users can be met, and the universality and the practicability of the virtual idol interaction method are enhanced.
As one example, a user provides first input information to select a scene, such as: text information (e.g., input "i want to go to the beach"), voice information (e.g., say "take me to go to the beach"), or image information (e.g., upload a photo of the beach). The application determines one of the interactive scenes directly from the plurality of interactive scenes in the scene library as a target interactive scene. For example, the user inputs "I want to go to the beach" and the application will directly select the beach as the target interaction scene.
As another example, a user provides first input information to select a scene, an application obtains scene configuration information based on the first input information provided by the user, and then determines a target interaction scene based on the scene configuration information. The scene configuration information may include scene type (e.g., outdoor, indoor), scene theme (e.g., romantic, adventure), scene time (e.g., day, night), scene character (e.g., virtual idol and its apparel, NPC and its apparel), and scene item (e.g., furniture, plants, etc.), etc. For example, the user inputs "I want to co-dance with a virtual idol wearing a evening dress at night, and the user is happy with a Cupid playing organ", and the application generates a target interaction scene with a specific setting according to the configuration information.
In some embodiments, the acquiring a background image from the target interaction scene includes:
and acquiring a preset image corresponding to the target interaction scene as the background image.
Therefore, the required background image can be quickly generated by acquiring the preset image corresponding to the target interaction scene as the background image, so that the time and the computing resources required for generating the background image are saved, and the operation efficiency of the virtual even image interaction method is improved; because the preset image is specially designed for the target interactive scene, the preset image has higher image quality and specificity, the use of the preset image as the background image is beneficial to ensuring that the generated interactive scene image has higher quality and visual effect, thereby improving the user experience; by acquiring the preset image corresponding to the target interaction scene as the background image, the operation flow is simplified, the user can more easily perform virtual idol interaction, the user does not need to manually create or select the background image, and the corresponding background image can be obtained only by selecting the target interaction scene, so that the operation convenience is improved; the mode of using the preset image as the background image is convenient for later updating and expansion, and a developer can add a new preset image or replace an old preset image for the existing target interaction scene at any time, so that the content of the interaction scene is enriched, and the adaptability and the expandability of the virtual even-image interaction method are improved.
As one example, a user launches a virtual idol interactive application and selects a target interactive scene, such as a beach. The application obtains a preset image corresponding to the beach scene, such as an exquisite beach illustration or a realistic beach photograph, as a background image. And the terminal equipment of the user displays the background image of the selected target interaction scene. In the interactive scene, a user can see a beautiful beach which comprises elements such as blue sky, white cloud, beach, sea wave and the like.
Referring to fig. 2, fig. 2 is a schematic flow chart of a virtual idol interaction process based on an augmented reality device according to an embodiment of the present application.
In some embodiments, the terminal device is an augmented reality device worn by the user;
the obtaining the target interaction scene corresponding to the user comprises the following steps:
displaying a plurality of augmented reality scenes by using the terminal equipment;
when a selection operation for one of the augmented reality scenes is received, the selected augmented reality scene is taken as the target interactive scene.
Therefore, the terminal equipment is the augmented reality equipment worn by the user, so that the user can interact with the virtual idol in the real environment, and compared with the traditional display equipment, the augmented reality equipment can bring more immersive experience to the user, and the reality and the interestingness of the virtual idol interaction are improved; the method has the advantages that the plurality of augmented reality scenes are displayed by using the augmented reality equipment for selection by a user, so that the selection process of the target interaction scene is simplified, the user can determine the target interaction scene only by selecting in the augmented reality environment with strong immersion and strong substitution, and the operation convenience is improved; a user may interact with a virtual idol through an augmented reality device in a real environment. The method enhances the user participation degree, so that the user can participate in the virtual idol interaction more invested, and the user experience is improved; the augmented reality device has very high adaptability, can be applied to multiple scenes such as entertainment, education, business and the like, and can meet the user demands under different scenes by using the augmented reality device to perform virtual idol interaction, so that the universality and the practicability of the virtual idol interaction method are enhanced.
Taking a virtual idol interaction application based on an augmented reality device as an example, a user interacts with the virtual idol through smart glasses or other similar augmented reality devices. The following is a typical example of an interaction procedure:
the user wears the augmented reality device and starts the virtual idol interactive application.
Applications show a number of optional augmented reality scenarios in the field of view of the user, such as home living room, cafes, parks, bamboo, palace, deserts, grasslands, snowmountains, etc. These scenes combine virtual elements with the user's real environment, making the interaction more immersive and natural.
The user selects one of the augmented reality scenes according to his own preference. For example, the user has selected a cafe scene.
The application receives the selection operation of the user on the cafe scene, and takes the selected cafe scene as a target interaction scene.
In the target interaction scene, a user is expanded and interacted with the virtual idol. The application places the virtual idol in a user selected cafe scenario, e.g., the virtual idol may sit opposite the user, drink coffee with the user, etc.
In some embodiments, the acquiring a background image from the target interaction scene includes:
When the selected augmented reality scene is a virtual reality scene, acquiring a preset image corresponding to the target interaction scene as the background image;
when the selected augmented reality scene is an augmented reality scene or a mixed reality scene, a real-time image of the current environment is acquired by using a camera of the terminal device as the background image.
Therefore, the background images are acquired in different modes according to the selected type of the augmented reality scene (virtual reality, augmented reality or mixed reality), so that the virtual idol interaction method can flexibly adapt to various types of interaction scenes, and the universality and practicability of the virtual idol interaction method are improved; in an augmented reality or mixed reality scene, a camera of the terminal equipment is used for collecting a real-time image of the current environment as a background image, so that seamless fusion of a virtual even image and a real environment can be realized, the sense of reality of the interaction of the virtual even image is enhanced, and the user experience is improved; the background images are acquired in different modes in different types of augmented reality scenes, so that rich and various interaction experiences are provided for users, and the users can select virtual reality, augmented reality or mixed reality scenes to perform virtual idol interaction according to own preference and demand; in a virtual reality scene, by acquiring a preset (3D) image as a background image, time and calculation resources required for generating the background image can be saved, and in an augmented reality or mixed reality scene, an environment image acquired in real time is used as the background image, so that higher-degree reality and virtual fusion can be realized, and the interaction quality is improved.
In the above embodiments, the virtual idol interaction process involves different types of augmented reality scenes, such as virtual reality, augmented reality, and mixed reality. The following is a typical example of an interaction procedure:
the user wears an augmented reality device (e.g., virtual reality helmet, smart glasses, etc.) and launches a virtual idol interactive application.
Applications show multiple augmented reality scenes in the field of view of a user, including virtual reality scenes (e.g., game environments, virtual cities, magic castes, pirates, peaks of snow mountains, seafloors, tropical rain forests, etc.), augmented reality scenes (e.g., home living rooms, cafes, etc.), and mixed reality scenes (e.g., virtual pets in parks, workplaces and supermarkets where real and virtual elements are combined, etc.).
The user selects one of the augmented reality scenes according to his own preference.
When the user selects the virtual reality scene, the application acquires a preset 3D image corresponding to the target interaction scene as a background image. For example, the user selects a virtual city in a virtual reality scene, and the application will acquire a preset 3D image of the virtual city as a background. When a user selects other virtual reality scenes such as a magic castle, a pirate ship, a peak of a snow mountain, the seabed, a tropical rain forest and the like, the application takes a corresponding preset 3D image as a background.
If the user selects an augmented reality scene or a mixed reality scene, the application will use the camera of the terminal device to acquire a real-time image of the current environment as a background image. For example, if the user selects the home living room currently in as the augmented reality scene, the application will capture the picture of the home living room in real time and take it as the background image.
In the selected scenario, the user is interacting with a virtual idol expansion. The application places the virtual idol in the scene selected by the user and fuses with the real environment or the virtual environment so that the user can perform natural and immersive interaction with the virtual idol in the selected scene.
In some embodiments, the acquiring a foreground image containing one or more virtual idol images includes:
acquiring face driving data and action driving data of each virtual idol image, and acquiring a target position of each virtual idol image in each frame of the background image;
and generating each frame of the foreground image which corresponds to each frame of the background image and contains one or more virtual even images according to the face driving data and the action driving data of each virtual even image and the target position of each virtual even image in each frame of the background image.
Therefore, face driving data and action driving data of each virtual idol are obtained, corresponding foreground images are generated in each frame of background images according to the driving data, so that the virtual idol can dynamically interact according to the operation or the preset action of a user, and the interestingness and the sense of reality of the virtual idol interaction method are improved; the face driving data and the action driving data of the virtual idol are respectively obtained, so that the mouth shape, the expression and the action of the virtual idol can be displayed more carefully, the expressive force and the sense of reality of the virtual idol are improved, and the user experience is improved; corresponding foreground images can be generated according to the face driving data and the action driving data of different virtual idol images, so that the virtual idol image interaction method can adapt to various virtual idol images, and the universality and the practicability of the virtual idol image interaction method are enhanced; the target position of each virtual idol in each frame of background image is obtained, the position of the virtual idol in the interaction scene is accurately controlled, more natural and smooth virtual idol interaction is facilitated, and user experience is improved.
In the above embodiments, the virtual idol interaction process involves acquiring a foreground image containing one or more virtual idols. The following is a typical example of an interaction procedure:
And the user starts the virtual idol interaction application, selects an interaction scene and acquires a background image corresponding to the scene.
The application obtains information of a plurality of virtual idols, including face-driven data and action-driven data. The face driving data may include driving data of expression, mouth shape, etc., and the motion driving data may include driving data of posture, motion, etc.
The application determines for each virtual idol a target position in each frame of the background image, which positions may be preset or calculated in real time from the user interaction data (e.g. real-time images).
Based on the face drive data, the motion drive data, and the target location of each virtual idol in each frame of the background image, the application generates each frame of the foreground image containing one or more virtual idols corresponding to each frame of the background image, e.g., a 3D model of the virtual idol may be rendered onto the 2D image, or the foreground image may be synthesized using other methods.
After each frame of foreground image is generated, the foreground image and the background image are synthesized by application, and a complete interactive scene image is generated, wherein the process relates to the technologies of transparency mixing, shielding treatment, light treatment and the like.
The generated interactive scene image is displayed on the terminal equipment, so that a user can interact with the virtual idol. The virtual idol will change expression and motion according to the face-driven data and the motion-driven data to achieve a natural and interesting interactive experience. By the method, the application can provide a dynamic and real virtual idol interaction experience for the user, and meanwhile, good performance and picture quality are maintained.
In some embodiments, the acquiring the face driving data and the action driving data of each virtual even image includes:
acquiring an interactive text and an interactive action of each virtual idol;
for each of the virtual idol images, performing the following processing:
driving the mouth shape and expression of the virtual idol according to the interactive text of the virtual idol to obtain face driving data of the virtual idol;
and driving the motion of the virtual idol according to the interaction motion of the virtual idol so as to obtain motion driving data of the virtual idol.
In the embodiment of the application, the interaction action can be smiling, waving hands, dancing, kissing, hugging, handshake, oiling, shouting, calling, clapping, heart comparing, grimacing, waving, nodding, boxing, making a break, shoulder kneading, tapping and the like.
Therefore, the mouth shape, the expression and the action of the virtual idol are driven according to the interactive text and the interactive action of the virtual idol, so that the performance of the virtual idol in the interaction process is more natural and real, and the user experience of the virtual idol interaction method is improved; the method has the advantages that the interactive text and the interactive action are allowed to be set for each virtual idol, so that unique interactive characteristics can be displayed among different virtual idols, the method is beneficial to providing more abundant and diverse interactive experience for users, and the sense of closeness of the users to the virtual idols is enhanced; the interactive text and the interactive action of each virtual idol are acquired, and the interactive performance of the virtual idol can be flexibly adjusted according to the requirements of users and scene setting, so that the virtual idol interaction method can be better adapted to different use scenes, and the practicability of the virtual idol interaction method is enhanced; the face driving data and the action driving data of the virtual idol are respectively generated from the interactive text and the interactive action, so that the virtual idol has finer performance in the interaction process, the interaction quality of the virtual idol interaction method is improved, and the user experience is further improved.
In the above embodiments, the virtual idol interaction process involves acquiring face-driven data and action-driven data for each virtual idol. The following is a typical example of an interaction procedure:
The user interacts with the virtual idol and sends text, speech, or a use gesture to the application.
The application program receives input information of a user and acquires interactive text and interactive actions of each virtual idol image so as to respond to the input information of the user. For example, the user says that: "hello", the application obtains the interactive text of the virtual idol: "good and bad", interactive actions: "call in.
For each virtual idol, the application performs the following processing:
a. and driving the mouth shape and the expression of the virtual idol by the application program according to the interactive text of the virtual idol to obtain face driving data. For example, based on the "good-for-life" interactive text, the application will match the mouth shape and expression of the virtual idol with the pronunciation "good-for-life".
b. And driving the actions of the virtual idol by the application program according to the interactive actions of the virtual idol, and obtaining action driving data. For example, the application will cause the virtual idol to perform an call-in action.
The application program processes and integrates the face driving data and the action driving data, and the foreground image can be obtained by combining the target position of the virtual even image in the foreground image, so that the virtual even image can perform corresponding expression and action change according to the input of the user in the foreground image, and natural interaction with the user is realized.
In some embodiments, the obtaining the interactive text and the interactive action of each virtual idol image includes:
receiving second input information of the user by using the terminal equipment, wherein the second input information is text information, voice information or image information;
and according to the second input information, acquiring the interactive text and the interactive action of each virtual idol image so as to respond to the second input information.
As an example, when the user inputs voice information (i.e., the second input information) and the content is "i am good for work in the present day", the interactive text of the virtual idol is, for example, "hard to feel like rest" or "too tired to feel like rest" at night. The interactive action of the virtual idol is, for example, a shoulder-rubbing, a tap or a hug action to express the user's interest and comfort.
Therefore, the terminal equipment is used for receiving second input information of the user, and the interactive text and the interactive action of each virtual idol are obtained according to the information, so that real-time interaction with the user is realized, and the real-time performance and pertinence of the interaction are enhanced; the method has the advantages that various input modes such as text information, voice information and image information are supported, so that a user can select a proper input mode to interact with the virtual idol according to own preference and scene requirements, and usability and user experience of the virtual idol interaction method are improved; the user is allowed to trigger the interactive text and the interactive action of the virtual idol through the second input information, personalized interaction between the user and the virtual idol is realized, the approach and immersion of the user to the virtual idol are enhanced, and the user experience is improved; according to the second input information of the user, the interactive text and the interactive action of the virtual idol are determined, so that the virtual idol can make corresponding reactions to the demands and the instructions of the user, the intelligent degree of the virtual idol interaction method is improved, and the user experience is further improved.
In the above embodiment, the virtual idol interaction process involves acquiring the interaction text and interaction action of each virtual idol according to the second input information (e.g., text, voice or image information) of the user. The following is a typical example of an interaction procedure:
the user interacts with the virtual idol through the terminal device, sending a second input information, which may be text information (e.g., words entered by the user), speech information (e.g., words spoken by the user), or image information (e.g., user gestures or expressions, or user hand-drawn works).
The application program receives second input information of the user and determines corresponding interactive text and interactive actions for each virtual idol according to the second input information. For example:
a. if the user sends text information "let us dance", the application program automatically generates interactive text of the virtual idol, such as "good o, dance bar together", and triggers corresponding interactive actions (such as dance actions) according to the interactive text.
b. If the user sends a voice message "tell me a joke", the application will automatically generate an interactive text of the virtual idol, e.g. a piece of joke, and trigger a corresponding interactive action (e.g. speaking joke).
c. If the user sends image information, such as a gesture or expression, the application program analyzes the image information and generates corresponding interactive text and interactive actions for the virtual idol according to the analysis result (for example, the user swings hands, and the virtual idol returns to swing hands).
By the method, the virtual idol can generate the interactive text and the interactive action according to the real-time input of the user, and the natural and vivid interactive experience with the user is realized.
In some embodiments, the obtaining, according to the second input information, the interactive text and the interactive action of each virtual idol includes:
according to the second input information, acquiring an interactive text and an interactive action of the first virtual idol;
and according to the second input information and the interactive text and the interactive action of the first virtual even image to the kth virtual even image, acquiring the interactive text and the interactive action of the kth+1th virtual even image, responding to the second input information or other virtual even images, wherein k is a positive integer smaller than N, N is the number of virtual even images in the foreground image, and N is an integer larger than 1.
Therefore, through determining the interactive text and the interactive action of the virtual idol one by one, the role and the position of each virtual idol in an interactive scene can be fully considered, and a richer and finer interactive effect is realized; according to the second input information and the interactive text and the interactive actions of the k+1 virtual idol, the interactive text and the interactive actions of the k+1 virtual idol are obtained, so that the interaction between the virtual idol has consistency and synergy, and the overall interaction quality and the user experience of the virtual idol interaction method are improved; according to the second input information and the interactive text and the interactive action of other virtual idol images, the interactive text and the interactive action of each virtual idol image are obtained, diversified requirements and scene changes of users can be responded flexibly, and the practicability and the adaptability of the virtual idol image interaction method are enhanced; the interactive text and the interactive action of the k+1th virtual idol are obtained according to the second input information and the interactive text and the interactive action of the k-th virtual idol, so that dynamic adjustment and response of the virtual idol in the interaction process can be realized, and the dynamic performance and user experience of the virtual idol interaction method can be improved.
In the above embodiment, the virtual idol interaction process involves acquiring the interaction text and the interaction action of each virtual idol according to the second input information of the user, and considering the interactions of other virtual idols. The following is a typical example of an interaction procedure:
and the user interacts with the virtual idol through the terminal equipment to send second input information. The application program receives second input information of the user and determines corresponding interactive text and interactive actions for the first virtual idol according to the second input information.
The application program then determines the interactive text and the interactive action of the k+1th virtual idol according to the second input information of the user and the interactive text and the interactive action of the first k virtual idol, so that the virtual idol can also interact with the user, and not just the user.
For example, assume that there are three virtual idols in a scene: A. b and C. The user sends a text message: "how do you feel this new song? "the application first determines the interactive text and interactive actions for virtual idol A, say A: "I feel the song very good. "the application will then determine the interactive text and interactive actions for virtual idol B based on the user's input and the response of a, say B: "I also like this song, melody is beautiful. Finally, the application program will determine the interactive text and the interactive action for the virtual idol C according to the second input information of the user, the responses of a and B, for example, say C: "this song is truly good, i like it too. "
In this way, virtual idols can interact with each other, and a more natural and vivid interaction experience is generated according to the input of a user.
In some embodiments, the generating an interactive scene image from the background image and the foreground image comprises:
synthesizing the background image and the corresponding foreground image of each frame to generate the interactive scene image of each frame; in each frame of the interaction scene image, each virtual even image appears at a corresponding target position in the background image of the current frame, and interacts with the user through the mouth shape and the expression corresponding to the corresponding interaction text and the action corresponding to the corresponding interaction action.
Therefore, each frame of background image and the corresponding foreground image are synthesized to generate each frame of interaction scene image in real time, the instantaneity and the continuity of the virtual even image interaction method are ensured, and smoother and natural interaction experience is provided for users; in each frame of interaction scene image, the virtual idol interacts with the user according to the mouth shape and the expression corresponding to the interaction text and the action corresponding to the interaction action, so that the personality characteristics of the virtual idol are fully reflected, and the interaction process is more real, vivid and interesting; when an interactive scene image is generated, the virtual idol image is placed at a target position corresponding to the virtual idol image in the background image, so that the virtual idol image can flexibly adapt to various scenes, and richer and diversified interactive experience is provided for users; the virtual idol and the background image are synthesized, fusion between the virtual idol and the background is realized, and the visual effect and the sense of reality of the interactive scene are enhanced, so that the interactive experience of a user is improved.
In the above embodiments, the virtual even image interaction process involves compositing the background image and the foreground image to generate an interactive scene image. The following is a typical scenario generation process example:
first, the application will acquire a background image for each frame. The background image can be a preset static image or an environment image acquired in real time.
The application then obtains a foreground image corresponding to each frame of background image. The foreground image contains one or more virtual idols, each having its own target location, interactive text, and interactive actions.
The application then synthesizes each frame of background image with its corresponding foreground image. During the composition process, each virtual even image will appear at its own corresponding target location in the background image.
Finally, the virtual idol interacts with the user according to the self interaction text and interaction action, including generating corresponding mouth shapes and expressions through the interaction text and generating corresponding actions according to the interaction action.
For example, the user is interacting with two virtual idols. The background image is an indoor environment, and virtual idols a and B are located on the left and right sides of the background image, respectively. In each frame of interactive scene image, virtual idols A and B interact with a user according to own interactive text and interactive actions. A is answering the user's question and therefore has a corresponding mouth shape and expression, while B is making an explanatory gesture and therefore has a corresponding action. In this way, the application program generates a scene with interactivity by synthesizing the background image and the foreground image, so that the user can perform immersive interaction with the virtual idol.
For another example, in an interactive scenario where the even images simulating love is set up, assuming that there are 3 virtual even images A, B, C, they will interact differently according to the user's input or the setting in the scenario, for example:
the user selects and interacts with A in the family living room: a is a delicately for the user, and shows the lovely face of the user or performs the performance to try to attract the attention of the user.
The user selects and B interacts in the cafe: b actively conversing with the user in an attempt to establish a deeper link with the user, e.g. sharing his own stories, listening to the user's advice, etc.
The user selects and interacts with C in the city street: c shows its own appearance, wear, etc., or tries to do something interesting with the user, e.g. travel together, watch a movie, etc.
Each virtual idol has own individuality and characteristics, and the user is expected to obtain the love and attention of the user through different interaction modes and user interactions.
For another example, suppose the user LAN determines that the target interaction scenario is a summer beach, the LAN selects to interact with three virtual idols, a summer parcels, a time Shang Chao man, and a sun man, respectively. The following are possible interaction processes:
The small fresh in summer leans against the tree at the beach and holds a book. When the LAN comes close, a small fresh summer sweet smile towards her and then starts chat sweet. He asks the name of the LAN and what beach activities the LAN likes. In summer, small freshening suggests two people to surf sea waves or pick up shells together. During the interaction, the small freshness in summer will exhibit peace and warmth, trying to attract the attention of the LAN.
Fashion tide men wear fashion swimwear and sunglasses to play volleyball on beach volleyball courts at the beach edge. When the LAN comes close, the fashion tide men try to show her with his ball skills. He invites the LAN to play a ball together and praise her ball skills. During the interaction, fashion tide men may exhibit confidence and sunlight, trying to attract the attention of the LAN through their own vitality and charm.
Male walks on the beach, looking very relaxed and free. When the LAN is close, he goes to her call. He asks whether the LAN likes this beach or if it plays happy on vacation. During the interaction, a sun-type male will show enthusiasm and friendliness, trying to attract the attention of the LAN through his personality charm.
In a specific application scenario, the embodiment of the application further provides a virtual idol interaction method, which comprises the following steps:
Receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function;
using the terminal equipment to play a preset interactive video, wherein in the preset interactive video, one or more virtual idol prompts a user to determine the target interactive scene;
displaying a plurality of interaction scenes by using the terminal equipment, and taking the selected interaction scene as the target interaction scene when a selection operation aiming at one interaction scene is received; or, receiving first input information of a user by using the terminal equipment, and determining one of a plurality of interaction scenes as the target interaction scene according to the first input information; or, acquiring scene configuration information according to the first input information, and configuring the target interaction scene according to the scene configuration information, wherein the first input information is text information, voice information or image information, and the scene configuration information comprises at least one of the following: scene type, scene topic, scene time, scene role and scene item;
Acquiring a preset image corresponding to the target interaction scene as the background image;
acquiring face driving data and action driving data of each virtual idol image, and acquiring a target position of each virtual idol image in each frame of the background image;
generating each frame of the foreground image containing one or more virtual idol images corresponding to each frame of the background image according to the face driving data and the action driving data of each virtual idol image and the target position of each virtual idol image in each frame of the background image;
synthesizing the background image and the corresponding foreground image of each frame to generate the interactive scene image of each frame; in each frame of the interaction scene image, each virtual even image appears at a corresponding target position in the background image of the current frame and interacts with the user through the mouth shape and the expression corresponding to the corresponding interaction text and the action corresponding to the corresponding interaction action;
and displaying the interaction scene image by using the terminal equipment so that the user interacts with one or more virtual idols in the target interaction scene.
In another specific application scenario, the embodiment of the application further provides a virtual idol interaction method, which comprises the following steps:
receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function; the terminal device is an augmented reality device worn by the user;
using the terminal equipment to play a preset interactive video, wherein in the preset interactive video, one or more virtual idol prompts a user to determine the target interactive scene;
displaying a plurality of augmented reality scenes by using the terminal equipment;
when a selection operation for one of the augmented reality scenes is received, taking the selected augmented reality scene as the target interaction scene;
when the selected augmented reality scene is a virtual reality scene, acquiring a preset image corresponding to the target interaction scene as the background image;
when the selected augmented reality scene is an augmented reality scene or a mixed reality scene, acquiring a real-time image of the current environment by using a camera of the terminal equipment as the background image;
Acquiring face driving data and action driving data of each virtual idol image, and acquiring a target position of each virtual idol image in each frame of the background image;
generating each frame of the foreground image containing one or more virtual idol images corresponding to each frame of the background image according to the face driving data and the action driving data of each virtual idol image and the target position of each virtual idol image in each frame of the background image;
synthesizing the background image and the corresponding foreground image of each frame to generate the interactive scene image of each frame; in each frame of the interaction scene image, each virtual even image appears at a corresponding target position in the background image of the current frame and interacts with the user through the mouth shape and the expression corresponding to the corresponding interaction text and the action corresponding to the corresponding interaction action;
and displaying the interaction scene image by using the terminal equipment and synchronously playing corresponding voice synthesis data so as to enable the user to interact with one or more virtual idol images in the target interaction scene.
The step of obtaining the face driving data and the action driving data of each virtual even image comprises the following steps:
Receiving second input information of the user by using the terminal equipment, wherein the second input information is text information, voice information or image information;
according to the second input information, acquiring an interactive text and an interactive action of the first virtual idol so as to respond to the second input information;
according to the second input information and the interactive text and the interactive action from the first virtual even image to the kth virtual even image, the interactive text and the interactive action of the kth+1th virtual even image are obtained, so that the second input information or other virtual even images are responded, k is a positive integer smaller than N, N is the number of virtual even images in the foreground image, and N is an integer larger than 1;
for each of the virtual idol images, performing the following processing:
performing voice synthesis according to the interactive text of the virtual idol to obtain voice data;
driving the mouth shape and expression of the virtual idol according to the voice data of the virtual idol to obtain the face driving data of the virtual idol;
and driving the motion of the virtual idol according to the interaction motion of the virtual idol so as to obtain motion driving data of the virtual idol.
In some embodiments, the manner of acquiring the interaction action of the first virtual idol may be semantic matching, text driving or voice driving, and specifically, the interaction action corresponding to the semantic information of the second input information may be acquired, or the interaction action corresponding to the interaction text of the first virtual idol may be acquired, or the interaction action corresponding to the voice data of the first virtual idol may be acquired.
(electronic device)
The embodiment of the application also provides an electronic device, the specific embodiment of which is consistent with the embodiment described in the method embodiment and the achieved technical effect, and part of the contents are not repeated.
The electronic device comprises a memory storing a computer program and at least one processor configured to implement the following steps when executing the computer program:
receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function;
acquiring a target interaction scene corresponding to a user;
obtaining a background image according to the target interaction scene;
acquiring a foreground image containing one or more virtual idol images;
generating an interactive scene image according to the background image and the foreground image;
and displaying the interaction scene image by using the terminal equipment so that the user interacts with one or more virtual idols in the target interaction scene.
In some embodiments, before acquiring the target interaction scenario corresponding to the user, the at least one processor is configured to execute the computer program to further implement the steps of:
And playing a preset interactive video by using the terminal equipment, wherein in the preset interactive video, one or more virtual idol images prompt the user to determine the target interactive scene.
In some embodiments, the at least one processor is configured to obtain the user's corresponding target interaction scenario when executing the computer program by:
displaying a plurality of interaction scenes by using the terminal equipment, and taking the selected interaction scene as the target interaction scene when a selection operation aiming at one interaction scene is received; or,
and receiving first input information of a user by using the terminal equipment, and acquiring a target interaction scene corresponding to the user according to the first input information, wherein the first input information is text information, voice information or image information.
In some embodiments, the at least one processor is configured to obtain the target interaction scenario corresponding to the user from the first input information when executing the computer program in the following manner:
according to the first input information, one of a plurality of interaction scenes is determined to be the target interaction scene; or,
Acquiring scene configuration information according to the first input information, and configuring the target interaction scene according to the scene configuration information, wherein the scene configuration information comprises at least one of the following: scene type, scene topic, scene time, scene role, and scene item.
In some embodiments, the at least one processor is configured to obtain a background image from the target interaction scene when executing the computer program in the following manner:
and acquiring a preset image corresponding to the target interaction scene as the background image.
In some embodiments, the terminal device is an augmented reality device worn by the user;
the at least one processor is configured to obtain a target interaction scenario corresponding to a user when executing the computer program by:
displaying a plurality of augmented reality scenes by using the terminal equipment;
when a selection operation for one of the augmented reality scenes is received, the selected augmented reality scene is taken as the target interactive scene.
In some embodiments, the at least one processor is configured to obtain a background image from the target interaction scene when executing the computer program in the following manner:
When the selected augmented reality scene is a virtual reality scene, acquiring a preset image corresponding to the target interaction scene as the background image;
when the selected augmented reality scene is an augmented reality scene or a mixed reality scene, a real-time image of the current environment is acquired by using a camera of the terminal device as the background image.
In some embodiments, the at least one processor is configured to obtain the foreground image containing one or more virtual idol images when executing the computer program by:
acquiring face driving data and action driving data of each virtual idol image, and acquiring a target position of each virtual idol image in each frame of the background image;
and generating each frame of the foreground image which corresponds to each frame of the background image and contains one or more virtual even images according to the face driving data and the action driving data of each virtual even image and the target position of each virtual even image in each frame of the background image.
In some embodiments, the at least one processor is configured to obtain the face drive data and the action drive data for each of the virtual idol images when executing the computer program in the following manner:
Acquiring an interactive text and an interactive action of each virtual idol;
for each of the virtual idol images, performing the following processing:
driving the mouth shape and expression of the virtual idol according to the interactive text of the virtual idol to obtain face driving data of the virtual idol;
and driving the motion of the virtual idol according to the interaction motion of the virtual idol so as to obtain motion driving data of the virtual idol.
In some embodiments, the at least one processor is configured to obtain the interactive text and the interactive action for each of the virtual idol images when executing the computer program in the following manner:
receiving second input information of the user by using the terminal equipment, wherein the second input information is text information, voice information or image information;
and according to the second input information, acquiring the interactive text and the interactive action of each virtual idol image so as to respond to the second input information.
In some embodiments, the at least one processor is configured to generate an interactive scene image from the background image and the foreground image when executing the computer program in the following manner:
synthesizing the background image and the corresponding foreground image of each frame to generate the interactive scene image of each frame; in each frame of the interaction scene image, each virtual even image appears at a corresponding target position in the background image of the current frame, and interacts with the user through the mouth shape and the expression corresponding to the corresponding interaction text and the action corresponding to the corresponding interaction action.
Referring to fig. 3, fig. 3 is a block diagram of an electronic device 10 according to an embodiment of the present application.
The electronic device 10 may for example comprise at least one memory 11, at least one processor 12 and a bus 13 connecting the different platform systems.
Memory 11 may include (computer) readable media in the form of volatile memory, such as Random Access Memory (RAM) 111 and/or cache memory 112, and may further include Read Only Memory (ROM) 113.
The memory 11 also stores a computer program executable by the processor 12 to cause the processor 12 to implement the steps of any of the methods described above.
Memory 11 may also include utility 114 having at least one program module 115, such program modules 115 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Accordingly, the processor 12 may execute the computer programs described above, as well as may execute the utility 114.
The processor 12 may employ one or more application specific integrated circuits (ASICs, application Specific Integrated Circuit), DSPs, programmable logic devices (PLD, programmableLogic devices), complex programmable logic devices (CPLDs, complex Programmable Logic Device), field programmable gate arrays (FPGAs, fields-Programmable Gate Array), or other electronic components.
Bus 13 may be a local bus representing one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or any of a variety of bus architectures.
The electronic device 10 may also communicate with one or more external devices such as a keyboard, pointing device, bluetooth device, etc., as well as one or more devices capable of interacting with the electronic device 10 and/or with any device (e.g., router, modem, etc.) that enables the electronic device 10 to communicate with one or more other computing devices. Such communication may be via the input-output interface 14. Also, the electronic device 10 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 15. The network adapter 15 may communicate with other modules of the electronic device 10 via the bus 13. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with the electronic device 10 in actual applications, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage platforms, and the like.
(computer-readable storage Medium)
The embodiment of the application also provides a computer readable storage medium, and the specific embodiment of the computer readable storage medium is consistent with the embodiment recorded in the method embodiment and the achieved technical effect, and part of the contents are not repeated.
The computer readable storage medium stores a computer program which, when executed by at least one processor, performs the steps of any of the methods or performs the functions of any of the electronic devices described above.
The computer readable medium may be a computer readable signal medium or a computer readable storage medium. In embodiments of the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable storage medium may also be any computer readable medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an idol-oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the C programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
(computer program product)
The embodiment of the application also provides a computer program product, the specific embodiment of which is consistent with the embodiment described in the method embodiment and the achieved technical effect, and part of the contents are not repeated.
The present application provides a computer program product comprising a computer program which, when executed by at least one processor, performs the steps of any of the methods or performs the functions of any of the electronic devices described above.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a computer program product according to an embodiment of the present application.
The computer program product is configured to implement the steps of any of the methods described above or to implement the functions of any of the electronic devices described above. The computer program product may employ a portable compact disc read only memory (CD-ROM) and comprise program code and may run on a terminal device, such as a personal computer. However, the computer program product of the present application is not limited thereto, and the computer program product may employ any combination of one or more computer readable media.
The present application has been described in terms of its purpose, performance, advancement, and novelty, and the like, and is thus adapted to the functional enhancement and use requirements highlighted by the patent statutes, but the description and drawings are not limited to the preferred embodiments of the present application, and therefore, all equivalents and modifications that are included in the construction, apparatus, features, etc. of the present application shall fall within the scope of the present application.
Claims (14)
1. A virtual idol interaction method, characterized in that the method comprises:
receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function;
acquiring a target interaction scene corresponding to a user;
obtaining a background image according to the target interaction scene;
acquiring a foreground image containing one or more virtual idol images;
generating an interactive scene image according to the background image and the foreground image;
and displaying the interaction scene image by using the terminal equipment so that the user interacts with one or more virtual idols in the target interaction scene.
2. The virtual idol interaction method of claim 1, wherein prior to obtaining the target interaction scene corresponding to the user, the method further comprises:
and playing a preset interactive video by using the terminal equipment, wherein in the preset interactive video, one or more virtual idol images prompt the user to determine the target interactive scene.
3. The virtual idol interaction method of claim 1, wherein the obtaining the target interaction scene corresponding to the user comprises:
Displaying a plurality of interaction scenes by using the terminal equipment, and taking the selected interaction scene as the target interaction scene when a selection operation aiming at one interaction scene is received; or,
and receiving first input information of a user by using the terminal equipment, and acquiring a target interaction scene corresponding to the user according to the first input information, wherein the first input information is text information, voice information or image information.
4. The virtual idol interaction method of claim 3, wherein the obtaining the target interaction scene corresponding to the user according to the first input information comprises:
according to the first input information, one of a plurality of interaction scenes is determined to be the target interaction scene; or,
acquiring scene configuration information according to the first input information, and configuring the target interaction scene according to the scene configuration information, wherein the scene configuration information comprises at least one of the following: scene type, scene topic, scene time, scene role, and scene item.
5. The virtual idol interaction method of claim 4, wherein the obtaining a background image from the target interaction scene comprises:
And acquiring a preset image corresponding to the target interaction scene as the background image.
6. The virtual idol interaction method of claim 1, wherein the terminal device is an augmented reality device worn by the user;
the obtaining the target interaction scene corresponding to the user comprises the following steps:
displaying a plurality of augmented reality scenes by using the terminal equipment;
when a selection operation for one of the augmented reality scenes is received, the selected augmented reality scene is taken as the target interactive scene.
7. The virtual idol interaction method of claim 6, wherein the obtaining a background image from the target interaction scene comprises:
when the selected augmented reality scene is a virtual reality scene, acquiring a preset image corresponding to the target interaction scene as the background image;
when the selected augmented reality scene is an augmented reality scene or a mixed reality scene, a real-time image of the current environment is acquired by using a camera of the terminal device as the background image.
8. The virtual idol interaction method of claim 5 or 7, wherein the acquiring the foreground image containing one or more virtual idols comprises:
Acquiring face driving data and action driving data of each virtual idol image, and acquiring a target position of each virtual idol image in each frame of the background image;
and generating each frame of the foreground image which corresponds to each frame of the background image and contains one or more virtual even images according to the face driving data and the action driving data of each virtual even image and the target position of each virtual even image in each frame of the background image.
9. The virtual idol interaction method of claim 8, wherein the obtaining face driving data and action driving data of each virtual idol comprises:
acquiring an interactive text and an interactive action of each virtual idol;
for each of the virtual idol images, performing the following processing:
driving the mouth shape and expression of the virtual idol according to the interactive text of the virtual idol to obtain face driving data of the virtual idol;
and driving the motion of the virtual idol according to the interaction motion of the virtual idol so as to obtain motion driving data of the virtual idol.
10. The virtual idol interaction method of claim 9, wherein the obtaining the interaction text and the interaction action of each virtual idol comprises:
Receiving second input information of the user by using the terminal equipment, wherein the second input information is text information, voice information or image information;
and according to the second input information, acquiring the interactive text and the interactive action of each virtual idol image so as to respond to the second input information.
11. The virtual idol interaction method of claim 9, wherein the generating an interaction scene image from the background image and the foreground image comprises:
synthesizing the background image and the corresponding foreground image of each frame to generate the interactive scene image of each frame; in each frame of the interaction scene image, each virtual even image appears at a corresponding target position in the background image of the current frame, and interacts with the user through the mouth shape and the expression corresponding to the corresponding interaction text and the action corresponding to the corresponding interaction action.
12. An electronic device comprising a memory and at least one processor, the memory storing a computer program, the at least one processor being configured to implement the following steps when executing the computer program:
Receiving an access request from a terminal device, and establishing communication connection between the terminal device and a target server, wherein the target server is used for providing a virtual idol interaction function;
acquiring a target interaction scene corresponding to a user;
obtaining a background image according to the target interaction scene;
acquiring a foreground image containing one or more virtual idol images;
generating an interactive scene image according to the background image and the foreground image;
and displaying the interaction scene image by using the terminal equipment so that the user interacts with one or more virtual idols in the target interaction scene.
13. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by at least one processor, implements the steps of the method of any one of claims 1-11 or the functions of the electronic device of claim 12.
14. A computer program product, characterized in that it comprises a computer program which, when executed by at least one processor, implements the steps of the method of any one of claims 1-11 or the functions of the electronic device of claim 12.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310580142.9A CN116634236A (en) | 2023-05-22 | 2023-05-22 | Virtual idol interaction method, electronic device, storage medium and program product |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310580142.9A CN116634236A (en) | 2023-05-22 | 2023-05-22 | Virtual idol interaction method, electronic device, storage medium and program product |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116634236A true CN116634236A (en) | 2023-08-22 |
Family
ID=87636001
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310580142.9A Pending CN116634236A (en) | 2023-05-22 | 2023-05-22 | Virtual idol interaction method, electronic device, storage medium and program product |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116634236A (en) |
-
2023
- 2023-05-22 CN CN202310580142.9A patent/CN116634236A/en active Pending
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8963916B2 (en) | Coherent presentation of multiple reality and interaction models | |
KR102270699B1 (en) | System and method for augmented and virtual reality | |
US20130249947A1 (en) | Communication using augmented reality | |
US20130238778A1 (en) | Self-architecting/self-adaptive model | |
CN114787759A (en) | Communication support program, communication support method, communication support system, terminal device, and non-language expression program | |
Harbison | Performing image | |
US20230298290A1 (en) | Social interaction method and apparatus, device, storage medium, and program product | |
CN116954437A (en) | Information interaction processing method, device, equipment and computer storage medium | |
CN116668733A (en) | Virtual anchor live broadcast system and method and related device | |
Qadri et al. | Virtual tourism using Samsung gear VR headset | |
CN116634236A (en) | Virtual idol interaction method, electronic device, storage medium and program product | |
Zhang et al. | A study of integration application based on 5G/8K/AI/VR for the activation of intangible cultural heritage | |
Tornatzky et al. | An Artistic Approach to Virtual Reality | |
Moon et al. | Mixed-reality art as shared experience for cross-device users: Materialize, understand, and explore | |
Ma et al. | Embodied Cognition Guides Virtual-Real Interaction Design to Help Yicheng Flower Drum Intangible Cultural Heritage Dissemination | |
Pavlik | Experiencing cinematic VR: where theory and practice converge in the Tribeca film festival cinema360 | |
Liu et al. | Science museum mixed reality digital media exhibitions for children | |
Xin et al. | AR Interaction Design Mode of Multi-user and Multi-character in Theme Parks | |
US20240273613A1 (en) | Browsing-based augmented reality try-on experience | |
Conroy | Trans Cyborg Theatre: Digital Technology & Media in Performance | |
Li | Research on the Forms of Expression and Application of Interactive Narrative Exhibition Design of Museums in Guangdong | |
Delioglanis | Locative Media and Narrative in North American Literature and Culture | |
Guy et al. | LO: TECH: POP: CULT: Screendance Remixed | |
Velaphi | The South African voice in South African animation: a critical examination, via the case study approach, of the South African animation industry and its commitment to representing a local identity. | |
Yang | RMB City: A Garden of Heterotopias Beyond the Spectacle of Political Dichotomy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |