CN115686194A

CN115686194A - Method, system and device for real-time visualization and interaction of virtual images

Info

Publication number: CN115686194A
Application number: CN202211105697.XA
Authority: CN
Inventors: 张俊卿
Original assignee: Individual
Current assignee: Individual
Priority date: 2022-09-09
Filing date: 2022-09-09
Publication date: 2023-02-03

Abstract

The invention provides a method, a system and a terminal for visualizing and interacting a virtual image in real time, which relate to the technical field of virtual reality based on terminals. The invention also discloses a virtual image display method and a virtual image display device.

Description

Method, system and device for real-time visualization and interaction of virtual images

Technical Field

The invention relates to the technical field of virtual reality, in particular to a method, a system and a device for real-time visualization and interaction of virtual images.

Background

The metauniverse (Metaverse) is a virtual world which is linked and created by a scientific and technological means and is mapped and interacted with the real world, and has a digital living space of a novel social system.

Mixed Reality Mix Reality, including both augmented Reality and augmented virtual, refers to a new visualization environment created by merging real and virtual worlds. In the new visualization environment, physical and digital objects coexist and interact in real time. The system typically has three main features: 1. virtual and reality are combined; 2. registering in a virtual three-dimensional environment (3D registration); 3. the system runs in real time and can interact.

The virtual technology is currently an interactive mode in a fixed ring environment, and the application range is limited. The current holographic projection usually needs a series of devices to be realized, for example, a patent of the invention with publication number CN114632324A entitled "an immersive space virtual setup system and method" discloses a whole set of equipment for virtual stage and virtual projection, which includes a virtual stage, scene making, multi-screen combination, thermal imaging, display module and other units, such a set of equipment is not only complex and expensive in cost, but also not very popular, and has a single application scene, which cannot meet the multiple scene requirements of the current vast users.

Disclosure of Invention

In order to overcome the problems in the related technologies at least to a certain extent, the application provides a method, a system and a device for real-time visualization and interaction of virtual images, so as to solve the technical problems that in the prior art, the application scene of virtual equipment is single, the requirements of various scenes cannot be met, the equipment is complex, the manufacturing cost is high, and the like.

In order to achieve the purpose, the following technical scheme is adopted in the application:

in a first aspect,

the application provides a virtual image is visual and interactive system in real time, includes:

the motion capture unit is used for capturing the real image, the motion, the expression and the mouth shape of the user through the camera device;

the display unit is used for transmitting the virtual image to a visual system of a user;

the sound processing unit is used for realizing real-time conversion of sound and displaying the collected sound to a user in a multi-level manner after processing;

the storage unit is used for storing the acquired virtual image data;

the data delivery unit is used for outputting the virtual image data to a corresponding environment;

and the perception unit is used for enabling the user to perceive the environment elements in the virtual environment.

Further, the system is applied to a terminal device, and the terminal device comprises: the system comprises a portable protocol terminal, a virtual content creating terminal and a cloud data storage platform which can be carried about;

the portable protocol terminal capable of being carried about is used for checking and receiving virtual images, sound information and other various data output by protocol related equipment, and simultaneously is also used for carrying self virtual images and changing voice and audio matching data for output;

the protocol terminal is used for storing and outputting virtual images, sound matching, action joint matching, environment, atmosphere and various virtual data of the corresponding main body;

the virtual content creating terminal is used for a user to create virtual output contents of various main bodies such as a user or a building;

the cloud data storage platform is a cloud-based server and is used for storing various data related to users and equipment so as to be used by the equipment applied by the users.

Furthermore, the portable protocol terminal capable of being carried about is in the form of glasses, lenses or contact lenses;

the virtual image output by the portable protocol terminal capable of being carried with the user comprises: a human, animal, thing, or environment;

the visual result received by the user wearing the terminal equipment is that the real world is combined with the virtual content, and the visual content can be checked whether the protocol equipment of the other party is opened or not and whether the other party agrees to display the own virtual image or not.

Furthermore, the form of the protocol terminal can be any form, the form of the protocol terminal can be glasses, brooch, tie clip, belt or pendant, and the form of the virtual output equipment applied to the building environment can be a polygonal three-dimensional terminal;

the protocol terminal also includes output, read, and input functions.

Further, the virtual content creation terminal is operating software based on Linux, macOS, windows, or various operating systems;

establishing the virtual content of the character comprises the following steps: a character, a portrait, a skin tone, a hairstyle, a height, a sound match, a body joint motion match, a garment or an accessory;

establishing virtual content of an environment includes: building overall structure virtual style, building facade virtual style, indoor structure virtual style, indoor decoration and furnishing virtual style or association of light and atmosphere display equipment;

the virtual content can be initially created through the terminal, and can also be imported into an output result of third-party software for secondary editing and matching, and the corresponding third-party software includes but is not limited to: METAHUMAN, maya, C4D, 3D Max, dimension, rhino of UnrealEngine.

Further, the virtual content creation terminal is combined with atmosphere display equipment to perform corresponding atmosphere display, including but not limited to simulating cold air or rain by using spray or water, and the airflow device simulates natural wind;

the data created by the virtual content creation terminal can be stored in the cloud data storage platform and the corresponding protocol terminal device at the same time, and the reading device can conveniently read the corresponding data through the cloud data storage platform or the corresponding protocol terminal device according to the state of the protocol terminal.

In a second aspect of the present invention,

the application provides a method for real-time visualization and interaction of virtual images, which comprises the following steps:

when users carrying the portable protocol terminals meet each other, the equipment establishes communication under the condition that the portable protocol terminals with the same protocol are all in a starting state;

the system acquires the virtual image and sound information of the opposite side from the terminal worn by the system;

the virtual image and the virtual sound information are the matching result of the real action, the sound production of the user, the three-dimensional virtual image model created in the terminal of the installed and created environment and the sound variation;

the creating and displaying of the sound have coordinates and directions, and when a user carrying the portable protocol terminal device is in a virtual environment, the user can feel the sound in different directions according to the coordinates;

after acquiring the virtual image of the opposite user, the portable protocol terminal finishes capturing the real image of the opposite user according to the opened motion capturing unit, synchronously captures the motion, expression and mouth shape of the opposite user, renders the captured motion to the virtual image in real time, and simultaneously generates corresponding virtual sound information by the real sound production of the opposite user through sound change matching so as to realize synchronization of the virtual image and the virtual sound tracking real image;

when a user carrying the portable protocol terminal enters a room or a street provided with the protocol terminal, corner nodes in the room or nodes and a plurality of angles of the street and a building are selected, and a plurality of protocol terminals are placed, a visual system of the user can receive a virtual image after being virtually constructed, and an auditory system can receive virtual audio information output by the protocol terminal;

the protocol terminal is used for creating a virtual environment and can be used for networking multiple devices.

Further, the motion capture can be realized by using a human body capture algorithm and establishing a matching relationship between the facial expression and the body motion of the user, so that when the user acts and expresses, the virtual image can be driven to act and make the same expression quickly and synchronously based on the matching relationship.

Further, the virtual image and sound information is a three-dimensional virtual image model and a sound variation matching created by the user in the terminal installed with the creation environment, and specifically includes: the virtual image is created by using the virtual subject creating program, and the characters, the shapes, the hair styles, the skin colors, the accessories, the expressions and the sounds can be created according to the requirements of the characters, the shapes, the hair styles, the skin colors, the accessories, the expressions and the sounds.

In a third aspect,

the application provides a terminal device, including:

the system for real-time visualization and interaction of the virtual images is provided.

This application adopts above technical scheme, possesses following beneficial effect at least:

in practical application, various elements such as virtual images, sounds, environments, weather and the like are designed through the terminal and stored to the data output device, and the device can be used independently and also can be combined for networking. The independent device can use comparatively portable form, can store carrier's avatar, change the sound matching data when independently using, can hand-carry, and when the device was opened, the data of sending the device can be received to the agreement reading device in the scope, then through the device similar to AR glasses, looks over the avatar that device carrier equipment throwed out. When the data output devices are used in a combined mode, the data output devices are applied to the edge nodes of the environment range expected to be released, and corresponding environment virtual information is stored in the data output devices. By adopting the technical scheme, the real-time projection of the virtual image is realized after data transmission and compiling, so that real-time interaction between users and the environment can be realized, the real image of the user can be hidden by default in a user system carrying the display equipment, whether the real image of the environment is hidden or not can be set by an environment creation system, and the user carrying the display equipment can also select the real image by setting.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic diagram of a system for real-time visualization and interaction of virtual imagery in accordance with an exemplary embodiment;

FIG. 2 is a reference diagram illustrating the placement of a virtual image in a room environment in accordance with an exemplary embodiment;

FIG. 3 is a diagram illustrating a reference view of a virtual image being dropped in a street environment, in accordance with an illustrative embodiment;

in the figure, 1, 2, and 3 are atmosphere presentation devices.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be described in detail below. It should be apparent that the described embodiments are only a few embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the examples given herein without making any creative effort, shall fall within the protection scope of the present application.

Referring to fig. 1, fig. 1 is a schematic diagram illustrating a system for real-time visualization and interaction of virtual images according to an exemplary embodiment, as shown in fig. 1, the system includes:

the motion capture unit is used for capturing the real image, the motion, the expression and the mouth shape of the user through a camera device, a sensing device or other devices;

the display unit is used for transmitting the virtual images and the virtual sounds to a visual system and an auditory system of a user;

the sound processing unit is used for realizing real-time conversion of sound and displaying the collected sound to a user in a multi-level mode after processing;

the storage unit is used for storing the acquired virtual image data;

In one embodiment, the system is applied to a terminal device, and the terminal device comprises: the system comprises a portable protocol terminal, a virtual content creating terminal and a cloud data storage platform which can be carried about;

the portable protocol terminal which can be carried about is used for checking and receiving virtual images, sound information and other various data output by protocol related equipment, and simultaneously is also used for carrying the matching data of own virtual images and variable sound and audio for output;

Specifically, the portable protocol terminal that can be carried about is generally in the form of glasses and lenses, and may be a highly integrated contact lens or a micro-anterior-ocular projection system in the future, and mainly functions to view and receive virtual images, sound information and other various data output by protocol-related devices, including: people, things, environment, etc. Meanwhile, the system can also carry own virtual image and various data for output. The visual result received by wearing the equipment is the combination of the real world and the virtual content, and the visual content can be checked according to whether the protocol equipment of the other party is opened or not and whether the other party agrees to display the own virtual image or not.

The protocol terminal form can be any form, and the form of carrying about can be very small, such as glasses, brooch, tie clip, belt or pendant, but the form of applying to the virtual output device in the building environment may also be very heavy, such as a camera or other large devices. The devices of this kind are mainly used for storing, outputting and controlling various virtual data of virtual images, sound matching, action joint matching, environment, atmosphere and the like of corresponding subjects.

The virtual content creating terminal is based on operating software of Linux, macOS, windows or various operating systems, a user creates virtual output contents of various main bodies such as a user or a building, and the creation of the virtual contents of a character comprises the following steps: characters, portraits, skin color, hair style, height, sound matching, body joint motion matching, clothing, accessories, and the like. Establishing virtual content of an environment includes: the virtual style of the whole structure of the building, the virtual style of the facade outside the building, the virtual style of the indoor structure, the virtual style of the indoor decoration and furnishing, the association of the lamplight and atmosphere display equipment and the like. The virtual content can be initially created through the terminal, and can also be imported into an output result of third-party software for secondary editing and matching. The corresponding third party software may be, but is not limited to: various 3D software such as METAHUMAN, maya, C4D, 3D Max, dimension, rhino and the like of UnrealEngine are manufactured.

The cloud data storage platform is a server based on a cloud end for storing various data related to users and equipment so as to be used by the equipment applied by the users.

By adopting the technical scheme, the virtual environment and the real scene can be combined, so that the virtual image is visualized in real time and is interactive in real time, the method and the system are widely applied to various scenes, and the requirements of users are met.

In one embodiment, the virtual content creation terminal is combined with an atmosphere showing device to perform corresponding atmosphere showing, including but not limited to imitating cold air or rain by spray or water, and an airflow device is imitating natural wind;

the data created by the virtual content creation terminal can be stored in the cloud data storage platform and the corresponding protocol terminal device at the same time, and the reading device can conveniently read the corresponding data through the cloud data storage platform or the corresponding protocol terminal device according to the state of the protocol terminal device.

In one embodiment, a method for real-time visualization and interaction of virtual images includes:

the system acquires the virtual image and sound information of the other party from the terminal worn by the system;

the creation and display of the sound have coordinates and directions, and when a user carrying the portable protocol terminal equipment is in a virtual environment, the user can feel the sound in different directions according to the coordinates;

when a user carrying the portable protocol terminal enters a room or a street provided with the protocol terminal, corner nodes in the room or nodes and a plurality of angles of the street and a building are selected, a plurality of protocol terminals are placed, a visual system of the user can receive virtual images after virtual construction, and meanwhile, an auditory system can receive virtual audio information output by the protocol terminals;

Specifically, when users carrying portable protocol terminals meet each other, the equipment establishes communication under the condition that the terminals with the same protocol are all in a power-on state, and the system acquires virtual image and sound information of the opposite side from the terminals worn by the system; the virtual image and sound information is a three-dimensional virtual image model created by the user in the terminal of the installed creating environment and is matched with the variation of sound, for example, a virtual body creating program is used for creating own virtual image, and characters, shapes, hairstyles, skin colors, accessories, expressions, sounds and the like can be created according to the needs of the user.

After acquiring the virtual image of the opposite user, the portable protocol terminal finishes capturing the real image of the opposite user according to the opened motion capturing unit and synchronously captures the motion, the expression, the mouth shape and the like of the opposite user; and rendering captured actions such as the actions, the expressions, the mouth shapes and the like to the virtual image in real time, so that the virtual image tracks the real image and realizes synchronization.

Referring to fig. 2 and 3, fig. 2 is a reference diagram of a virtual image being placed in a room environment, and fig. 3 is a reference diagram of a virtual image being placed in a street environment.

As shown in fig. 2, when a user carrying a portable protocol terminal enters a room equipped with protocol devices, the protocol devices are placed at each corner node in the room, so that the real image of the room is projected as a virtual image. The room is further provided with atmosphere exhibiting means 1, 2, 3 for simulating an ambient atmosphere in the room, such as simulating rain, spray, natural wind, etc.

As shown in fig. 3, when a user carrying a portable protocol terminal enters a street equipped with a protocol device, the protocol terminal is placed at each node of the street and building and at a plurality of angles, such as corners outside the street and the top of the building, so that the real image of the street is projected as a virtual image, and the user carrying the device and the protocol enters the street, and can see the virtual image and the real image of the street connected.

The virtual scene development process of the part can be realized by using a virtual engine. The virtual engine has built in various mesh and editing tools, such as animation blueprints to create and control complex motion behavior quickly, plug-ins such as levelink to enable real-time data streaming from the outside to the illusion engine, character animation, cameras, lights and other data streamed from DCC tools such as Maya or Motionbuilder, or motion capture or performance capture systems including the ARKit face tracking system to capture facial performances with some intelligent terminal. The design of the LiveLink ensures that it can be extended by the phantom plug-in, enabling third parties to add support for new sources. The portrait creating tool can also use METAHUMAN of unealengine, which is a better virtual portrait creating tool at present, and of course, the portrait creating tool can be self-created by 3D creation software such as MAYA and C4D and then imported into the terminal device of the application to perform expression action association.

The ARKit face tracking system described above provides techniques for modifying the image of an object in video, for example to correct lens distortion or beautify the face. These techniques include extracting and verifying features of an object from source video frames, tracking those features over time, estimating a pose of the object, modifying a 3D model of the object based on the features, and rendering the modified video frames based on the modified 3D model and modified eigen and extrinsic matrices. These techniques may be applied in real-time to the 3D modeling and rendering process in a holographic projection process.

When a user enters a street, the protocol terminal can capture street personnel and scan other protocol devices in the range through a protocol, and when a corresponding user ID can be tracked through the protocol terminal, a three-dimensional virtual image and various data of the corresponding user can be obtained through the protocol terminal or the cloud service platform. Through the ARKit face tracking system, the user can also receive face-to-face facial expressions, limb movements and voice information after voice change of other users carrying the protocol equipment.

The creation process of the virtual image of the street with sunshine in the day and the street with light at night may be different when being processed by the protocol terminal, and the differences may include: realistic indoor and outdoor light effects are created while maintaining temporal performance using lighting tools including atmospheric and sky environments, volume fog, volume lighting maps, pre-computed lighting scenes, etc.

In the street scene, there are sound simulation devices, such as human voice, traffic flow sound, etc., in this case, while simulating the virtual image, it is necessary to further simulate possible sounds thereof, including real-time synthesis of human voice, physical audio propagation modeling, multi-layer sound concurrence, etc., so as to realize synchronous sound transmission in the virtual image, and make the effect of the virtual image and the scene more realistic.

In this scenario, the sound simulation apparatus of the protocol terminal may provide real-time conversion of sound by being integrated in the terminal, and similarly, the sound simulation apparatus of the protocol terminal may be only a processing unit, and may present picked-up sound to a user in multiple levels after being processed to a certain degree.

In one embodiment, the motion capture may use a human body capture algorithm, and by establishing a matching relationship between the facial expression and the body movement of the user, when the user performs the movement and the expression, the virtual image is driven to perform the movement and perform the same expression quickly and synchronously based on the matching relationship.

Specifically, the facial expression and the limb action of the user can be accurately captured through a human body capturing algorithm during action capturing, and the user can rapidly and synchronously drive the virtual image to act and make the same expression based on the matching relation during action and expression by establishing the matching relation between the facial expression and the limb action of the user. The synchronously driven avatar is transmitted to the user's vision system through the display unit of the portable terminal, so that the vision system receives the avatar of the opposite user.

In one embodiment, the virtual image and sound information is a three-dimensional virtual image model and a sound variation matching created by a user in a terminal installed with a creation environment, and specifically includes: the virtual image is created by using a virtual body creating program or other tools, and characters, shapes, hairstyles, skin colors, accessories, expressions and sounds can be created according to the needs of the user.

The method, the system and the terminal device for real-time visualization and interaction of the virtual image capture the real image, the action, the expression, the mouth shape and the like of the user, transmit the virtual image to the visual system of the user, convert the sound in real time, display the collected sound to the user in a multi-level mode after processing, store the obtained virtual image data and output the virtual image data to the corresponding environment. The device adopted by the invention can be used independently or in combination, can read corresponding virtual environment data, and visually checks the corresponding content of the virtual environment, so that the virtual image can realize real-time interaction between people and people or objects, can be applied to various scenes, can meet different requirements of users, and is convenient for popularizing technologies such as holographic projection and various virtual images.

The specific manner in which each unit performs operations has been described in detail in the above embodiments of the method, and will not be described in detail herein.

It is understood that the same or similar parts in the above embodiments may be mutually referred to, and the same or similar contents in other embodiments may be referred to for the contents which are not described in detail in some embodiments.

It should be noted that, in the description of the present application, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. In addition, in the description of the present application, the meaning of "plurality" means at least two unless otherwise specified.

It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or intervening elements may also be present; when an element is referred to as being "connected" to another element, it can be directly connected to the other element or intervening elements may also be present, and further, as used herein, connected may include wirelessly connected; the term "and/or" is used to include any and all combinations of one or more of the associated listed items.

Any process or method descriptions in the flow charts or otherwise described herein may be understood as: represents modules, segments or portions of code which include one or more executable instructions for implementing specific logical functions or steps of a process, and the scope of the preferred embodiments of the present application includes other implementations in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the embodiments of the present application.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried out in the method of implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and the program, when executed, includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A system for real-time visualization and interaction of virtual images, the system comprising:

the storage unit is used for storing the acquired virtual image data;

2. The system according to claim 1, wherein the system is applied to a terminal device, and the terminal device comprises: the system comprises a portable protocol terminal, a virtual content creating terminal and a cloud data storage platform which can be carried about;

the portable protocol terminal capable of being carried about is used for checking and receiving virtual images, sound information and other various data output by protocol related equipment, and simultaneously is also used for carrying self virtual images and variable sound audio matching data for outputting;

3. The system for real-time visualization and interaction of virtual images as claimed in claim 2, wherein the portable protocol terminal is in the form of glasses, lenses or contact lenses;

the visual result received by the user wearing the terminal equipment is that the real world is combined with the virtual content, and the visual content can be checked out according to whether the protocol equipment of the opposite side is opened or not and whether the opposite side agrees to display the own virtual image or not.

4. The system for real-time visualization and interaction of virtual images according to claim 2, wherein the form of the protocol terminal can be any form, the form of the protocol terminal can be glasses, brooch, tie clip, belt or pendant, and the form of the virtual output device applied to the building environment can be a polygonal stereo terminal;

the protocol terminal also includes output, read, and input functions.

5. The system for real-time visualization and interaction of virtual images according to claim 2, wherein the virtual content creation terminal is operating software based on Linux, macOS, windows, or various operating systems;

establishing virtual content of an environment includes: the virtual style of the whole structure of the building, the virtual style of the facade outside the building, the virtual style of the indoor structure, the indoor decoration and furnishing virtual style or the association of the lamplight and atmosphere display equipment;

the virtual content can be initially created through the terminal, and can also be imported into an output result of third-party software for secondary editing and matching, wherein the corresponding third-party software includes but is not limited to: METAHUMAN, maya, C4D, 3D Max, dimension, rhino of UnrealEngine.

6. The system for real-time visualization and interaction of virtual images as claimed in claim 5, wherein the virtual content creation terminal is combined with an atmosphere display device to perform corresponding atmosphere display, including but not limited to simulating cold air or rain with a spray or water, and the airflow device simulating natural wind;

7. A method for real-time visualization and interaction of virtual images according to any one of claims 1 to 6, comprising:

the virtual image and sound information is matched with a three-dimensional virtual image model and a variation sound created in a terminal with an installed and created environment by a user;

after acquiring the virtual image of the opposite user, the portable protocol terminal finishes capturing the real image of the opposite user according to the opened motion capturing unit, synchronously captures the motion, expression and mouth shape of the opposite user, renders the captured motion to the virtual image in real time, and enables the virtual image to track the real image to realize synchronization;

8. The method as claimed in claim 7, wherein the motion capture is performed by using a human body capture algorithm, and by establishing a matching relationship between facial expressions and body movements of the user, the user can rapidly and synchronously drive the avatar to move and make the same expression based on the matching relationship when doing the movement and the expression.

9. The method as claimed in claim 7, wherein the virtual image and sound information is a three-dimensional avatar model created by a user in a terminal installed with a creation environment and matching with vocal variation, specifically comprising: the virtual image is created by using the virtual main body creating program, and the role, the shape, the hair style, the skin color, the accessory, the expression and the sound can be created according to the requirement of the user.

10. A terminal device, comprising:

a system for real-time visualization and interaction of virtual images as claimed in claims 1 to 6.