CN113176827A - AR interaction method and system based on expressions, electronic device and storage medium - Google Patents

AR interaction method and system based on expressions, electronic device and storage medium Download PDF

Info

Publication number
CN113176827A
CN113176827A CN202110571684.0A CN202110571684A CN113176827A CN 113176827 A CN113176827 A CN 113176827A CN 202110571684 A CN202110571684 A CN 202110571684A CN 113176827 A CN113176827 A CN 113176827A
Authority
CN
China
Prior art keywords
person
expression
real
interaction
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110571684.0A
Other languages
Chinese (zh)
Other versions
CN113176827B (en
Inventor
李佳佳
夏宇寰
张军鹏
郑子霞
魏谢敏
魏伟波
张鹏飞
李雯蔚
宋天滋
于沁宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Qingdao University
Original Assignee
Qingdao University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Qingdao University filed Critical Qingdao University
Priority to CN202110571684.0A priority Critical patent/CN113176827B/en
Publication of CN113176827A publication Critical patent/CN113176827A/en
Application granted granted Critical
Publication of CN113176827B publication Critical patent/CN113176827B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/011Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/174Facial expression recognition

Abstract

The invention provides an expression-based AR interaction method, a system, an electronic device and a storage medium, and the technical scheme of the method comprises the steps of collecting real-time image data of a real object and a first character in a real scene, and simultaneously collecting sound data of the first character and real-time image data of the environment of the real scene; generating an AR picture comprising virtual images of the entity, the first person and the environment according to the real-time image data of the entity, the first person and the environment, superposing expression elements on the virtual images of the entity in the AR picture, and simultaneously generating an intelligent image in the AR picture, wherein the AR picture is displayed through a screen; the intelligent figure is based on virtual imaging of a solid object superposed with expression elements, and interacts with a first person in a real scene according to sound data of the first person; the interaction comprises the intelligent agent image and a first person to carry out dialogue according to a preset corpus. The invention solves the problem that the existing VR-based or AR-based interactive system cannot consider the autism children group.

Description

AR interaction method and system based on expressions, electronic device and storage medium
Technical Field
The invention belongs to the technical field of augmented reality, and particularly relates to an expression-based AR interaction method and system, electronic equipment and a storage medium.
Background
In the emotional intervention of the autistic children, the prior art can be performed through Virtual Reality (VR), and the existing virtual reality technology products need to wear a virtual helmet or a wearable device, so that the interaction space is limited, and the children are easy to operate by mistake, and especially for the autistic children, the children do not like to wear the device on the body. In the existing Augmented Reality (AR) technology, an augmented reality helmet or a handheld digital device is required to realize a stereoscopic imaging effect, so that the required device cost is high, the efficiency is low due to the fact that hands cannot be liberated, and the recognition accuracy is poor; in addition, the technical development requirements of autism children as a user group are ignored, the operation is complex, the operation flow is tedious, and the method is not suitable for children, especially autism children.
Disclosure of Invention
The embodiment of the application provides an expression-based AR interaction method, system, electronic equipment and storage medium, and aims to at least solve the problem that an existing VR-based or AR-based interaction system cannot take the autism child group into consideration.
In a first aspect, an embodiment of the present application provides an expression-based AR interaction method, including: 1. an AR interaction method based on expressions is characterized by comprising the following steps: a real data acquisition step, which is to acquire real object and real-time image data of a first person in a real scene and simultaneously acquire sound data of the first person and real-time image data of the environment of the real scene; an AR picture generation step, namely generating an AR picture comprising virtual images of the entity, the first person and the environment according to the real-time image data of the entity, the first person and the environment, superposing an expression element on the virtual image of the entity in the AR picture, and simultaneously generating an intelligent object image in the AR picture, wherein the AR picture is displayed through a screen; an AR intelligent interaction step, wherein the intelligent object image is based on virtual imaging of the real object on which the expression elements are superposed, and interacts with the first person in the real scene according to the sound data and the real-time image data of the first person; the interaction comprises the intelligent object conversing with the first person according to a preset corpus.
Preferably, the method further comprises: and an intervention step, namely if the preset corpus can not support the dialog between the intelligent agent image and the first person, the interaction is intervened through a second person.
Preferably, the step of acquiring reality data further comprises: an expression training step, namely training an expression recognition classification model according to a face expression data set through a CNN neural network; and an expression classification step, namely identifying the acquired real-time image data of the first person through an OpenCV interface, extracting facial expression data of the first person, and inputting the facial expression data into the expression identification classification model for classification.
Preferably, an identification image is covered on the surface of the object, and the identification image comprises a two-dimensional graph with patterns and colors and is used for acquiring the real-time image data of the object.
In a second aspect, an embodiment of the present application provides an expression-based AR interaction system, which is applicable to the expression-based AR interaction method, and includes: the real data acquisition module is used for acquiring real-time image data of an entity and a first person in a real scene and simultaneously acquiring sound data of the first person and real-time image data of the environment of the real scene; the AR picture generation module generates an AR picture comprising virtual images of the entity, the first person and the environment according to the real-time image data of the entity, the first person and the environment, superimposes an expression element on the virtual image of the entity in the AR picture, and generates an intelligent object image in the AR picture, wherein the AR picture is displayed through a screen; the AR intelligent interaction module is used for enabling the intelligent body image to interact with the first person in the real scene according to the sound data and the real-time image data of the first person on the basis of the virtual imaging of the real body object on which the expression elements are superposed; the interaction comprises the intelligent object conversing with the first person according to a preset corpus.
In some of these embodiments, the system further comprises: and operating an intervention module, and if the preset corpus can not support the dialog between the intelligent body image and the first person, intervening the interaction through a second person.
In some of these embodiments, the reality data acquisition module further comprises: the expression training unit is used for training an expression recognition classification model according to a face expression data set through a CNN neural network; and the expression classification unit is used for identifying the acquired real-time image data of the first person through an OpenCV interface, extracting facial expression data of the first person, and inputting the facial expression data into the expression identification classification model for classification.
In some embodiments, an identification image is covered on the surface of the object, and the identification image comprises a two-dimensional graph with patterns and colors for acquiring the real-time image data of the object.
In a third aspect, an embodiment of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor, when executing the computer program, implements the expression-based AR interaction method according to the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor implements the expression-based AR interaction method as described in the first aspect above.
Compared with the prior art, the method and the device have the advantages that characters and environments in real scenes are collected and are directly displayed on the screen, in addition, entity objects used for interaction are designed and collected, expression elements are superposed when AR pictures are generated, the elements are combined to create suitable environment experience for AR system interaction of autistic children, the characters can be collected and are recognized in a classified mode, and the autistic children can be enabled to carry out AR interaction based on the expressions through designing interactive games.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:
FIG. 1 is a flowchart of an expression-based AR interaction method of the present invention;
FIG. 2 is a flowchart illustrating the substeps of step S1 in FIG. 1;
FIG. 3 is a block diagram of an expression-based AR interaction system of the present invention;
FIG. 4 is a block diagram of an electronic device of the present invention;
FIG. 5 is a diagram illustrating an effect of an embodiment of the present application;
FIG. 6 is an effect diagram of an intelligent agent image according to an embodiment of the present application;
FIG. 7 is a diagram illustrating an interaction effect according to an embodiment of the present application;
FIG. 8 is a diagram illustrating another interaction effect according to an embodiment of the present application;
in the above figures:
1. a reality data acquisition module; 2. an AR picture generation module; 3. an AR intelligent interaction module; 4. operating an intervention module; 11. an expression training unit; 12. an expression classification unit; 60. a bus; 61. a processor; 62. a memory; 63. a communication interface.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.
It is obvious that the drawings in the following description are only examples or embodiments of the present application, and that it is also possible for a person skilled in the art to apply the present application to other similar contexts on the basis of these drawings without inventive effort. Moreover, it should be appreciated that in the development of any such actual implementation, as in any engineering or design project, numerous implementation-specific decisions must be made to achieve the developers' specific goals, such as compliance with system-related and business-related constraints, which may vary from one implementation to another.
Reference in the specification to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the specification. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of ordinary skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments without conflict.
Unless defined otherwise, technical or scientific terms referred to herein shall have the ordinary meaning as understood by those of ordinary skill in the art to which this application belongs. Reference to "a," "an," "the," and similar words throughout this application are not to be construed as limiting in number, and may refer to the singular or the plural. The present application is directed to the use of the terms "including," "comprising," "having," and any variations thereof, which are intended to cover non-exclusive inclusions; for example, a process, method, system, article, or apparatus that comprises a list of steps or modules (elements) is not limited to the listed steps or elements, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Embodiments of the invention are described in detail below with reference to the accompanying drawings:
fig. 1 is a flowchart of an expression-based AR interaction method of the present invention, please refer to fig. 1, and the expression-based AR interaction method of the present invention includes the following steps:
s1: the method comprises the steps of collecting real-time image data of a real object and a first person in a real scene, and simultaneously collecting sound data of the first person and real-time image data of the environment of the real scene. Optionally, the surface of the object is covered with an identification image, and the identification image includes a two-dimensional graph with patterns and colors, and is used for acquiring the real-time image data of the object.
In the implementation, an entity object is designed to serve as an external entity object of the augmented reality system, the entity object conforms to the size and range of the hand-held by the first person, and optionally, building blocks can be used as the entity object. In the specific implementation, a two-dimensional graph which can be identified by a computer vision system is pasted on the entity, and the complexity of the pattern and the color of the two-dimensional graph is designed so as to improve the accuracy of the computer identification technology.
In specific implementation, a real-time image of the entity is acquired through the camera, a real-time image of the first person is acquired through the camera, and sound data of the first person is acquired at the same time, and optionally, the real-time image of the first person includes facial expressions of the first person. In addition, real-time image data of the current real environment are collected through the camera.
Optionally, fig. 2 is a flowchart illustrating a sub-step of step S1 in fig. 1, please refer to fig. 2:
s11: training an expression recognition classification model according to a human face expression data set through a CNN neural network;
s12: and identifying the acquired real-time image data of the first person through an OpenCV interface, extracting facial expression data of the first person, and inputting the facial expression data into the expression identification classification model for classification.
In specific implementation, a Fer2013 facial expression data set is used, a CNN neural network is used for training, after real-time image data of a first person are collected, an OpenCV interface is called to recognize facial expressions, and the facial expressions are transmitted to a trained model for expression classification.
Please continue to refer to fig. 1:
s2: according to the real object, the first person and the real-time image data of the environment, an AR picture comprising the real object, the first person and the virtual imaging of the environment is generated, an expression element is superposed on the virtual imaging of the real object in the AR picture, an intelligent object image is generated in the AR picture, and the AR picture is displayed through a screen.
In a specific implementation, the real-time image data acquired in step S1 is displayed in a mirror image on a display screen, and the real objects are identified, and an expressive element is superimposed on each of the real objects through an augmented reality technique, that is, an expressive pattern is superimposed on the real object, and the expressive pattern moves along with the movement of the real object. Fig. 5 is an effect diagram of an entity according to an embodiment of the present application, please refer to fig. 5, in which a building block in a real scene is used as the entity, a two-dimensional graph including patterns and colors is attached to the entity, and an expressive element is superimposed on the building block through an augmented reality technology for interaction with a first person.
In addition, a virtual agent image is designed by using three-dimensional design software and is superposed in a picture by using an augmented reality technology. Optionally, the virtual intelligent object is a character. Fig. 6 is a diagram illustrating an effect of an entity according to an embodiment of the present application, please refer to fig. 6, which illustrates designing a virtual agent image of an agent image and processing the virtual agent image in a three-dimensional manner.
In an implementation, the actions and speech of the avatar are controlled by the game engine Unity.
In specific implementation, a wake-up password is preset, and the first person activates the agent image by speaking the wake-up password.
S3: the intelligent figure is based on virtual imaging of the entity object superposed with the expression elements, and interacts with the first person in the real scene according to the sound data and the real-time image data of the first person; the interaction comprises the intelligent object conversing with the first person according to a preset corpus.
In specific implementation, the Vuforia AR engine interface is used for positioning the position of a target entity, so that a first person can control a virtual expression in a screen by controlling the entity.
In a specific implementation, fig. 7 is an interaction effect diagram according to an embodiment of the present application, please refer to fig. 7, a real scene mirror image is displayed in a picture, a virtual agent is generated through an augmented reality technology, real-time image data of a building block in the real scene is collected, and the real-time image data is displayed in the picture after an expression element is superimposed through the augmented reality technology, as shown in fig. 7, the embodiment of the present application provides a first interaction rule: the first person completes the memory game through the dialogue with the virtual intelligent body, and the augmented reality engine scans the building blocks as the entity object and generates the virtual expression. In the picture, the virtual expression can automatically convert the angle and the position, so that the first person can guess and find out the designated virtual expression. Optionally, the first person may find out the building block corresponding to the virtual expression in reality through voice or motion.
In specific implementation, fig. 8 is another interaction effect diagram of the embodiment of the present application, please refer to fig. 8, a real scene mirror image is displayed in a picture, a virtual agent is generated through an augmented reality technology, real-time image data of building blocks in the real scene is collected and displayed in the picture after expression elements are superimposed through the augmented reality technology, in addition, a virtual whiteboard is generated in the generated augmented reality picture, and the virtual whiteboard presents a two-dimensional cartoon, as shown in fig. 8, the embodiment of the present application provides a second interaction rule: and generating a virtual white board in the generated augmented reality picture, wherein the virtual white board presents a two-dimensional cartoon, the virtual intelligent body asks a question for the first person, and the first person needs to answer what expression the role in the cartoon should do at the moment. In a specific implementation, optionally, the first person inputs information by lifting the building block, and the virtual agent determines whether the information is correct and gives a prompt. In specific implementation, a social scene is preset as the content of the cartoon.
In specific implementation, the embodiment of the present application provides a third interaction rule: the method comprises the steps of directly displaying an image of a first person in an augmented reality picture, namely enabling the first person to appear in a screen, making a specified expression by the first person according to the requirement of a virtual intelligent body, collecting the facial expression of the first person, inputting the facial expression into an expression classification model for classification detection, and optionally calculating the duration of the expression to display the completion effect in a visual mode.
In a specific implementation, the dialog between the virtual agent and the first person is supported by the artificial intelligence corpus, and the corresponding reply sentence is retrieved according to the keyword of the first person conversation.
And S4, if the preset corpus can not support the dialog between the intelligent image and the first person, intervening the interaction through a second person.
In the specific implementation, another operation end is designed, a second person except the first person performs operation, and if the intelligent agent image cannot complete the conversation with the first person according to the existing corpus, the second person performs intervention to control the intelligent agent to perform conversation; optionally, the second person may also control the interaction progress and other emergency situations that cannot be handled by any preset rule.
In the specific implementation, the second person can control the conversation of the virtual agent, the interaction progress and the subsequent interaction event triggering through the UDP network protocol.
Fig. 3 is a block diagram of an expression-based AR interaction system according to the present invention, please refer to fig. 3, which includes:
reality data acquisition module 1: the method comprises the steps of collecting real-time image data of a real object and a first person in a real scene, and simultaneously collecting sound data of the first person and real-time image data of the environment of the real scene. Optionally, the surface of the object is covered with an identification image, and the identification image includes a two-dimensional graph with patterns and colors, and is used for acquiring the real-time image data of the object.
In the implementation, an entity object is designed to serve as an external entity object of the augmented reality system, the entity object conforms to the size and range of the hand-held by the first person, and optionally, building blocks can be used as the entity object. In the specific implementation, a two-dimensional graph which can be identified by a computer vision system is pasted on the entity, and the complexity of the pattern and the color of the two-dimensional graph is designed so as to improve the accuracy of the computer identification technology.
In specific implementation, a real-time image of the entity is acquired through the camera, a real-time image of the first person is acquired through the camera, and sound data of the first person is acquired at the same time, and optionally, the real-time image of the first person includes facial expressions of the first person. In addition, real-time image data of the current real environment are collected through the camera.
Optionally, the reality data acquisition module 1 further includes:
expression training unit 11: training an expression recognition classification model according to a human face expression data set through a CNN neural network;
expression classification unit 12: and identifying the acquired real-time image data of the first person through an OpenCV interface, extracting facial expression data of the first person, and inputting the facial expression data into the expression identification classification model for classification.
In specific implementation, a Fer2013 facial expression data set is used, a CNN neural network is used for training, after real-time image data of a first person are collected, an OpenCV interface is called to recognize facial expressions, and the facial expressions are transmitted to a trained model for expression classification.
The AR screen generation module 2: according to the real object, the first person and the real-time image data of the environment, an AR picture comprising the real object, the first person and the virtual imaging of the environment is generated, an expression element is superposed on the virtual imaging of the real object in the AR picture, an intelligent object image is generated in the AR picture, and the AR picture is displayed through a screen.
In specific implementation, real-time image data acquired by the real data acquisition module 1 is displayed in a display screen in a mirror image mode, real objects are identified, an expression element is superposed on each real object through an augmented reality technology, namely, an expression pattern is superposed on each real object, and the expression pattern moves along with the movement of the real objects. Fig. 5 is an effect diagram of an entity according to an embodiment of the present application, please refer to fig. 5, in which a building block in a real scene is used as the entity, a two-dimensional graph including patterns and colors is attached to the entity, and an expressive element is superimposed on the building block through an augmented reality technology for interaction with a first person.
In addition, a virtual agent image is designed by using three-dimensional design software and is superposed in a picture by using an augmented reality technology. Optionally, the virtual intelligent object is a character. Fig. 6 is a diagram illustrating an effect of an entity according to an embodiment of the present application, please refer to fig. 6, which illustrates designing a virtual agent image of an agent image and processing the virtual agent image in a three-dimensional manner.
In an implementation, the actions and speech of the avatar are controlled by the game engine Unity.
In specific implementation, a wake-up password is preset, and the first person activates the agent image by speaking the wake-up password.
AR intelligent interaction module 3: the intelligent figure is based on virtual imaging of the entity object superposed with the expression elements, and interacts with the first person in the real scene according to the sound data and the real-time image data of the first person; the interaction comprises the intelligent object conversing with the first person according to a preset corpus.
In specific implementation, the Vuforia AR engine interface is used for positioning the position of a target entity, so that a first person can control a virtual expression in a screen by controlling the entity.
In a specific implementation, fig. 7 is an interaction effect diagram according to an embodiment of the present application, please refer to fig. 7, a real scene mirror image is displayed in a picture, a virtual agent is generated through an augmented reality technology, real-time image data of a building block in the real scene is collected, and the real-time image data is displayed in the picture after an expression element is superimposed through the augmented reality technology, as shown in fig. 7, the embodiment of the present application provides a first interaction rule: the first person completes the memory game through the dialogue with the virtual intelligent body, and the augmented reality engine scans the building blocks as the entity object and generates the virtual expression. In the picture, the virtual expression can automatically convert the angle and the position, so that the first person can guess and find out the designated virtual expression. Optionally, the first person may find out the building block corresponding to the virtual expression in reality through voice or motion.
In specific implementation, fig. 8 is another interaction effect diagram of the embodiment of the present application, please refer to fig. 8, a real scene mirror image is displayed in a picture, a virtual agent is generated through an augmented reality technology, real-time image data of building blocks in the real scene is collected and displayed in the picture after expression elements are superimposed through the augmented reality technology, in addition, a virtual whiteboard is generated in the generated augmented reality picture, and the virtual whiteboard presents a two-dimensional cartoon, as shown in fig. 8, the embodiment of the present application provides a second interaction rule: and generating a virtual white board in the generated augmented reality picture, wherein the virtual white board presents a two-dimensional cartoon, the virtual intelligent body asks a question for the first person, and the first person needs to answer what expression the role in the cartoon should do at the moment. In a specific implementation, optionally, the first person inputs information by lifting the building block, and the virtual agent determines whether the information is correct and gives a prompt. In specific implementation, a social scene is preset as the content of the cartoon.
In specific implementation, the embodiment of the present application provides a third interaction rule: the method comprises the steps of directly displaying an image of a first person in an augmented reality picture, namely enabling the first person to appear in a screen, making a specified expression by the first person according to the requirement of a virtual intelligent body, collecting the facial expression of the first person, inputting the facial expression into an expression classification model for classification detection, and optionally calculating the duration of the expression to display the completion effect in a visual mode.
In a specific implementation, the dialog between the virtual agent and the first person is supported by the artificial intelligence corpus, and the corresponding reply sentence is retrieved according to the keyword of the first person conversation.
And operating an intervention module 4, if the preset corpus can not support the dialog between the intelligent agent image and the first person, intervening the interaction through a second person.
In the specific implementation, another operation end is designed, a second person except the first person performs operation, and if the intelligent agent image cannot complete the conversation with the first person according to the existing corpus, the second person performs intervention to control the intelligent agent to perform conversation; optionally, the second person may also control the interaction progress and other emergency situations that cannot be handled by any preset rule.
In the specific implementation, the second person can control the conversation of the virtual agent, the interaction progress and the subsequent interaction event triggering through the UDP network protocol.
In addition, the expression-based AR interaction method described in conjunction with fig. 1 and 2 may be implemented by an electronic device. Fig. 4 is a block diagram of an electronic device of the present invention.
The electronic device may comprise a processor 61 and a memory 62 in which computer program instructions are stored.
Specifically, the processor 61 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.
Memory 62 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 62 may include a Hard Disk Drive (Hard Disk Drive, abbreviated HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 62 may include removable or non-removable (or fixed) media, where appropriate. The memory 62 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 62 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 62 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.
The memory 62 may be used to store or cache various data files that need to be processed and/or used for communication, as well as possible computer program instructions executed by the processor 61.
The processor 61 implements any of the expression-based AR interaction methods in the above embodiments by reading and executing computer program instructions stored in the memory 62.
In some of these embodiments, the electronic device may also include a communication interface 63 and a bus 60. As shown in fig. 4, the processor 61, the memory 62, and the communication interface 63 are connected via a bus 60 to complete communication therebetween.
The communication port 63 may be implemented with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.
The bus 60 includes hardware, software, or both to couple the components of the electronic device to one another. Bus 60 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 60 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 60 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.
The electronic device may perform the expression-based AR interaction method in the embodiments of the present application.
In addition, in combination with the expression-based AR interaction method in the foregoing embodiments, embodiments of the present application may provide a computer-readable storage medium to implement. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the expression-based AR interaction methods of the embodiments described above.
And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. An AR interaction method based on expressions is characterized by comprising the following steps:
a real data acquisition step, which is to acquire real object and real-time image data of a first person in a real scene and simultaneously acquire sound data of the first person and real-time image data of the environment of the real scene;
an AR picture generation step, namely generating an AR picture comprising virtual images of the entity, the first person and the environment according to the real-time image data of the entity, the first person and the environment, superposing an expression element on the virtual image of the entity in the AR picture, and simultaneously generating an intelligent object image in the AR picture, wherein the AR picture is displayed through a screen;
an AR intelligent interaction step, wherein the intelligent object image is based on virtual imaging of the real object on which the expression elements are superposed, and interacts with the first person in the real scene according to the sound data of the first person; the interaction comprises the intelligent object conversing with the first person according to a preset corpus.
2. The expression-based AR interaction method of claim 1, wherein the method further comprises:
and an intervention step, namely if the preset corpus can not support the dialog between the intelligent agent image and the first person, the interaction is intervened through a second person.
3. The expression-based AR interaction method of claim 1, wherein the step of collecting reality data further comprises:
an expression training step, namely training an expression recognition classification model according to a face expression data set through a CNN neural network;
and an expression classification step, namely identifying the acquired real-time image data of the first person through an OpenCV interface, extracting facial expression data of the first person, and inputting the facial expression data into the expression identification classification model for classification.
4. The AR interaction method based on the expression as claimed in claim 1, wherein a recognition image is covered on the surface of the object, the recognition image comprises a two-dimensional graph with patterns and colors, and the real-time image data of the object is acquired by the recognition image.
5. AR interactive system based on expression, characterized by, include:
the real data acquisition module is used for acquiring real-time image data of an entity and a first person in a real scene and simultaneously acquiring sound data of the first person and real-time image data of the environment of the real scene;
the AR picture generation module generates an AR picture comprising virtual images of the entity, the first person and the environment according to the real-time image data of the entity, the first person and the environment, superimposes an expression element on the virtual image of the entity in the AR picture, and generates an intelligent object image in the AR picture, wherein the AR picture is displayed through a screen;
the AR intelligent interaction module is used for enabling the intelligent object image to be based on the virtual imaging of the entity object superposed with the expression elements and to interact with the first person in the real scene according to the sound data of the first person; the interaction comprises the intelligent object conversing with the first person according to a preset corpus.
6. The expression-based AR interaction system of claim 5, wherein the system further comprises:
and operating an intervention module, and if the preset corpus can not support the dialog between the intelligent body image and the first person, intervening the interaction through a second person.
7. The expression-based AR interaction system of claim 5, wherein the reality data collection module further comprises:
the expression training unit is used for training an expression recognition classification model according to a face expression data set through a CNN neural network;
and the expression classification unit is used for identifying the acquired real-time image data of the first person through an OpenCV interface, extracting facial expression data of the first person, and inputting the facial expression data into the expression identification classification model for classification.
8. The AR interaction system based on expressions according to claim 5, characterized in that the surface of the physical object is covered with an identification image, the identification image comprises a two-dimensional graph with patterns and colors, and the real-time image data of the physical object is acquired.
9. An electronic device comprising a memory, a processor, and a computer program stored on the memory and executable on the processor, wherein the processor, when executing the computer program, implements the expression-based AR interaction method of any one of claims 1 to 4.
10. A computer-readable storage medium on which a computer program is stored, the program, when executed by a processor, implementing the expression-based AR interaction method of any of claims 1 to 4.
CN202110571684.0A 2021-05-25 2021-05-25 AR interaction method and system based on expressions, electronic device and storage medium Active CN113176827B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110571684.0A CN113176827B (en) 2021-05-25 2021-05-25 AR interaction method and system based on expressions, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110571684.0A CN113176827B (en) 2021-05-25 2021-05-25 AR interaction method and system based on expressions, electronic device and storage medium

Publications (2)

Publication Number Publication Date
CN113176827A true CN113176827A (en) 2021-07-27
CN113176827B CN113176827B (en) 2022-10-28

Family

ID=76928211

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110571684.0A Active CN113176827B (en) 2021-05-25 2021-05-25 AR interaction method and system based on expressions, electronic device and storage medium

Country Status (1)

Country Link
CN (1) CN113176827B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116643675A (en) * 2023-07-27 2023-08-25 苏州创捷传媒展览股份有限公司 Intelligent interaction system based on AI virtual character

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109841217A (en) * 2019-01-18 2019-06-04 苏州意能通信息技术有限公司 A kind of AR interactive system and method based on speech recognition
CN209821887U (en) * 2019-03-26 2019-12-20 广东虚拟现实科技有限公司 Marker substance
CN112053449A (en) * 2020-09-09 2020-12-08 脸萌有限公司 Augmented reality-based display method, device and storage medium
US20210118237A1 (en) * 2019-10-15 2021-04-22 Beijing Sensetime Technology Development Co., Ltd. Augmented reality scene image processing method and apparatus, electronic device and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109841217A (en) * 2019-01-18 2019-06-04 苏州意能通信息技术有限公司 A kind of AR interactive system and method based on speech recognition
CN209821887U (en) * 2019-03-26 2019-12-20 广东虚拟现实科技有限公司 Marker substance
US20210118237A1 (en) * 2019-10-15 2021-04-22 Beijing Sensetime Technology Development Co., Ltd. Augmented reality scene image processing method and apparatus, electronic device and storage medium
CN112053449A (en) * 2020-09-09 2020-12-08 脸萌有限公司 Augmented reality-based display method, device and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116643675A (en) * 2023-07-27 2023-08-25 苏州创捷传媒展览股份有限公司 Intelligent interaction system based on AI virtual character
CN116643675B (en) * 2023-07-27 2023-10-03 苏州创捷传媒展览股份有限公司 Intelligent interaction system based on AI virtual character

Also Published As

Publication number Publication date
CN113176827B (en) 2022-10-28

Similar Documents

Publication Publication Date Title
EP3885965B1 (en) Image recognition method based on micro facial expressions, apparatus and related device
TWI751161B (en) Terminal equipment, smart phone, authentication method and system based on face recognition
Olszewski et al. High-fidelity facial and speech animation for VR HMDs
US11736756B2 (en) Producing realistic body movement using body images
WO2019173108A1 (en) Electronic messaging utilizing animatable 3d models
WO2020150686A1 (en) Systems and methods for face reenactment
CN110418095B (en) Virtual scene processing method and device, electronic equipment and storage medium
CN110956691B (en) Three-dimensional face reconstruction method, device, equipment and storage medium
KR102148151B1 (en) Intelligent chat based on digital communication network
WO2022252866A1 (en) Interaction processing method and apparatus, terminal and medium
WO2020024692A1 (en) Man-machine interaction method and apparatus
CN106127828A (en) The processing method of a kind of augmented reality, device and mobile terminal
CN111108508B (en) Face emotion recognition method, intelligent device and computer readable storage medium
EP4315266A1 (en) Interactive augmented reality content including facial synthesis
KR20230113370A (en) face animation compositing
CN111009028A (en) Expression simulation system and method of virtual face model
US20190302880A1 (en) Device for influencing virtual objects of augmented reality
CN113362263A (en) Method, apparatus, medium, and program product for changing the image of a virtual idol
KR20200092207A (en) Electronic device and method for providing graphic object corresponding to emotion information thereof
CN113176827B (en) AR interaction method and system based on expressions, electronic device and storage medium
CN112714337A (en) Video processing method and device, electronic equipment and storage medium
CN110084306B (en) Method and apparatus for generating dynamic image
CN112149599A (en) Expression tracking method and device, storage medium and electronic equipment
CN111597926A (en) Image processing method and device, electronic device and storage medium
KR102345729B1 (en) Method and apparatus for generating video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant