WO2020147598A1 - Model action method and apparatus, speaker having screen, electronic device, and storage medium - Google Patents

Model action method and apparatus, speaker having screen, electronic device, and storage medium Download PDF

Info

Publication number
WO2020147598A1
WO2020147598A1 PCT/CN2020/070375 CN2020070375W WO2020147598A1 WO 2020147598 A1 WO2020147598 A1 WO 2020147598A1 CN 2020070375 W CN2020070375 W CN 2020070375W WO 2020147598 A1 WO2020147598 A1 WO 2020147598A1
Authority
WO
WIPO (PCT)
Prior art keywords
face
image
facial
preset
action
Prior art date
Application number
PCT/CN2020/070375
Other languages
French (fr)
Chinese (zh)
Inventor
冯瑞丰
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020147598A1 publication Critical patent/WO2020147598A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation

Definitions

  • the embodiments of the present disclosure relate to image processing technology, for example, to a model action method, device, speaker with screen, electronic equipment, and storage medium.
  • the human-machine interaction method stays in a two-dimensional space, such as using voice for interaction, or in screen interaction, by detecting the moving distance, moving speed and moving direction of the limbs on the plane to achieve the interaction.
  • the interaction in the two-dimensional space cannot simulate real characters, and the interaction effect is poor.
  • the present disclosure provides a model action method, a device, a speaker with a screen, an electronic device, and a storage medium, so as to solve the problem that the model action method in a two-dimensional space cannot simulate a real character image and the model action effect is poor.
  • the embodiment of the present disclosure provides a model action method, including:
  • a model action instruction is generated according to the facial action feature parameters, so that a preset three-dimensional (3 Dimensions, 3D) image with facial feature data executes a face action corresponding to the model action instruction according to the model action instruction.
  • the embodiment of the present disclosure also provides a model action device, which includes:
  • the face image acquisition module is set to acquire two or more consecutive face images
  • the facial motion feature parameter determination module is configured to determine the facial motion feature parameter corresponding to the face change according to the face changes in the two or more face images;
  • the preset 3D image action execution module is configured to generate model action instructions according to the facial action feature parameters, so that the preset 3D image with face feature data executes the person corresponding to the model action instruction according to the model action instructions Face action.
  • the embodiment of the present disclosure further provides a speaker with a screen, which includes a main body, a controller located in the main body, and at least two cameras located on the main body; the distance between the at least two cameras is greater than a distance threshold, and the controller
  • the model action device as described in any embodiment of the present disclosure is provided inside.
  • An embodiment of the present disclosure also provides an electronic device, which includes:
  • One or more processors are One or more processors;
  • Memory set to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the model action method described in any of the embodiments of the present disclosure.
  • the embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the model action method as described in any of the embodiments of the present disclosure is implemented.
  • FIG. 1 is a flowchart of a model action method provided by Embodiment 1 of the present disclosure
  • FIG. 2 is a flowchart of a model action method provided by Embodiment 2 of the present disclosure
  • FIG. 3 is a flowchart of a model action method provided by Embodiment 3 of the present disclosure.
  • Embodiment 4 is a schematic structural diagram of a model action device provided by Embodiment 4 of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a speaker with a screen provided by Embodiment 5 of the present disclosure
  • FIG. 6 is a schematic diagram of the control structure of a model action interface of a speaker with a screen provided in Embodiment 5 of the present disclosure
  • FIG. 7 is a schematic structural diagram of an electronic device provided by Embodiment 6 of the present disclosure.
  • each embodiment provides optional features and examples at the same time. Multiple features recorded in the embodiments can be combined to form multiple alternative solutions. Each numbered embodiment should not be regarded as It is a technical solution.
  • Figure 1 is a flow chart of a model action method provided in Embodiment 1 of the present disclosure. This embodiment is applicable to the situation of human face action interaction.
  • the method can be executed by a model action device, which can be hardware and/or software. It can be integrated into electronic devices such as mobile phones, tablets and computers. Including the following steps:
  • Two or more consecutive face images are acquired through at least one camera, and the time interval between each image is preset.
  • a face image can be acquired by a camera at a preset time interval, or face video data can be recorded by a camera, and the face image can be intercepted from the face video data at a preset time interval.
  • the face changes in the face image can be detected.
  • the face change is determined by comparing the pixel changes at the corresponding positions in the two face images.
  • a fixed part of the face The face change determines the change of the face.
  • a fixed part can be the eyes or the chin.
  • the change of the eyes in the two images can have positions Change, according to the interval time between two images, the movement speed of the eyes can be confirmed.
  • the movement trajectory of the eyes can be obtained, and the movement speed information can be obtained by combining the movement time information, which can determine the person Feature parameters of facial movements.
  • the facial motion characteristic parameter includes at least one of the following parameters: moving speed, moving direction, and moving distance. Face movements can be shaking up and down, shaking left and right, or shaking in circles.
  • the facial motion characteristic parameter corresponding to the facial motion is at least one of the moving speed, the moving direction and the moving distance.
  • the facial motion characteristic parameters include movement speed, movement direction, and movement distance.
  • the facial motion feature parameters can be used to restore the facial motion to achieve the effect of interacting with the user.
  • the preset 3D image with facial feature data is: a cartoon 3D image, a professional 3D image, or a gender 3D image.
  • the cartoon 3D image can be, for example, an animal image such as a kitten, a puppy, or a monkey, or an animated image such as Peppa Pig or Cherry Ball.
  • the professional 3D image may be, for example, an image of a doctor, a teacher, a firefighter, or a policeman.
  • the gender 3D image can be, for example, a man or a woman.
  • the gender 3D image can also be combined with age information to set images such as boy, girl, adult man, adult woman, elderly man, or elderly woman.
  • the determining, based on the face changes in the two or more face images, the face action feature parameters corresponding to the face changes includes: according to the The position changes of the preset parts of the face in two or more face images are determined, and the moving speed, the moving direction and the moving distance of the preset parts are determined to determine the facial motion characteristic parameters.
  • the preset position may be, for example, the eyes or the chin, or the cheeks, and the preset position is not limited here.
  • the model action instruction is generated according to the facial action characteristic parameters, so that the preset 3D image with face feature data is executed according to the model action instruction corresponding to the model action instruction
  • the facial action includes: generating a model action instruction according to the moving speed, the moving direction, and the moving distance, so that a preset 3D image with facial feature data executes the model action according to the model action instruction Command the corresponding facial action.
  • the model action instructions include the movement speed, direction and distance of the face.
  • the preset 3D image with face feature data simulates the face action according to the model action instructions.
  • the preset 3D image with face feature data realizes the interaction with the user. Face action interaction.
  • the model action instruction includes the facial movement speed, movement direction and movement distance of the preset 3D image with facial feature data.
  • the facial movement speed, movement direction and movement distance of the preset 3D image with facial feature data can be compared with The moving speed, moving direction, and moving distance of the user's face are the same, or they can be set according to preset rules.
  • the preset rule can be, for example, that the moving speed of the user's face is v, which has facial feature data.
  • the moving speed of the face of the 3D image is 2v; the moving direction of the user’s face is left and right, and the moving direction of the preset 3D image with facial feature data is left and right; the moving distance of the user’s face is d, with people
  • the face movement distance of the preset 3D image of the facial feature data is 2d.
  • the preset rule can also be, for example, that the moving speed of the user's face is v, and the moving speed of the preset 3D image with facial feature data is v/2; the moving direction of the user's face is left and right, with people The moving direction of the face of the preset 3D image of the facial feature data is right and left; the moving distance of the user's face is d, and the moving distance of the face of the preset 3D image with facial feature data is d/2.
  • the preset rules can be set arbitrarily to improve the interactive interest.
  • the model action method obtained by the embodiment of the present disclosure obtains two or more consecutive face images; and determines the face change according to the face change in the two or more face images Corresponding face action feature parameters; generating model action instructions according to the face action feature parameters, so that the preset 3D image with face feature data executes the face actions corresponding to the model action instructions according to the model action instructions. Realize the simulation of facial motions through preset 3D images with facial feature data, improve the effect of facial motion simulation, enhance the reality of facial motion simulation, and improve the interactive experience.
  • FIG. 2 is a schematic flowchart of a model action method provided in Embodiment 2 of the disclosure. This embodiment is described on the basis of the optional solutions in the foregoing embodiment. It includes the following:
  • S220 According to the face changes in the two or more face images, determine the facial motion feature parameters corresponding to the face changes.
  • the facial feature data of the image can achieve the effect of improving the accuracy of the construction of the 3D face model.
  • the facial feature extraction algorithm the facial feature data of the face image is extracted.
  • the face feature data mainly represents the data of the eyes, eyebrows, nose, mouth, ears, and face contour, and the face can be uniquely represented by the face feature data.
  • a 3D face model corresponding to the face image is constructed.
  • the 3D model used may be a general face model or a three-dimensional deformed model.
  • S250 Apply the 3D face model to a preset 3D image, and obtain a preset 3D image with the face feature data.
  • the preset 3D image with facial feature data has the same facial feature data as the user, achieving the purpose of simulating the appearance of the user, and the preset 3D image can Realize the role-playing of the user.
  • the preset 3D image is a doctor, after applying the user's face 3D model to the preset 3D image, a doctor with the user's face data can be formed, that is, a doctor who looks the same as the user can be obtained , Realize the user to act as a doctor, improve entertainment.
  • the preset 3D image with facial feature data is a 3D image with user facial feature data.
  • Model action instructions are generated based on facial action feature parameters.
  • the preset 3D image with facial feature data executes the corresponding person according to the model action instructions.
  • Facial actions can realize facial action interactions between users and preset 3D images with facial feature data that have the same facial feature data as themselves, which enhances the interest and enhances the interactive experience.
  • the technical solution of this embodiment extracts face feature data of at least one face image at a preset angle by acquiring at least one face image at a preset angle; constructs at least one face image at a preset angle according to the face feature data Corresponding face 3D model; apply the face 3D model to the preset 3D image to obtain the preset 3D image with facial feature data, which can realize face action interaction with the avatar with the user's facial features, and improve entertainment Sex and enhance the interactive experience.
  • This embodiment does not limit the execution order of steps S210, S220 and steps S230, S240, and S250, and can be performed according to the order of this embodiment, or steps S230, S240, and S250 are performed first, and then steps S210 and S220 are performed. It may also be steps S210 and S220, which are executed synchronously with steps S230, S240 and S250.
  • FIG. 3 is a schematic flowchart of a model action method provided in Embodiment 3 of the disclosure. This embodiment is described on the basis of the optional solutions in the foregoing embodiment. It includes the following:
  • S320 Determine the moving speed, the moving direction and the moving distance of the preset part according to the position changes of the preset parts of the face in the two or more face images to determine the facial motion feature parameters.
  • the preset parts in the face image include at least: eyes, mouth, nose, eyebrows, chin, forehead and cheeks.
  • the position change of the mouth in the face image is detected.
  • the direction and distance of the movement of the mouth can be determined, and the time period corresponding to the position of the mouth can be determined through the time information of the face image. Then you can determine the speed of the mouth.
  • the moving speed, moving direction and moving distance of the mouth are the moving speed, moving direction and moving distance of the human face.
  • a model action instruction is generated.
  • the model action instruction contains the facial movement speed, movement direction and movement distance of the preset 3D image with facial feature data, and has facial features
  • the face moving speed, moving direction and moving distance of the preset 3D image of the data can be the same as or different from the user’s face moving speed, moving direction and moving distance. It can be set according to the preset rules, and the preset rules can be referred to The above disclosed embodiments.
  • the technical solution of this embodiment acquires two or more consecutive face images, and determines the preset according to the position change of the preset part of the face in the two or more face images
  • the movement speed, movement direction and movement distance of the parts are used to determine the facial motion characteristic parameters, and to improve the accuracy of the determination of the movement speed, movement direction and movement distance of the face; according to the movement speed, the movement direction and the movement distance
  • Generate a model action instruction to make a preset 3D image with facial feature data execute facial actions with the model action instruction according to the model action instruction, so as to realize the simulation of facial actions through the preset 3D image with facial feature data, Improve the effect of facial motion simulation, enhance the reality of facial motion simulation, and improve the interactive experience.
  • FIG. 4 is a schematic structural diagram of a model action device provided in the fourth embodiment of the disclosure.
  • the model action device includes: a face image acquisition module 410, a face action feature parameter determination module 420, and a preset 3D image action execution module 430, each of which is described below.
  • the face image acquisition module 410 is configured to acquire two or more consecutive face images.
  • the facial motion characteristic parameter determination module 420 is configured to determine the facial motion characteristic parameter corresponding to the facial change according to the facial changes in the two or more facial images.
  • the preset 3D image action execution module 430 is configured to generate model action instructions according to the facial action feature parameters, so that the preset 3D image with face feature data executes the model action instructions corresponding to the model action instructions according to the model action instructions. Face action.
  • the model action device provided in this embodiment can simulate a face action through a preset 3D image with face feature data, improve the effect of face action simulation, enhance the reality of face action simulation, and improve the interactive experience.
  • the device further includes: a preset 3D image acquisition module, configured to generate model action instructions according to the facial motion feature parameters, so that the preset 3D image with facial feature data is Before the model action instruction executes the face action corresponding to the model action instruction, acquiring at least one face image with a preset angle, and extracting face feature data of the face image at the at least one preset angle; Face feature data, constructing a face 3D model corresponding to the face image at the at least one preset angle; applying the face 3D model to an initial preset 3D image to obtain a preset with the face feature data 3D image.
  • a preset 3D image acquisition module configured to generate model action instructions according to the facial motion feature parameters, so that the preset 3D image with facial feature data is Before the model action instruction executes the face action corresponding to the model action instruction, acquiring at least one face image with a preset angle, and extracting face feature data of the face image at the at least one preset angle; Face feature data, constructing a face 3D model corresponding to the
  • the facial motion characteristic parameters include at least one of the following parameters: moving speed, moving direction, and moving distance.
  • the facial motion feature parameter determination module is set to:
  • the moving speed, the moving direction and the moving distance of the preset parts are determined to determine the facial motion characteristic parameters.
  • the preset 3D image action execution module is set to:
  • a model action instruction is generated according to the movement speed, the movement direction and the movement distance, so that the preset 3D avatar with facial feature data executes the face action corresponding to the model action instruction according to the model action instruction.
  • the preset 3D image with facial feature data is: a cartoon 3D image, a professional 3D image, or a gender 3D image.
  • the model action device provided by the present disclosure can execute the model action method provided by any embodiment of the present disclosure, and has the corresponding functional modules and effects for executing the model action method.
  • Fig. 5 is a schematic structural diagram of a speaker with a screen provided in the fifth embodiment of the disclosure.
  • the speaker with a screen includes: a main body 51, a controller 52 located in the main body 51, and at least two cameras 53 located on the main body 51; the distance between the at least two cameras 53 is greater than the distance threshold, the The controller 52 is provided with any model action device as provided in the embodiment of the present disclosure.
  • the distance between at least two cameras is greater than the distance threshold.
  • one camera can be placed on the upper part of the main body of the speaker with screen, and the other camera can be placed on the lower part of the main body of the speaker with screen, and the distance is greater than the distance threshold . It is convenient to shoot face images from multiple angles and directions, enrich the angle information obtained by the face image, and then enrich the face feature data of the face image, so as to achieve the effect of improving the accuracy of the construction of the face 3D model.
  • Figure 6 is a schematic diagram of the control structure of a model action interface with a speaker with a screen provided by an embodiment of the disclosure.
  • the camera obtains the user's face image data, extracts the face feature data, builds a face 3D model, and applies the face 3D model to a preset 3D image to obtain a preset 3D with face feature data Image, so that the preset 3D image with facial feature data has the user's facial feature data, and then long press control 3, the camera captures the user's facial movements, the preset 3D image with facial feature data is based on the user's face
  • the moving speed, moving direction and moving distance are executed in accordance with the conversion rules set by the user to perform facial actions to realize interaction with the user through facial actions.
  • a speaker with a screen can be applied to a point-to-read scene.
  • the image data of the point-to-read material is obtained through a camera of the screen-mounted speaker.
  • the image text content is obtained, the text content is converted into voice data, and the reading is realized through the speaker.
  • the screen in the speaker with screen can display the preset 3D image with face feature data, and click the preset 3D image with face feature data to realize the interesting point of reading, such as the preset with face feature data
  • the 3D image is the teacher, through the teacher’s point reading, the simulation of real teaching is realized, the learning fun is improved, and the learning efficiency is improved.
  • the user can also use the preset 3D image with facial feature data displayed in the speaker with a screen, and use the model action method in the above disclosed embodiment to realize the interaction with the preset 3D image with facial feature data to improve the interaction Experience.
  • the speaker with a screen provided in this embodiment can simulate facial motions through a preset 3D image with facial feature data, improve the effect of facial motion simulation, enhance the authenticity of facial motion simulation, and improve interactive experience.
  • FIG. 7 shows a schematic structural diagram of an electronic device 600 suitable for implementing embodiments of the present disclosure.
  • the electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), tablet computers (PAD), and portable multimedia players (Portable Media Player). , PMP), mobile terminals such as car navigation terminals, etc., and fixed terminals such as digital televisions (Television, TV), desktop computers, etc.
  • PMP Personal Digital Assistant
  • PMP Personal Digital Assistant
  • mobile terminals such as car navigation terminals, etc.
  • fixed terminals such as digital televisions (Television, TV), desktop computers, etc.
  • the electronic device shown in FIG. 7 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
  • the electronic device 600 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 602 or from a storage device 608 is loaded into a random access memory (Random Access Memory, RAM) 603 program to perform various appropriate actions and processing.
  • the RAM 603 also stores various programs and data required for the operation of the electronic device 600.
  • the processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input/output (Input/Output, I/O) interface 605 is also connected to the bus 604.
  • the following devices can be connected to the I/O interface 605: including input devices 606 such as touch screen, touch panel, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD) Output devices 607 such as speakers, vibrators, etc.; storage devices 608 such as magnetic tapes, hard disks, etc.; and communication devices 609.
  • the communication device 609 may allow the electronic device 600 to perform wireless or wired communication with other devices to exchange data.
  • FIG. 7 shows an electronic device 600 having multiple devices, it is not required to implement or have all the illustrated devices. More or fewer devices may be implemented or provided instead.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart.
  • the computer program may be downloaded and installed from the network through the communication device 609, or from the storage device 608, or from the ROM 602.
  • the processing device 601 the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
  • the aforementioned computer-readable medium of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above.
  • Examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM) Or flash memory), optical fiber, CD-ROM (Compact Disc Read-Only Memory), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal that is propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable signal medium may send, propagate, or transmit a program for use by or in combination with an instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: electric wires, optical cables, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • the above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
  • the above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires two or more consecutive face images; The face changes in two or more face images are determined to determine the face action feature parameters corresponding to the face changes; the model action instructions are generated according to the face action feature parameters, so as to have the face feature data. It is assumed that the 3D avatar executes the facial action corresponding to the model action instruction according to the model action instruction.
  • the computer program code for performing the operations of the present disclosure can be written in one or more programming languages or a combination thereof.
  • the above programming languages include object-oriented programming languages such as Java, Smalltalk, C++, as well as conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to an external computer ( For example, use an Internet service provider to connect via the Internet).
  • LAN Local Area Network
  • WAN Wide Area Network
  • each block in the flowchart or block diagram may represent a module, program segment, or part of code that contains one or more logic functions Executable instructions.
  • the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts can be implemented with dedicated hardware-based systems that perform specified functions or operations Or, it can be realized by a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present disclosure may be implemented in software or hardware. Among them, the name of the unit in one case does not constitute a limitation on the unit itself.
  • the preset 3D image action execution module can also be described as an "action execution module”.
  • the seventh embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored.
  • the model action method as provided in the embodiment of the present disclosure is realized, and the method includes: obtaining continuous Two or more face images; according to the face changes in the two or more face images, determine the face action feature parameters corresponding to the face changes; according to the face action
  • the feature parameter generates a model action instruction, so that the preset 3D avatar with facial feature data executes the face action corresponding to the model action instruction according to the model action instruction.
  • the computer-readable storage medium provided by the embodiment of the present disclosure is not limited to the implementation of the above method operation when the computer program stored thereon is executed, and can also implement the model action method provided in any embodiment of the present disclosure. Related operations.
  • the present disclosure can be implemented by software and necessary general-purpose hardware, and of course, it can also be implemented by hardware. Based on this understanding, the technical solution of the present disclosure can be embodied in the form of a software product.
  • the computer software product can be stored in a computer-readable storage medium, such as a computer floppy disk, ROM, RAM, flash memory (FLASH), hard disk, or optical disk. Etc., including multiple instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the method described in the embodiments of the present disclosure.
  • the included units and modules are only divided according to the functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, the names of the functional units are only for It is easy to distinguish each other and is not used to limit the protection scope of the present disclosure.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Processing Or Creating Images (AREA)

Abstract

A model action method and apparatus, a speaker having a screen, an electronic device, and a storage medium. Said method comprises: acquiring two or more continuous facial images (S110); determining, according to a facial change in the two or more facial images, facial action feature parameters corresponding to the facial change (S120); and generating a model action instruction according to the facial action feature parameters, so as to enable a pre-set 3D image having facial feature data to execute, according to the model action instruction, a facial action corresponding to the model action instruction (S130).

Description

模型动作方法、装置、带屏音箱、电子设备及存储介质Model action method, device, speaker with screen, electronic equipment and storage medium
本申请要求在2019年01月15日提交中国专利局、申请号为201910037303.3的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office with an application number of 201910037303.3 on January 15, 2019. The entire content of the application is incorporated into this application by reference.
技术领域Technical field
本公开实施例涉及图像处理技术,例如涉及一种模型动作方法、装置、带屏音箱、电子设备及存储介质。The embodiments of the present disclosure relate to image processing technology, for example, to a model action method, device, speaker with screen, electronic equipment, and storage medium.
背景技术Background technique
随着电子产品的发展,与电子产品进行良好的交互可以满足用户的使用需求,提高用户使用电子产品的体验感。With the development of electronic products, good interaction with electronic products can meet users' needs and improve users' experience of using electronic products.
人与机器的交互方法停留在二维空间中,如采用语音进行交互,或在屏幕互动中,通过检测肢体在平面移动的移动距离、移动速度和移动方向实现互动。二维空间中的交互无法模拟真实人物形象,交互效果差。The human-machine interaction method stays in a two-dimensional space, such as using voice for interaction, or in screen interaction, by detecting the moving distance, moving speed and moving direction of the limbs on the plane to achieve the interaction. The interaction in the two-dimensional space cannot simulate real characters, and the interaction effect is poor.
发明内容Summary of the invention
本公开提供一种模型动作方法、装置、带屏音箱、电子设备及存储介质,以解决二维空间中的模型动作方法无法模拟真实人物形象,模型动作效果差的问题。The present disclosure provides a model action method, a device, a speaker with a screen, an electronic device, and a storage medium, so as to solve the problem that the model action method in a two-dimensional space cannot simulate a real character image and the model action effect is poor.
本公开实施例提供了一种模型动作方法,包括:The embodiment of the present disclosure provides a model action method, including:
获取连续的两张或两张以上的人脸图像;Acquire two or more consecutive face images;
根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数;Determining the facial motion feature parameters corresponding to the facial changes according to the facial changes in the two or more facial images;
根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设三维(3 Dimensions,3D)形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。A model action instruction is generated according to the facial action feature parameters, so that a preset three-dimensional (3 Dimensions, 3D) image with facial feature data executes a face action corresponding to the model action instruction according to the model action instruction.
本公开实施例还提供了一种模型动作装置,该模型动作装置包括:The embodiment of the present disclosure also provides a model action device, which includes:
人脸图像获取模块,设置为获取连续的两张或两张以上的人脸图像;The face image acquisition module is set to acquire two or more consecutive face images;
人脸动作特征参数确定模块,设置为根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数;The facial motion feature parameter determination module is configured to determine the facial motion feature parameter corresponding to the face change according to the face changes in the two or more face images;
预设3D形象动作执行模块,设置为根据所述人脸动作特征参数生成模型动 作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。The preset 3D image action execution module is configured to generate model action instructions according to the facial action feature parameters, so that the preset 3D image with face feature data executes the person corresponding to the model action instruction according to the model action instructions Face action.
本公开实施例还提供了一种带屏音箱,包括主体、位于主体内的控制器和位于主体上的至少两个摄像头;所述至少两个摄像头之间的距离大于距离阈值,所述控制器内设置如本公开任意实施例提供的所述的模型动作装置。The embodiment of the present disclosure further provides a speaker with a screen, which includes a main body, a controller located in the main body, and at least two cameras located on the main body; the distance between the at least two cameras is greater than a distance threshold, and the controller The model action device as described in any embodiment of the present disclosure is provided inside.
本公开实施例还提供了一种电子设备,该设备包括:An embodiment of the present disclosure also provides an electronic device, which includes:
一个或多个处理器;One or more processors;
存储器,设置为存储一个或多个程序;Memory, set to store one or more programs;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如本公开实施例中任一所述的模型动作方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the model action method described in any of the embodiments of the present disclosure.
本公开实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本公开实施例中任一所述的模型动作方法。The embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the model action method as described in any of the embodiments of the present disclosure is implemented.
附图说明BRIEF DESCRIPTION
图1是本公开实施例一提供的一种模型动作方法的流程图;FIG. 1 is a flowchart of a model action method provided by Embodiment 1 of the present disclosure;
图2是本公开实施例二提供的一种模型动作方法的流程图;FIG. 2 is a flowchart of a model action method provided by Embodiment 2 of the present disclosure;
图3是本公开实施例三提供的一种模型动作方法的流程图;FIG. 3 is a flowchart of a model action method provided by Embodiment 3 of the present disclosure;
图4是本公开实施例四提供的一种模型动作装置的结构示意图;4 is a schematic structural diagram of a model action device provided by Embodiment 4 of the present disclosure;
图5是本公开实施例五提供的一种带屏音箱的结构示意图;FIG. 5 is a schematic structural diagram of a speaker with a screen provided by Embodiment 5 of the present disclosure;
图6为本公开实施例五提供的一种带屏音箱的模型动作界面的控件结构示意图;6 is a schematic diagram of the control structure of a model action interface of a speaker with a screen provided in Embodiment 5 of the present disclosure;
图7是本公开实施例六提供的一种电子设备的结构示意图。FIG. 7 is a schematic structural diagram of an electronic device provided by Embodiment 6 of the present disclosure.
具体实施方式detailed description
下面结合附图和实施例对本公开进行说明。此处所描述的具体实施例仅仅用于解释本公开,而非对本公开的限定。为了便于描述,附图中仅示出了与本公开相关的部分而非全部结构。The present disclosure will be described below with reference to the drawings and embodiments. The specific embodiments described here are only used to explain the present disclosure, but not to limit the present disclosure. For the convenience of description, only a part of the structure related to the present disclosure is shown in the drawings instead of all of the structure.
下述实施例中,每个实施例中同时提供了可选特征和示例,实施例中记载的多个特征可进行组合,形成多个可选方案,不应将每个编号的实施例仅视为一个技术方案。In the following embodiments, each embodiment provides optional features and examples at the same time. Multiple features recorded in the embodiments can be combined to form multiple alternative solutions. Each numbered embodiment should not be regarded as It is a technical solution.
实施例一Example one
图1为本公开实施例一提供的一种模型动作方法的流程图,本实施例可适用于人脸动作交互的情况,该方法可以由模型动作装置来执行,该装置可由硬件和/或软件组成,并一般可集成在手机、平板以及计算机等电子设备中。包括如下步骤:Figure 1 is a flow chart of a model action method provided in Embodiment 1 of the present disclosure. This embodiment is applicable to the situation of human face action interaction. The method can be executed by a model action device, which can be hardware and/or software. It can be integrated into electronic devices such as mobile phones, tablets and computers. Including the following steps:
S110、获取连续的两张或两张以上的人脸图像。S110. Acquire two or more consecutive face images.
通过至少一个摄像头获取连续的两张或两张以上的人脸图像,每张图像之间的时间间隔预先设定。可选地,可以通过摄像头每隔预设时间间隔获取一张人脸图像,还可以通过摄像头录制人脸视频数据,按照预设时间间隔在人脸视频数据中截取人脸图像。Two or more consecutive face images are acquired through at least one camera, and the time interval between each image is preset. Optionally, a face image can be acquired by a camera at a preset time interval, or face video data can be recorded by a camera, and the face image can be intercepted from the face video data at a preset time interval.
S120、根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数。S120: According to the face changes in the two or more face images, determine the facial motion feature parameters corresponding to the face changes.
通过两张或两张以上的人脸图像,可以检测出人脸图像中的人脸变化。一实施例中,在检测人脸图像中人脸变化时,通过比对两张人脸图像中对应位置处的像素变化,确定人脸变化,示例性地,以人脸中一固定部位的人脸变化确定人脸变化,一固定部位例如可以是眼睛或下巴,以眼睛为例,以人脸中眼睛在两张图中的变化,确定人脸的变化,眼睛在两张图像中的变化可以有位置变化,根据两张图像的间隔时间可以确认眼睛的移动速度,通过多张图像中眼睛的位置变化,可以得到眼睛的移动轨迹,结合移动的时间信息可以得到移动的速度信息,由此可以确定人脸动作特征参数。Through two or more face images, the face changes in the face image can be detected. In one embodiment, when detecting a face change in a face image, the face change is determined by comparing the pixel changes at the corresponding positions in the two face images. For example, a fixed part of the face The face change determines the change of the face. A fixed part can be the eyes or the chin. Take the eyes as an example. Take the change of the eyes in the face in two images to determine the change of the face. The change of the eyes in the two images can have positions Change, according to the interval time between two images, the movement speed of the eyes can be confirmed. Through the change of the position of the eyes in multiple images, the movement trajectory of the eyes can be obtained, and the movement speed information can be obtained by combining the movement time information, which can determine the person Feature parameters of facial movements.
可选地,所述人脸动作特征参数包括如下参数中的至少之一:移动速度、移动方向、移动距离。人脸动作可以是上下晃动,左右晃动或转圈晃动等。对应人脸动作的人脸动作特征参数为移动速度、移动方向和移动距离中的至少一项。可选地,人脸动作特征参数包括移动速度、移动方向和移动距离。通过人脸动作特征参数可以还原人脸动作,达到与用户进行互动的效果。Optionally, the facial motion characteristic parameter includes at least one of the following parameters: moving speed, moving direction, and moving distance. Face movements can be shaking up and down, shaking left and right, or shaking in circles. The facial motion characteristic parameter corresponding to the facial motion is at least one of the moving speed, the moving direction and the moving distance. Optionally, the facial motion characteristic parameters include movement speed, movement direction, and movement distance. The facial motion feature parameters can be used to restore the facial motion to achieve the effect of interacting with the user.
S130、根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。S130. Generate a model action instruction according to the face action feature parameters, and make a preset 3D image with face feature data execute a face action corresponding to the model action instruction according to the model action instruction.
本实施例中,所述具有人脸特征数据的预设3D形象为:卡通3D形象、职业3D形象或性别3D形象。卡通3D形象例如可以是小猫、小狗或猴子等动物形象,也可以是小猪佩奇或樱桃丸子等动画形象。职业3D形象例如可以是医生、教师、消防员或警察等形象。性别3D形象例如可以是男人或女人,在性别3D形象中还可以结合年龄信息,设置男孩、女孩、成年男人、成年女人、老年男人或老年女人等形象。In this embodiment, the preset 3D image with facial feature data is: a cartoon 3D image, a professional 3D image, or a gender 3D image. The cartoon 3D image can be, for example, an animal image such as a kitten, a puppy, or a monkey, or an animated image such as Peppa Pig or Cherry Ball. The professional 3D image may be, for example, an image of a doctor, a teacher, a firefighter, or a policeman. The gender 3D image can be, for example, a man or a woman. The gender 3D image can also be combined with age information to set images such as boy, girl, adult man, adult woman, elderly man, or elderly woman.
在上述方案中,可选的是,所述根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数,包括:根据所述两张或两张以上的人脸图像中的人脸的预设部位的位置变化,确定所述预设部位的移动速度、移动方向和移动距离以确定人脸动作特征参数。其中,预设部位例如可以是眼睛或下巴,也可以是双颊,在此对预设部位不做限定。In the above solution, optionally, the determining, based on the face changes in the two or more face images, the face action feature parameters corresponding to the face changes includes: according to the The position changes of the preset parts of the face in two or more face images are determined, and the moving speed, the moving direction and the moving distance of the preset parts are determined to determine the facial motion characteristic parameters. Wherein, the preset position may be, for example, the eyes or the chin, or the cheeks, and the preset position is not limited here.
在上述方案中,可选的是,所述根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作,包括:根据所述移动速度、所述移动方向和所述移动距离生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。模型动作指令包括人脸的移动速度、移动方向和移动距离,具有人脸特征数据的预设3D形象根据模型动作指令模拟人脸动作,通过具有人脸特征数据的预设3D形象实现与用户的人脸动作交互。模型动作指令中包括具有人脸特征数据的预设3D形象的脸部移动速度、移动方向和移动距离,具有人脸特征数据的预设3D形象的脸部移动速度、移动方向和移动距离可以与用户的人脸的移动速度、移动方向和移动距离相同,也可以是按照预设规则进行设置,预设规则例如可以是,用户的人脸的移动速度是v,具有人脸特征数据的预设3D形象的脸部移动速度为2v;用户的人脸的移动方向为左右,具有人脸特征数据的预设3D形象的脸部移动方向为左右;用户的人脸的移动距离为d,具有人脸特征数据的预设3D形象的脸部移动距离为2d。预设规则例如还可以是,用户的人脸的移动速度是v,具有人脸特征数据的预设3D形象的脸部移动速度为v/2;用户的人脸的移动方向为左右,具有人脸特征数据的预设3D形象的脸部移动方向为右左;用户的人脸的移动距离为d,具有人脸特征数据的预设3D形象的脸部移动距离为d/2。预设规则可以任意设定,提高交互趣味性。In the above solution, optionally, the model action instruction is generated according to the facial action characteristic parameters, so that the preset 3D image with face feature data is executed according to the model action instruction corresponding to the model action instruction The facial action includes: generating a model action instruction according to the moving speed, the moving direction, and the moving distance, so that a preset 3D image with facial feature data executes the model action according to the model action instruction Command the corresponding facial action. The model action instructions include the movement speed, direction and distance of the face. The preset 3D image with face feature data simulates the face action according to the model action instructions. The preset 3D image with face feature data realizes the interaction with the user. Face action interaction. The model action instruction includes the facial movement speed, movement direction and movement distance of the preset 3D image with facial feature data. The facial movement speed, movement direction and movement distance of the preset 3D image with facial feature data can be compared with The moving speed, moving direction, and moving distance of the user's face are the same, or they can be set according to preset rules. The preset rule can be, for example, that the moving speed of the user's face is v, which has facial feature data. The moving speed of the face of the 3D image is 2v; the moving direction of the user’s face is left and right, and the moving direction of the preset 3D image with facial feature data is left and right; the moving distance of the user’s face is d, with people The face movement distance of the preset 3D image of the facial feature data is 2d. The preset rule can also be, for example, that the moving speed of the user's face is v, and the moving speed of the preset 3D image with facial feature data is v/2; the moving direction of the user's face is left and right, with people The moving direction of the face of the preset 3D image of the facial feature data is right and left; the moving distance of the user's face is d, and the moving distance of the face of the preset 3D image with facial feature data is d/2. The preset rules can be set arbitrarily to improve the interactive interest.
本公开实施例提供的模型动作方法,通过获取连续的两张或两张以上的人脸图像;根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数;根据人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作,可以实现通过具有人脸特征数据的预设3D形象模拟人脸动作,提高人脸动作模拟的效果,增强人脸动作模拟的真实性,提高交互体验。The model action method provided by the embodiment of the present disclosure obtains two or more consecutive face images; and determines the face change according to the face change in the two or more face images Corresponding face action feature parameters; generating model action instructions according to the face action feature parameters, so that the preset 3D image with face feature data executes the face actions corresponding to the model action instructions according to the model action instructions. Realize the simulation of facial motions through preset 3D images with facial feature data, improve the effect of facial motion simulation, enhance the reality of facial motion simulation, and improve the interactive experience.
实施例二Example 2
图2为本公开实施例二提供的一种模型动作方法的流程示意图。本实施例以上述实施例中可选方案为基础进行说明。包括如下:FIG. 2 is a schematic flowchart of a model action method provided in Embodiment 2 of the disclosure. This embodiment is described on the basis of the optional solutions in the foregoing embodiment. It includes the following:
S210、获取连续的两张或两张以上的人脸图像。S210. Acquire two or more consecutive face images.
S220、根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数。S220: According to the face changes in the two or more face images, determine the facial motion feature parameters corresponding to the face changes.
S230、获取至少一个预设角度的人脸图像,提取所述至少一个预设角度的人脸图像的人脸特征数据。S230. Acquire at least one face image of a preset angle, and extract face feature data of the face image of the at least one preset angle.
通过至少两个摄像头获取至少一个预设角度的人脸图像,至少两个摄像头的距离大于距离阈值,可以方便多角度多方位拍摄人脸图像,丰富人脸图像获取的角度信息,进而丰富人脸图像的人脸特征数据,以达到提高人脸3D模型构建的精度的效果。通过人脸特征提取算法,提取人脸图像的人脸特征数据。人脸特征数据主要表征眼睛、眉毛、鼻子、嘴巴、耳朵和人脸轮廓的数据,通过人脸特征数据可以唯一表示人脸。Obtain at least one face image with a preset angle through at least two cameras. The distance between the at least two cameras is greater than the distance threshold, which can facilitate the shooting of face images from multiple angles and directions, enrich the angle information obtained from the face image, and then enrich the face The facial feature data of the image can achieve the effect of improving the accuracy of the construction of the 3D face model. Through the facial feature extraction algorithm, the facial feature data of the face image is extracted. The face feature data mainly represents the data of the eyes, eyebrows, nose, mouth, ears, and face contour, and the face can be uniquely represented by the face feature data.
S240、根据所述人脸特征数据,构建所述至少一个预设角度的人脸图像对应的人脸3D模型。S240. Construct a 3D face model corresponding to the face image of the at least one preset angle according to the face feature data.
根据获取到的人脸特征数据,构建人脸图像对应的人脸3D模型,在构建人脸3D模型时,采用的3D模型可以为通用人脸模型或三维变形模型。According to the acquired facial feature data, a 3D face model corresponding to the face image is constructed. When constructing the face 3D model, the 3D model used may be a general face model or a three-dimensional deformed model.
S250、将所述人脸3D模型应用于预设3D形象,获得具有所述人脸特征数据的预设3D形象。S250: Apply the 3D face model to a preset 3D image, and obtain a preset 3D image with the face feature data.
将构建得到的人脸3D模型应用于预设3D形象,具有人脸特征数据的预设3D形象便具有与用户相同的人脸特征数据,达到模拟用户长相的目的,并且通过预设3D形象可以实现用户的角色扮演,如预设3D形象为医生,则在将用户的人脸3D模型应用于预设3D形象后,可以形成具有用户人脸数据的医生,即可以获得与用户长相一样的医生,实现用户扮演医生,提高娱乐性。Applying the constructed 3D face model to a preset 3D image, the preset 3D image with facial feature data has the same facial feature data as the user, achieving the purpose of simulating the appearance of the user, and the preset 3D image can Realize the role-playing of the user. If the preset 3D image is a doctor, after applying the user's face 3D model to the preset 3D image, a doctor with the user's face data can be formed, that is, a doctor who looks the same as the user can be obtained , Realize the user to act as a doctor, improve entertainment.
S260、根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。S260. Generate a model action instruction according to the facial action feature parameters, and make a preset 3D image with face feature data execute a face action corresponding to the model action instruction according to the model action instruction.
具有人脸特征数据的预设3D形象为具有用户人脸特征数据的3D形象,根据人脸动作特征参数生成模型动作指令,具有人脸特征数据的预设3D形象按照模型动作指令执行对应的人脸动作,可以实现用户与具有与自己相同人脸特征数据的具有人脸特征数据的预设3D形象进行人脸动作交互,提高趣味性,增强交互体验。The preset 3D image with facial feature data is a 3D image with user facial feature data. Model action instructions are generated based on facial action feature parameters. The preset 3D image with facial feature data executes the corresponding person according to the model action instructions. Facial actions can realize facial action interactions between users and preset 3D images with facial feature data that have the same facial feature data as themselves, which enhances the interest and enhances the interactive experience.
本实施例的技术方案通过获取至少一个预设角度的人脸图像,提取至少一个预设角度的人脸图像的人脸特征数据;根据人脸特征数据,构建至少一个预设角度的人脸图像对应的人脸3D模型;将人脸3D模型应用于预设3D形象, 获得具有人脸特征数据的预设3D形象,可以实现与具有用户人脸特征的虚拟形象进行人脸动作交互,提高娱乐性,增强交互体验。The technical solution of this embodiment extracts face feature data of at least one face image at a preset angle by acquiring at least one face image at a preset angle; constructs at least one face image at a preset angle according to the face feature data Corresponding face 3D model; apply the face 3D model to the preset 3D image to obtain the preset 3D image with facial feature data, which can realize face action interaction with the avatar with the user's facial features, and improve entertainment Sex and enhance the interactive experience.
本实施例中不限定步骤S210、S220和步骤S230、S240、S250的执行顺序,可以是依据本实施例的顺序执行,还可以是先执行步骤S230,S240、S250,再执行步骤S210和S220,还可以是步骤S210和S220,与步骤S230,S240和S250同步执行。This embodiment does not limit the execution order of steps S210, S220 and steps S230, S240, and S250, and can be performed according to the order of this embodiment, or steps S230, S240, and S250 are performed first, and then steps S210 and S220 are performed. It may also be steps S210 and S220, which are executed synchronously with steps S230, S240 and S250.
实施例三Example three
图3为本公开实施例三提供的一种模型动作方法的流程示意图。本实施例以上述实施例中可选方案为基础进行说明。包括如下:FIG. 3 is a schematic flowchart of a model action method provided in Embodiment 3 of the disclosure. This embodiment is described on the basis of the optional solutions in the foregoing embodiment. It includes the following:
S310、获取连续的两张或两张以上的人脸图像。S310. Acquire two or more consecutive face images.
S320、根据所述两张或两张以上的人脸图像中的人脸的预设部位的位置变化,确定所述预设部位的移动速度、移动方向和移动距离以确定人脸动作特征参数。S320: Determine the moving speed, the moving direction and the moving distance of the preset part according to the position changes of the preset parts of the face in the two or more face images to determine the facial motion feature parameters.
人脸图像中预设部位至少包括:眼睛、嘴巴、鼻子、眉毛、下巴、额头和双颊。在本实施例中,以人脸图像中的预设部位为嘴巴为例,检测人脸图像中嘴巴的位置变化。在获取的多张人脸图像中,通过检测嘴巴在人脸图像中的位置,可以确定嘴巴的移动方向和移动距离,通过人脸图像具有的时间信息,可以确定嘴巴位置移动对应的时间段,进而可以确定嘴巴的移动速度。嘴巴的移动速度、移动方向和移动距离即人脸的移动速度、移动方向和移动距离。The preset parts in the face image include at least: eyes, mouth, nose, eyebrows, chin, forehead and cheeks. In this embodiment, taking the preset part in the face image as the mouth as an example, the position change of the mouth in the face image is detected. In the acquired multiple face images, by detecting the position of the mouth in the face image, the direction and distance of the movement of the mouth can be determined, and the time period corresponding to the position of the mouth can be determined through the time information of the face image. Then you can determine the speed of the mouth. The moving speed, moving direction and moving distance of the mouth are the moving speed, moving direction and moving distance of the human face.
S330、根据所述移动速度、所述移动方向和所述移动距离生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。S330. Generate a model action instruction according to the movement speed, the movement direction, and the movement distance, so that a preset 3D image with facial feature data executes a face corresponding to the model action instruction according to the model action instruction action.
根据用户的人脸移动速度、移动方向和移动距离,生成模型动作指令,模型动作指令中包含具有人脸特征数据的预设3D形象的脸部移动速度、移动方向和移动距离,具有人脸特征数据的预设3D形象的脸部移动速度、移动方向和移动距离可以与用户的人脸移动速度、移动方向和移动距离相同,也可以不同,可以按照预设规则进行设置,预设规则可以参照上述公开实施例。According to the user's facial movement speed, movement direction and movement distance, a model action instruction is generated. The model action instruction contains the facial movement speed, movement direction and movement distance of the preset 3D image with facial feature data, and has facial features The face moving speed, moving direction and moving distance of the preset 3D image of the data can be the same as or different from the user’s face moving speed, moving direction and moving distance. It can be set according to the preset rules, and the preset rules can be referred to The above disclosed embodiments.
本实施例的技术方案获取连续的两张或两张以上的人脸图像,根据所述两张或两张以上的人脸图像中的人脸的预设部位的位置变化,确定所述预设部位的移动速度、移动方向和移动距离以确定人脸动作特征参数,提高人脸的移动速度、移动方向和移动距离确定的准确度;根据所述移动速度、所述移动方向和所述移动距离生成模型动作指令,使具有人脸特征数据的预设3D形象根据所 述模型动作指令执行与所述模型动作指令人脸动作,实现通过具有人脸特征数据的预设3D形象模拟人脸动作,提高人脸动作模拟的效果,增强人脸动作模拟的真实性,提高交互体验。The technical solution of this embodiment acquires two or more consecutive face images, and determines the preset according to the position change of the preset part of the face in the two or more face images The movement speed, movement direction and movement distance of the parts are used to determine the facial motion characteristic parameters, and to improve the accuracy of the determination of the movement speed, movement direction and movement distance of the face; according to the movement speed, the movement direction and the movement distance Generate a model action instruction to make a preset 3D image with facial feature data execute facial actions with the model action instruction according to the model action instruction, so as to realize the simulation of facial actions through the preset 3D image with facial feature data, Improve the effect of facial motion simulation, enhance the reality of facial motion simulation, and improve the interactive experience.
实施例四Example 4
图4为本公开实施例四提供的一种模型动作装置的结构示意图。参考图4,模型动作装置包括:人脸图像获取模块410、人脸动作特征参数确定模块420和预设3D形象动作执行模块430,下面对每个模块进行说明。FIG. 4 is a schematic structural diagram of a model action device provided in the fourth embodiment of the disclosure. Referring to FIG. 4, the model action device includes: a face image acquisition module 410, a face action feature parameter determination module 420, and a preset 3D image action execution module 430, each of which is described below.
人脸图像获取模块410,设置为获取连续的两张或两张以上的人脸图像。The face image acquisition module 410 is configured to acquire two or more consecutive face images.
人脸动作特征参数确定模块420,设置为根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数。The facial motion characteristic parameter determination module 420 is configured to determine the facial motion characteristic parameter corresponding to the facial change according to the facial changes in the two or more facial images.
预设3D形象动作执行模块430,设置为根据所述人脸动作特征参数生成模型动作指令,使预设具有人脸特征数据的3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。The preset 3D image action execution module 430 is configured to generate model action instructions according to the facial action feature parameters, so that the preset 3D image with face feature data executes the model action instructions corresponding to the model action instructions according to the model action instructions. Face action.
本实施例提供的模型动作装置,可以实现通过具有人脸特征数据的预设3D形象模拟人脸动作,提高人脸动作模拟的效果,增强人脸动作模拟的真实性,提高交互体验。The model action device provided in this embodiment can simulate a face action through a preset 3D image with face feature data, improve the effect of face action simulation, enhance the reality of face action simulation, and improve the interactive experience.
上述方案中,可选的是,该装置还包括:预设3D形象获取模块,设置为在根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作之前,获取至少一个预设角度的人脸图像,提取所述至少一个预设角度的人脸图像的人脸特征数据;根据所述人脸特征数据,构建所述至少一个预设角度的人脸图像对应的人脸3D模型;将所述人脸3D模型应用于初始预设3D形象,获得具有所述人脸特征数据的预设3D形象。In the above solution, optionally, the device further includes: a preset 3D image acquisition module, configured to generate model action instructions according to the facial motion feature parameters, so that the preset 3D image with facial feature data is Before the model action instruction executes the face action corresponding to the model action instruction, acquiring at least one face image with a preset angle, and extracting face feature data of the face image at the at least one preset angle; Face feature data, constructing a face 3D model corresponding to the face image at the at least one preset angle; applying the face 3D model to an initial preset 3D image to obtain a preset with the face feature data 3D image.
一实施例中,所述人脸动作特征参数包括如下参数中的至少之一:移动速度、移动方向、移动距离。In an embodiment, the facial motion characteristic parameters include at least one of the following parameters: moving speed, moving direction, and moving distance.
上述方案中,可选的是,所述人脸动作特征参数确定模块,是设置为:In the above solution, optionally, the facial motion feature parameter determination module is set to:
根据所述两张或两张以上的人脸图像中的人脸的预设部位的位置变化,确定所述预设部位的移动速度、移动方向和移动距离以确定所述人脸动作特征参数。According to the position changes of the preset parts of the face in the two or more face images, the moving speed, the moving direction and the moving distance of the preset parts are determined to determine the facial motion characteristic parameters.
上述方案中,可选的是,所述预设3D形象动作执行模块,是设置为:In the above solution, it is optional that the preset 3D image action execution module is set to:
根据所述移动速度、所述移动方向和所述移动距离生成模型动作指令,使 具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。A model action instruction is generated according to the movement speed, the movement direction and the movement distance, so that the preset 3D avatar with facial feature data executes the face action corresponding to the model action instruction according to the model action instruction.
一实施例中,所述具有人脸特征数据的预设3D形象为:卡通3D形象、职业3D形象或性别3D形象。In an embodiment, the preset 3D image with facial feature data is: a cartoon 3D image, a professional 3D image, or a gender 3D image.
本公开提供的模型动作装置可执行本公开任意实施例所提供的模型动作方法,具备执行模型动作方法相应的功能模块和效果。The model action device provided by the present disclosure can execute the model action method provided by any embodiment of the present disclosure, and has the corresponding functional modules and effects for executing the model action method.
实施例五Example 5
图5为本公开实施例五提供的一种带屏音箱的结构示意图。参考图5,带屏音箱包括:包括主体51、位于主体51内的控制器52和位于主体51上的至少两个摄像头53;所述至少两个摄像头53之间的距离大于距离阈值,所述控制器52内设置如本公开实施例提供的任一模型动作装置。Fig. 5 is a schematic structural diagram of a speaker with a screen provided in the fifth embodiment of the disclosure. 5, the speaker with a screen includes: a main body 51, a controller 52 located in the main body 51, and at least two cameras 53 located on the main body 51; the distance between the at least two cameras 53 is greater than the distance threshold, the The controller 52 is provided with any model action device as provided in the embodiment of the present disclosure.
至少两个摄像头之间的距离大于距离阈值,以两个摄像头为例,一个摄像头可以放置在带屏音箱主体的上部位置,另一个摄像头可以放置在带屏音箱主体的下部位置,距离大于距离阈值,可以方便多角度多方位拍摄人脸图像,丰富人脸图像获取的角度信息,进而丰富人脸图像的人脸特征数据,以达到提高人脸3D模型构建的精度的效果。The distance between at least two cameras is greater than the distance threshold. Taking two cameras as an example, one camera can be placed on the upper part of the main body of the speaker with screen, and the other camera can be placed on the lower part of the main body of the speaker with screen, and the distance is greater than the distance threshold , It is convenient to shoot face images from multiple angles and directions, enrich the angle information obtained by the face image, and then enrich the face feature data of the face image, so as to achieve the effect of improving the accuracy of the construction of the face 3D model.
图6为本公开实施例提供的一种带屏音箱的模型动作界面的控件结构示意图,用户启动模型动作界面后,通过设置菜单控件1选择具有人脸特征数据的预设3D形象的形象种类,具有人脸特征数据的预设3D形象的脸部移动速度、移动方向和移动距离与用户人脸的移动速度、移动方向和移动距离之间的转换规则。在具有人脸特征数据的预设3D形象种类设置中,可以通过控件2进行快速设置。首先多次点击控件3,摄像头获取到用户的人脸图像数据,提取人脸特征数据,构建人脸3D模型,将人脸3D模型应用于预设3D形象得到具有人脸特征数据的预设3D形象,使得具有人脸特征数据的预设3D形象具有用户的人脸特征数据,然后长按控件3,摄像头捕捉用户的人脸动作,具有人脸特征数据的预设3D形象根据用户的人脸移动速度、移动方向和移动距离,按照用户设定的转换规则执行脸部动作,实现通过脸部动作与用户进行交互。Figure 6 is a schematic diagram of the control structure of a model action interface with a speaker with a screen provided by an embodiment of the disclosure. After the user starts the model action interface, the user selects the image type of the preset 3D image with facial feature data through the setting menu control 1. The conversion rule between the face movement speed, movement direction and movement distance of the preset 3D image with facial feature data and the movement speed, movement direction and movement distance of the user's face. In the preset 3D image type setting with facial feature data, you can quickly set it through control 2. First, click Control 3 many times, the camera obtains the user's face image data, extracts the face feature data, builds a face 3D model, and applies the face 3D model to a preset 3D image to obtain a preset 3D with face feature data Image, so that the preset 3D image with facial feature data has the user's facial feature data, and then long press control 3, the camera captures the user's facial movements, the preset 3D image with facial feature data is based on the user's face The moving speed, moving direction and moving distance are executed in accordance with the conversion rules set by the user to perform facial actions to realize interaction with the user through facial actions.
示例性地,带屏音箱可以应用于点读场景,通过带屏音箱的摄像头获取点读材料的图像数据,点读材料以书籍为例,点读材料例如可以放置在桌面上。通过对点读材料的图像数据进行分析,获取图像文字内容,将文字内容转化成语音数据,通过音箱喇叭实现点读。带屏音箱中的屏幕可以展示具有人脸特征数据的预设3D形象,通过具有人脸特征数据的预设3D形象进行点读,实现点 读的趣味性,例如具有人脸特征数据的预设3D形象是教师,通过教师进行点读,实现对真实教学的模拟,提高学习乐趣,进而提高学习效率。Exemplarily, a speaker with a screen can be applied to a point-to-read scene. The image data of the point-to-read material is obtained through a camera of the screen-mounted speaker. Through the analysis of the image data of the reading material, the image text content is obtained, the text content is converted into voice data, and the reading is realized through the speaker. The screen in the speaker with screen can display the preset 3D image with face feature data, and click the preset 3D image with face feature data to realize the interesting point of reading, such as the preset with face feature data The 3D image is the teacher, through the teacher’s point reading, the simulation of real teaching is realized, the learning fun is improved, and the learning efficiency is improved.
用户还可以通过带屏音箱中显示的具有人脸特征数据的预设3D形象,采用上述公开实施例中模型动作方法实现与具有人脸特征数据的预设3D形象的人脸动作互动,提高交互体验。The user can also use the preset 3D image with facial feature data displayed in the speaker with a screen, and use the model action method in the above disclosed embodiment to realize the interaction with the preset 3D image with facial feature data to improve the interaction Experience.
本实施例提供的带屏音箱,可以实现通过具有人脸特征数据的预设3D形象模拟人脸动作,提高人脸动作模拟的效果,增强人脸动作模拟的真实性,提高交互体验。The speaker with a screen provided in this embodiment can simulate facial motions through a preset 3D image with facial feature data, improve the effect of facial motion simulation, enhance the authenticity of facial motion simulation, and improve interactive experience.
实施例六Example Six
下面参考图7,图7示出了适于用来实现本公开实施例的电子设备600的结构示意图。本公开实施例中的电子设备可以包括但不限于诸如移动电话、笔记本电脑、数字广播接收器、个人数字助理(Personal Digital Assistant,PDA)、平板电脑(PAD)、便携式多媒体播放器(Portable Media Player,PMP)、车载终端(例如车载导航终端)等等的移动终端以及诸如数字电视机(Television,TV)、台式计算机等等的固定终端。图7示出的电子设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Referring now to FIG. 7, FIG. 7 shows a schematic structural diagram of an electronic device 600 suitable for implementing embodiments of the present disclosure. The electronic devices in the embodiments of the present disclosure may include, but are not limited to, mobile phones, notebook computers, digital broadcast receivers, personal digital assistants (Personal Digital Assistant, PDA), tablet computers (PAD), and portable multimedia players (Portable Media Player). , PMP), mobile terminals such as car navigation terminals, etc., and fixed terminals such as digital televisions (Television, TV), desktop computers, etc. The electronic device shown in FIG. 7 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present disclosure.
如图7所示,电子设备600可以包括处理装置(例如中央处理器、图形处理器等)601,其可以根据存储在只读存储器(Read-Only Memory,ROM)602中的程序或者从存储装置608加载到随机访问存储器(Random Access Memory,RAM)603中的程序而执行多种适当的动作和处理。在RAM 603中,还存储有电子设备600操作所需的多种程序和数据。处理装置601、ROM 602以及RAM603通过总线604彼此相连。输入/输出(Input/Output,I/O)接口605也连接至总线604。As shown in FIG. 7, the electronic device 600 may include a processing device (such as a central processing unit, a graphics processor, etc.) 601, which may be based on a program stored in a read-only memory (Read-Only Memory, ROM) 602 or from a storage device 608 is loaded into a random access memory (Random Access Memory, RAM) 603 program to perform various appropriate actions and processing. The RAM 603 also stores various programs and data required for the operation of the electronic device 600. The processing device 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input/output (Input/Output, I/O) interface 605 is also connected to the bus 604.
通常,以下装置可以连接至I/O接口605:包括例如触摸屏、触摸板、键盘、鼠标、摄像头、麦克风、加速度计、陀螺仪等的输入装置606;包括例如液晶显示器(Liquid Crystal Display,LCD)、扬声器、振动器等的输出装置607;包括例如磁带、硬盘等的存储装置608;以及通信装置609。通信装置609可以允许电子设备600与其他设备进行无线或有线通信以交换数据。虽然图7示出了具有多种装置的电子设备600,并不要求实施或具备所有示出的装置。可以替代地实施或具备更多或更少的装置。Generally, the following devices can be connected to the I/O interface 605: including input devices 606 such as touch screen, touch panel, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; including, for example, a liquid crystal display (LCD) Output devices 607 such as speakers, vibrators, etc.; storage devices 608 such as magnetic tapes, hard disks, etc.; and communication devices 609. The communication device 609 may allow the electronic device 600 to perform wireless or wired communication with other devices to exchange data. Although FIG. 7 shows an electronic device 600 having multiple devices, it is not required to implement or have all the illustrated devices. More or fewer devices may be implemented or provided instead.
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算 机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信装置609从网络上被下载和安装,或者从存储装置608被安装,或者从ROM 602被安装。在该计算机程序被处理装置601执行时,执行本公开实施例的方法中限定的上述功能。According to an embodiment of the present disclosure, the process described above with reference to the flowchart may be implemented as a computer software program. For example, the embodiments of the present disclosure include a computer program product, which includes a computer program carried on a computer-readable medium, and the computer program contains program code for executing the method shown in the flowchart. In such an embodiment, the computer program may be downloaded and installed from the network through the communication device 609, or from the storage device 608, or from the ROM 602. When the computer program is executed by the processing device 601, the above-mentioned functions defined in the method of the embodiment of the present disclosure are executed.
本公开上述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、RAM、ROM、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读信号介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。The aforementioned computer-readable medium of the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the two. The computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any combination of the above. Examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, RAM, ROM, Erasable Programmable Read-Only Memory (EPROM) Or flash memory), optical fiber, CD-ROM (Compact Disc Read-Only Memory), optical storage device, magnetic storage device, or any suitable combination of the above. In the present disclosure, the computer-readable storage medium may be any tangible medium that contains or stores a program, and the program may be used by or in combination with an instruction execution system, apparatus, or device. In this disclosure, the computer-readable signal medium may include a data signal that is propagated in baseband or as part of a carrier wave, in which computer-readable program code is carried. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable signal medium may send, propagate, or transmit a program for use by or in combination with an instruction execution system, apparatus, or device . The program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: electric wires, optical cables, radio frequency (RF), etc., or any suitable combination of the foregoing.
上述计算机可读介质可以是上述电子设备中所包含的;也可以是单独存在,而未装配入该电子设备中。The above-mentioned computer-readable medium may be included in the above-mentioned electronic device; or it may exist alone without being assembled into the electronic device.
上述计算机可读介质承载有一个或者多个程序,当上述一个或者多个程序被该电子设备执行时,使得该电子设备:获取连续的两张或两张以上的人脸图像;根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数;根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。The above-mentioned computer-readable medium carries one or more programs, and when the above-mentioned one or more programs are executed by the electronic device, the electronic device: acquires two or more consecutive face images; The face changes in two or more face images are determined to determine the face action feature parameters corresponding to the face changes; the model action instructions are generated according to the face action feature parameters, so as to have the face feature data. It is assumed that the 3D avatar executes the facial action corresponding to the model action instruction according to the model action instruction.
可以以一种或多种程序设计语言或其组合来编写用于执行本公开的操作的计算机程序代码,上述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算 机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(Local Area Network,LAN)或广域网(Wide Area Network,WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。The computer program code for performing the operations of the present disclosure can be written in one or more programming languages or a combination thereof. The above programming languages include object-oriented programming languages such as Java, Smalltalk, C++, as well as conventional Procedural programming language-such as "C" language or similar programming language. The program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server. In the case of a remote computer, the remote computer can be connected to the user's computer through any kind of network-including Local Area Network (LAN) or Wide Area Network (WAN)-or it can be connected to an external computer ( For example, use an Internet service provider to connect via the Internet).
附图中的流程图和框图,图示了按照本公开实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowcharts and block diagrams in the accompanying drawings illustrate the possible implementation of the system architecture, functions, and operations of the system, method, and computer program product according to the embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, program segment, or part of code that contains one or more logic functions Executable instructions. It should also be noted that, in some alternative implementations, the functions marked in the block may also occur in a different order from the order marked in the drawings. For example, two blocks shown in succession can actually be executed substantially in parallel, or they can sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and/or flowcharts, and combinations of blocks in the block diagrams and/or flowcharts, can be implemented with dedicated hardware-based systems that perform specified functions or operations Or, it can be realized by a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。其中,单元的名称在一种情况下并不构成对该单元本身的限定,例如,预设3D形象动作执行模块还可以被描述为“动作执行模块”。The units described in the embodiments of the present disclosure may be implemented in software or hardware. Among them, the name of the unit in one case does not constitute a limitation on the unit itself. For example, the preset 3D image action execution module can also be described as an "action execution module".
实施例七Example 7
本公开实施例七还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被模型动作装置执行时实现如本公开实施例提供的模型动作方法,该方法包括:获取连续的两张或两张以上的人脸图像;根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数;根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。The seventh embodiment of the present disclosure also provides a computer-readable storage medium on which a computer program is stored. When the program is executed by the model action device, the model action method as provided in the embodiment of the present disclosure is realized, and the method includes: obtaining continuous Two or more face images; according to the face changes in the two or more face images, determine the face action feature parameters corresponding to the face changes; according to the face action The feature parameter generates a model action instruction, so that the preset 3D avatar with facial feature data executes the face action corresponding to the model action instruction according to the model action instruction.
本公开实施例所提供的一种计算机可读存储介质,其上存储的计算机程序被执行时不限于实现如上所述的方法操作,还可以实现本公开任意实施例所提供的模型动作方法中的相关操作。The computer-readable storage medium provided by the embodiment of the present disclosure is not limited to the implementation of the above method operation when the computer program stored thereon is executed, and can also implement the model action method provided in any embodiment of the present disclosure. Related operations.
通过以上关于实施方式的描述,本公开可借助软件及必需的通用硬件来实现,当然也可以通过硬件实现。基于这样的理解,本公开的技术方案可以以软件产品的形式体现出来,该计算机软件产品可以存储在计算机可读存储介质中,如计算机的软盘、ROM、RAM、闪存(FLASH)、硬盘或光盘等,包括多个指 令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本公开实施例所述的方法。Through the above description of the embodiments, the present disclosure can be implemented by software and necessary general-purpose hardware, and of course, it can also be implemented by hardware. Based on this understanding, the technical solution of the present disclosure can be embodied in the form of a software product. The computer software product can be stored in a computer-readable storage medium, such as a computer floppy disk, ROM, RAM, flash memory (FLASH), hard disk, or optical disk. Etc., including multiple instructions to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the method described in the embodiments of the present disclosure.
上述模型动作装置的实施例中,所包括的单元和模块只是按照功能逻辑进行划分的,但并不局限于上述的划分,只要能够实现相应的功能即可;另外,功能单元的名称也只是为了便于相互区分,并不用于限制本公开的保护范围。In the above embodiment of the model action device, the included units and modules are only divided according to the functional logic, but are not limited to the above division, as long as the corresponding functions can be realized; in addition, the names of the functional units are only for It is easy to distinguish each other and is not used to limit the protection scope of the present disclosure.

Claims (15)

  1. 一种模型动作方法,包括:A model action method, including:
    获取连续的两张或两张以上的人脸图像;Acquire two or more consecutive face images;
    根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数;Determining the facial motion feature parameters corresponding to the facial changes according to the facial changes in the two or more facial images;
    根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设三维3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。A model action instruction is generated according to the facial action feature parameters, so that a preset three-dimensional 3D image with face feature data executes a face action corresponding to the model action instruction according to the model action instruction.
  2. 根据权利要求1所述的方法,在根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作之前,还包括:The method according to claim 1, wherein a model action instruction is generated according to the facial action feature parameters, so that a preset 3D image with face feature data executes a person corresponding to the model action instruction according to the model action instruction Before facial movements, it also includes:
    获取至少一个预设角度的人脸图像,提取所述至少一个预设角度的人脸图像的人脸特征数据;Acquiring at least one face image of a preset angle, and extracting face feature data of the face image of the at least one preset angle;
    根据所述人脸特征数据,构建所述至少一个预设角度的人脸图像对应的人脸3D模型;Constructing a 3D face model corresponding to the face image of the at least one preset angle according to the face feature data;
    将所述人脸3D模型应用于预设3D形象,获得具有所述人脸特征数据的预设3D形象。The 3D face model is applied to a preset 3D image to obtain the preset 3D image with the face feature data.
  3. 根据权利要求1所述的方法,其中,所述人脸动作特征参数包括如下参数中的至少之一:移动速度、移动方向、移动距离。The method according to claim 1, wherein the facial motion characteristic parameters include at least one of the following parameters: moving speed, moving direction, and moving distance.
  4. 根据权利要求3所述的方法,其中,所述根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数,包括:The method according to claim 3, wherein the determining the facial motion feature parameters corresponding to the facial changes according to the facial changes in the two or more facial images comprises:
    根据所述两张或两张以上的人脸图像中的人脸的预设部位的位置变化,确定所述预设部位的移动速度、移动方向和移动距离以确定所述人脸动作特征参数。According to the position changes of the preset parts of the face in the two or more face images, the moving speed, the moving direction and the moving distance of the preset parts are determined to determine the facial motion characteristic parameters.
  5. 根据权利要求4所述的方法,其中,所述根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作,包括:The method according to claim 4, wherein said generating a model action instruction according to said facial action feature parameters, so that a preset 3D image with face feature data is executed according to said model action instruction and said model action instruction The corresponding facial actions include:
    根据所述移动速度、所述移动方向和所述移动距离生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。A model action instruction is generated according to the movement speed, the movement direction, and the movement distance, so that a preset 3D image with facial feature data executes a face action corresponding to the model action instruction according to the model action instruction.
  6. 根据权利要求1所述的方法,其中,所述具有人脸特征数据的预设3D形象为:卡通3D形象、职业3D形象或性别3D形象。The method according to claim 1, wherein the preset 3D image with facial feature data is: a cartoon 3D image, a professional 3D image, or a gender 3D image.
  7. 一种模型动作装置,包括:A model action device, including:
    人脸图像获取模块,设置为获取连续的两张或两张以上的人脸图像;The face image acquisition module is set to acquire two or more consecutive face images;
    人脸动作特征参数确定模块,设置为根据所述两张或两张以上的人脸图像中的人脸变化,确定所述人脸变化对应的人脸动作特征参数;The facial motion feature parameter determination module is configured to determine the facial motion feature parameter corresponding to the face change according to the face changes in the two or more face images;
    预设三维3D形象动作执行模块,设置为根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。The preset three-dimensional 3D image action execution module is configured to generate model action instructions according to the facial action feature parameters, so that the preset 3D image with face feature data executes the model action instructions corresponding to the model action instructions according to the model action instructions Face action.
  8. 根据权利要求7所述的装置,还包括:The device according to claim 7, further comprising:
    预设3D形象获取模块,设置为在根据所述人脸动作特征参数生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作之前,获取至少一个预设角度的人脸图像,提取所述至少一个预设角度的人脸图像的人脸特征数据;根据所述人脸特征数据,构建所述至少一个预设角度的人脸图像对应的人脸3D模型;将所述人脸3D模型应用于预设3D形象,获得具有所述人脸特征数据的预设3D形象。The preset 3D image acquisition module is configured to generate model action instructions according to the facial action feature parameters, so that the preset 3D image with face feature data executes the person corresponding to the model action instruction according to the model action instructions Before the facial action, obtain at least one face image with a preset angle, extract the face feature data of the face image at the at least one preset angle; construct the at least one face image with a preset angle according to the face feature data A human face 3D model corresponding to the human face image; applying the human face 3D model to a preset 3D image to obtain the preset 3D image with the facial feature data.
  9. 根据权利要求7所述的装置,其中,所述人脸动作特征参数包括如下参数中的至少之一:移动速度、移动方向、移动距离。8. The device according to claim 7, wherein the facial motion characteristic parameters include at least one of the following parameters: moving speed, moving direction, and moving distance.
  10. 根据权利要求9所述的装置,其中,所述人脸动作特征参数确定模块,是设置为:根据所述两张或两张以上的人脸图像中的人脸的预设部位的位置变化,确定所述预设部位人脸的移动速度、移动方向和移动距离以确定所述人脸动作特征参数。The device according to claim 9, wherein the facial motion feature parameter determination module is configured to: change according to the position change of the preset part of the human face in the two or more facial images, The moving speed, moving direction and moving distance of the human face at the preset position are determined to determine the facial motion characteristic parameters.
  11. 根据权利要求10所述的装置,其中,所述预设3D形象动作执行模块,是设置为:根据所述移动速度、所述移动方向和所述移动距离生成模型动作指令,使具有人脸特征数据的预设3D形象根据所述模型动作指令执行与所述模型动作指令对应的人脸动作。The device according to claim 10, wherein the preset 3D image action execution module is configured to: generate model action instructions according to the moving speed, the moving direction and the moving distance, so as to have facial features The preset 3D image of the data executes the facial action corresponding to the model action instruction according to the model action instruction.
  12. 根据权利要求7所述的装置,其中,所述具有人脸特征数据的预设3D形象为:卡通3D形象、职业3D形象或性别3D形象。7. The device according to claim 7, wherein the preset 3D image with facial feature data is: a cartoon 3D image, a professional 3D image, or a gender 3D image.
  13. 一种带屏音箱,包括主体、位于所述主体内的控制器和位于所述主体上的至少两个摄像头;所述至少两个摄像头之间的距离大于距离阈值,所述控制器内设置如权利要求7-12中任一所述的模型动作装置。A speaker with a screen, comprising a main body, a controller located in the main body, and at least two cameras located on the main body; the distance between the at least two cameras is greater than a distance threshold, and the controller is set as The model action device according to any one of claims 7-12.
  14. 一种电子设备,包括:An electronic device, including:
    一个或多个处理器;One or more processors;
    存储器,设置为存储一个或多个程序;Memory, set to store one or more programs;
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1-6中任一所述的模型动作方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the model action method according to any one of claims 1-6.
  15. 一种计算机可读存储介质,存储有计算机程序,所述程序被处理器执行时实现如权利要求1-6中任一所述的模型动作方法。A computer-readable storage medium storing a computer program, which, when executed by a processor, realizes the model action method according to any one of claims 1-6.
PCT/CN2020/070375 2019-01-15 2020-01-06 Model action method and apparatus, speaker having screen, electronic device, and storage medium WO2020147598A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910037303.3 2019-01-15
CN201910037303.3A CN111435546A (en) 2019-01-15 2019-01-15 Model action method and device, sound box with screen, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
WO2020147598A1 true WO2020147598A1 (en) 2020-07-23

Family

ID=71580067

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/070375 WO2020147598A1 (en) 2019-01-15 2020-01-06 Model action method and apparatus, speaker having screen, electronic device, and storage medium

Country Status (2)

Country Link
CN (1) CN111435546A (en)
WO (1) WO2020147598A1 (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103105924A (en) * 2011-11-15 2013-05-15 中国科学院深圳先进技术研究院 Man-machine interaction method and device
CN104616347A (en) * 2015-01-05 2015-05-13 掌赢信息科技(上海)有限公司 Expression migration method, electronic equipment and system
US20150178988A1 (en) * 2012-05-22 2015-06-25 Telefonica, S.A. Method and a system for generating a realistic 3d reconstruction model for an object or being
CN106447785A (en) * 2016-09-30 2017-02-22 北京奇虎科技有限公司 Method for driving virtual character and device thereof
CN107479693A (en) * 2017-07-07 2017-12-15 大圣科技股份有限公司 Real-time hand recognition methods based on RGB information, storage medium, electronic equipment
CN108875633A (en) * 2018-06-19 2018-11-23 北京旷视科技有限公司 Expression detection and expression driving method, device and system and storage medium

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1991817A (en) * 2005-12-29 2007-07-04 罗兆鑫 E-mail auxiliary and words-to-voice system
CN102169642B (en) * 2011-04-06 2013-04-03 沈阳航空航天大学 Interactive virtual teacher system having intelligent error correction function
CN102222363B (en) * 2011-07-19 2012-10-03 杭州实时数码科技有限公司 Method for fast constructing high-accuracy personalized face model on basis of facial images
CN103413468A (en) * 2013-08-20 2013-11-27 苏州跨界软件科技有限公司 Parent-child educational method based on a virtual character
CN103414782A (en) * 2013-08-20 2013-11-27 苏州跨界软件科技有限公司 Parent-child system and method based on virtual character
CN105590486A (en) * 2014-10-21 2016-05-18 黄小曼 Machine vision-based pedestal-type finger reader, related system device and related method
WO2017000213A1 (en) * 2015-06-30 2017-01-05 北京旷视科技有限公司 Living-body detection method and device and computer program product
CN107333086A (en) * 2016-04-29 2017-11-07 掌赢信息科技(上海)有限公司 A kind of method and device that video communication is carried out in virtual scene
CN106023692A (en) * 2016-05-13 2016-10-12 广东博士早教科技有限公司 AR interest learning system and method based on entertainment interaction
CN108229239B (en) * 2016-12-09 2020-07-10 武汉斗鱼网络科技有限公司 Image processing method and device
CN106910247B (en) * 2017-03-20 2020-10-02 厦门黑镜科技有限公司 Method and apparatus for generating three-dimensional avatar model
CN107705355A (en) * 2017-09-08 2018-02-16 郭睿 A kind of 3D human body modeling methods and device based on plurality of pictures
CN107831902B (en) * 2017-11-23 2020-08-25 腾讯科技(上海)有限公司 Motion control method and device, storage medium and terminal
CN108090463B (en) * 2017-12-29 2021-10-26 腾讯科技(深圳)有限公司 Object control method, device, storage medium and computer equipment
CN108615256B (en) * 2018-03-29 2022-04-12 西南民族大学 Human face three-dimensional reconstruction method and device
CN108806360A (en) * 2018-05-31 2018-11-13 北京智能管家科技有限公司 Reading partner method, apparatus, equipment and storage medium
CN109118562A (en) * 2018-08-31 2019-01-01 百度在线网络技术(北京)有限公司 Explanation video creating method, device and the terminal of virtual image

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103105924A (en) * 2011-11-15 2013-05-15 中国科学院深圳先进技术研究院 Man-machine interaction method and device
US20150178988A1 (en) * 2012-05-22 2015-06-25 Telefonica, S.A. Method and a system for generating a realistic 3d reconstruction model for an object or being
CN104616347A (en) * 2015-01-05 2015-05-13 掌赢信息科技(上海)有限公司 Expression migration method, electronic equipment and system
CN106447785A (en) * 2016-09-30 2017-02-22 北京奇虎科技有限公司 Method for driving virtual character and device thereof
CN107479693A (en) * 2017-07-07 2017-12-15 大圣科技股份有限公司 Real-time hand recognition methods based on RGB information, storage medium, electronic equipment
CN108875633A (en) * 2018-06-19 2018-11-23 北京旷视科技有限公司 Expression detection and expression driving method, device and system and storage medium

Also Published As

Publication number Publication date
CN111435546A (en) 2020-07-21

Similar Documents

Publication Publication Date Title
US20210029305A1 (en) Method and apparatus for adding a video special effect, terminal device and storage medium
CN112379812B (en) Simulation 3D digital human interaction method and device, electronic equipment and storage medium
CN109462776B (en) Video special effect adding method and device, terminal equipment and storage medium
WO2020186935A1 (en) Virtual object displaying method and device, electronic apparatus, and computer-readable storage medium
WO2022068479A1 (en) Image processing method and apparatus, and electronic device and computer-readable storage medium
WO2020107908A1 (en) Multi-user video special effect adding method and apparatus, terminal device and storage medium
US10166477B2 (en) Image processing device, image processing method, and image processing program
CN109474850B (en) Motion pixel video special effect adding method and device, terminal equipment and storage medium
WO2022170958A1 (en) Augmented reality-based display method and device, storage medium, and program product
WO2022116751A1 (en) Interaction method and apparatus, and terminal, server and storage medium
US20230419582A1 (en) Virtual object display method and apparatus, electronic device, and medium
WO2022088928A1 (en) Elastic object rendering method and apparatus, device, and storage medium
WO2020186934A1 (en) Method, apparatus, and electronic device for generating animation containing dynamic background
CN109600559B (en) Video special effect adding method and device, terminal equipment and storage medium
US20230182028A1 (en) Game live broadcast interaction method and apparatus
WO2023116653A1 (en) Element display method and apparatus, and electronic device and storage medium
TW200541330A (en) Method and system for real-time interactive video
US11756251B2 (en) Facial animation control by automatic generation of facial action units using text and speech
JP2022500795A (en) Avatar animation
WO2023195909A2 (en) Determination method and apparatus for video with special effects, electronic device, and storage medium
WO2022012349A1 (en) Animation processing method and apparatus, electronic device, and storage medium
WO2023116562A1 (en) Image display method and apparatus, electronic device, and storage medium
WO2020147598A1 (en) Model action method and apparatus, speaker having screen, electronic device, and storage medium
EP4071725A1 (en) Augmented reality-based display method and device, storage medium, and program product
WO2022188145A1 (en) Method for interaction between display device and terminal device, and storage medium and electronic device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20741152

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 05.11.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 20741152

Country of ref document: EP

Kind code of ref document: A1