CN115619914A

CN115619914A - Human body posture synchronous animation display method and device and automobile

Info

Publication number: CN115619914A
Application number: CN202211350222.7A
Authority: CN
Inventors: 陈光银
Original assignee: Chongqing Changan Automobile Co Ltd
Current assignee: Chongqing Changan Automobile Co Ltd
Priority date: 2022-10-31
Filing date: 2022-10-31
Publication date: 2023-01-17

Abstract

The disclosure relates to the technical field of intelligent automobiles, and discloses a human body posture synchronous animation display method and device and an automobile. The method comprises the following steps: acquiring video stream data acquired by a camera; extracting a character image frame in video stream data, and identifying the character image frame by using a skeleton identification algorithm to obtain two-dimensional coordinate data of a human skeleton key point; converting two-dimensional coordinate data of key points of human bones into three-dimensional coordinate data; the three-dimensional coordinate data based on the human skeleton key points drives the animation image displayed in the screen to move, so that the animation image is synchronous with the human body posture in the image frame of the character, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal. The method and the device can enable the virtual animation image in the screen to make action display synchronous with the human posture, and improve the intelligent degree of human-computer interaction.

Description

Human body posture synchronous animation display method and device and automobile

Technical Field

The disclosure relates to the technical field of intelligent automobiles, in particular to a human body posture synchronous animation display method and device and an automobile.

Background

With the development of automobile intelligence, some current automobile machines adopt voice assistants or intelligent assistants to realize human-vehicle interaction. For example, to increase the intelligence of such human-vehicle interactions, some automobiles implement a physical avatar with the voice assistant or intelligent assistant that gives some action changes, etc., when the user wakes up the voice assistant or intelligent assistant. However, the action change of the physical image is usually some preset actions, so the degree of intelligence of human-vehicle interaction is not high.

Disclosure of Invention

In view of the defects of the prior art, the present disclosure discloses a human body posture synchronous animation display method, device and automobile, so as to solve the technical problem that the human-computer interaction intelligence degree of the existing automobile is not high.

According to an aspect of the embodiments of the present disclosure, a method for displaying human body posture synchronization animation is provided, including: acquiring video stream data acquired by a camera; extracting a character image frame in video stream data, and identifying the character image frame by using a skeleton identification algorithm to obtain two-dimensional coordinate data of a human skeleton key point; converting two-dimensional coordinate data of key points of human bones into three-dimensional coordinate data; the three-dimensional coordinate data based on the human skeleton key points drives the animation image displayed in the screen to move, so that the animation image is synchronous with the human body posture in the image frame of the character, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal.

In some alternative embodiments, converting two-dimensional coordinate data of a human skeletal keypoint to three-dimensional coordinate data comprises: acquiring two-dimensional coordinate data of human skeleton key points of a basic character image frame and a real-time character image frame in video stream data; calculating the distance from the sub skeleton points of all adjacent parent-child skeleton points in the real-time character image frame to the straight line where the corresponding adjacent parent-child skeleton points in the basic character image frame are located, and taking the distance as the depth of the sub skeleton points in the three-dimensional space; determining the depth direction of the sub-skeleton points in the current real-time character image frame in a three-dimensional space based on the continuous action trend of key skeleton points of the human body in the continuous real-time character image frame; and generating three-dimensional coordinate data of key skeleton points of the human body in the real-time character image frame based on the two-dimensional coordinate data of the sub skeleton points in the real-time character image frame and the depth direction of the sub skeleton points in the three-dimensional space.

In some optional embodiments, determining the depth direction of the sub-skeleton points in the current real-time human image frame in the three-dimensional space based on the trend of the continuous motion of the human key skeleton points in the continuous real-time human image frame comprises: acquiring initial constraints on human body actions in advance, wherein the initial constraints comprise various basic actions of the human body and the depth direction of a human body key bone point corresponding to each basic action in a three-dimensional space; estimating the target action of the human body based on the position change track of key skeleton points of the human body in a two-dimensional space in continuous real-time human image frames; and determining the depth direction of the sub-skeleton points in the current real-time character image frame in the three-dimensional space based on the initial constraint of the target action.

In some optional embodiments, the base human image frame comprises a first human image frame in the video stream data.

In some alternative embodiments, generating three-dimensional coordinate data of key skeletal points of a human body in a real-time human image frame comprises: and representing the three-dimensional coordinate data of the key skeleton points of the human body in the real-time character image frame in a human body feature animation file format.

In some optional embodiments, acquiring the camera comprises 2D camera provided on the vehicle; the method for driving the animation image displayed in the screen to move based on the three-dimensional coordinate data of the key points of the human skeleton comprises the following steps: and driving the animation image displayed in the vehicle screen to move based on the three-dimensional coordinate data of the human skeleton key points.

In some optional embodiments, after obtaining the two-dimensional coordinate data of the key points of the human skeleton, the method further comprises: judging whether two-dimensional coordinate data of the human skeleton key points are abnormal or not, wherein the abnormality comprises the vacancy of the two-dimensional coordinate data of the human skeleton key points; if so, displaying a text prompt for adjusting the distance between the target person and the camera in a vehicle screen; or controlling a loudspeaker of the vehicle to send out a language prompt for adjusting the distance between the target person and the camera.

In some optional embodiments, acquiring video stream data collected by a camera further includes: detecting whether a current gear of a vehicle is in a preset gear, wherein the preset gear comprises a parking gear; if the vehicle is in a preset gear, starting a camera to collect video data, and acquiring video stream data collected by the camera; and if the vehicle is not at the preset gear, displaying the popup window information for adjusting the vehicle to the preset gear in a vehicle screen.

According to another aspect of the embodiments of the present disclosure, there is provided a human body posture synchronization animation display device, including: the data acquisition module is configured to acquire video stream data acquired by the camera; the image identification module is configured to extract a human image frame in the video stream data, and identify the human image frame by using a bone identification algorithm to obtain two-dimensional coordinate data of human bone key points; the coordinate conversion module is configured to convert the two-dimensional coordinate data of the human skeleton key points into three-dimensional coordinate data; and the gesture display module is configured to drive the animation image displayed in the screen to move based on the three-dimensional coordinate data of the human skeleton key points, so that the animation image is synchronous with the human body gesture in the human image frame, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal.

According to yet another aspect of the embodiments of the present disclosure, there is provided an automobile including: a camera; a display screen; the processor is respectively connected with the camera and the display screen; a memory for storing one or more programs that, when executed by the one or more processors, cause the vehicle to perform the steps of the above-described method.

The beneficial effect of this disclosure: acquiring video stream data acquired by a camera; extracting a character image frame in video stream data, and identifying the character image frame by using a skeleton identification algorithm to obtain two-dimensional coordinate data of a human skeleton key point; converting two-dimensional coordinate data of key points of human bones into three-dimensional coordinate data; the three-dimensional coordinate data based on the human skeleton key points drives the animation image displayed in the screen to move, so that the animation image is synchronous with the human body posture in the image frame of the character, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal, and therefore the virtual animation image in the screen can make action display synchronous with the human body posture, and the intelligent degree of human-computer interaction is improved.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present application and together with the description, serve to explain the principles of the application. It is obvious that the drawings in the following description are only some embodiments of the application, and that for a person skilled in the art, other drawings can be derived from them without inventive effort. In the drawings:

FIG. 1 is a flowchart of a method for displaying human body posture synchronous animation according to an embodiment of the present disclosure;

FIG. 2 is a schematic diagram of key skeletal points of a human body provided by an embodiment of the present disclosure;

fig. 3 is a schematic diagram of depth calculation of some key skeleton points in a base human image frame and a real-time human image frame according to an embodiment of the present disclosure;

FIG. 4 is a schematic structural diagram of a human body posture synchronous animation display device according to an embodiment of the present disclosure;

FIG. 5 is a schematic structural diagram of another human body posture synchronous animation display device provided in the embodiment of the present disclosure;

fig. 6 is a diagram of an actual application scenario provided by the embodiment of the present disclosure.

Detailed Description

The embodiments of the present disclosure are described below with specific examples, and other advantages and effects of the present disclosure will be readily apparent to those skilled in the art from the disclosure in the specification. The disclosure may be embodied or carried out in various other specific embodiments, and various modifications and changes may be made in the details within the description without departing from the spirit of the disclosure. It should be noted that, in the following embodiments and examples, subsamples may be combined without conflict.

It should be noted that the drawings provided in the following embodiments are only for illustrating the basic idea of the present disclosure, and the components related to the present disclosure are only shown in the drawings rather than drawn according to the number, shape and size of the components in actual implementation, and the type, number and proportion of the components in actual implementation may be changed freely, and the layout of the components may be more complex.

In the following description, numerous details are set forth to provide a more thorough explanation of the embodiments of the present disclosure, however, it will be apparent to one skilled in the art that the embodiments of the present disclosure may be practiced without these specific details, and in other embodiments, well-known structures and devices are shown in block diagram form rather than in detail in order to avoid obscuring the embodiments of the present disclosure.

The terms "first," "second," and the like in the description and in the claims, and the above-described drawings of embodiments of the present disclosure, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the present disclosure described herein may be made. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions.

Referring to fig. 1, a flowchart of a human body posture synchronous animation display method provided in this embodiment is shown in fig. 1, where the human body posture synchronous animation display method includes:

s101, acquiring video stream data acquired by a camera;

s102, extracting a human image frame in video stream data, and identifying the human image frame by using a bone identification algorithm to obtain two-dimensional coordinate data of human bone key points;

s103, converting the two-dimensional coordinate data of the key points of the human skeleton into three-dimensional coordinate data;

and S104, driving the animation image displayed in the screen to move based on the three-dimensional coordinate data of the human skeleton key points, so that the animation image is synchronous with the human body posture in the image frame of the character, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal.

The camera in this embodiment is preferably a 2D camera, and then the human image frames in the video stream data lack depth information.

The skeleton recognizing algorithm, also called human skeleton key point detecting algorithm, is one open source algorithm in human posture estimating and human motion capturing technology. And identifying the human skeleton information in the obtained human image frame by using a skeleton identification algorithm, and generating coordinate information and confidence of each human skeleton information for reference of information developers and the like. For example, by using a bone recognition algorithm to recognize a human figure frame, 18 coordinates of key bone points of each human bone can be recognized, as shown in fig. 2, the 18 bone key points are respectively: left toe, right toe, left knee, right knee, left waist, right waist, head, left hand toe, right hand toe, left elbow, right elbow, left shoulder, right shoulder, nose, left eye, and right eye. Of course, in practice, the number of the human body key bone points obtained by the bone recognition algorithm may also be more than 18, or less than 18, and the embodiment of the present disclosure does not limit this.

The two-dimensional coordinate data of the key skeleton points of the human body can be the two-dimensional coordinates of each key skeleton point in a camera coordinate system.

In some embodiments, converting two-dimensional coordinate data of a human skeletal keypoint to three-dimensional coordinate data comprises: acquiring two-dimensional coordinate data of human skeleton key points of a basic character image frame and a real-time character image frame in video stream data; calculating the distance from the sub skeleton points of all adjacent parent-child skeleton points in the real-time character image frame to the straight line where the corresponding adjacent parent-child skeleton points in the basic character image frame are located, and taking the distance as the depth of the sub skeleton points in the three-dimensional space; determining the depth direction of the sub-skeleton points in the current real-time character image frame in a three-dimensional space based on the continuous action trend of key skeleton points of the human body in the continuous real-time character image frame; and generating three-dimensional coordinate data of key skeleton points of the human body in the real-time character image frame based on the two-dimensional coordinate data of the sub skeleton points in the real-time character image frame and the depth and depth direction of the sub skeleton points in the three-dimensional space.

After the basic person image frame is selected, the human key bone points of the basic person image frame are taken as reference, and the human key bone points of the subsequent real-time person image frame are compared with the corresponding human key bone points in the basic person image frame to determine the depth information of the key bone points.

Referring to fig. 3, fig. 3 shows 3 skeleton points in the key skeleton points of the human body corresponding to the image frame of the person, where A, B, C represents 3 key skeleton points corresponding to the image frame of the base person, A1, B1, and C1 represent 3 key skeleton points corresponding to the image frame of the real-time person, and then BC and AB are both adjacent parent-child skeleton points, and for adjacent parent-child skeleton point BC, if C is a child skeleton point, B is a parent skeleton point, and the same applies to the rest of the above.

Specifically, when the depth of the key bone point C1 in the three-dimensional space needs to be calculated, only the length of the line segment C1D needs to be calculated. The implementation manner of calculating the length of the line segment C1D is not unique. For example, the included angle ≧ CBC1 can be calculated first from the coordinates of the key skeleton points B, C and the three points C1, then the length of the line segment BC1 is calculated, and finally the line segment C1D = sin ≦ CBC1 × BC1 according to a trigonometric function. For another example, the linear equation of BC may be calculated from the coordinates of the key bone points B and C, the coordinate of the intersection point D may be calculated from the linear equation of BC and the coordinates of C1, and the length of the line segment C1D may be calculated using the coordinates of the point D and the key bone point C1. Of course, other implementations may be used in practice, and the disclosed embodiments are not limited thereto.

In some embodiments, determining the depth direction of the sub-skeleton points in the current real-time human image frame in the three-dimensional space based on the continuous action trend of the human key skeleton points in the continuous real-time human image frame comprises: acquiring initial constraints on human body actions in advance, wherein the initial constraints comprise various basic actions of the human body and the depth direction of a human body key bone point corresponding to each basic action in a three-dimensional space; estimating the target action of the human body based on the position change track of key skeleton points of the human body in a continuous real-time human image frame in a two-dimensional space; and determining the depth direction of the sub-skeleton points in the current real-time human image frame in the three-dimensional space based on the initial constraint of the target action.

Referring to fig. 3, the initial motion is constrained by using the key skeleton points of the human body in the basic human image frame as a reference and combining the motion habit of the human body, and then the motion direction can be predicted according to the trend of the continuous motion. For example, when a person walks forward, since the hands and feet cannot be on the same side, it is possible to constrain: when a person moves forwards, if the right foot steps forwards and the right hand swings obliquely backwards, the depth directions of all key skeleton points on the right backward are specified to be positive, and the depth directions of all key skeleton points on the right hand are specified to be negative. Then, the coordinate position of the key skeleton point of the human body obtained by the basic human image frame is taken as a reference, the motion of the human body is determined through the relative motion change of each key skeleton point in the real-time human image frame, and the depth direction of each key skeleton point in the three-dimensional space can be determined by combining the constraint.

In some embodiments, generating three-dimensional coordinate data of key skeletal points of a human body in a real-time human image frame comprises: and representing the three-dimensional coordinate data of the key skeleton points of the human body in the real-time character image frame in a human body feature animation file format.

Specifically, the human body feature animation file format, also called BVH format for short, because the BVH data contains the skeleton and limb joint rotation data of the character, the BVH data is used to drive the 3D animation image, so that the 3D animation image can be consistent with the human motion track in the human image frame. It should be noted that the transformation of three-dimensional coordinate data based on key bone points of a human body into BVH format data and the use of BVH data to drive 3D bone motion are conventional techniques in the art, and therefore are not described herein again.

Furthermore, the method in fig. 2 may be implemented in a motor vehicle. For example, in some embodiments, acquiring the camera includes a 2D camera disposed on the vehicle; the method for driving the animation image displayed in the screen to move based on the three-dimensional coordinate data of the key points of the human skeleton comprises the following steps: and driving the animation image displayed in the vehicle screen to move based on the three-dimensional coordinate data of the human skeleton key points. Therefore, the animation effect that the animation image synchronously moves along with the human body posture can be displayed in the vehicle screen. For example, the voice assistant or the intelligent voice assistant in the vehicle can be virtualized into an animation image, and the human body posture outside the vehicle can move synchronously, so that the intelligence and the entertainment of human-computer interaction are improved.

In some embodiments, after obtaining the two-dimensional coordinate data of the key points of the human skeleton, the method further comprises: judging whether two-dimensional coordinate data of the human skeleton key points are abnormal or not, wherein the abnormality comprises the vacancy of the two-dimensional coordinate data of the human skeleton key points; if so, displaying a text prompt for adjusting the distance between the target person and the camera in a vehicle screen; or controlling a loudspeaker of the vehicle to send out a language prompt for adjusting the distance between the target person and the camera.

Specifically, due to occlusion or other factors, there may be situations such as missing key bone points of a human body obtained through a human image frame, and at this time, in order to ensure that the key bone points are complete, a prompt may be sent to allow a user to move the position, so as to obtain effective key bone point information.

In some embodiments, acquiring video stream data captured by a camera further comprises: detecting whether a current gear of a vehicle is in a preset gear, wherein the preset gear comprises a parking gear; if the vehicle is in a preset gear, starting a camera to collect video data, and acquiring video stream data collected by the camera; and if the vehicle is not at the preset gear, displaying the popup window information for adjusting the vehicle to the preset gear in a vehicle screen.

According to the method, whether the vehicle is in the safe state or not is identified by detecting the gear of the vehicle, and the method can be started only in the safe state, so that the driving safety is improved.

All the above optional technical solutions may be combined arbitrarily to form optional embodiments of the present application, and are not described herein again.

The following are embodiments of the disclosed apparatus that may be used to perform embodiments of the disclosed methods. For details not disclosed in the embodiments of the apparatus of the present disclosure, refer to the embodiments of the method of the present disclosure.

Fig. 4 is a schematic diagram of a human body posture synchronous animation display device provided in an embodiment of the present disclosure. As shown in fig. 4, the human body posture synchronization animation display device includes:

a data acquisition module 401 configured to acquire video stream data acquired by a camera;

the image identification module 402 is configured to extract a human image frame in the video stream data, and identify the human image frame by using a bone identification algorithm to obtain two-dimensional coordinate data of a human bone key point;

a coordinate conversion module 403 configured to convert two-dimensional coordinate data of the human skeleton key points into three-dimensional coordinate data;

and the gesture display module 404 is configured to drive the animated character displayed in the screen to move based on the three-dimensional coordinate data of the human skeleton key points, so that the animated character is synchronized with the human body gesture in the human image frame, wherein the animated character comprises a three-dimensional animated character or a three-dimensional animated animal.

According to the technical scheme provided by the embodiment, video stream data acquired by a camera is acquired; extracting a character image frame in video stream data, and identifying the character image frame by using a skeleton identification algorithm to obtain two-dimensional coordinate data of a human skeleton key point; converting two-dimensional coordinate data of key points of human bones into three-dimensional coordinate data; the three-dimensional coordinate data based on the human skeleton key points drives the animation image displayed in the screen to move, so that the animation image is synchronous with the human body posture in the image frame of the character, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal, and therefore the virtual animation image in the screen can make action display synchronous with the human body posture, and the intelligent degree of human-computer interaction is improved.

In some embodiments, the coordinate transformation module 403 in fig. 4 is configured to obtain two-dimensional coordinate data of human skeleton key points of the base human image frame and the real-time human image frame in the video stream data; calculating the distance from the sub skeleton points of all adjacent parent-child skeleton points in the real-time character image frame to the straight line where the corresponding adjacent parent-child skeleton points in the basic character image frame are located, and taking the distance as the depth of the sub skeleton points in the three-dimensional space; determining the depth direction of the sub-skeleton points in the current real-time character image frame in a three-dimensional space based on the continuous action trend of key skeleton points of the human body in the continuous real-time character image frame; and generating three-dimensional coordinate data of key skeleton points of the human body in the real-time character image frame based on the two-dimensional coordinate data of the sub skeleton points in the real-time character image frame and the depth and depth direction of the sub skeleton points in the three-dimensional space.

In some embodiments, the coordinate transformation module 403 in fig. 4 is further configured to obtain initial constraints on the human body actions in advance, where the initial constraints include various basic actions of the human body, and a depth direction of a human body key bone point corresponding to each basic action in a three-dimensional space; estimating the target action of the human body based on the position change track of key skeleton points of the human body in a two-dimensional space in continuous real-time human image frames; and determining the depth direction of the sub-skeleton points in the current real-time human image frame in the three-dimensional space based on the initial constraint of the target action.

In some embodiments, the base character image frame comprises a first character image frame in the video stream data.

In some embodiments, the coordinate transformation module 403 in fig. 4 also represents three-dimensional coordinate data of key skeletal points of a human body in a real-time human character image frame in a human feature animation file format.

In some embodiments, acquiring the camera includes a 2D camera disposed on the vehicle; the pose display module 404 of fig. 4 drives the animated character displayed in the vehicle screen to move based on the three-dimensional coordinate data of the human skeletal key points.

In some embodiments, referring to fig. 5, the human-posture-synchronized animation display device further includes:

the abnormality determining module 405 is configured to determine whether there is an abnormality in the two-dimensional coordinate data of the human skeleton key points after the two-dimensional coordinate data of the human skeleton key points are obtained, where the abnormality includes a vacancy in the two-dimensional coordinate data of the human skeleton key points;

a first alert module 406 configured to display a text prompt in the vehicle screen to adjust the distance between the target person and the camera if so; or controlling a loudspeaker of the vehicle to send out a language prompt for adjusting the distance between the target person and the camera.

a gear detection module 407 configured to detect whether a current gear of the vehicle is in a preset gear, where the preset gear includes a parking gear;

the second reminding module 408 is configured to display pop-up window information for adjusting the vehicle to a preset gear in a vehicle screen if the vehicle is not in the preset gear;

the data obtaining module 401 in fig. 4 is configured to, if the vehicle is in a preset gear, turn on a camera to collect video data, and obtain video stream data collected by the camera.

Referring to fig. 6, the present embodiment provides an actual application scenario, as shown in fig. 6, the application scenario includes an automobile 1 and a user 2, and the user 2 is located outside the automobile.

The car 1 comprises at least a camera 11, a display screen 12 and a computing device 13. Specifically, the computing device 13 includes a processor 131 and a memory 132, where the number of the processors 131 is at least one, for example, in this embodiment, the number of the processors 131 is 2, where a first processor is used for the intelligent driving calculation, and a second processor is used for running a related application program of the car machine system; the memory 132 is used for storing one or more programs, which when executed by the processor 131 (including the first processor and the second processor in fig. 6) causes the vehicle 1 to implement the steps of the method in fig. 1 described above.

In this embodiment, the Memory may include a Random Access Memory (RAM), and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory.

The Processor may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a Network Processor (NP), and the like; the Integrated Circuit may also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component.

The above description and the drawings sufficiently illustrate embodiments of the disclosure to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. The examples merely typify possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and subsamples of some embodiments may be included in or substituted for portions and subsamples of other embodiments. Furthermore, the words used in the specification are words of description for example only and are not limiting upon the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, the terms "comprises," "comprising," and variations thereof, when used in this application, specify the presence of stated sub-samples, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other sub-samples, integers, steps, operations, elements, components, and/or groups thereof. Without further limitation, an element defined by the phrase "comprising a …" does not exclude the presence of additional like elements in a process, method, or apparatus that comprises the element. In this document, each embodiment may be described with emphasis on differences from other embodiments, and the same and similar parts between the respective embodiments may be referred to each other. For methods, products, etc. of the embodiment disclosures, reference may be made to the description of the method section for relevance if it corresponds to the method section of the embodiment disclosure.

Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software may depend upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments. It can be clearly understood by those skilled in the art that, for convenience and simplicity of description, the specific working processes of the above-described systems, apparatuses, and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments disclosed herein, the disclosed methods, products (including but not limited to devices, automobiles, etc.) may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, a division of a unit may be merely a division of a logical function, and an actual implementation may have another division, for example, a plurality of units or components may be combined or may be integrated into another system, or some subsamples may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to implement the present embodiment. In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. In the description corresponding to the flowcharts and block diagrams in the figures, operations or steps corresponding to different blocks may also occur in different orders than disclosed in the description, and sometimes there is no specific order between the different operations or steps. For example, two sequential operations or steps may in fact be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. Each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

Claims

1. A human body posture synchronous animation display method is characterized by comprising the following steps:

acquiring video stream data acquired by a camera;

extracting a figure image frame in the video stream data, and identifying the figure image frame by using a skeleton identification algorithm to obtain two-dimensional coordinate data of a human skeleton key point;

converting two-dimensional coordinate data of key points of human bones into three-dimensional coordinate data;

and driving the animation image displayed in the screen to move based on the three-dimensional coordinate data of the human skeleton key points, so that the animation image is synchronous with the human body posture in the human image frame, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal.

2. The human body posture synchronous animation display method according to claim 1, wherein the converting two-dimensional coordinate data of human body skeleton key points into three-dimensional coordinate data comprises:

acquiring two-dimensional coordinate data of human skeleton key points of a basic character image frame and a real-time character image frame in the video stream data;

calculating the distance from the sub skeleton points of all adjacent father and son skeleton points in the real-time character image frame to the straight line where the corresponding adjacent father and son skeleton points in the basic character image frame are located, and taking the distance as the depth of the sub skeleton points in the three-dimensional space;

determining the depth direction of the sub-skeleton points in the current real-time character image frame in a three-dimensional space based on the continuous action trend of key skeleton points of the human body in the continuous real-time character image frame;

and generating three-dimensional coordinate data of key skeleton points of the human body in the real-time character image frame based on the two-dimensional coordinate data of the sub skeleton points in the real-time character image frame and the depth and depth direction of the sub skeleton points in the three-dimensional space.

3. The human body posture synchronous animation display method as claimed in claim 2, wherein the determining the depth direction of the sub-skeleton points in the current real-time human body image frame in the three-dimensional space based on the trend of the continuous actions of the human body key skeleton points in the continuous real-time human body image frame comprises:

acquiring initial constraints on human body actions in advance, wherein the initial constraints comprise various basic actions of the human body and the depth direction of a human body key bone point corresponding to each basic action in a three-dimensional space;

estimating the target action of the human body based on the position change track of key skeleton points of the human body in a two-dimensional space in continuous real-time human image frames;

and determining the depth direction of the sub-skeleton points in the current real-time human image frame in the three-dimensional space based on the starting constraint of the target action.

4. The human-pose synchronized animation display method of claim 2, wherein the base human image frame comprises a first human image frame in the video stream data.

5. The method for displaying human body posture synchronization animation as claimed in claim 2, wherein the generating three-dimensional coordinate data of human body key skeleton points in the real-time human image frame comprises: and the three-dimensional coordinate data of the key skeleton points of the human body in the real-time character image frame is expressed in a human body feature animation file format.

6. The human body posture synchronous animation display method according to any one of claims 1 to 5, wherein the acquiring camera comprises a 2D camera arranged on a vehicle;

the method for driving the movement of the animation image displayed in the screen based on the three-dimensional coordinate data of the human skeleton key points comprises the following steps: and driving the animation image displayed in the vehicle screen to move based on the three-dimensional coordinate data of the human skeleton key points.

7. The human body posture synchronous animation display method according to claim 6, further comprising, after obtaining two-dimensional coordinate data of human body skeleton key points:

judging whether the two-dimensional coordinate data of the human skeleton key points are abnormal or not, wherein the abnormality comprises the vacancy of the two-dimensional coordinate data of the human skeleton key points;

if so, displaying a text prompt for adjusting the distance between the target person and the camera in the vehicle screen; or controlling a loudspeaker of the vehicle to send out a language prompt for adjusting the distance between the target person and the camera.

8. The method for displaying human body posture synchronous animation according to claim 5, wherein the acquiring video stream data collected by a camera further comprises:

detecting whether the current gear of the vehicle is in a preset gear or not, wherein the preset gear comprises a parking gear;

if the vehicle is in a preset gear, starting a camera to collect video data, and acquiring video stream data collected by the camera;

and if the vehicle is not at the preset gear, displaying the popup window information for adjusting the vehicle to the preset gear in a vehicle screen.

9. A human body posture synchronous animation display device is characterized by comprising:

the data acquisition module is configured to acquire video stream data acquired by the camera;

the image identification module is configured to extract a person image frame in the video stream data, and identify the person image frame by using a skeleton identification algorithm to obtain two-dimensional coordinate data of a human skeleton key point;

the coordinate conversion module is configured to convert the two-dimensional coordinate data of the human skeleton key points into three-dimensional coordinate data;

and the gesture display module is configured to drive the animation image displayed in the screen to move based on the three-dimensional coordinate data of the human skeleton key points, so that the animation image is synchronized with the human body gesture in the human image frame, wherein the animation image comprises a three-dimensional animation character or a three-dimensional animation animal.

10. An automobile, comprising:

a camera;

a display screen;

the processor is respectively connected with the camera and the display screen;

memory for storing one or more programs that, when executed by the one or more processors, cause the vehicle to implement the method of any of claims 1-8.