CN114035683B

CN114035683B - User capturing method, apparatus, device, storage medium and computer program product

Info

Publication number: CN114035683B
Application number: CN202111314304.1A
Authority: CN
Inventors: 林楠; 李健龙; 葛瀚丞; 石磊; 徐昭吉; 翟忆蒙; 郝志雄; 张苗昌; 张茜
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd; Shanghai Xiaodu Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd; Shanghai Xiaodu Technology Co Ltd
Priority date: 2021-11-08
Filing date: 2021-11-08
Publication date: 2024-03-29
Anticipated expiration: 2041-11-08
Also published as: CN114035683A

Abstract

The disclosure provides a user capturing method, a user capturing device, electronic equipment, a computer readable storage medium and a computer program product, and relates to the technical field of artificial intelligence such as intelligent home and gesture recognition. The method comprises the following steps: scanning key points of the whole body bones of users entering the capturing area to generate a bone posture model; highlighting a skeletal gesture model corresponding to a user closest to the smart mirror in a preset display area of the smart mirror as a suspected model; and in response to not receiving error capturing feedback of the displayed suspected model within a preset time period, locking the suspected model into a valid model corresponding to the target user, and controlling the capturing component to take the target user as a capturing object and adjust the capturing gesture. The method not only can shorten the rendering time of the gesture model, but also can help the smart mirror to capture the target user by highlighting the display mechanism of the suspected model.

Description

User capturing method, apparatus, device, storage medium and computer program product

Technical Field

The disclosure relates to the technical field of man-machine interaction, in particular to the technical field of artificial intelligence such as intelligent home and gesture recognition, and especially relates to a user capturing method, a user capturing device, electronic equipment, a computer readable storage medium and a computer program product.

Background

With the continuous popularization of smart home concepts and the arrival of people at increasingly better lives, in addition to various small-screen smart devices widely known, a family or a special place gradually combines services with large-screen devices to form various intelligent large-screen devices including smart mirrors.

Taking an intelligent gym mirror arranged in a gym classroom as an example, how to provide a good man-machine interaction experience for a gym user in front of the mirror so as to provide a good gym instruction for the user is a problem to be solved by a person skilled in the art.

Disclosure of Invention

The embodiment of the disclosure provides a user capturing method, a user capturing device, electronic equipment, a computer readable storage medium and a computer program product.

In a first aspect, an embodiment of the present disclosure provides a user capturing method, including: scanning key points of the whole body bones of users entering the capturing area to generate a bone posture model; highlighting a skeletal gesture model corresponding to a user closest to the smart mirror in a preset display area of the smart mirror as a suspected model; and in response to not receiving error capturing feedback of the displayed suspected model within a preset time period, locking the suspected model into a valid model corresponding to the target user, and controlling the capturing component to take the target user as a capturing object and adjust the capturing gesture.

In a second aspect, embodiments of the present disclosure provide a user capture device, comprising: a bone posture model generation unit configured to perform whole-body bone key point scanning on a user entering the capturing area, and generate a bone posture model; the suspected model highlighting unit is configured to highlight a bone gesture model corresponding to a user closest to the smart mirror as a suspected model in a preset display area of the smart mirror; and the effective model determining and capturing gesture adjusting unit is configured to lock the suspected model into an effective model corresponding to the target user in response to the fact that false capturing feedback of the displayed suspected model is not received within a preset time period, and control the capturing component to take the target user as a capturing object and adjust the capturing gesture.

In a third aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to implement a user capture method as described in any one of the implementations of the first aspect when executed.

In a fourth aspect, embodiments of the present disclosure provide a non-transitory computer-readable storage medium storing computer instructions for enabling a computer to implement a user capture method as described in any one of the implementations of the first aspect when executed.

In a fifth aspect, embodiments of the present disclosure provide a computer program product comprising a computer program which, when executed by a processor, is capable of implementing a user capture method as described in any of the implementations of the first aspect.

Aiming at the large-screen intelligent device of the intelligent mirror, the technical scheme provided by the disclosure can simplify the rendering operand, shorten the rendering time and further promote the instantaneity by expressing the user gesture as the skeleton gesture model. Meanwhile, in order to prevent the user from being captured as the target user through the error according to the nearest algorithm, the user is further highlighted as a suspected model, whether the error is captured or not is confirmed through whether error capturing feedback is received, the suspected model is locked into an effective model corresponding to the target user when the error capturing feedback is not received, the target user is used as a capturing object, the capturing component is guided to adjust the capturing gesture according to the capturing object, further the capturing of more comprehensive user gesture information is facilitated, and the use experience of the intelligent mirror is improved.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

Other features, objects and advantages of the present disclosure will become more apparent upon reading of the detailed description of non-limiting embodiments, made with reference to the following drawings:

FIG. 1 is an exemplary system architecture in which the present disclosure may be applied;

FIG. 2 is a flowchart of a user capture method provided by an embodiment of the present disclosure;

FIG. 3 is a flowchart of a processing method of error capture feedback based on voice form in a user capture method according to an embodiment of the disclosure;

FIG. 4 is a flowchart of a gesture comparison and feedback method provided by an embodiment of the present disclosure;

FIGS. 5-1 through 5-8 are schematic views of a full flow of a user beginning to use a smart fitness mirror to actually enter a fitness session in accordance with embodiments of the present disclosure;

FIG. 6-1 is a schematic diagram illustrating the effect of the intelligent fitness mirror according to the embodiments of the present disclosure in prompting and correcting the user's erroneous action;

FIG. 6-2 is a schematic diagram illustrating the effect of the intelligent fitness mirror according to the embodiments of the present disclosure on the problem that the intelligent fitness mirror cannot capture normally due to the shielding of the lens;

FIGS. 7-1 to 7-3 are schematic diagrams illustrating the effect of the smart fitness mirror provided by the embodiments of the present disclosure on different forward excitation feedback according to the degree of posture consistency;

FIG. 8 is a block diagram of a user capture device provided in an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an electronic device adapted to perform a user capturing method according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness. It should be noted that, without conflict, the embodiments of the present disclosure and features of the embodiments may be combined with each other.

In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.

FIG. 1 illustrates an exemplary system architecture 100 to which embodiments of user capture methods, apparatus, electronic devices, and computer-readable storage media of the present disclosure may be applied.

As shown in fig. 1, a system architecture 100 may include a smart mirror 101 and a user 102 using the smart mirror 101.

The smart mirror 101 may interact with other terminal devices and servers through a network, so that more functions are provided for the user 102 by using the other terminal devices and servers, and the user 102 may also perform man-machine interaction with the smart mirror 101 through various manners, such as image display, voice interaction, gesture interaction, and the like. The smart mirror 101 may be mounted with various functional components, a camera assembly for taking images, a three-dimensional scanning assembly for scanning an object structure, a speaker for sounding, a display screen for presenting pictures, and the like.

Various applications or programs may be installed on the smart mirror 101 to implement the above-described functions, such as news-type applications, voice-interactive-type applications, fitness-type applications, smart wear-type applications, and the like.

The smart mirror 101 may provide various services through various built-in applications, for example, a fitness application that may provide fitness services, and the smart mirror 101 may achieve the following effects when running the fitness application: firstly, scanning key points of the whole body bones of a user entering a capturing area to generate a bone posture model; then, highlighting the skeletal gesture model corresponding to the user closest to the smart mirror as a suspected model in a preset display area of the smart mirror to display the model to the user 102; and if false capture feedback of the displayed suspected model is not received within a preset time period, locking the suspected model into a valid model corresponding to the target user, and controlling the capture component to take the target user as a capture object and adjust the capture gesture.

The user capturing method provided in the subsequent embodiments of the present disclosure is generally performed by the smart mirror 101, and accordingly, the user capturing device is also generally disposed in the smart mirror 101.

It should be understood that the shape, size, number of smart mirrors and the positional relationship between users in fig. 1 are merely illustrative. The adaptation can be made according to the implementation requirements.

Referring to fig. 2, fig. 2 is a flowchart of a user capturing method according to an embodiment of the disclosure, wherein the flowchart 200 includes the following steps:

step 201: scanning key points of the whole body bones of users entering the capturing area to generate a bone posture model;

this step aims at generating a skeletal pose model by a subject of execution of the user capture method (e.g., smart mirror 101 shown in fig. 1) performing a whole-body skeletal keypoint scan of the user entering the capture region.

The capturing area is in different forms according to different types of capturing devices, and when the capturing devices are image capturing devices of camera types, the capturing area is in a shooting visual field corresponding to the camera; when the capture device is a point capture device of the scanner class, the capture area will appear as a scan range corresponding to the scanner.

Correspondingly, the whole-body bone key point scanning mode is also changed correspondingly according to the original materials captured by the capturing equipment, when the capturing equipment is a camera, the original materials captured by the capturing equipment are the whole-body diagram of a user, so that the whole-body bone key point scanning is to firstly identify bones from the whole-body diagram, and then determine which positions of which bones are to be bone key points; when the capturing device is a three-dimensional laser scanner or a three-dimensional structured light scanner, the original material scanned by the capturing device is a human-shaped point set or a human-shaped point cloud, discrete points need to be fitted into a continuous human-shaped outline first to determine skeleton key points, then bones in the human-shaped outline are determined, and finally, positions of the bones are determined to be the skeleton key points.

After the skeleton key points are scanned, the generated skeleton gesture model is a gesture model which is restored by the skeleton key points and corresponds to the actual gesture of the user, and because the user identity detail information (such as face, posture and the like) is not required to be rendered based on skeleton generation, the rendering operation amount is reduced, the rendering time is shortened, and the linkage between the rendering result seen by the user and the actual action of the user is improved.

Step 202: highlighting a skeletal gesture model corresponding to a user closest to the smart mirror in a preset display area of the smart mirror as a suspected model;

on the basis of step 201, this step aims at selecting, by the above-mentioned execution subject, a bone posture model of a user distant from itself (smart mirror body) among bone posture models of all users present in the capturing area according to a preset distance nearest localization principle as a suspected model, and drawing attention of a target user by highlighting the suspected model in a preset display area.

The smart mirror preset distance is the most recently positioned principle because a plurality of users usually cannot pass before the smart mirror, so that the target user who actually uses the smart mirror can be successfully positioned according to the principle in most cases. And because the smart mirror of the present disclosure does not collect more user identity information to distinguish between different users, and more as one to provide the same service in public places (e.g., gym classrooms, gym rooms) without distinguishing users.

The present disclosure considers a certain non-target user who may erroneously lock a passing capture area in some cases according to the principle of closest positioning, and thus draws the attention of the target user by an operation of highlighting it as a suspected model in a preset display area, and feeds back in case the target user finds a locking error, thereby providing a correction mechanism of erroneous locking.

In this step, the suspected model is displayed in a highlighting manner in a preset display area for a period of time, so that a target user is given feedback time for possible error capturing or error locking operation, and the specific time length can be set according to the actual situation, for example, 5 seconds, 7 seconds, etc.

The preset display area of the smart mirror is an area for displaying a skeletal gesture model of a user in the present disclosure, and a mirror body position that is most noticeable to the user may be determined as the preset display area, for example, a middle area, a lower half area, and the like of the mirror body. In addition, besides the mode of highlighting the currently locked suspected model and other unlocked bone posture models, other modes such as edge drawing, differential color rendering, adding a selected frame, adding an arrow indication special effect and the like can be added or replaced according to actual conditions, and a preset distinguishing reminding mode can be specifically adjusted according to the installation place of the intelligent mirror and different crowd types, so that the currently locked suspected model and other unlocked bone posture models can be obviously distinguished without specific limitation.

Step 203: and in response to not receiving error capturing feedback of the displayed suspected model within a preset time period, locking the suspected model into a valid model corresponding to the target user, and controlling the capturing component to take the target user as a capturing object and adjust the capturing gesture.

For the case that the error capturing feedback to the displayed suspected model is not received within the preset time, the step aims to lock the suspected model into the effective model corresponding to the target user by the execution main body, namely, the suspected model is really the bone gesture model corresponding to the target user, and the suspected model can be definitely the effective model because the target user is successfully captured, and if other alternative bone gesture models can be confirmed to be eliminated and shielded at the moment.

After the effective model is determined, the execution subject controls the capturing component to take the target user as a capturing object and adjusts the capturing gesture, namely the capturing component can adjust the capturing gesture (such as capturing angle, capturing area and the like) of the target user according to the position change of the target user so as to better capture the follow-up motion information of the target user.

Aiming at the large-screen intelligent device, namely the intelligent mirror, the user capturing method provided by the embodiment of the disclosure can simplify the rendering operand, shorten the rendering time and further improve the instantaneity by expressing the user gesture as the skeleton gesture model. Meanwhile, in order to prevent the user from being captured as the target user through the error according to the nearest algorithm, the user is further highlighted as a suspected model, whether the error is captured or not is confirmed through whether error capturing feedback is received, the suspected model is locked into an effective model corresponding to the target user when the error capturing feedback is not received, the target user is used as a capturing object, the capturing component is guided to adjust the capturing gesture according to the capturing object, further the capturing of more comprehensive user gesture information is facilitated, and the use experience of the intelligent mirror is improved.

Unlike the case of step 203 in flow 200, another case may be: error capture feedback for the displayed suspected model is received within a preset time period, in which case a re-capture indication may be extracted from the error capture feedback according to a feedback form of the error capture feedback, and a new suspected model may be re-determined according to the re-capture indication.

The feedback forms may include voice feedback, gesture feedback, limb motion feedback, touch feedback, and the like, and it should be understood that the amounts of information contained in different feedback forms often vary greatly, and the actual feedback difficulties reflected in different feedback forms for containing the same amounts of information are also different, so that when an effective recapture instruction is extracted from the error capture feedback, the feedback forms need to be combined, so that an accurate recapture instruction is extracted according to a key information extraction mode corresponding to the feedback form, and further, new suspected models are determined again.

To enhance understanding of the implementation of extracting a recapture indication from error capture feedback, determining a new suspected model, the present disclosure also provides an implementation of a method of processing error capture feedback based on a form of speech in conjunction with fig. 3, flow 300 comprising the steps of:

Step 301: carrying out semantic recognition on the received voice feedback signals, and extracting action information and emotion tendency information from semantic recognition results;

the step aims at carrying out semantic recognition on the received voice feedback signal by the execution main body to obtain a semantic recognition result which can facilitate the execution main body to clearly express meaning intended by a user, and then extracting action information and emotion tendency information from the semantic recognition result. The action information refers to indication information for determining how to redetermine a new suspected model, for example, "select left side", "next", etc., and the emotion tendency information is content of the target user expressing self emotion, for example, "egg, select wrong", "select right this time, stick", etc., so that whether the new suspected model currently selected by the smart mirror is selected to the one desired by the user can be determined according to the emotion tendency.

Step 302: determining a position relation of a skeleton gesture model relative to a suspected model for indicating a target user according to emotion tendency information and action information;

step 303: and determining a new suspected model according to the current position and the position relation of the suspected model.

Steps 302-303 are aimed at determining, by the executing body, a positional relationship of the skeletal posture model for indicating the target user with respect to the suspected model according to the emotion tendency information and the motion information, and then determining a new suspected model according to the current position and the positional relationship of the suspected model.

According to the above processing example of error capture feedback in the form of voice, other feedback forms may also show similar motion information and emotion tendency information in different ways, and further determine a new suspected model according to similar processing and analysis ways, which are not listed here.

On the basis of any of the above embodiments, the present disclosure further provides, through fig. 4, a practical use manner of a user acting following a standard action displayed by a smart mirror after successfully determining an effective model corresponding to a target user, where the process 400 includes the following steps:

step 401: rendering the skeleton gesture of the effective model in real time according to the captured real-time action information of the target user;

the step aims to render the skeleton gesture of the effective model in real time by the execution body according to the captured real-time action information of the target user, so that the gesture of the skeleton gesture model obtained by real-time rendering is consistent with the real-time action of the target user. The bone gesture model at this time can be presented to the target user by the smart mirror, and the target user then specifies the actual actions that he/she is currently doing.

Step 402: comparing the skeleton gesture with the standard skeleton gesture at the same time point to obtain the gesture consistency degree;

based on step 401, this step aims to compare the bone pose actually presented by the target user with the standard bone pose at the same time point by the above-mentioned execution subject, so as to obtain a pose consistency degree, i.e. the pose consistency degree describes the pose consistency between the actual bone pose and the standard bone pose. The standard bone pose may be a bone pose presented by a bone pose model obtained by an exercise action demonstration person in the same way.

Step 403: presenting gesture feedback corresponding to the gesture consistency degree;

based on step 402, this step aims at presenting, by the execution subject, gesture feedback corresponding to the degree of gesture consistency. Specifically, the gesture consistency degree can be simply divided into two types of gesture consistency and gesture inconsistency, and forward excitation feedback and false action prompt are further given.

For example, when the degree of posture consistency exceeds a preset degree, corresponding forward excitation feedback is presented according to the degree of posture consistency exceeding the preset degree, for example, when the preset degree is 80% consistency, different forward excitation feedback is presented when the actual degree of posture consistency is 85% and 95%, for example, forward excitation feedback of 'good' is presented when 85% and forward excitation feedback of 'true too excellent' is presented when 95% so as to excite a user to continuously promote the driving force of consistency between the user and standard actions.

And for example, when the consistency degree of the gestures does not exceed the preset degree, determining an error skeleton corresponding to the error gesture, and differentially rendering the error skeleton on the effective model to prompt a target user. The differential rendering can adopt various modes such as color difference, highlighting or not and the like. Further, corresponding gesture correction guidance can be generated for the effective model with differentiated rendering, so that the target user can quickly correct the error gesture to the standard gesture through the gesture correction guidance. Specifically, the gesture correction guidance may be a text guidance presented beside the skeletal gesture model, an animation guidance embodied as a continuous action, or a voice guidance including a gesture collar.

To deepen understanding, the present disclosure also provides a specific usage manner of an intelligent fitness mirror, which is specifically provided in a fitness scene and provides a user with a fitness action guide, for example:

as shown in fig. 5-1, the user first selects a desired exercise program in the smart exercise scope;

as shown in fig. 5-2, the intelligent fitness mirror presents a prompt that requires the target user to stand 1-1.5 meters from the mirror;

at the moment, the intelligent body-building mirror starts the camera to scan and identify human body data of all users in the visual field, key skeleton point information of the human body is obtained through capturing human body action gestures, analyzing and abstracting, a skeleton gesture model is generated after each key skeleton point is connected, and the human body outline of the skeleton gesture model is outlined by adopting a match person form;

5-3, the intelligent gym mirror will identify a user nearest to the mirror based on visual nearest human bone point detection and highlight its bone pose model in white;

as shown in fig. 5-4, 5-5 and 5-6, after waiting for error capture feedback accompanied by reciprocal 321, if error capture feedback is not received, the smart fitness mirror prompts the target user that the lock is completed as shown in fig. 5-7, and formally enters a subsequent fitness course after passing through the transition screen of fig. 5-8.

After formally entering a fitness course, the user tries to do the same according to the standard fitness actions presented in the mirror, and the intelligent fitness mirror continuously monitors the real-time gestures of the locked user and renders a corresponding skeleton gesture model according to the real-time gestures, namely, the animation of the matchman is displayed.

Meanwhile, for example, a round dancing pool can be arranged around the matchman, the dancing pool is controlled to be combined with the state of the matchman, and the course links show corresponding special effects, for example, when a user walks out of a motion capture area in front of a mirror, the matchman turns into a red dotted outline, and a stage shows a light red effect.

When the actions made by the user are obviously different from the standard actions, the actions can be reflected on the matchmaker by differentially rendering the wrong limbs (see the lower half red part of the white matchmaker shown in fig. 6-1); if the user leaves the front lens action capturing area of the intelligent fitness mirror or the camera is blocked, a corresponding prompt can be sent (see the lens blocking prompt shown in fig. 6-2).

When the action made by the user is closer to the standard action, the corresponding action score is presented in a quantitative manner, and the office score sequentially shows the corresponding stage multicolor special effects from low to high, for example, the user is sequentially prompted to be good, fine, perfect and the like (corresponding to fig. 7-1, 7-2 and 7-3 respectively).

Specifically, the action score may be calculated by:

the intelligent body-building mirror acquires human body images in front of a screen, obtains human skeleton key points based on a visual recognition algorithm, and matches user skeleton key frames with training standard action key action frames in courses by combining an action matching algorithm, so that a matching result is returned in real time. The matching result is converted into a prompt language which is easy to understand by a user, such as that the elbow joint is close to the body trunk. If the matching degree of the action gesture of the user and the standard action is relatively high, and the action response has a certain rhythm (specifically, the action response is represented by a user key data frame and needs to be responded within a certain time range), the corresponding AI score is calculated in a conversion mode. During the period, the motion amplitude of the user is also identified (if the motion range of the user is larger, the vitality value of the user is considered to be higher), the calculation of the vitality value is converted into a corresponding course score, and the motion score of the user is recorded.

With further reference to FIG. 8, as an implementation of the method illustrated in the above figures, the present disclosure provides one embodiment of a user capture device, which corresponds to the method embodiment illustrated in FIG. 2, and which is particularly applicable in a variety of electronic devices.

As shown in fig. 8, the user capturing apparatus 800 of the present embodiment may include: a bone posture model generating unit 801, a suspected model highlighting unit 802, an effective model determining and capturing posture adjusting unit 803. Wherein, the skeletal gesture model generating unit 801 is configured to perform whole-body skeletal key point scanning on a user entering the capturing area, and generate a skeletal gesture model; a suspected model highlighting unit 802 configured to highlight, as a suspected model, a bone posture model corresponding to a user closest to the smart mirror in a preset display area of the smart mirror; the effective model determination and capture gesture adjustment unit 803 is configured to lock the suspected model into an effective model corresponding to the target user in response to not receiving erroneous capture feedback of the displayed suspected model within a preset period of time, and control the capture component to take the target user as a capture object and adjust the capture gesture.

In the present embodiment, in the user capturing apparatus 800: specific processing of the bone posture model generating unit 801, the suspected model highlighting unit 802, and the effective model determining and capturing posture adjusting unit 803, and the technical effects thereof, may refer to the relevant descriptions of steps 201-203 in the corresponding embodiment of fig. 2, and are not repeated herein.

In some alternative implementations of the present embodiment, the user capture device 800 may further include:

and the error capture feedback processing unit is configured to respond to the error capture feedback received in the preset time period and the displayed suspected model, extract a re-capture instruction from the error capture feedback according to the feedback form of the error capture feedback, and re-determine a new suspected model according to the re-capture instruction.

In some optional implementations of this embodiment, the error capture feedback processing unit may include a re-capture indication extraction subunit configured to extract a re-capture indication from the error capture feedback according to a feedback form of the error capture feedback, and the re-capture indication extraction subunit may be further configured to:

responding to the feedback form of error capturing feedback as voice, carrying out semantic recognition on the received voice feedback signal, and extracting action information and emotion tendency information from a semantic recognition result;

Determining a position relation of a skeleton gesture model relative to a suspected model for indicating a target user according to emotion tendency information and action information;

correspondingly, the error capture feedback processing unit may comprise a new suspected model redetermination subunit configured to redefine a new suspected model in accordance with the redemption indication, the new suspected model redetermination subunit in turn may be further configured to:

and determining a new suspected model according to the current position and the position relation of the suspected model.

a real-time rendering unit configured to render a skeletal gesture of the effective model in real time according to the captured real-time motion information of the target user;

the posture consistency comparison unit is configured to compare the skeleton posture with the standard skeleton posture at the same time point to obtain the posture consistency degree;

and a posture feedback unit configured to present posture feedback corresponding to the degree of posture consistency.

In some optional implementations of the present embodiment, the gesture feedback unit may be further configured to:

responding to the gesture consistency degree exceeding the preset degree, and presenting corresponding forward excitation feedback according to the amplitude of the gesture consistency degree exceeding the preset degree;

And determining an error skeleton corresponding to the error gesture in response to the gesture consistency degree not exceeding the preset degree, and performing differential rendering on the error skeleton on the effective model.

and the gesture correction guidance generating unit is configured to generate corresponding gesture correction guidance for the effective model with the differentiated rendering.

This embodiment exists as an embodiment of the apparatus corresponding to the above-described method embodiment.

Aiming at the large-screen intelligent equipment of the intelligent mirror, the user capturing device provided by the embodiment of the disclosure can simplify the rendering operand, shorten the rendering time consumption and further promote the instantaneity by expressing the user gesture as the skeleton gesture model. Meanwhile, in order to prevent the user from being captured as the target user through the error according to the nearest algorithm, the user is further highlighted as a suspected model, whether the error is captured or not is confirmed through whether error capturing feedback is received, the suspected model is locked into an effective model corresponding to the target user when the error capturing feedback is not received, the target user is used as a capturing object, the capturing component is guided to adjust the capturing gesture according to the capturing object, further the capturing of more comprehensive user gesture information is facilitated, and the use experience of the intelligent mirror is improved.

According to an embodiment of the present disclosure, the present disclosure further provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; the memory stores instructions executable by the at least one processor to enable the at least one processor to implement the user capture method described in any of the embodiments above when executed.

According to an embodiment of the present disclosure, there is also provided a readable storage medium storing computer instructions for enabling a computer to implement the user capturing method described in any of the above embodiments when executed.

According to an embodiment of the present disclosure, the present disclosure also provides a computer program product, which, when executed by a processor, is capable of implementing the user capture method described in any of the above embodiments.

Fig. 9 shows a schematic block diagram of an example electronic device 900 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the apparatus 900 includes a computing unit 901 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 902 or a computer program loaded from a storage unit 908 into a Random Access Memory (RAM) 903. In the RAM 903, various programs and data required for the operation of the device 900 can also be stored. The computing unit 901, the ROM 902, and the RAM 903 are connected to each other by a bus 904. An input/output (I/O) interface 905 is also connected to the bus 904.

Various components in device 900 are connected to I/O interface 905, including: an input unit 906 such as a keyboard, a mouse, or the like; an output unit 907 such as various types of displays, speakers, and the like; a storage unit 908 such as a magnetic disk, an optical disk, or the like; and a communication unit 909 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunications networks.

The computing unit 901 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 901 performs the respective methods and processes described above, such as a user capturing method. For example, in some embodiments, the user capture method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as the storage unit 908. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 900 via the ROM 902 and/or the communication unit 909. When the computer program is loaded into RAM 903 and executed by the computing unit 901, one or more steps of the user capture method described above may be performed. Alternatively, in other embodiments, the computing unit 901 may be configured to perform the user capture method by any other suitable means (e.g., by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of large management difficulty and weak service expansibility in the traditional physical host and virtual private server (VPS, virtual Private Server) service.

Aiming at the large-screen intelligent device of the intelligent mirror, the technical scheme provided by the embodiment of the disclosure can simplify the rendering operand, shorten the rendering time and further improve the instantaneity by expressing the gesture of the user as the skeleton gesture model. Meanwhile, in order to prevent the user from being captured as the target user through the error according to the nearest algorithm, the user is further highlighted as a suspected model, whether the error is captured or not is confirmed through whether error capturing feedback is received, the suspected model is locked into an effective model corresponding to the target user when the error capturing feedback is not received, the target user is used as a capturing object, the capturing component is guided to adjust the capturing gesture according to the capturing object, further the capturing of more comprehensive user gesture information is facilitated, and the use experience of the intelligent mirror is improved.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A user capturing method is applied to a smart mirror and comprises the following steps:

scanning key points of the whole body bones of users entering the capturing area to generate a bone posture model;

highlighting a skeletal gesture model corresponding to a user closest to the smart mirror in a preset display area of the smart mirror as a suspected model;

in response to not receiving error capture feedback of the displayed suspected model within a preset time period, locking the suspected model into a valid model corresponding to a target user, and controlling a capture component to take the target user as a capture object and adjust a capture gesture;

responding to the received error capturing feedback of the displayed suspected model within the preset time period, wherein the feedback form of the error capturing feedback is voice, carrying out semantic recognition on the received voice feedback signal, and extracting action information and emotion tendency information from a semantic recognition result; and determining a position relation of the skeletal posture model for indicating the target user relative to the suspected model according to the emotion tendency information and the action information, and determining a new suspected model according to the current position of the suspected model and the position relation.

2. The method of claim 1, further comprising:

rendering the skeleton gesture of the effective model in real time according to the captured real-time action information of the target user;

comparing the skeleton gesture with a standard skeleton gesture at the same time point to obtain a gesture consistency degree;

and presenting gesture feedback corresponding to the gesture consistency degree.

3. The method of claim 2, wherein the presenting gesture feedback corresponding to the degree of gesture consistency comprises:

responding to the gesture consistency degree exceeding a preset degree, and presenting corresponding forward excitation feedback according to the amplitude of the gesture consistency degree exceeding the preset degree;

and determining an error skeleton corresponding to the error gesture in response to the gesture consistency degree not exceeding a preset degree, and performing differential rendering on the error skeleton on the effective model.

4. A method according to claim 3, further comprising:

and generating corresponding gesture correction guidance for the effective model with the differentiated rendering.

5. A user capture device, for use with a smart mirror, comprising:

a bone posture model generation unit configured to perform whole-body bone key point scanning on a user entering the capturing area, and generate a bone posture model;

The suspected model highlighting unit is configured to highlight a bone gesture model corresponding to a user closest to the smart mirror as a suspected model in a preset display area of the smart mirror;

an effective model determining and capturing gesture adjusting unit configured to lock a suspected model displayed as an effective model corresponding to a target user in response to not receiving erroneous capturing feedback of the suspected model within a preset period of time, and control a capturing component to take the target user as a capturing object and adjust a capturing gesture;

the error capturing feedback processing unit is configured to respond to the fact that error capturing feedback of the displayed suspected model is received within the preset time period and the feedback form of the error capturing feedback is voice, perform semantic recognition on the received voice feedback signal, and extract action information and emotion tendency information from a semantic recognition result; and determining a position relation of the skeletal posture model for indicating the target user relative to the suspected model according to the emotion tendency information and the action information, and determining a new suspected model according to the current position of the suspected model and the position relation.

6. The apparatus of claim 5, further comprising:

a real-time rendering unit configured to render the skeletal pose of the effective model in real time according to the captured real-time motion information of the target user;

7. The apparatus of claim 6, wherein the gesture feedback unit is further configured to:

8. The apparatus of claim 7, further comprising:

and a gesture correction guidance generating unit configured to generate a corresponding gesture correction guidance for the effective model in which the differential rendering exists.

9. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the user capture method of any one of claims 1-4.

10. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the user capture method of any one of claims 1-4.