CN113076004B

CN113076004B - Method and device for dynamically evaluating user data based on immersion type equipment

Info

Publication number: CN113076004B
Application number: CN202110389780.3A
Authority: CN
Inventors: 杨昊鹏
Original assignee: Beijing Yinxu Technology Co ltd
Current assignee: Beijing Yinxu Technology Co ltd
Priority date: 2021-04-12
Filing date: 2021-04-12
Publication date: 2024-05-03
Anticipated expiration: 2041-04-12
Also published as: CN113076004A

Abstract

The application discloses a method and a device for dynamically evaluating user data based on immersion equipment. The method for dynamically evaluating the user data based on the immersion type equipment comprises the following steps: based on the immersion equipment, head action data, limb action and gesture data and voice data generated by a user under an interaction scene are respectively acquired; analyzing the head action data, and determining first user behavior information expressed by the head action data in an interactive scene; analyzing the limb actions and the gesture data, and determining second user behavior information expressed by the limb actions and the gesture data in an interactive scene; and analyzing the voice data to determine user language information expressed by the voice data in the interactive scene.

Description

Method and device for dynamically evaluating user data based on immersion type equipment

Technical Field

The application relates to the technical field of immersion, in particular to a method and a device for dynamically evaluating user data based on immersion equipment.

Background

Immersive technology refers to technology that models the boundary between the physical world and the simulated world, thereby creating an immersive sensation. Immersive techniques are capable of mixing reality and virtual, and computer-generated simulated environments may be utilized to allow a user to enter a virtual space via a VR device, from which an immersive sensation is obtained. The immersion technology is one of the emerging technologies, and compared with other technologies, the immersion technology has the greatest characteristic that the immersion technology can give the user the most intuitive feeling, and has huge development space in industries of audio and video entertainment, medical treatment, education and the like. However, in the application provided by the current immersion device, for the user data generated by the user in the interaction scene, how to analyze the user data from multiple dimensions and determine the visual information expressed by the user based on the user data according to the analysis result has not been realized yet.

Aiming at the technical problems existing in the prior art that how to analyze user data from multiple dimensions and determine visual information expressed by a user based on the user data according to analysis results for the user data generated by the user under an interaction scene is not yet realized, no effective solution is proposed at present.

Disclosure of Invention

The embodiment of the disclosure provides a method and a device for dynamically evaluating user data based on immersion equipment, which at least solve the technical problems existing in the prior art that how to analyze the user data from multiple dimensions for the user data generated under an interaction scene and determine visual information expressed by a user based on the user data according to an analysis result are not yet realized.

According to one aspect of the disclosed embodiments, there is provided a method for dynamically evaluating user data based on an immersive device, comprising: based on the immersion equipment, head action data, limb action and gesture data and voice data generated by a user under an interaction scene are respectively acquired; analyzing the head action data, and determining first user behavior information expressed by the head action data in an interactive scene; analyzing the limb actions and the gesture data, and determining second user behavior information expressed by the limb actions and the gesture data in an interactive scene; and analyzing the voice data to determine user language information expressed by the voice data in the interactive scene.

According to another aspect of the embodiments of the present disclosure, there is also provided a storage medium including a stored program, wherein the method of any one of the above is performed by a processor when the program is run.

According to another aspect of the embodiments of the present disclosure, there is also provided an apparatus for dynamically evaluating user data based on an immersive device, including: the data acquisition module is used for respectively acquiring head motion data, limb motion and gesture data and voice data generated by a user under an interaction scene based on the immersion type equipment; the first determining module is used for analyzing the head action data and determining first user behavior information expressed by the head action data in the interactive scene; the second determining module is used for analyzing the limb actions and the gesture data and determining second user behavior information expressed by the limb actions and the gesture data in the interaction scene; and the third determining module is used for analyzing the voice data and determining the user language information expressed by the voice data in the interactive scene.

According to another aspect of the embodiments of the present disclosure, there is also provided an apparatus for dynamically evaluating user data based on an immersive device, including: a processor; and a memory, coupled to the processor, for providing instructions to the processor for processing the steps of: based on the immersion equipment, head action data, limb action and gesture data and voice data generated by a user under an interaction scene are respectively acquired; analyzing the head action data, and determining first user behavior information expressed by the head action data in an interactive scene; analyzing the limb actions and the gesture data, and determining second user behavior information expressed by the limb actions and the gesture data in an interactive scene; and analyzing the voice data to determine user language information expressed by the voice data in the interactive scene.

In the embodiment of the disclosure, the computing device respectively acquires head action data, limb action and gesture data and voice data generated by a user in an interaction scene, and determines user behavior information and language information expressed by the user behavior data and the voice data in the interaction scene by combining the current interaction scene of the user. The computing device performs multidimensional analysis on the user data based on the immersion device, and determines user behavior information and language information expressed by the user data (including head motion data, limb motion, gesture data and voice data) generated under the interaction scene, so that the purpose of dynamically evaluating the user data is achieved. The method solves the technical problems existing in the prior art that how to analyze the user data from multiple dimensions for the user data generated by the user under the interaction scene and determine the visual information expressed by the user under the user data according to the analysis result are not realized yet.

Drawings

The accompanying drawings, which are included to provide a further understanding of the disclosure, illustrate and explain the present disclosure, and together with the description serve to explain the present disclosure. In the drawings:

FIG. 1 is a block diagram of a hardware architecture of a computing device for implementing a method according to embodiment 1 of the present disclosure;

FIG. 2 is a flow chart of a method for dynamically evaluating user data based on an immersive device in accordance with a first aspect of embodiment 1 of the present disclosure;

Fig. 3 is a schematic diagram of a VR device according to a first aspect of embodiment 1 of the present disclosure;

FIG. 4 is a schematic diagram of six classes of head action classes based on preset class definition rules according to the first aspect of embodiment 1 of the present disclosure;

FIG. 5a is a schematic diagram of limb behavior and gestures generated by a user in an interactive scenario according to the first aspect of embodiment 1 of the present disclosure;

FIG. 5b is another schematic diagram of limb behavior and gestures generated by a user in an interactive scenario according to the first aspect of embodiment 1 of the present disclosure;

FIG. 6 is a flow chart of acquiring voice data generated by a user in an interactive scenario according to the first aspect of embodiment 1 of the present disclosure;

FIG. 7 is a schematic illustration of various analysis dimensions and their corresponding analysis results related to analyzing speech data according to the first aspect of embodiment 1 of the present disclosure;

fig. 8 is a schematic diagram of an apparatus for dynamically evaluating user data based on an immersive device according to embodiment 2 of the present disclosure; and

Fig. 9 is a schematic diagram of an apparatus for dynamically evaluating user data based on an immersive device according to embodiment 3 of the present disclosure.

Detailed Description

In order to better understand the technical solutions of the present disclosure, the following description will clearly and completely describe the technical solutions of the embodiments of the present disclosure with reference to the drawings in the embodiments of the present disclosure. It will be apparent that the described embodiments are merely embodiments of a portion, but not all, of the present disclosure. All other embodiments, which can be made by one of ordinary skill in the art without inventive effort, based on the embodiments in this disclosure, shall fall within the scope of the present disclosure.

It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the foregoing figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the disclosure described herein may be capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

Example 1

According to the present embodiment, there is provided an embodiment of a method of dynamically evaluating user data based on an immersive device, it being noted that the steps illustrated in the flowchart of the drawings may be performed in a computer system, such as a set of computer-executable instructions, and that, although a logical order is illustrated in the flowchart, in some cases, the steps illustrated or described may be performed in an order other than that illustrated herein.

The method embodiments provided by the present embodiments may be performed in a server or similar computing device. FIG. 1 illustrates a hardware block diagram of a computing device for implementing a method for dynamically evaluating user data based on an immersive device. As shown in fig. 1, the computing device may include one or more processors (which may include, but are not limited to, a microprocessor MCU, a programmable logic device FPGA, etc., processing means), memory for storing data, and transmission means for communication functions. In addition, the method may further include: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power supply, and/or a camera. It will be appreciated by those of ordinary skill in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the electronic device described above. For example, the computing device may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.

It should be noted that the one or more processors and/or other data processing circuits described above may be referred to herein generally as "data processing circuits. The data processing circuit may be embodied in whole or in part in software, hardware, firmware, or any other combination. Furthermore, the data processing circuitry may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the computing device. As referred to in the embodiments of the present disclosure, the data processing circuit acts as a processor control (e.g., selection of the variable resistance termination path to interface with).

The memory may be used to store software programs and modules of application software, such as a program instruction/data storage device corresponding to a method for dynamically evaluating user data based on an immersion device in an embodiment of the disclosure, and the processor executes various functional applications and data processing by running the software programs and modules stored in the memory, that is, a method for dynamically evaluating user data based on the immersion device for implementing the application program. The memory may include high speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory may further include memory remotely located with respect to the processor, which may be connected to the computing device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The transmission means is used for receiving or transmitting data via a network. Specific examples of the network described above may include a wireless network provided by a communications provider of the computing device. In one example, the transmission means includes a network adapter (Network Interface Controller, NIC) that can be connected to other network devices via the base station to communicate with the Internet. In one example, the transmission device may be a Radio Frequency (RF) module, which is used to communicate with the internet wirelessly.

The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the computing device.

It should be noted herein that in some alternative embodiments, the computing device shown in FIG. 1 described above may include hardware elements (including circuitry), software elements (including computer code stored on a computer-readable medium), or a combination of both hardware and software elements. It should be noted that fig. 1 is only one example of a particular specific example and is intended to illustrate the types of components that may be present in the computing devices described above.

In the above-described operating environment, according to a first aspect of the present embodiment, there is provided a method for dynamically evaluating user data based on an immersive device. Fig. 2 shows a schematic flow chart of the method, and referring to fig. 2, the method includes:

S202: based on the immersion equipment, head action data, limb action and gesture data and voice data generated by a user under an interaction scene are respectively acquired;

S204: analyzing the head action data, and determining first user behavior information expressed by the head action data in an interactive scene;

s206: analyzing the limb actions and the gesture data, and determining second user behavior information expressed by the limb actions and the gesture data in an interactive scene; and

S208: and analyzing the voice data to determine the user language information expressed by the voice data in the interactive scene.

Specifically, the computing device obtains head motion data, limb motion and gesture data, and voice data generated by a user in an interactive scene based on the immersive device, respectively. The immersion device comprises an immersion head display device, a handle, an array microphone and the like. In an interactive scene simulated by the immersion type device, the computing device can acquire head motion data generated by a user in the interactive scene through the immersion type head display device, limb motion and gesture data generated by the user in the interactive scene through the handle device and voice data generated by the user in the interactive scene through the array microphone. As shown in fig. 3, the immersive device (VR device) may be divided into 3DOF and 6DOF, where the 6DOF is more than 3DOF to establish XYZ coordinate axes of the user currently, and the user positioning principle is mainly based on SLAM technology (simultaneous localization AND MAPPING), which refers to a process of calculating the position of a moving object and constructing an environment map while calculating the position of the moving object according to information of a sensor. The head motion data acquired by the computing device through the immersion head display device are spatial coordinates of the head of the user at different time points, the limb motion and gesture data acquired by the handle device are spatial coordinates of the limb and gesture of the user at different time points, and the voice data acquired by the array microphone are audio files corresponding to the sound made by the user in the interactive scene (S202).

Further, since the same head action data generated by the user in different interactive scenes may express different user behavior information, the computing device needs to analyze the head action data to determine the user behavior information (corresponding to the first user behavior information) expressed by the head action data in the interactive scenes (S204). The computing device can analyze the rule of the space coordinates of the user head according to the space coordinates of the user head at different time points, and determine the user behavior shown by the user head action, so that the user behavior information to be expressed by the user under the interaction scene based on the user behavior is determined by combining the scene category of the interaction scene where the user is currently located. For example, in an interactive scenario of a two-party conversation, the computing device analyzes the rule of obtaining the spatial coordinates of the user's head around the X-axis according to the spatial coordinates of the user's head at different time points, at this time, the computing device may determine that the user behavior exhibited by the user's head action is, for example, the user's nodding, so that, in combination with the scenario type of the interactive scenario in which the user is currently located (i.e., the two-party conversation scenario), it may determine that, based on the nodding behavior, the user behavior information to be expressed by the user in the two-party conversation scenario is, for example, behavior information such as the user's affirmative or approval of the current thing.

Further, since the same limb motion and gesture data generated by the user in different interaction scenarios may express different user behavior information, the computing device also needs to analyze the limb motion and gesture data to determine the user behavior information (corresponding to the second user behavior information) expressed by the limb motion and gesture data in the interaction scenarios (S206). The computing device can analyze rules of the spatial coordinates of the limb actions and the gestures of the user according to the spatial coordinates of the limb actions and the gestures of the user at different time points, and determine user behaviors represented by the limb actions and the gestures of the user, so that the scene category of an interaction scene where the user is currently located is combined, and user behavior information to be expressed by the user under the interaction scene is determined based on the user behaviors. For example, in an interaction scene of a tennis match, the computing device analyzes that the rule of the spatial coordinates of the limb actions and the gestures is forward semicircle rotation according to the spatial coordinates of the limb actions and the gestures at different time points, at this time, the computing device may determine that the limb actions and the user actions shown by the gestures of the user are forward waving of the arm, so that in combination with the scene type of the interaction scene where the user is currently located (i.e. the tennis match scene), it may determine that based on the user actions, the user action information to be expressed by the user in the tennis match scene is, for example, the action information that the user is waving. In addition, for example, in an interactive scene of outdoor activities, the computing device analyzes that the rule of the spatial coordinates of the limbs is still forward semicircular arc rotation according to the spatial coordinates of the limbs of the user at different time points, at this time, the computing device can determine that the user behavior exhibited by the limb actions of the user is, for example, forward waving of an arm, so that in combination with the scene type of the interactive scene where the user is currently located (i.e., the outdoor interactive scene), it can determine that based on the behavior, the user behavior information to be expressed by the user in the outdoor activity scene is, for example, behavior information of the butterfly being played by the user.

Further, since the same voice data generated by the user in different interactive scenes may express different user language information, the computing device needs to analyze the voice data to determine the user language information expressed by the voice data in the interactive scenes (S208). Under the current interaction scene, the computing device can determine user language information to be expressed by the user under the interaction scene according to the voice data of the user and combining the scene type of the interaction scene where the user is currently located. For example, in an interactive scenario of a game, the computing device determines, based on the user's voice data, e.g., "good", in conjunction with the scenario category of the interactive scenario in which the user is currently located (i.e., the game scenario), user language information to be expressed by the user in the game scenario, e.g., user's praise on the performance of athletes on the game. In addition, for example, in the interactive scene of the two-party conversation, the computing device determines, for example, that the user answers the opposite side request for the user, according to the voice data of the user, for example, "good", in combination with the scene type of the interactive scene in which the user is currently located (that is, the two-party conversation scene).

As described in the background art, the immersion technology is one of the emerging technologies, and compared with other technologies, the immersion technology has the biggest characteristic that the immersion technology can give the user the most intuitive feeling, and has huge development space in industries such as audio and video entertainment, medical treatment, education and the like. However, the technology of how to analyze the user data from multiple dimensions and determine visual information expressed by the user based on the user data according to the analysis result for the user data generated by the user in the interactive scene has not been achieved yet.

Aiming at the technical problems, through the technical scheme of the embodiment, the computing equipment respectively acquires head action data, limb action and gesture data and voice data generated by a user in an interactive scene, and determines user behavior information and language information expressed by the user behavior data and the voice data in the interactive scene by combining the current interactive scene of the user. The computing device performs multidimensional analysis on the user data based on the immersion device, and determines user behavior information and language information expressed by the user data (including head motion data, limb motion, gesture data and voice data) generated under the interaction scene, so that the purpose of dynamically evaluating the user data is achieved. The method solves the technical problems existing in the prior art that how to analyze the user data from multiple dimensions for the user data generated by the user under the interaction scene and determine the visual information expressed by the user under the user data according to the analysis result are not realized yet.

Optionally, the operation of analyzing the head motion data to determine the first user behavior information expressed by the head motion data in the interactive scene includes: analyzing the head motion data according to a preset class definition rule corresponding to the head motion data, and determining head motion information represented by the head motion data; determining a time interval and a relative position related to calculating a head motion trajectory of a user; determining a head action track of a user according to the head action information, the time interval and the relative position; and analyzing the head action track based on the scene information of the interaction scene, and determining first user behavior information expressed by the head action data under the interaction scene according to the analysis result.

Specifically, the computing device analyzes the head motion data according to a preset category definition rule corresponding to the head motion data, and determines head motion information represented by the head motion data. As shown in fig. 4, the computing device is pre-configured with six types of definition rules corresponding to the head motion data, where the definition rules are respectively: YAW along the Y AXIS (Y AXIS YAW), PITCH along the X AXIS (X AXIS PITCH), ROLL along the Z AXIS (Z AXIS ROLL), UP/DOWN along the Y AXIS (Y AXIS UP/DOWN), LEFT/RIGHT along the X AXIS (X AXIS LEFT/RIGHT), and forward/backward along the Z AXIS (Z AXIS FRONT/BACK). For example, the computing device obtains a class definition rule corresponding to the user's head motion data as a YAW (Y AXIS YAW), and then analyzes the head motion data according to the class definition rule to determine that the head motion information represented by the head motion data is a YAW. The computing device obtains a plurality of time points and positions generated when the head of the user is moved, determines a time interval t between each time point and a relative position S of two adjacent positions, and can calculate a head motion track of the head when the head is moved according to a formula vector (v) =S (relative position)/t (time interval). And then analyzing the head motion track according to the acquired head motion information, time interval and relative position. For example, under the interaction scene of the two parties, the head action obtained by continuously deflecting the analyzed head action track along the Y axis is the head shaking, and then the first user information expressed by the head action of the head shaking is determined to be the negative request of the other party by combining the current interaction scene, for example, the two parties' interaction scene. The computing device analyzes the head motion data via the category definition rules and determines a user head motion profile based on the head motion information and time intervals and relative positions associated with the user's head motion profile. And then analyzing the head action track based on the interaction scene so as to determine first user behavior information expressed by the head action data in the interaction scene. Therefore, the accuracy of obtaining the user behavior information by analyzing the user head action data is improved, the meaning expressed by the head action of the user in the interaction scene can be accurately judged, and the technical effect of accurately evaluating the head action data of the user is achieved.

Optionally, based on the immersive device, the operation of acquiring limb motion and gesture data generated by the user in the interaction scene comprises: recording the current initial space coordinates and initial time of the hands of the user according to the initial state of the immersion type equipment; determining a time interval for recording limb actions and gesture data generated by a user in an interaction scene according to the precision required by the preset interaction scene; according to the determined time interval, recording the spatial coordinates of the limb actions and gestures generated by the user in the interaction scene in preset time; and taking all the space coordinates recorded in the preset time as limb actions and gesture data generated by the user in the interaction scene.

Specifically, referring to fig. 5a and 5b, the computing device may capture the user behavior data according to an immersive device such as a handle or the real hands of the user, and if the computing device obtains the user behavior data through the hands of the user, it is necessary to simply model the hands of the user through the front depth camera of the immersive head display device when the hands are first used. If the computing device obtains user behavior data through the immersive device such as a handle mainly relying on Bluetooth or optical positioning, the computing device obtains a starting signal transmitted to the handle by limbs of the user after the user activates the handle, and the initializing space coordinates and the initializing time of the hands of the user in an interactive scene are recorded. Different precision is preset manually under different interaction scenes, and the precision can comprise three levels of low, medium and high, even lower precision and higher precision can be contained according to the needs, and the precision is not limited to the lower precision and the higher precision. And, setting different accuracies to the interactive scene means that the time intervals of the user data recorded by the immersive device are also different. For example: the time interval corresponding to the high precision is 10ms, the time interval corresponding to the medium precision is 100ms, and the time interval corresponding to the high precision is 1000ms. Assuming that the precision of the current interaction scene of the user is middle, the computing device determines that the time interval for recording the limb motion and gesture data generated by the user in the interaction scene is 100ms according to the precision of the interaction scene, namely, the computing device records the space coordinates of one limb motion and gesture every 0.1 second. The computing device then uses these spatial coordinates obtained over a preset time (e.g., 5 s) as limb motion and gesture data generated by the user in the interactive scene. According to the required precision of the interaction scene, the computing equipment continuously records the space coordinates of the limb actions and the gestures as limb action and gesture data so as to achieve the purpose of collecting the limb actions and the gesture data with preset precision, and therefore limb action and gesture data generated by a user in the interaction scene can be accurately and comprehensively obtained.

Optionally, the operation of analyzing the limb motion and gesture data to determine the second user behavior information expressed by the limb motion and gesture data in the interaction scene includes: connecting all the space coordinates recorded in the preset time from an initial state to a final state by taking time as a sequence; calculating the corresponding acceleration and vector speed of every two adjacent space coordinates in all the obtained space coordinates in a preset time; carrying out track division on the space coordinates with different values of acceleration and vector speed according to different colors to obtain limb and gesture tracks corresponding to a user in preset time; and analyzing the limb and gesture tracks according to the scene information of the interaction scene, and determining second user behavior information expressed by limb actions and gesture data in the interaction scene according to the analysis result.

Specifically, the computing device connects all the space coordinates recorded in the preset time from the initial state to the last state in time sequence. For example, the computing device connects the spatial coordinates (x 0, y0, z 0) of the initial state of the user at the initial time (t 0) with the next spatial coordinates (x 1, y1, z 1) corresponding to the next time (t 1) in time order until the spatial coordinates (xn, yn, zn) of the end state at the end time (tn). The computing device will calculate the acceleration and vector velocity corresponding to each adjacent two of all the spatial coordinates recorded over a preset time, for example (x 0, y0, z 0) and (x 1, y1, z 1), (x 1, y1, z 1) and (x 2, y2, z 2), (xn-1, yn-1, zn-1) and (xn, yn, zn). And dividing the space coordinates with the acceleration and the vector speed of different values into tracks according to different colors (for example, gradually evolving from red to green) to obtain limb and gesture tracks corresponding to the user in preset time. And the computing equipment analyzes the limb and gesture tracks according to the scene information of the interaction scene, and determines second user behavior information expressed by limb actions and gesture data in the interaction scene according to the analysis result. For example, in an interactive scene of a tennis match, a user needs to hit a tennis ball displayed in front of the user with a tennis racket on a hand, when the computing device wants the user to hit the tennis ball to a specified position, according to data such as space coordinates, acceleration, vector speed and the like required by forming a limb and a gesture track, the actual newton of the tennis ball is calculated, and the vector direction and the like when hit are obtained, so as to judge what limb action the user needs to simulate currently and what gesture direction can hit the tennis ball to the specified position.

In addition, the technical scheme can also improve the precision of the space coordinates, and simulate the needed limb actions and gestures by the user, so that the computing equipment can present the user behaviors in front of the user in real time, and can be used in practical applications like creation and the like. Therefore, the computing equipment needs the preset conditions and numerical value standard intervals of the user, meanwhile, the computer vision can be used for carrying out learning simulation on the real scene, and the data parameter adjustment is used for enabling the delay of the final computing equipment to be sufficiently reduced, so that the accuracy is continuously improved.

Optionally, based on the immersive device, the operation of acquiring voice data generated by the user in the interaction scene includes: acquiring user sound data generated by a user in an interactive scene and sound simulation data of environmental noise in the interactive scene based on an array microphone in the immersion type equipment; and based on the sound simulation data, noise reduction processing is carried out on the user sound data through hardware and software respectively, and then the voice data generated by the user under the interaction scene is obtained.

Specifically, referring to fig. 6, the computing device obtains user sound data generated by the user in the interactive scene and sound simulation data of environmental noise in the interactive scene through array microphones such as a dual microphone or a three microphone set by the immersive device. Taking three microphone arrays as an example, two microphone arrays closest to the user are used for acquiring user sound data, and the other microphone arrays are used for acquiring sound analog data such as background sound and environmental noise. The computing equipment sequentially carries out noise reduction processing on the voice data of the user through hardware and software based on the collected voice simulation data, and clear and effective voice data generated by the user under an interaction scene can be obtained. Therefore, the computing equipment captures the original sound of the user through the array microphone, and after noise reduction processing is carried out on the user sound data influenced by the environmental noise, clear and high-accuracy voice data are collected.

Optionally, the operation of analyzing the voice data to determine the user language information expressed by the voice data in the interactive scene includes: analyzing the voice data to determine the average pronunciation score corresponding to the voice data under the pronunciation quality dimension; analyzing the voice data to determine the average pronunciation rate corresponding to the voice data in the language fluency dimension; analyzing the voice data to determine the language accuracy corresponding to the voice data under the term accuracy dimension; analyzing the voice data to determine the standard language rate corresponding to the voice data under the term standard dimension; and taking the average pronunciation score, the average pronunciation rate, the language accuracy and the standard language rate as user language information expressed by the voice data in the interactive scene, wherein the user language information is used for indicating the current various capabilities and indexes of the user.

Specifically, referring to fig. 7, after the computing device obtains the voice data of the user, the computing device analyzes the voice data, and performs multidimensional evaluation on the voice data of the user from four aspects of pronunciation quality dimension, language fluency dimension, term accuracy dimension and term specification latitude. The pronunciation quality dimension is mainly measured and evaluated as pronunciation quality accuracy, reading fluency, completion, syllable pronunciation score, error type and the like in a follow-up reading/self-confidence reading mode, and the computing equipment can measure and evaluate the pronunciation data according to the pronunciation quality dimension to obtain average pronunciation score. The language fluency dimension is mainly evaluated by carrying out multidimensional disassembly on a section of English recording under the free dialogue scene of a user, and the computing equipment evaluates the voice data according to the language fluency dimension to obtain the average pronunciation rate. The accurate dimension of the language is mainly evaluated according to the accuracy of answering the questions, the hearing comprehension and the language organization force under the preset conditions of the problematic points, and the computing equipment evaluates the language data according to the accurate dimension of the language to obtain the language accuracy. The user standardability is mainly based on whether the language is used correctly in a special context or not, and the computing equipment evaluates the voice data according to the standard dimension of the language to obtain the standard language rate. And the computing equipment takes the average pronunciation score, the average pronunciation rate, the language accuracy and the standard language rate obtained by the evaluation as user language information expressed by the voice data in the interactive scene, so that the current various capabilities and indexes of the user are indicated.

Therefore, the computing equipment accurately grasps the detailed feedback of the actual pronunciation quality of the current user through the pronunciation quality dimension, so that the user can know the weakness of the current user in time, and the computing equipment recommends the most-needed problem scheme aiming at the weakness obtained by evaluation, thereby breaking through the problem. The computing device suggests to the user how to effectively master the speed of the language when answering the questions through the language fluency dimension, and how to complete the speech and describe the situation in the effective time when practicing in a limited time. How to self-introduce in 60s, the speech speed is required to be moderate and the content is required to be completely conveyed. The computing device exercises the user's hearing and understanding capabilities through the exact dimensions of the phrase, promotes how the user organizes the language, how to answer key questions, etc. by setting different exercise modes. The computing device uses the term specification dimension to enhance how the user speaks with the effective emotion in the appropriate scenario, and if a pleasant sentence is used to open a dialogue in a free dialogue, then sadness terms are used to ensure that the user uses the sentence effectively in the current scenario when the scene is sad. In summary, the computing device evaluates the voice data of the user through multiple dimensions, obtains an evaluation result of evaluating the voice data, and feeds back the evaluation result to the user. Therefore, reasonable suggestions are provided for the user through various data obtained through evaluation, so that various capacities of the user are improved, and the purpose of improving the capacities of the user is achieved.

Optionally, the method further comprises: and displaying the first user behavior information, the second user behavior information and the user language information to the user.

Specifically, the computing device analyzes head motion data, limb motion, gesture data and voice data generated by a user in an interaction scene to obtain first user behavior information, second user behavior information and user language information, and displays the first user behavior information, the second user behavior information and the user language information to the user. The computing device thus displays items of visual information to the user through multidimensional analysis of the user data.

In the embodiment of the disclosure, the computing device respectively acquires head motion data, limb motion, gesture data and voice data generated by the user in the interaction scene, and determines user behavior information and language information expressed by the user behavior data and the voice data in the interaction scene by combining the current interaction scene of the user. The computing device performs multidimensional analysis on the user data based on the immersion device, and determines user behavior information and language information expressed by the user data (including head motion data, limb motion, gesture data and voice data) generated under the interaction scene, so that the purpose of dynamically evaluating the user data is achieved. The method solves the technical problems existing in the prior art that how to analyze the user data from multiple dimensions for the user data generated by the user under the interaction scene and determine the visual information expressed by the user under the user data according to the analysis result are not realized yet.

In addition, the computing device performs algorithm determination on the integration of the data based on multidimensional analysis of the immersive device, and is used for locating various visual information represented by current user behavior data, evaluating the user and displaying the information to the user. The computing device can obtain a sound information file with high quality and high precision after the hardware and the software are denoised successively based on the voice data collected by the microphone array of the immersion device. The computing device will analyze the sound information to show the current capabilities and indicators of the user, which may also be used to correlate core recommendations or personalize the generation of corresponding scenario and interactive content.

Further, referring to fig. 1, according to a second aspect of the present embodiment, there is provided a storage medium. The storage medium includes a stored program, wherein the method of any one of the above is performed by a processor when the program is run.

It should be noted that, for simplicity of description, the foregoing method embodiments are all described as a series of acts, but it should be understood by those skilled in the art that the present invention is not limited by the order of acts described, as some steps may be performed in other orders or concurrently in accordance with the present invention. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required for the present invention.

From the description of the above embodiments, it will be clear to a person skilled in the art that the method according to the above embodiments may be implemented by means of software plus the necessary general hardware platform, but of course also by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

Example 2

Fig. 8 shows an apparatus 800 for dynamically evaluating user data based on an immersive device according to the present embodiment, which apparatus 800 corresponds to the method according to the first aspect of embodiment 1. Referring to fig. 8, the apparatus 800 includes: the data acquisition module 810 is configured to respectively acquire head motion data, limb motion and gesture data, and voice data generated by a user in an interaction scene based on the immersion device; a first determining module 820, configured to analyze the head motion data and determine first user behavior information expressed by the head motion data in the interaction scenario; the second determining module 830 is configured to analyze the limb motion and gesture data, and determine second user behavior information expressed by the limb motion and gesture data in the interaction scene; and a third determining module 840, configured to analyze the voice data and determine user language information expressed by the voice data in the interactive scenario.

Optionally, the first determining module 820 includes: the first analysis sub-module is used for analyzing the head motion data according to a preset category definition rule corresponding to the head motion data and determining head motion information represented by the head motion data; a first determination sub-module for determining a time interval and a relative position associated with calculating a head motion trajectory of a user; the second determining submodule is used for determining the head action track of the user according to the head action information, the time interval and the relative position; and the third determining submodule is used for analyzing the head action track based on the scene information of the interaction scene and determining the first user behavior information expressed by the head action data in the interaction scene according to the analysis result.

Optionally, the data acquisition module 810 includes: the first recording submodule is used for recording the initialization space coordinates and the initialization time of the current hands of the user according to the initial state of the immersion equipment; a fourth determining submodule, configured to determine, according to a precision required by a preset interaction scene, a time interval for recording limb actions and gesture data generated by a user in the interaction scene; the second recording submodule is used for recording the spatial coordinates of the limb actions and the gestures generated by the user in the interaction scene in preset time according to the determined time interval; and a fifth determining submodule, configured to use all the spatial coordinates recorded in the preset time as limb motion and gesture data generated by the user in the interaction scene.

Optionally, the second determining module 830 includes: the connection submodule is used for connecting all the space coordinates obtained by recording in the preset time from an initial state to a final state in time sequence; the calculating sub-module is used for calculating the acceleration and the vector speed corresponding to each two adjacent space coordinates in all the space coordinates obtained in a recording way in preset time; the dividing sub-module is used for dividing the space coordinates with different acceleration and vector speed values into tracks according to different colors to obtain limb and gesture tracks corresponding to the user in preset time; and a sixth determining submodule, configured to analyze the limb and the gesture track according to scene information of the interaction scene, and determine second user behavior information expressed by limb motion and gesture data in the interaction scene according to an analysis result.

Optionally, the data acquisition module 810 includes: the acquisition sub-module is used for acquiring user sound data generated by a user in an interaction scene and sound simulation data of environmental noise in the interaction scene based on the array microphone in the immersion type equipment; and the noise reduction processing sub-module is used for obtaining voice data generated by the user under the interaction scene after noise reduction processing is carried out on the voice data of the user through hardware and software respectively based on the voice simulation data.

Optionally, the third determining module 840 includes: the average pronunciation score determining sub-module is used for analyzing the voice data and determining average pronunciation scores corresponding to the voice data under the pronunciation quality dimension; the average pronunciation rate determination submodule is used for analyzing the voice data and determining the average pronunciation rate corresponding to the voice data in the language fluency dimension; the language accuracy rate determination submodule is used for analyzing the voice data and determining the language accuracy rate corresponding to the voice data under the term accuracy dimension; the standard language rate determination submodule is used for analyzing the voice data and determining the standard language rate corresponding to the voice data under the term standard dimension; and the user language information determining submodule is used for taking the average pronunciation score, the average pronunciation rate, the language accuracy and the standard language rate as user language information expressed by the voice data in the interactive scene, wherein the user language information is used for indicating the current various capabilities and indexes of the user.

Optionally, the apparatus 800 further comprises: and the display module is used for displaying the first user behavior information, the second user behavior information and the user language information to the user.

Thus, according to the embodiment, the computing device respectively acquires the head motion data, the limb motion and gesture data and the voice data generated by the user in the interaction scene, and determines the user behavior data and the user behavior information and the language information expressed by the voice data of the user in the interaction scene by combining the current interaction scene of the user. The computing device performs multidimensional analysis on the user data based on the immersion device, and determines user behavior information and language information expressed by the user data (including head motion data, limb motion, gesture data and voice data) generated under the interaction scene, so that the purpose of dynamically evaluating the user data is achieved. The method solves the technical problems existing in the prior art that how to analyze the user data from multiple dimensions for the user data generated by the user under the interaction scene and determine the visual information expressed by the user under the user data according to the analysis result are not realized yet.

Example 3

Fig. 9 shows an apparatus 900 for dynamically evaluating user data based on an immersive device according to the present embodiment, which apparatus 900 corresponds to the method according to the first aspect of embodiment 1. Referring to fig. 9, the apparatus 900 includes: a processor 910; and a memory 920 coupled to the processor 910 for providing instructions to the processor 910 for processing the following processing steps: based on the immersion equipment, head action data, limb action and gesture data and voice data generated by a user under an interaction scene are respectively acquired; analyzing the head action data, and determining first user behavior information expressed by the head action data in an interactive scene; analyzing the limb actions and the gesture data, and determining second user behavior information expressed by the limb actions and the gesture data in an interactive scene; and analyzing the voice data to determine user language information expressed by the voice data in the interactive scene.

Optionally, the memory 920 is further configured to provide instructions for the processor 910 to process the following processing steps: and displaying the first user behavior information, the second user behavior information and the user language information to the user.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

In the foregoing embodiments of the present invention, the descriptions of the embodiments are emphasized, and for a portion of this disclosure that is not described in detail in this embodiment, reference is made to the related descriptions of other embodiments.

In the several embodiments provided in the present application, it should be understood that the disclosed technology may be implemented in other manners. The above-described embodiments of the apparatus are merely exemplary, and the division of the units, such as the division of the units, is merely a logical function division, and may be implemented in another manner, for example, multiple units or components may be combined or may be integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some interfaces, units or modules, or may be in electrical or other forms.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing program codes.

The foregoing is merely a preferred embodiment of the present invention and it should be noted that modifications and adaptations to those skilled in the art may be made without departing from the principles of the present invention, which are intended to be comprehended within the scope of the present invention.

Claims

1. A method for dynamically evaluating user data based on an immersive device, comprising:

Based on the immersion equipment, head action data, limb action and gesture data and voice data generated by a user under an interaction scene are respectively acquired;

comprising the following steps:

recording the initialized space coordinates and the initial time of the current hands of the user according to the initial state of the immersion type device;

determining and recording a time interval of limb actions and gesture data generated by the user in the interaction scene according to the preset precision required by the interaction scene;

according to the determined time interval, recording the space coordinates of the limb actions and gestures generated by the user in the interaction scene in preset time; and

Taking all the space coordinates recorded in the preset time as limb actions and gesture data generated by the user in an interaction scene;

analyzing the head action data, and determining first user behavior information expressed by the head action data in the interactive scene;

comprising the following steps:

Analyzing the head motion data according to a preset class definition rule corresponding to the head motion data, and determining head motion information represented by the head motion data;

determining a time interval and a relative position associated with calculating a head motion trajectory of the user;

Determining a head motion track of the user according to the head motion information, the time interval and the relative position; and

Analyzing the head action track based on the scene information of the interaction scene, and determining first user behavior information expressed by the head action data under the interaction scene according to an analysis result;

analyzing the limb actions and gesture data, and determining second user behavior information expressed by the limb actions and gesture data in the interaction scene; and

Analyzing the voice data and determining user language information expressed by the voice data in the interaction scene;

comprising the following steps:

Connecting all the space coordinates recorded in the preset time from an initial state to a final state by taking time as a sequence;

calculating the corresponding acceleration and vector speed of every two adjacent space coordinates in all the space coordinates recorded and obtained in the preset time;

Carrying out track division on the space coordinates with the acceleration and the vector speed of different values according to different colors to obtain limb and gesture tracks corresponding to the user in the preset time; and

Analyzing the limb and gesture tracks according to the scene information of the interaction scene, and determining second user behavior information expressed by the limb action and gesture data under the interaction scene according to an analysis result;

based on the immersive device, the operation of acquiring voice data generated by a user in an interactive scene comprises the following steps:

Acquiring user sound data generated by the user in the interaction scene and sound simulation data of environmental noise in the interaction scene based on an array microphone in the immersion device; and

Based on the sound simulation data, noise reduction processing is carried out on the user sound data through hardware and software respectively, and then voice data generated by the user under the interaction scene is obtained;

The operation of determining the user language information expressed by the voice data in the interaction scene comprises the following steps:

analyzing the voice data, and determining an average pronunciation score corresponding to the voice data under the pronunciation quality dimension;

analyzing the voice data and determining the average pronunciation rate corresponding to the voice data under the language fluency dimension;

analyzing the voice data to determine the language accuracy corresponding to the voice data under the accurate dimension of the phrase;

analyzing the voice data and determining a standard language rate corresponding to the voice data under a term standard dimension; and

And taking the average pronunciation score, the average pronunciation rate, the language accuracy and the standard language rate as user language information expressed by the voice data in the interactive scene, wherein the user language information is used for indicating the current various capabilities and indexes of the user.

2. The method as recited in claim 1, further comprising: and displaying the first user behavior information, the second user behavior information and the user language information to the user.

3. A storage medium comprising a stored program, wherein the method of any one of claims 1 to 2 is performed by a processor when the program is run.

4. An apparatus for dynamically evaluating user data based on an immersive device, comprising:

The data acquisition module is used for respectively acquiring head motion data, limb motion and gesture data and voice data generated by a user under an interaction scene based on the immersion equipment;

comprising the following steps:

The first determining module is used for analyzing the head action data and determining first user behavior information expressed by the head action data in the interaction scene;

comprising the following steps:

the second determining module is used for analyzing the limb actions and gesture data and determining second user behavior information expressed by the limb actions and gesture data in the interaction scene; and

The third determining module is used for analyzing the voice data and determining user language information expressed by the voice data in the interaction scene;

comprising the following steps:

5. An apparatus for dynamically evaluating user data based on an immersive device, comprising:

A processor; and

A memory, coupled to the processor, for providing instructions to the processor to process the following processing steps:

comprising the following steps: