CN115587347A - Virtual world content processing method and device - Google Patents

Virtual world content processing method and device Download PDF

Info

Publication number
CN115587347A
CN115587347A CN202211193143.XA CN202211193143A CN115587347A CN 115587347 A CN115587347 A CN 115587347A CN 202211193143 A CN202211193143 A CN 202211193143A CN 115587347 A CN115587347 A CN 115587347A
Authority
CN
China
Prior art keywords
image
characteristic
sign data
virtual world
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211193143.XA
Other languages
Chinese (zh)
Inventor
曹佳炯
丁菁汀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202211193143.XA priority Critical patent/CN115587347A/en
Publication of CN115587347A publication Critical patent/CN115587347A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/806Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/80Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level
    • G06V10/809Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data
    • G06V10/811Fusion, i.e. combining data from various sources at the sensor level, preprocessing level, feature extraction level or classification level of classification results, e.g. where the classifiers operate on the same input data the classifiers operating on different input data, e.g. multi-modal recognition

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computing Systems (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the specification provides a content processing method and a content processing device of a virtual world, wherein the content processing method of the virtual world comprises the following steps: acquiring sign data and characteristic images of an access device of a virtual world in a preset period, which are acquired by aiming at a user; inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics; performing feature splicing on the physiological features and the psychological features to obtain splicing features; and determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.

Description

Virtual world content processing method and device
Technical Field
The present disclosure relates to the field of virtualization technologies, and in particular, to a method and an apparatus for processing content in a virtual world.
Background
The virtual world provides a simulation of the real world and can even provide scenes that are difficult to implement in the real world, and thus the virtual world is increasingly applied to various scenes. In a virtual world scenario, a user logs in a three-dimensional virtual world with a specific ID and performs an activity using a virtual user role in the virtual world, and usually, different user roles each performing a different activity exist in the virtual world.
Disclosure of Invention
One or more embodiments of the present specification provide a content processing method of a virtual world. The content processing method of the virtual world comprises the following steps: and acquiring sign data and characteristic images of the access equipment of the virtual world in a preset period, which are acquired by aiming at a user. And inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics. And performing characteristic splicing on the physiological characteristics and the psychological characteristics to obtain splicing characteristics. And determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
One or more embodiments of the present specification provide a content processing apparatus of a virtual world, including: the data acquisition module is configured to acquire sign data and characteristic images of a preset period, which are acquired by the access equipment of the virtual world aiming at the user. And the feature extraction module is configured to input the sign data and the feature images of at least one time slice in the preset period into a first model for physiological feature extraction to obtain physiological features, and input the sign data and the feature images of the preset period into a second model for psychological feature extraction to obtain psychological features. And the characteristic splicing module is configured to perform characteristic splicing on the physiological characteristics and the psychological characteristics to obtain spliced characteristics. And the content output module is configured to determine the content adaptation level of the virtual world based on the splicing characteristics and output adaptation content corresponding to the content adaptation level to the access equipment.
One or more embodiments of the present specification provide a content processing apparatus of a virtual world, including: a processor; and a memory configured to store computer-executable instructions that, when executed, cause the processor to: and acquiring sign data and characteristic images of the access equipment of the virtual world in a preset period aiming at a user. And inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics. And performing feature splicing on the physiological features and the psychological features to obtain spliced features. And determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
One or more embodiments of the present specification provide a storage medium storing computer-executable instructions that, when executed by a processor, implement the following flow: and acquiring sign data and characteristic images of the access equipment of the virtual world in a preset period, which are acquired by aiming at a user. And inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics. And performing characteristic splicing on the physiological characteristics and the psychological characteristics to obtain splicing characteristics. And determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
Drawings
In order to more clearly illustrate one or more embodiments or technical solutions in the prior art in the present specification, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present specification, and for those skilled in the art, other drawings can be obtained according to the drawings without inventive exercise;
FIG. 1 is a flowchart illustrating a method for processing content in a virtual world according to one or more embodiments of the present disclosure;
FIG. 2 is a flowchart of a content processing method applied to a virtual world of a virtual health scene according to one or more embodiments of the present disclosure;
fig. 3 is a schematic diagram of a content processing apparatus of a virtual world according to one or more embodiments of the present disclosure;
fig. 4 is a schematic structural diagram of a content processing device of a virtual world according to one or more embodiments of the present specification.
Detailed Description
In order to make those skilled in the art better understand the technical solutions in one or more embodiments of the present specification, the technical solutions in one or more embodiments of the present specification will be clearly and completely described below with reference to the drawings in one or more embodiments of the present specification, and it is obvious that the described embodiments are only a part of the embodiments of the present specification, and not all embodiments. All other embodiments that can be derived by a person skilled in the art from one or more of the embodiments described herein without making any inventive step shall fall within the scope of protection of this document.
The embodiment of a method for processing contents of a virtual world provided by the present specification:
in the content processing method of the virtual world provided by this embodiment, feature extraction is performed according to sign data and feature images of a preset period of a user, which are acquired by an access device of the virtual world, so as to obtain physiological features and psychological features, a content adaptation level of the user in the virtual world is determined based on the physiological features and the psychological features, and adaptation content corresponding to the content adaptation level is output to the access device; therefore, the content suitable for the current state of the user is adapted to the user by analyzing the physiology and the psychology of the user accessing the virtual world, so that the user can experience the content, the negative influence on the user caused by the virtual world is avoided, and the physiological and psychological health of the user in the process of accessing the virtual world is ensured on the basis of ensuring the access requirement of the user to the virtual world.
Referring to fig. 1, the method for processing contents of a virtual world provided in this embodiment specifically includes steps S102 to S108.
Step S102, acquiring sign data and characteristic images of the access device of the virtual world in a preset period, wherein the sign data and the characteristic images are acquired by the access device of the virtual world aiming at a user.
The virtual world is a virtual simulation world which is realized based on decentralized cooperation and has an open economic system; specifically, a user in the real world may access the virtual world through the access device to perform a related action in the virtual world, for example, a game virtual world in which the user performs game interaction through the access device, and for example, a conference virtual world in which the user performs an online conference through the access device;
furthermore, an identity mapping between the virtual image in the virtual world and the user in the real world can be established, and related activities are carried out in the virtual world based on the established identity mapping. The access device of the Virtual world may be a VR (Virtual Reality) device, an AR (Augmented Reality) device, or the like connected to the Virtual world, for example, a head-mounted VR device connected to the Virtual world. The method for processing the content of the virtual world in the virtual world provided in this embodiment may be applied to a server, where the server is a server or a service platform that provides corresponding services for an access device accessing the virtual world, or a server or a service platform that maintains the operation of the virtual world.
The sign data comprises sign signals, such as brain wave signals, heart rate and blood pressure. The characteristic image comprises a biological characteristic image, a physiological image or a physical sign image of the user; for example, eye images and mouth images. Specifically, the physical sign data and the characteristic image in the embodiment are determined according to a sensor mounted on the access device; the above description of the vital sign data and the characteristic image is only exemplary, for example, the access device is equipped with a brain wave sensor, and then acquires brain wave signals. If the access equipment is equipped with other sensors, other signals or other images can be acquired.
The physical sign data refers to an electric signal acquired by a signal sensor when the user performs physiological activities, for example, an electroencephalogram signal refers to an electric signal acquired by a brain wave sensor when the brain of the user performs activities; optionally, in a scene that the user accesses the virtual world, the physical sign data is acquired through a signal sensor integrated or configured by the access device of the virtual world, for example, the electroencephalogram signal of the user is acquired through an electroencephalogram sensor configured by the access device.
In specific implementation, in order to ensure healthy access of the user to the virtual world and avoid adverse physiological and psychological effects on the user caused by the content of the user being enthusiastic in the virtual world, in this embodiment, step S102 may be replaced by obtaining a preset period of physical sign data and characteristic images of the user accessing the virtual world based on the access device, which are acquired by the access device; and forms a new implementation with other processing steps provided by the embodiment. Or, the step S102 may be replaced by acquiring sign data and characteristic images of a preset period, which are acquired by the access device of the virtual world; and forms a new implementation with other processing steps provided by the embodiment. Or, step S102 may be replaced by acquiring sign information acquired by the access device of the virtual world; and forms a new implementation mode with other processing steps provided by the embodiment; optionally, the sign information includes a sign signal and/or a sign image.
Specifically, in the process that a user accesses a virtual world based on access equipment, the access equipment acquires physical sign data and characteristic images of the user by using a carried sensor; in order to timely and efficiently monitor the health of the user and avoid that the user experience is affected by frequently outputting content to the user, in this embodiment, a preset period may be configured, for example, three hours. And determining the content adaptation level and updating the adaptation content for the user according to a preset period.
Step S104, inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model for physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model for psychological characteristic extraction to obtain psychological characteristics.
At least one time slice in the preset period comprises at least one time slice which is determined to extract physiological characteristics after the preset period is divided into a plurality of time slices; for example, the preset period is 3 hours, the preset period is divided into 5 minutes, and the last 5 minutes of the 3 hours is determined as a time slice for performing the psychological feature extraction, that is, the physical sign data and the feature image in the last 5 minutes are input into the first model for performing the physiological feature extraction, so as to obtain the physiological feature. In addition, the at least one time slice can also be divided into a preset period, and the time slice meeting the conditions in the divided multiple time slices is determined as the at least one time slice; optionally, the condition includes that a difference between the physical sign data and the characteristic picture is greater than a difference threshold; the time difference between the time segment and the cut-off time of the preset period is smaller than the preset time distance.
In practical application, a user accesses the virtual world and generates certain physiological changes according to related contents of the virtual world; for example, long periods of user addiction to the virtual world can result in pupil dilation or heart rate acceleration.
In a specific implementation process, in order to prevent a user from being addicted to the content in the virtual world to cause a physiological or psychological problem of the user, in this embodiment, after physical sign data and a feature image of a preset period, which are acquired by an access device of the virtual world for the user, are acquired, a physiological feature is determined based on the physical sign data and the feature image of at least one time slice in the preset period, and a psychological feature is determined based on the physical sign data and the feature image of the preset period.
Specifically, in the process of determining physiological characteristics based on the sign data and the characteristic image of at least one time slice in a preset period and determining psychological characteristics based on the sign data and the characteristic image of the preset period, the sign data and the characteristic image of at least one time slice in the preset period are input into a first model for physiological characteristic extraction to obtain physiological characteristics, and the sign data and the characteristic image of the preset period are input into a second model for psychological characteristic extraction to obtain psychological characteristics.
(1) Inputting the sign data and the characteristic image of at least one time segment in the preset period into a first model for physiological characteristic extraction to obtain physiological characteristics
Optionally, the first model includes: the system comprises a sign data encoder, an image encoder and a multi-mode hybrid network; the sign data encoder encodes the sign data of the at least one time slice and outputs sign characteristics; the image encoder performs image encoding on the characteristic image of the at least one time slice and outputs image characteristics; and the multi-modal hybrid network performs multi-modal processing on the sign features and the image features and outputs the physiological features. Optionally, the multi-modal hybrid network includes a multi-modal encoder during the use of the first model; namely, the first model includes: the vital signs data encoder, the image encoder and the multimodal hybrid network may be replaced by the first model comprising: a sign data encoder, an image encoder and a multi-modal encoder; and forms a new implementation mode with other contents provided by the embodiment.
In order to enable the obtained physiological characteristics to be more accurate and more effective, inputting sign data and characteristic images of at least one time slice in a preset period into a first model for physiological characteristic extraction; the first model comprises a biological feature extraction model.
In an optional implementation manner provided by this embodiment, the first model performs physiological feature extraction by:
inputting the sign data of the at least one time slice into a sign data encoder to perform sign data encoding to obtain sign characteristics, and inputting the characteristic image of the at least one time slice into an image encoder to perform characteristic image encoding to obtain image characteristics;
and inputting the physical sign characteristics and the image characteristics into a multi-modal encoder to perform multi-modal processing to obtain the physiological characteristics.
Further, in an optional implementation manner provided by this embodiment, the multi-modal processing is implemented by:
determining a splicing dimension according to the feature dimension of the physical sign feature and the feature dimension of the image feature;
performing feature splicing on the sign features and the image features according to the splicing dimension to obtain multi-mode splicing features;
determining dimensionality reduction based on the characteristic dimensionality of the multi-mode splicing characteristics and a preset dimensionality reduction proportion;
and performing feature dimension reduction processing on the multi-mode splicing features according to the dimension reduction dimension to obtain the physiological features.
Optionally, the number of dimensions of the stitching dimension is equal to the sum of the number of dimensions of the feature dimension of the sign feature and the number of dimensions of the feature dimension of the image feature;
the number of dimensions of the dimension reduction dimension comprises: the number of dimensions of the feature dimension of the vital sign feature, the number of dimensions of the feature dimension of the image feature, or an average of the number of dimensions of the feature dimension of the vital sign feature and the number of dimensions of the feature dimension of the image feature.
Because the physiological characteristics can not be determined directly based on the physical sign data and the characteristic image, the physical sign data and the characteristic image are coded to obtain the physical sign characteristics and the image characteristics in a vector form which can be identified, and then the physical sign characteristics and the image characteristics are subjected to multi-modal processing to obtain the physiological characteristics;
specifically, sign data and feature images of at least one time slice in a preset period are input into a first model for physiological feature extraction, and the first model comprises a first sign data encoder, a first image encoder and a first multi-modal encoder; in the process of extracting physiological features, inputting sign data of at least one time slice into a first sign data encoder to encode the sign data to obtain first sign features, inputting feature images of at least one time slice into a first image encoder to encode feature images to obtain first image features, and inputting the first sign features and the first image features into a first multi-mode encoder to encode multi-modes to obtain the physiological features; optionally, the first multimodal encoder performs serial concatenation on the first feature and the first image feature during multimodal encoding to obtain a concatenation feature; and carrying out feature dimension reduction coding on the spliced features to obtain physiological features.
For example, 128-dimensional sign features output by the sign data encoder and 128-dimensional image features output by the image encoder are obtained, the 128-dimensional sign features and the 128-dimensional image features are input into the multi-modal encoder, the multi-modal encoder firstly connects the sign features and the image features in series to obtain 256-dimensional multi-modal mosaic features, and then the 256-dimensional multi-modal mosaic features are encoded to obtain one 128-dimensional feature as the physiological feature.
In practical application, the first model may be trained in advance, for example, the first model may be trained on a cloud server, and specifically in the training process of the first model, in an optional implementation manner provided in this embodiment, the first model is trained in the following manner:
inputting the sign data sample into a sign data encoder to be trained to perform sign data encoding to obtain sign characteristics, and inputting the characteristic image sample into an image encoder to be trained to perform image encoding to obtain image characteristics;
inputting the physical sign characteristics and the image characteristics into a multi-modal mixed network to be trained for multi-modal processing to obtain sample physiological characteristics and physiological characteristic indexes;
calculating a training loss based on the sign data samples, the feature image samples, the sample physiological features, and the physiological feature indicators;
and adjusting parameters of the sign data encoder to be trained, the image encoder to be trained and the multi-modal hybrid network to be trained based on the training loss.
Specifically, in the training process of the first model, a model to be trained, which comprises a sign data encoder, an image encoder, a multi-modal hybrid network and a decoder, is trained; optionally, the multi-modal hybrid network includes a multi-modal encoder and a classifier; after training to obtain a model comprising the sign data encoder, the image encoder, the multi-modal hybrid network and the decoder, removing a classifier and the decoder in the model to obtain a first model comprising the sign data encoder, the image encoder and the multi-modal encoder.
Further, in order to improve the effectiveness of extracting the physiological features of the first model obtained by training, in an optional implementation manner provided by this embodiment, in the process of calculating the training loss, two parts of training loss are calculated: predicted loss and characteristic loss; that is, the process of calculating the training loss based on the physical sign data sample, the characteristic image sample, the sample physiological characteristic and the physiological characteristic index is implemented by the following method:
calculating a prediction loss based on the vital sign data samples, the feature image samples, and the sample physiological features, and calculating a feature loss based on the vital sign data samples, the feature image samples, and the physiological feature indicators.
Optionally, in the process of calculating the prediction loss based on the physical sign data sample, the characteristic image sample, and the sample physiological characteristic, the following operations are performed:
determining intermediate time according to the acquisition time of the sign data sample and the characteristic image sample, and extracting a target sign data sample and a target characteristic image sample before the intermediate time from the sign data sample and the characteristic image sample;
inputting the target sign data sample, the target characteristic image sample and the sample physiological characteristic into a decoder for data prediction to obtain predicted sign data and a predicted characteristic image;
calculating the prediction loss based on sign data samples after the intermediate time in the sign data samples, feature image samples after the intermediate time in the feature image samples, the predicted sign data, and the predicted feature image.
Specifically, in the process of training the first model, the sign data sample is input into a sign data encoder to be trained to perform sign data encoding to obtain sign characteristics, and the characteristic image sample is input into an image encoder to be trained to perform image encoding to obtain image characteristics; inputting the physical sign characteristics and the image characteristics into a multi-modal encoder to be trained for multi-modal encoding to obtain sample physiological characteristics; inputting the physiological characteristics of the sample into a classifier to carry out health classification, and obtaining the health probability (physiological characteristic index) of the physiological characteristics; inputting the sign data samples in the first half time of the sign data samples and the characteristic image samples in the first half time of the characteristic image samples into a decoder for data prediction to obtain predicted sign data and predicted characteristic images; calculating the Euclidean distance between the sign data sample and the characteristic image sample in the second half time and the predicted sign data and the predicted characteristic image as prediction loss; calculating Euclidean distances between labeling data of the sign data sample and the characteristic image sample and the health probability as characteristic loss; performing parameter adjustment on the model to be trained based on the characteristic loss and the prediction loss until the model converges; and constructing a first model based on the sign data encoder, the image encoder and the multi-mode encoder in the converged model for physiological feature extraction.
For example, in the calculation process of prediction loss, the sign data sample and the feature image sample of one minute are extracted from the first 30 seconds and input to a decoder for data prediction, so as to obtain predicted sign data and a predicted feature image output by the decoder; and (4) calculating Euclidean distances between the sign data samples and the characteristic image samples 30 seconds after calculation and the predicted sign data and the predicted characteristic image to serve as prediction loss.
In order to distinguish from the second model described below, the vital sign data encoder, the image encoder, the multi-modal network, the multi-modal encoder, the prediction loss, the feature loss, the classifier, and/or the decoder associated with the first model may be replaced with the first vital sign data encoder, the first image encoder, the first multi-modal network, the first multi-modal encoder, the first prediction loss, the first feature loss, the first classifier, and/or the first decoder. In addition, the above features to be distinguished from the second model may be added to the first description.
(2) Inputting the sign data and the characteristic images of the preset period into a second model for extracting psychological characteristics to obtain the psychological characteristics
It should be noted that the specific process of extracting the psychological characteristic and the training process of the second model are similar to the process of extracting the physiological characteristic and the training process of the first model, and therefore, please refer to the related contents of extracting the physiological characteristic and the training process of the first model.
Optionally, the second model includes: a sign data encoder, an image encoder and a multi-modal encoder;
the sign data encoder encodes the sign data of the preset period and outputs sign characteristics; the image encoder performs image encoding on the characteristic image of the preset period and outputs image characteristics; and the multi-mode encoder performs multi-mode encoding on the sign features and the image features and outputs the psychological features.
Because the psychological characteristics of the user are a long-time process, the physical sign data and the characteristic images of the preset period are input into the second model for psychological characteristic extraction; the second model comprises a psychological characteristic extraction model. In an optional implementation manner provided by this embodiment, the second model performs the psychological feature extraction by:
inputting the sign data of the preset period into a sign data encoder to perform sign data encoding to obtain sign characteristics, and inputting the characteristic image of the preset period into an image encoder to perform characteristic image encoding to obtain image characteristics;
and inputting the physical sign characteristics and the image characteristics into a multi-modal encoder to perform multi-modal processing to obtain the psychological characteristics.
Further, in an optional implementation manner provided by this embodiment, the multi-modal processing is implemented by:
determining a splicing dimension according to the feature dimension of the physical sign feature and the feature dimension of the image feature;
performing feature splicing on the sign features and the image features according to the splicing dimension to obtain multi-mode splicing features;
determining dimensionality reduction based on the characteristic dimensionality of the multi-mode splicing characteristics and a preset dimensionality reduction proportion;
and performing feature dimension reduction processing on the multi-mode splicing features according to the dimension reduction dimension to obtain the physiological features.
Optionally, the number of dimensions of the stitching dimension is equal to the sum of the number of dimensions of the feature dimension of the sign feature and the number of dimensions of the feature dimension of the image feature;
the dimension number of the dimensionality reduction dimension comprises: the number of dimensions of the feature dimension of the vital sign feature, the number of dimensions of the feature dimension of the image feature, or an average of the number of dimensions of the feature dimension of the vital sign feature and the number of dimensions of the feature dimension of the image feature.
Since the psychological characteristics cannot be determined directly based on the physical sign data and the characteristic image, the physical sign data and the characteristic image are encoded to obtain the physical sign characteristics and the image characteristics in a vector form which can be identified, and then the physiological sign characteristics and the image characteristics are subjected to multi-modal processing to obtain the psychological characteristics.
Specifically, the physical sign data and the characteristic images of a preset period are input into a second model for physiological characteristic extraction, and the second model comprises a second physical sign data encoder, a second image encoder and a second multi-modal encoder; in the process of extracting the psychological characteristics, inputting the sign data of a preset period into a second sign data encoder to encode the sign data to obtain second body sign characteristics, inputting the feature images of the preset period into a second image encoder to encode feature images to obtain second image characteristics, and inputting the second body sign characteristics and the second image characteristics into a second multi-mode encoder to encode multi-modes to obtain the psychological characteristics; optionally, the second multi-modal encoder performs serial concatenation on the second feature and the second image feature in the multi-modal encoding process to obtain a concatenation feature; and carrying out feature dimension reduction coding on the splicing features to obtain psychological features.
In practical application, the second model may be trained in advance, for example, the second model is trained on a cloud server, and specifically in the training process of the second model, in an optional implementation manner provided in this embodiment, the first model is trained in the following manner:
inputting the sign data sample into a sign data encoder to be trained for sign data encoding to obtain sign characteristics, and inputting the characteristic image sample into an image encoder to be trained for image encoding to obtain image characteristics;
inputting the physical sign characteristics and the image characteristics into a multi-modal mixed network to be trained for multi-modal processing to obtain sample psychological characteristics and psychological characteristic indexes;
calculating a training loss based on the sign data samples, the feature image samples, the sample psychological features, and the psychological feature index;
and adjusting parameters of the sign data encoder to be trained, the image encoder to be trained and the multi-modal hybrid network to be trained based on the training loss.
Specifically, in the training process of the second model, a model to be trained, which comprises a second feature data encoder, a second image encoder, a second multi-modal hybrid network and a second decoder, is trained; optionally, the second multi-modal hybrid network includes a second multi-modal encoder and a second classifier; after training to obtain a model including the second volume characteristics data encoder, the second image encoder, the second multi-modal hybrid network and the second decoder, the second classifier and the second decoder in the model are removed, and a second model including the second volume characteristics data encoder, the second image encoder and the second multi-modal encoder is obtained.
Further, in order to improve effectiveness of extracting the psychological features of the second model obtained by training, in an optional implementation manner provided by this embodiment, in the process of calculating the training loss, two parts of training loss are calculated: predicted loss and characteristic loss; namely, the process of calculating the training loss based on the physical sign data sample, the characteristic image sample, the sample psychological characteristic and the psychological characteristic index is realized by adopting the following mode:
calculating a predicted loss based on the sign data samples, the feature image samples, and the sample psychographic features, and calculating a feature loss based on the sign data samples, the feature image samples, and the psychographic feature indicator.
Optionally, in the process of calculating the prediction loss based on the sign data sample, the feature image sample, and the sample psychological feature, the following operations are performed:
determining intermediate time according to the acquisition time of the physical sign data sample and the characteristic image sample, and extracting a target physical sign data sample and a target characteristic image sample before the intermediate time from the physical sign data sample and the characteristic image sample;
inputting the target sign data sample, the target characteristic image sample and the sample psychological characteristic into a decoder for data prediction to obtain predicted sign data and a predicted characteristic image;
calculating the prediction loss based on sign data samples after the intermediate time in the sign data samples, feature image samples after the intermediate time in the feature image samples, the predicted sign data, and the predicted feature image.
Specifically, in the process of training the second model, the sign data samples are input into a sign data encoder to be trained for sign data encoding to obtain sign characteristics, and the characteristic image samples are input into an image encoder to be trained for image encoding to obtain image characteristics; inputting the physical sign characteristics and the image characteristics into a multi-modal encoder to be trained for multi-modal encoding to obtain sample psychological characteristics; inputting the sample psychological characteristics into a classifier to carry out health classification, and obtaining the health probability (psychological characteristic index) of the psychological characteristics; inputting the sign data samples in the first half time of the sign data samples and the characteristic image samples in the first half time of the characteristic image samples into a decoder for data prediction to obtain predicted sign data and predicted characteristic images; calculating the Euclidean distance between the sign data sample and the characteristic image sample in the second half time and the predicted sign data and the predicted characteristic image as prediction loss; calculating Euclidean distances between labeling data of the sign data sample and the characteristic image sample and the health probability as characteristic loss; performing parameter adjustment on the model to be trained based on the characteristic loss and the prediction loss until the model converges; and constructing a second model based on the sign data encoder, the image encoder and the multi-mode encoder in the converged model for extracting the psychological characteristics.
For example, in the calculation process of prediction loss, one hour of sign data samples and one hour of feature image samples are extracted, the sign data samples and the feature image samples in the previous 30 minutes are input into a decoder for data prediction, and predicted sign data and predicted feature images output by the decoder are obtained; and (4) calculating Euclidean distances between the sign data samples and the characteristic image samples 30 minutes after calculation and the predicted sign data and the predicted characteristic image to serve as prediction loss.
It should be noted that, the relevant description of the second model is distinguished from the relevant content of the first model, and a second description may be added to the relevant feature to distinguish the relevant content of the first model from the relevant content of the second model.
The process of inputting the sign data and the characteristic image of at least one time segment in the preset period into the first model for physiological characteristic extraction to obtain physiological characteristics, and the process of inputting the sign data and the characteristic image of the preset period into the second model for psychological characteristic extraction to obtain psychological characteristics and the related model training process are specifically explained.
It is necessary to supplement that, since physiological characteristics are easy to change in a short time, psychological characteristics need to be determined according to data of a certain time; therefore, in the process of training the first model and the second model, the time corresponding to the samples for training the first model is shorter than the time corresponding to the samples for training the second model. For example, the first model is trained based on samples per minute and the second model is trained based on samples per hour.
In order to further improve the effectiveness of feature extraction of a first model and a second model obtained by training, in the process of model training, an initial feature image and an initial feature image acquired by access equipment are preprocessed, and then model training is carried out based on a physical sign data sample and a feature image sample obtained after preprocessing; in an optional implementation manner provided by this embodiment, the sign data samples and the feature image samples are obtained as follows:
acquiring initial sign data and initial characteristic images of users acquired by at least one access device;
preprocessing the initial sign data and the initial characteristic image to obtain target sign data and a target characteristic image;
and performing labeling processing on the target sign data and the target characteristic image to obtain a sign data sample and a characteristic image sample.
Optionally, the pre-treating includes: filtering the initial sign data to obtain the target sign data; performing image evaluation on the initial characteristic image to obtain an image index of the initial image data; and screening the initial characteristic image of which the image index is greater than an index threshold value in the initial characteristic image as the target characteristic image.
Specifically, filtering processing is performed on initial sign data acquired by the access equipment based on a carried sensor, noise and fluctuation in the sign data are filtered, and target sign data are obtained; for example, performing median filtering on the acquired initial sign signals, and filtering out noise and fluctuation in the initial sign signals to obtain target sign data;
for an initial characteristic image acquired by access equipment based on a mounted sensor, inputting the initial characteristic image into an image quality evaluation model for quality evaluation to obtain an image index (image quality index) of the initial characteristic image; screening out initial characteristic images with image indexes larger than an index threshold value as target characteristic images;
and then labeling the target sign data and the target characteristic image respectively based on the pre-stored health data or the health data acquired from a third party to obtain a sign data sample and a characteristic image sample. Optionally, the labeled health data is the same for the target sign data and the target feature image of the same user at the same time.
In addition, step S104 may also be replaced by performing feature extraction based on the physical sign data and the feature image to obtain physiological features and psychological features; and forms a new implementation with other processing steps provided by the embodiment. Or, the method can be replaced by the method, and the characteristic extraction is carried out on the basis of the physical sign data and the characteristic image to obtain the user access characteristic; correspondingly, the following steps S106 to S108 may also be replaced by determining a content adaptation level of the virtual world based on the user access characteristic, outputting adaptation content corresponding to the content adaptation level to the access device, and forming a new implementation manner with the step S102; optionally, the user access characteristics include physiological characteristics and/or psychological characteristics.
And S106, performing characteristic splicing on the physiological characteristics and the psychological characteristics to obtain splicing characteristics.
The splicing characteristics comprise fusion characteristics which are obtained after characteristic fusion is carried out on the physiological characteristics and the psychological characteristics and can identify the physiological characteristics and the psychological characteristics.
In a specific execution process, after the physiological features and the psychological features are obtained, in order to improve the convenience of determining the content adaptation level, the physiological features and the psychological features are subjected to feature splicing to obtain splicing features.
In specific implementation, in the process of carrying out feature splicing on the physiological features and the psychological features to obtain splicing features, inputting the physiological features and the psychological features into a feature splicing model to carry out feature splicing to obtain the splicing features. Optionally, in the process of performing feature splicing by the feature splicing model, according to the feature dimension of the physiological feature and the feature dimension of the psychological feature, determining a splicing dimension; performing feature splicing on the physiological features and the psychological features according to the splicing dimension to obtain intermediate features; determining dimensionality reduction dimension based on the characteristic dimension of the intermediate characteristic and a preset dimensionality reduction proportion; and performing feature dimension reduction processing on the intermediate features according to dimension reduction dimensions to obtain and output splicing features.
Optionally, the number of dimensions of the stitching dimension is equal to the sum of the number of dimensions of the feature dimension of the physiological feature and the number of dimensions of the feature dimension of the psychological feature; the dimension number of the dimensionality reduction dimension comprises: a number of dimensions of a feature dimension of the physiological feature, a number of dimensions of a feature dimension of the psychological feature, or an average number of dimensions of the feature dimension of the physiological feature and the number of dimensions of the feature dimension of the psychological feature.
In practical application, the feature concatenation model may be trained in advance, for example, the feature concatenation model is trained on a cloud server, and specifically in the training process of the feature concatenation model, in an optional implementation manner provided in this embodiment, the feature concatenation model is trained in the following manner:
inputting the physiological characteristic sample and the psychological characteristic sample into a characteristic splicing model to be trained for characteristic splicing to obtain training splicing characteristics;
performing physiological characteristic reconstruction and psychological characteristic reconstruction based on the training splicing characteristics to obtain reconstructed physiological characteristics and reconstructed psychological characteristics;
and calculating training loss according to the physiological characteristic sample, the psychological characteristic sample, the reconstructed physiological characteristic and the reconstructed psychological characteristic, and performing characteristic splicing on the characteristic splicing model to be trained based on the training loss.
The feature stitching model may adopt a multi-layer Perceptron structure, such as a 5-layer MLP (multi layer Perceptron); inputting a physiological characteristic sample and a psychological characteristic sample, and outputting a splicing characteristic; the loss function is a reconstruction loss; for example, the loss function is the Euclidean distance between the reconstructed physiological characteristics and the reconstructed psychological characteristics and the physiological characteristic samples and the psychological characteristic samples; or the Euclidean distance of the reconstructed physiological characteristics and physiological characteristic samples and the Euclidean distance of the reconstructed psychological characteristics and psychological characteristic samples.
And (5) performing model training by using the mode until the model converges to obtain a feature splicing model. After the characteristic splicing model is obtained, in practical application, after the physiological characteristics and the psychological characteristics are obtained, the physiological characteristics and the psychological characteristics are input into the characteristic splicing model to carry out characteristic splicing, and splicing characteristics are obtained.
Step S108, determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting the adaptation content corresponding to the content adaptation level to the access device.
In practical application, in order to avoid that psychology or physiology of a user is enthusiastic in a virtual world for a long time to affect the health of the user, in this embodiment, after obtaining a splicing characteristic of the user in the virtual world, it is determined that the current health condition of the user can access content in the virtual world and an access level of the corresponding content based on the splicing characteristic; in order to effectively manage the content and the access level of the content in the virtual world, the content and the access level of the content in the virtual world are classified according to the levels; therefore, in order to more accurately determine the adaptive content output to the access device, after the splicing characteristics are obtained, the health level of the user is determined based on the splicing characteristics; the health level is a content adaptation level.
The adapted content comprises services or items in a virtual world which are configured to correspond to the content adaptation level in advance. For example, a roller coaster project in a virtual world; if the content adaptation level of the user is level 1 (the health level is the lowest), it indicates that the health degree of the user is low, and in order to avoid that the user experiences the roller coaster item with too high intensity to cause the physical discomfort of the user, the experience intensity of the roller coaster item is level 1 (the experience intensity is the weakest) for the user with the content adaptation level of level 1. Optionally, the adapted content includes item data of a service item that can participate in the content adaptation level adaptation during the process of the user accessing the virtual world based on the access device. Optionally, the adapted content includes item data of the determined service item adapted by the content adaptation level in which the user participates in the virtual world.
In specific implementation, in order to improve the accuracy of the determined content adaptation level, in an optional implementation manner provided by this embodiment, the content adaptation level of the virtual world is determined in the following manner:
inputting the splicing characteristics into a preset index algorithm to perform characteristic index calculation to obtain characteristic indexes;
determining a content adaptation level matching the feature index.
Optionally, the preset index algorithm is obtained by the following method:
inputting the splicing characteristic sample into a distribution model to be trained for index calculation to obtain a sample index;
calculating training loss based on the sample indexes and the sample labels of the splicing characteristic samples, and performing parameter adjustment on the distribution model based on the training loss to obtain distribution parameters;
and constructing a distribution expression based on the distribution parameters to serve as the preset index algorithm.
Specifically, in the process of determining the content adaptation level of the virtual world based on the splicing characteristics, the characteristic index is calculated based on the splicing characteristics, and then the content adaptation level matched with the characteristic index is determined. The characteristic index comprises the health probability of the user; further, in the process of calculating the characteristic index based on the splicing characteristic, the splicing characteristic is input into a preset index algorithm to calculate the characteristic index, and the characteristic index is obtained. Optionally, the preset index algorithm includes a distribution expression obtained based on the training of the splicing feature sample.
For example, a Gaussian mixture model is used, gaussian distribution corresponding to the healthy user is fitted based on the splicing feature sample, and after the splicing feature is obtained, the splicing feature is input into the Gaussian distribution to calculate the healthy probability of the user. And the labeling data of the spliced characteristic sample is healthy or unhealthy.
After the obtained characteristic indexes are calculated, determining the content adaptation level matched with the characteristic indexes; in the process of determining the content adaptation level matched with the characteristic index, determining an index interval to which the characteristic index belongs, and taking the content adaptation level corresponding to the index interval as the content adaptation level of the user in the virtual world; optionally, the index interval is preconfigured.
For example, 5 levels are preconfigured: t1, T2, T3 and T4; representing the characteristic index by p, and if p < T1, determining the characteristic index as a first level; if p is more than or equal to T1 and less than T2, determining the level as a second level; if the p is more than or equal to T2 and less than T3, determining the grade as a third grade; if p is more than or equal to T3 and less than T4, determining the level as a fourth level; if T4 is not more than p, determining the grade as the fifth grade; the higher the level is, the higher the health degree of the user is, the more contents can participate, and the strength of the participated contents can be higher.
And after the content adaptation level is determined, outputting the adaptation content corresponding to the content adaptation level to the access equipment. In this embodiment, in the process of outputting the adapted content corresponding to the content adaptation level to the access device, the content list corresponding to the content adaptation level may be output to the access device, and the level of the content of the user accessing the virtual world based on the access device may also be adjusted. In an optional implementation manner provided by this embodiment, outputting, to the access device, adaptation content corresponding to the content adaptation level includes:
reading the adaptive content corresponding to the content adaptation level in the virtual world;
and constructing a content list based on the adaptive content and the content adaptation level and outputting the content list to the access equipment, or updating the output content of the access equipment based on the adaptive content.
Specifically, after the content adaptation level of the virtual world is determined, a content list is constructed based on the content adaptation level, the adaptation content corresponding to the content adaptation level and the content intensity level and is output to the access device, so that the user can select the content; or, the content output by the access device is subjected to level adjustment based on the content adaptation level.
To sum up, in the content processing method of the virtual world provided in this embodiment, after acquiring the physical sign data and the feature image of the virtual world in the preset period, which are acquired by the access device of the virtual world, the physical sign data and the feature image of the ending time segment in the preset period (for example, the preset period is one hour, and the ending time segment is the last ten minutes or one minute in the one hour) are input into the physiological feature extraction model for physiological feature extraction, so as to obtain physiological features, and the physical sign data and the feature image of the preset period are input into the psychological feature extraction model for psychological feature extraction, so as to obtain psychological features; inputting the physiological characteristics and the psychological characteristics into a characteristic splicing model for characteristic splicing to obtain splicing characteristics; substituting the splicing characteristics into a preset distribution expression to calculate characteristic indexes, obtaining the characteristic indexes, and determining the content adaptation level matched with the characteristic indexes; reading the adaptive content corresponding to the content adaptation level; constructing a content list based on the content adaptation level and the adaptation content and outputting the content list to the access device, so that the physical and psychological detection is performed on the user who accesses the virtual world from the access device, and the adaptation content output to the user based on the access device is adjusted according to the detection result, thereby ensuring the physical and psychological health of the user in the process of accessing the virtual world; the physical and psychological influences of the user due to the fact that the user is addicted to the high-strength content in the virtual world for a long time are avoided; the experience content of the user is not required to be managed rigidly, and the perception degree of the user to the virtual world is improved through real-time content adaptation.
The following takes an application of the content processing method of the virtual world provided in this embodiment in a virtual health scene as an example, and further describes the content processing method of the virtual world provided in this embodiment, referring to fig. 2, the content processing method of the virtual world applied in the virtual health scene specifically includes the following steps.
Step S202, acquiring sign signals and sign images of a preset period, which are acquired by access equipment of the virtual world aiming at a user.
Step S204, inputting the sign signals and the sign images of the target time segment in the preset period into a physiological characteristic extraction model for physiological characteristic extraction, and obtaining physiological characteristics.
The target time segment is the last time segment in the preset period. For example, the predetermined period is one hour, and the target time segment is the last ten minutes of the hour.
Step S206, inputting the sign signals and the sign images in the preset period into a psychological characteristic extraction model for psychological characteristic extraction, and obtaining psychological characteristics.
Wherein, step S204 and step S206 may be executed simultaneously, or step S206 may be executed first, and then step S204 is executed; and are not limited herein.
And S208, inputting the physiological characteristics and the psychological characteristics into the characteristic splicing model for characteristic splicing to obtain splicing characteristics.
And step S210, substituting the splicing characteristics into a health distribution expression, and calculating the health probability of the user.
Step S212, determining the content adaptation level corresponding to the health probability, and reading the adaptation content corresponding to the content adaptation level.
And step S214, constructing a content list based on the adaptive content and the content adaptive level and outputting the content list to the access device.
An embodiment of a content processing apparatus for a virtual world provided in this specification is as follows:
in the foregoing embodiment, a method for processing content in a virtual world is provided, and correspondingly, a device for processing content in a virtual world is also provided, which is described below with reference to the accompanying drawings.
Referring to fig. 3, a schematic diagram of a content processing apparatus of a virtual world provided in this embodiment is shown.
Since the device embodiments correspond to the method embodiments, the description is relatively simple, and the relevant portions may refer to the corresponding description of the method embodiments provided above. The device embodiments described below are merely illustrative.
The present embodiment provides a content processing apparatus for a virtual world, including:
a data acquisition module 302 configured to acquire sign data and characteristic images of a preset period, which are acquired by an access device of a virtual world for a user;
a feature extraction module 304, configured to input the sign data and the feature image of at least one time slice in the preset period into a first model for physiological feature extraction to obtain physiological features, and input the sign data and the feature image of the preset period into a second model for psychological feature extraction to obtain psychological features;
a feature splicing module 306 configured to perform feature splicing on the physiological features and the psychological features to obtain spliced features;
a content output module 308 configured to determine a content adaptation level of the virtual world based on the splicing characteristics, and output adapted content corresponding to the content adaptation level to the access device.
An embodiment of a content processing device in a virtual world provided in this specification is as follows:
on the basis of the same technical concept, corresponding to the content processing method of the virtual world described above, one or more embodiments of the present specification further provide a content processing device of the virtual world, where the content processing device of the virtual world is configured to execute the content processing method of the virtual world provided above, and fig. 4 is a schematic structural diagram of the content processing device of the virtual world provided in one or more embodiments of the present specification.
The present embodiment provides a content processing device of a virtual world, including:
as shown in fig. 4, the content processing device of the virtual world may have a relatively large difference due to different configurations or performances, and may include one or more processors 401 and a memory 402, where one or more stored applications or data may be stored in the memory 402. Memory 402 may be, among other things, transient storage or persistent storage. The application program stored in memory 402 may include one or more modules (not shown), each of which may include a series of computer-executable instructions in a content processing device of the virtual world. Still further, the processor 401 may be configured to communicate with the memory 402 to execute a series of computer-executable instructions in the memory 402 on a content processing device of the virtual world. The virtual world's content processing apparatus may also include one or more power supplies 403, one or more wired or wireless network interfaces 404, one or more input/output interfaces 405, one or more keyboards 406, and the like.
In one particular embodiment, the virtual world's content processing apparatus includes a memory, and one or more programs, wherein the one or more programs are stored in the memory, and the one or more programs may include one or more modules, and each module may include a series of computer-executable instructions for the virtual world's content processing apparatus, and the one or more programs configured for execution by the one or more processors include computer-executable instructions for:
acquiring sign data and characteristic images of an access device of a virtual world in a preset period, which are acquired by aiming at a user;
inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics;
performing feature splicing on the physiological features and the psychological features to obtain spliced features;
and determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
An embodiment of a storage medium provided in this specification is as follows:
on the basis of the same technical concept, one or more embodiments of the present specification further provide a storage medium corresponding to the above-described content processing method of the virtual world.
The storage medium provided in this embodiment is used to store computer-executable instructions, and when the computer-executable instructions are executed by the processor, the following processes are implemented:
acquiring sign data and characteristic images of an access device of a virtual world in a preset period, which are acquired by aiming at a user;
inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics;
performing feature splicing on the physiological features and the psychological features to obtain splicing features;
and determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
It should be noted that the embodiment of the storage medium in this specification and the embodiment of the content processing method in the virtual world in this specification are based on the same inventive concept, and therefore, for specific implementation of this embodiment, reference may be made to implementation of the foregoing corresponding method, and repeated parts are not described again.
The foregoing description of specific embodiments has been presented for purposes of illustration and description. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
In the 30 s of the 20 th century, improvements in a technology could clearly distinguish between improvements in hardware (e.g., improvements in circuit structures such as diodes, transistors, switches, etc.) and improvements in software (improvements in process flow). However, as technology advances, many of today's process flow improvements have been seen as direct improvements in hardware circuit architecture. Designers almost always obtain a corresponding hardware circuit structure by programming an improved method flow into the hardware circuit. Thus, it cannot be said that an improvement in the process flow cannot be realized by hardware physical modules. For example, a Programmable Logic Device (PLD) (e.g., a Field Programmable Gate Array (FPGA)) is an integrated circuit whose Logic functions are determined by a user programming the Device. A digital system is "integrated" on a PLD by the designer's own programming without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Furthermore, nowadays, instead of manually manufacturing an Integrated Circuit chip, such Programming is often implemented by "logic compiler" software, which is similar to a software compiler used in program development and writing, but the original code before compiling is also written by a specific Programming Language, which is called Hardware Description Language (HDL), and HDL is not only one but many, such as ABEL (Advanced Boolean Expression Language), AHDL (alternate Hardware Description Language), traffic, CUPL (core universal Programming Language), HDCal, jhddl (Java Hardware Description Language), lava, lola, HDL, PALASM, rhyd (Hardware Description Language), and vhigh-Language (Hardware Description Language), which is currently used in most popular applications. It will also be apparent to those skilled in the art that hardware circuitry that implements the logical method flows can be readily obtained by merely slightly programming the method flows into an integrated circuit using the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer-readable medium storing computer-readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, an Application Specific Integrated Circuit (ASIC), a programmable logic controller, and an embedded microcontroller, examples of which include, but are not limited to, the following microcontrollers: ARC625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic for the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller as pure computer readable program code, the same functionality can be implemented by logically programming method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers and the like. Such a controller may thus be regarded as a hardware component and the means for performing the various functions included therein may also be regarded as structures within the hardware component. Or even means for performing the functions may be regarded as being both a software module for performing the method and a structure within a hardware component.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. One typical implementation device is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smartphone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, respectively. Of course, the functions of the units may be implemented in the same software and/or hardware or in multiple software and/or hardware when implementing the embodiments of the present description.
One skilled in the art will appreciate that one or more embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, one or more embodiments of the present description may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the description may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and so forth) having computer-usable program code embodied therein.
The description has been presented with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the description. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include forms of volatile memory in a computer readable medium, random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.
Computer-readable media, including both permanent and non-permanent, removable and non-removable media, may implement the information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Disks (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or apparatus comprising the element.
One or more embodiments of the present description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. One or more embodiments of the specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the system embodiment, since it is substantially similar to the method embodiment, the description is relatively simple, and reference may be made to the partial description of the method embodiment for relevant points.
The above description is only an example of the present document and is not intended to limit the present document. Various modifications and changes may occur to those skilled in the art from this document. Any modification, equivalent replacement, improvement or the like made within the spirit and principle of this document shall be included in the scope of the claims of this document.

Claims (17)

1. A content processing method of a virtual world comprises the following steps:
acquiring sign data and characteristic images of an access device of a virtual world in a preset period, which are acquired by aiming at a user;
inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics;
performing feature splicing on the physiological features and the psychological features to obtain splicing features;
and determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
2. The content processing method of the virtual world according to claim 1, wherein the outputting of the adapted content corresponding to the content adaptation level to the access device includes:
reading the adaptive content corresponding to the content adaptation level in the virtual world;
and constructing a content list based on the adaptive content and the content adaptation level and outputting the content list to the access equipment, or updating the output content of the access equipment based on the adaptive content.
3. The content processing method of the virtual world according to claim 1, wherein the first model includes: a sign data encoder, an image encoder and a multi-modal hybrid network;
the sign data encoder encodes the sign data of the at least one time slice and outputs sign characteristics; the image encoder performs image encoding on the characteristic image of the at least one time slice and outputs image characteristics; and the multi-modal hybrid network performs multi-modal processing on the sign features and the image features and outputs the physiological features.
4. The method for processing the contents of the virtual world according to claim 1, wherein the first model is trained in the following manner:
inputting the sign data sample into a sign data encoder to be trained for sign data encoding to obtain sign characteristics, and inputting the characteristic image sample into an image encoder to be trained for image encoding to obtain image characteristics;
inputting the physical sign characteristics and the image characteristics into a multi-modal mixed network to be trained for multi-modal processing to obtain sample physiological characteristics and physiological characteristic indexes;
calculating a training loss based on the sign data samples, the feature image samples, the sample physiological features, and the physiological feature indicators;
and adjusting parameters of the sign data encoder to be trained, the image encoder to be trained and the multi-modal hybrid network to be trained based on the training loss.
5. The method for processing contents of a virtual world according to claim 4, wherein said calculating a training loss based on the physical sign data samples, the characteristic image samples, the sample physiological characteristics and the physiological characteristic indicators comprises:
calculating a prediction loss based on the vital sign data samples, the feature image samples, and the sample physiological features, and calculating a feature loss based on the vital sign data samples, the feature image samples, and the physiological feature indicators.
6. The method for processing the contents of the virtual world according to claim 5, wherein the calculating the predicted loss based on the sign data samples, the characteristic image samples and the sample physiological characteristics comprises:
determining intermediate time according to the acquisition time of the sign data sample and the characteristic image sample, and extracting a target sign data sample and a target characteristic image sample before the intermediate time from the sign data sample and the characteristic image sample;
inputting the target sign data sample, the target characteristic image sample and the sample physiological characteristics into a decoder for data prediction to obtain predicted sign data and a predicted characteristic image;
calculating the prediction loss based on sign data samples after the intermediate time in the sign data samples, feature image samples after the intermediate time in the feature image samples, the predicted sign data, and the predicted feature image.
7. The method for processing the contents of the virtual world according to claim 4, wherein the sign data samples and the characteristic image samples are obtained by:
acquiring initial sign data and initial characteristic images of users acquired by at least one access device;
preprocessing the initial sign data and the initial characteristic image to obtain target sign data and a target characteristic image;
and performing labeling processing on the target sign data and the target characteristic image to obtain a sign data sample and a characteristic image sample.
8. The content processing method of the virtual world according to claim 7, wherein the preprocessing comprises:
filtering the initial sign data to obtain the target sign data; and (c) a second step of,
performing image evaluation on the initial characteristic image to obtain an image index of the initial image data; and screening the initial characteristic image of which the image index is greater than an index threshold value in the initial characteristic image as the target characteristic image.
9. The content processing method of the virtual world according to claim 1, wherein the second model includes: the system comprises a sign data encoder, an image encoder and a multi-mode hybrid network;
the sign data encoder encodes the sign data of the preset period and outputs sign characteristics; the image encoder performs image encoding on the characteristic image of the preset period and outputs image characteristics; and the multi-mode mixed network performs multi-mode processing on the physical sign characteristics and the image characteristics and outputs the psychological characteristics.
10. The content processing method of the virtual world according to claim 1, wherein the determining the content adaptation level of the virtual world based on the splicing features comprises:
inputting the splicing characteristics into a preset index algorithm to perform characteristic index calculation to obtain characteristic indexes;
determining a content adaptation level matching the feature index.
11. The method for processing the content of the virtual world according to claim 10, wherein the preset index algorithm is obtained by:
inputting the splicing characteristic sample into a distribution model to be trained for index calculation to obtain a sample index;
calculating training loss based on the sample indexes and the sample labels of the splicing characteristic samples, and performing parameter adjustment on the distribution model based on the training loss to obtain distribution parameters;
and constructing a distribution expression based on the distribution parameters to serve as the preset index algorithm.
12. The content processing method of the virtual world according to claim 1, wherein the physiological feature extraction comprises:
inputting the sign data of the at least one time slice into a sign data encoder to encode the sign data to obtain sign characteristics, and inputting the characteristic image of the at least one time slice into an image encoder to encode a characteristic image to obtain image characteristics;
and inputting the sign features and the image features into a multi-modal encoder to perform multi-modal processing to obtain the physiological features.
13. The content processing method of the virtual world according to claim 12, the multi-modal processing, comprising:
determining a splicing dimension according to the feature dimension of the physical sign feature and the feature dimension of the image feature;
performing feature splicing on the sign features and the image features according to the splicing dimension to obtain multi-mode splicing features;
determining dimensionality reduction based on the characteristic dimensionality of the multi-mode splicing characteristics and a preset dimensionality reduction proportion;
and performing feature dimension reduction processing on the multi-mode splicing features according to the dimension reduction dimension to obtain the physiological features.
14. The content processing method of the virtual world according to claim 1, wherein the adapted content includes item data of the determined content adaptation level adapted service item in which the user participates in the virtual world.
15. A content processing apparatus of a virtual world, comprising:
the data acquisition module is configured to acquire sign data and characteristic images of a preset period, which are acquired by access equipment of the virtual world aiming at a user;
the characteristic extraction module is configured to input the sign data and the characteristic images of at least one time slice in the preset period into a first model for physiological characteristic extraction to obtain physiological characteristics, and input the sign data and the characteristic images of the preset period into a second model for psychological characteristic extraction to obtain psychological characteristics;
the characteristic splicing module is configured to perform characteristic splicing on the physiological characteristics and the psychological characteristics to obtain spliced characteristics;
and the content output module is configured to determine the content adaptation level of the virtual world based on the splicing characteristics and output adaptation content corresponding to the content adaptation level to the access equipment.
16. A content processing apparatus of a virtual world, comprising:
a processor; and the number of the first and second groups,
a memory configured to store computer-executable instructions that, when executed, cause the processor to:
acquiring sign data and characteristic images of an access device of a virtual world in a preset period, which are acquired by aiming at a user;
inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics;
performing feature splicing on the physiological features and the psychological features to obtain splicing features;
and determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
17. A storage medium storing computer-executable instructions that when executed by a processor implement the following:
acquiring sign data and characteristic images of an access device of a virtual world in a preset period, which are acquired by aiming at a user;
inputting the sign data and the characteristic image of at least one time slice in the preset period into a first model to perform physiological characteristic extraction to obtain physiological characteristics, and inputting the sign data and the characteristic image of the preset period into a second model to perform psychological characteristic extraction to obtain psychological characteristics;
performing feature splicing on the physiological features and the psychological features to obtain splicing features;
and determining the content adaptation level of the virtual world based on the splicing characteristics, and outputting adaptation content corresponding to the content adaptation level to the access equipment.
CN202211193143.XA 2022-09-28 2022-09-28 Virtual world content processing method and device Pending CN115587347A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211193143.XA CN115587347A (en) 2022-09-28 2022-09-28 Virtual world content processing method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211193143.XA CN115587347A (en) 2022-09-28 2022-09-28 Virtual world content processing method and device

Publications (1)

Publication Number Publication Date
CN115587347A true CN115587347A (en) 2023-01-10

Family

ID=84773163

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211193143.XA Pending CN115587347A (en) 2022-09-28 2022-09-28 Virtual world content processing method and device

Country Status (1)

Country Link
CN (1) CN115587347A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108305296A (en) * 2017-08-30 2018-07-20 深圳市腾讯计算机系统有限公司 Iamge description generation method, model training method, equipment and storage medium
CN112597967A (en) * 2021-01-05 2021-04-02 沈阳工业大学 Emotion recognition method and device for immersive virtual environment and multi-modal physiological signals
CN114463825A (en) * 2022-04-08 2022-05-10 北京邮电大学 Face prediction method based on multi-mode fusion and related equipment
CN114581823A (en) * 2022-02-24 2022-06-03 华南理工大学 Virtual reality video emotion recognition method and system based on time sequence characteristics
CN114639132A (en) * 2020-12-16 2022-06-17 腾讯科技(深圳)有限公司 Feature extraction model processing method, device and equipment in face recognition scene
CN114662144A (en) * 2022-03-07 2022-06-24 支付宝(杭州)信息技术有限公司 Biological detection method, device and equipment
CN114972944A (en) * 2022-06-16 2022-08-30 中国电信股份有限公司 Training method and device of visual question-answering model, question-answering method, medium and equipment
CN114999237A (en) * 2022-06-07 2022-09-02 青岛理工大学 Intelligent education interactive teaching method

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108305296A (en) * 2017-08-30 2018-07-20 深圳市腾讯计算机系统有限公司 Iamge description generation method, model training method, equipment and storage medium
CN114639132A (en) * 2020-12-16 2022-06-17 腾讯科技(深圳)有限公司 Feature extraction model processing method, device and equipment in face recognition scene
CN112597967A (en) * 2021-01-05 2021-04-02 沈阳工业大学 Emotion recognition method and device for immersive virtual environment and multi-modal physiological signals
CN114581823A (en) * 2022-02-24 2022-06-03 华南理工大学 Virtual reality video emotion recognition method and system based on time sequence characteristics
CN114662144A (en) * 2022-03-07 2022-06-24 支付宝(杭州)信息技术有限公司 Biological detection method, device and equipment
CN114463825A (en) * 2022-04-08 2022-05-10 北京邮电大学 Face prediction method based on multi-mode fusion and related equipment
CN114999237A (en) * 2022-06-07 2022-09-02 青岛理工大学 Intelligent education interactive teaching method
CN114972944A (en) * 2022-06-16 2022-08-30 中国电信股份有限公司 Training method and device of visual question-answering model, question-answering method, medium and equipment

Similar Documents

Publication Publication Date Title
CN109558832B (en) Human body posture detection method, device, equipment and storage medium
US20160071024A1 (en) Dynamic hybrid models for multimodal analysis
CN104995662A (en) Avatar-based transfer protocols, icon generation and doll animation
US20210397954A1 (en) Training device and training method
Bai et al. Motion2Vector: Unsupervised learning in human activity recognition using wrist-sensing data
CN116824278B (en) Image content analysis method, device, equipment and medium
CN109698017B (en) Medical record data generation method and device
CN112308113A (en) Target identification method, device and medium based on semi-supervision
CN116343314B (en) Expression recognition method and device, storage medium and electronic equipment
CN112818955A (en) Image segmentation method and device, computer equipment and storage medium
CN113887206B (en) Model training and keyword extraction method and device
CN113657272B (en) Micro video classification method and system based on missing data completion
CN111639684B (en) Training method and device for data processing model
CN114359775A (en) Key frame detection method, device, equipment, storage medium and program product
CN111476291B (en) Data processing method, device and storage medium
CN115499635B (en) Data compression processing method and device
CN115374141B (en) Update processing method and device for virtual image
CN115587347A (en) Virtual world content processing method and device
CN115358777A (en) Advertisement putting processing method and device of virtual world
CN115393022A (en) Cross-domain recommendation processing method and device
CN115456114A (en) Method, device, medium and equipment for model training and business execution
CN112312205B (en) Video processing method and device, electronic equipment and computer storage medium
CN111063420A (en) Method and device for identifying psychological pressure based on neural network
CN111291645A (en) Identity recognition method and device
CN117540007B (en) Multi-mode emotion analysis method, system and equipment based on similar mode completion

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination