CN116740237A

CN116740237A - Bad behavior filtering method, electronic equipment and computer readable storage medium

Info

Publication number: CN116740237A
Application number: CN202310611328.6A
Authority: CN
Inventors: 柴金详; 谭宏冰; 解澎莉; 李熹昊; 栾欣洋
Original assignee: Shanghai Movu Technology Co Ltd; Mofa Shanghai Information Technology Co Ltd
Current assignee: Shanghai Movu Technology Co Ltd; Mofa Shanghai Information Technology Co Ltd
Priority date: 2023-05-28
Filing date: 2023-05-28
Publication date: 2023-09-12

Abstract

The application provides a bad behavior filtering method, electronic equipment and a computer readable storage medium, which are used for providing a live broadcast function by utilizing a virtual anchor, wherein the virtual anchor adopts a person driver, and the method comprises the following steps: acquiring behavior information of the person in the live broadcast process in real time, and detecting whether the behavior of the person in the live broadcast process is bad behavior or not based on the behavior information, wherein the behavior information comprises image information and/or voice information; if the behavior of the person is bad behavior, driving the virtual anchor by adopting a first driving strategy to filter the bad behavior of the person, wherein the first driving strategy comprises at least one of the following steps: the method comprises the steps of a first preset text and a first preset action. According to the application, the bad behaviors of the people in the live broadcast process are monitored and filtered in real time, so that the health and positive upward of live broadcast content are ensured, the bad content is avoided, and the user experience is improved.

Description

Bad behavior filtering method, electronic equipment and computer readable storage medium

Technical Field

The application relates to the technical field of virtual persons and artificial intelligence, in particular to a bad behavior filtering method, electronic equipment and a computer readable storage medium.

Background

The virtual objects include virtual humans, virtual animals, virtual cartoon figures, and the like. The virtual person is a personified image constructed by CG technology and operated in a code form, and has various interaction modes such as language communication, expression, action display and the like. The technology of the dummy person has been rapidly developed in the field of artificial intelligence and has been applied in many technical fields such as video, media, games, finance, travel, education, medical and so on.

When the existing virtual anchor is driven by the person in use, the action expression is consistent with the person in use, and when the person in use makes some bad behaviors (such as white eyes), the virtual anchor can make corresponding bad behaviors, so that the viewing experience of audiences is greatly influenced. Based on this, the present application provides a bad behavior filtering method, an electronic device and a computer readable storage medium to improve the prior art.

Disclosure of Invention

The application aims to provide a bad behavior filtering method, electronic equipment and a computer readable storage medium, which ensure the health and positive upward of live content, avoid bad content and improve user experience.

The application adopts the following technical scheme:

In a first aspect, the present application provides a bad behavior filtering method for providing live broadcast functionality with a virtual anchor, the virtual anchor employing a person-in-person driver, the method comprising:

acquiring behavior information of the person in the live broadcast process in real time, and detecting whether the behavior of the person in the live broadcast process is bad behavior based on the behavior information, wherein the behavior information comprises image information and/or voice information, and the bad behavior comprises at least one of the following: white eyes, poor speech, smoking, alcoholism, gambling, wear exposure, disability and injury to others;

if the behavior of the person is bad behavior, driving the virtual anchor by adopting a first driving strategy to filter the bad behavior of the person, wherein the first driving strategy comprises at least one of the following steps: the method comprises the steps of a first preset text and a first preset action.

The beneficial effect of this technical scheme lies in: the adverse behaviors of the middle people in the live broadcast process are monitored and filtered in real time, so that the health and positive upward of live broadcast content are ensured, the occurrence of the adverse content is avoided, and the user experience is improved. When the people in the process do bad behaviors, the virtual anchor does not correspondingly do consistent bad behaviors, but the first driving strategy (preset text and actions) is adopted to drive the virtual anchor, so that the bad behaviors of the people in the process are effectively filtered, legal rights and interests of audiences are ensured, and especially for the underage audiences, the underage audiences can be protected from bad information. In addition, bad content can be effectively reduced, the public praise of the platform is improved to a certain extent, the recognition and trust of the user to the platform are enhanced, the method is suitable for filtering various bad behaviors, has higher flexibility and expandability, can be continuously optimized and upgraded according to actual conditions, and is suitable for the requirements of different platforms. In conclusion, the method can effectively improve the live broadcast quality and user experience of the live broadcast platform, ensure the interests of audiences and the public praise of the platform, and has higher practical value and application prospect.

In some alternative embodiments, the means for obtaining the first driving strategy includes:

acquiring a behavior identifier of the behavior information;

and inputting the behavior identification into a strategy configuration model, and acquiring a first driving strategy corresponding to the behavior identification.

The beneficial effect of this technical scheme lies in: different driving strategies are given for different bad behaviors of the people in the pair, so that the filtering mechanism is more natural and does not cause the discomfort of the audience. Specifically, behavior information is obtained in real time, and policy matching is performed according to real-time conditions, so that the execution of the policies is more real-time and dynamic, and the actual application requirements can be met. By acquiring the behavior identification and inputting the behavior identification into the strategy configuration model, the corresponding first driving strategy can be more accurately matched, and therefore the accuracy of strategy matching is improved. Based on the algorithm model and the behavioral data analysis, the optimal driving strategy can be rapidly and accurately worked out, the complex manual data processing and analysis process is avoided, and the operation efficiency and the management level of the virtual anchor platform are greatly improved. Through accurate matching first driving strategy, can improve the utilization efficiency and the operating efficiency of platform resource to reduce the platform cost, improve the income.

In some alternative embodiments, the driving the virtual anchor with the first driving policy includes:

driving the expression and the mouth shape of the virtual anchor based on the first preset text; and/or the number of the groups of groups,

and driving the action of the virtual anchor based on the first preset action.

The beneficial effect of this technical scheme lies in: the expression, the mouth shape and the action of the virtual anchor are driven based on the first preset text or action, so that the virtual anchor can simulate the communication mode and the behavior of human more realistically, the realism and the interactivity of the virtual anchor are further enhanced, and the immersion and the participation of spectators are further improved. The first driving strategy is adopted, so that the manufacturing process of the virtual anchor can be more automatic and standardized, the production cost is reduced, and the production efficiency is improved. For example, in the manufacturing process, only the expression and the action of the virtual anchor are required to be set according to the first preset text or action, and detailed design and manufacture of each action and each expression are not required, so that the realistic sensation, the intelligent interaction capability and the production efficiency of the virtual anchor are improved, the user requirements are better met, and the user experience is improved.

In some optional embodiments, the detecting whether the behavior of the person is bad behavior based on the behavior information includes:

Inputting the behavior information into a behavior detection model to obtain a behavior detection result corresponding to the person, wherein the behavior detection result is used for indicating bad behavior or not;

the training process of the behavior detection model comprises the following steps:

acquiring a training set, wherein the training set comprises a plurality of training data, and each training data comprises sample behavior information and labeling data of behavior detection results corresponding to the sample behavior information;

for each of the training data, the following processing is performed:

inputting sample behavior information in the training data into a preset deep learning model to obtain predicted data of a behavior detection result corresponding to the sample behavior information;

updating model parameters of the deep learning model based on the prediction data and the labeling data of the behavior detection result corresponding to the sample behavior information;

detecting whether a preset training ending condition is met; if yes, taking the trained deep learning model as the behavior detection model; if not, continuing to train the deep learning model by using the next training data.

The beneficial effect of this technical scheme lies in: by training and identifying by adopting the deep learning model, automatic identification can be realized, the time consumption is short, the cost can be reduced, the efficiency can be improved, and the bad behaviors can be identified more accurately, so that the propagation of the bad behaviors in a network space can be effectively reduced. In the training process, the weight of the deep learning model can be updated according to the labeling data and the prediction data, so that the detection accuracy is continuously improved, and the deep learning model is more suitable for various scenes and changes. By detecting and finding out bad behaviors, corresponding measures can be taken in time, network safety is protected, and healthy development of a network environment is maintained.

In some alternative embodiments, the method further comprises:

generating a bad behaviour record for the person in question based on the behaviour information, the bad behaviour record comprising at least one of:

a frame image in which bad behavior occurs;

a timestamp corresponding to the frame image;

and the duration of the video containing the frame images is not longer than the preset duration.

The beneficial effect of this technical scheme lies in: by generating the bad behavior record for the person in question, the bad behavior can be better recorded and managed, facilitating future investigation and processing. Through recording and analyzing the bad behaviors, the occurrence mechanism and rule of the bad behaviors can be deeply known, and an important basis is provided for preventing similar bad behaviors. The bad behaviors can be timely found and enhanced through recording and management, so that social order and network safety are effectively maintained, the bad behaviors can be more accurately recorded through the modes of frame images, time stamps, videos containing the frame images and the like, and the recording effectiveness and accuracy are improved. In summary, by generating the record of the bad behaviors for the person in question, the record and management capability of the bad behaviors can be improved, an important basis is provided for preventing similar bad behaviors, the striking force of the bad behaviors is enhanced, and meanwhile, the accuracy and effectiveness of the record can be improved.

In some alternative embodiments, the method further comprises:

when the number of the bad behavior records of the people is not less than a first preset number, displaying behavior prompt information by using a live broadcasting room management interface so as to prompt the people to adjust the behaviors of the people;

when the number of bad behavior records of the people is not smaller than a second preset number, sending illegal prompt information to terminal equipment of platform management personnel so that the platform management personnel can supervise the people to adjust the behaviors of the people, and the first preset number is smaller than the second preset number.

The beneficial effect of this technical scheme lies in: by setting the quantity threshold value of the bad behavior records, the bad behaviors of people in the process of monitoring can be timely found and monitored, so that the maintenance of the order and network security of the live broadcasting room is facilitated.

When the number of bad behavior records is small and exceeds the first preset number, the behavior prompt information is displayed by using the live broadcasting room management interface, so that the user can be reminded to adjust the behavior under the condition of not disturbing other users. And when the number of bad behavior records is more and exceeds the second preset number, illegal prompt information is sent to terminal equipment of platform management personnel, so that the middle person is supervised to adjust own behaviors in time, and when the number of bad behaviors is severe, the live broadcasting room can be limited or blocked, or the proportion of the people in the live broadcasting room is reduced. By monitoring and processing the bad behaviors of the middle people, the legal rights and interests of other users can be protected, and the quality of the impression and the user experience of the live broadcasting room are improved. By automatically sending the violation prompt information to platform management personnel, the management efficiency and response speed can be improved, and the management cost and human resources are saved. In summary, the number threshold of the bad behavior records is set, and bad behaviors of people in process are monitored and processed by means of a live broadcasting room management interface, prompt information sent by terminal equipment of platform management staff and the like, so that monitoring capability of the bad behaviors and effects of prompting and adjusting the behaviors of the people are improved, rights and interests of other users are protected, and efficiency of platform management is enhanced.

In some optional embodiments, the live room corresponding to the virtual host is a live room for a preset theme, and the preset theme includes at least one of the following: education, product promotion, fitness, games, and music;

the method further comprises the steps of:

based on the behavior information, obtaining the correlation degree between the behavior of the person and the preset theme;

when the correlation is smaller than a preset correlation, driving the virtual anchor by adopting a second driving strategy, wherein the second driving strategy comprises at least one of the following steps: the second preset text and the second preset action.

The beneficial effect of this technical scheme lies in: the method has the advantages that the correlation degree between the behaviors and the preset theme is obtained based on the behavior information of the people, so that the driving strategy of the virtual anchor can be adjusted in real time, the driving strategy better accords with the preset theme of the live broadcasting room, and the theme correlation of the live broadcasting room is improved. Through ensuring that the behavior of the virtual host is related to the preset theme, the identity and attribution of the user to the live broadcasting room can be increased, the participation degree and loyalty of the user can be improved, and the user viscosity of the live broadcasting room can be increased. By adopting the second driving strategy to drive the virtual anchor, the interaction and the interestingness of the live broadcasting room can be increased, more users are attracted to participate in live broadcasting interaction, and the user experience and the entertainment value are improved. Products and services of different categories can be directionally promoted by aiming at live broadcasting rooms with different preset topics, and the commercial value and the flow rendering capability of the live broadcasting rooms are improved. In summary, the correlation degree between the behavior information of the people and the preset theme is obtained based on the behavior information of the people, and the second driving strategy is adopted to drive the virtual anchor, so that the theme correlation, user viscosity, interactivity and interestingness of the living broadcasting room are improved, and the commercial value of the living broadcasting room is improved.

In some alternative embodiments, the preset theme is for an underage educational theme, the method further comprising:

acquiring real-time barrages of the live broadcasting room, and detecting whether each real-time barrage is a bad speaker or not so as to acquire the number of the bad speakers of the live broadcasting room;

and when the number of bad utterances in the live broadcasting room is not less than a third preset number, starting a barrage filtering popup window on a management interface of the live broadcasting room so that people in the live broadcasting room can start the function of filtering the bad utterances through the barrage filtering popup window.

The beneficial effect of this technical scheme lies in: through acquiring and detecting live broadcasting room barrage in real time and starting barrage filtering barrage window, the bad language can be filtered in the live broadcasting room management interface, the number of bad language in the live broadcasting room is effectively reduced, and healthy growth of minors is protected. And by starting the bullet screen filtering bullet window, people in the bullet screen can filter bad language in the management interface of the living broadcasting room, so that the management and control of the bad language are enhanced, and the safety and standardization of the living broadcasting room are improved. The interaction and enthusiasm of the live broadcasting room can be improved by filtering bad comments, more users are attracted to participate in live broadcasting interaction, and user experience and entertainment value are improved. By reducing the number of bad utterances in the live broadcast room, the trust degree and acceptance degree of users to the live broadcast room can be improved, the attention and support of the users to the theme of the live broadcast room can be further increased, and the user viscosity of the live broadcast room can be increased. In summary, by acquiring and detecting the live broadcast room barrage in real time and starting the barrage filtering barrage window, the setting of bad filtering comments is carried out in the management interface of the live broadcast room, the number of bad utterances in the live broadcasting room is reduced, the healthy growth of minors is protected, the management and control of the bad utterances are enhanced, and the interactivity, the enthusiasm and the trust degree of the live broadcasting room are improved.

In a second aspect, the present application provides an electronic device for providing live functionality with a virtual anchor, the electronic device comprising a memory storing a computer program and at least one processor configured to implement the following steps when executing the computer program:

In some alternative embodiments, the at least one processor is configured to obtain the first driving policy when executing the computer program by:

Acquiring a behavior identifier of the behavior information;

In some alternative embodiments, the at least one processor is configured to drive the virtual anchor with a first drive policy when executing the computer program in the following manner:

and driving the action of the virtual anchor based on the first preset action.

In some alternative embodiments, the at least one processor is configured to detect whether the person's behavior is bad behavior based on the behavior information when executing the computer program in the following manner:

For each of the training data, the following processing is performed:

In some alternative embodiments, the at least one processor is configured to execute the computer program to further implement the steps of:

a frame image in which bad behavior occurs;

a timestamp corresponding to the frame image;

the at least one processor is configured to execute the computer program to further implement the steps of:

In some alternative embodiments, the preset theme is for an underage educational theme, and the at least one processor is configured to execute the computer program to further implement the steps of:

In a third aspect, the present application provides a computer-readable storage medium storing a computer program which, when executed by at least one processor, performs the steps of any of the methods or performs the functions of any of the electronic devices described above.

Drawings

The application will be further described with reference to the drawings and embodiments.

Fig. 1 is a flow chart of a bad behavior filtering method according to an embodiment of the present application.

Fig. 2 is a schematic flow chart of driving a virtual anchor according to an embodiment of the present application.

Fig. 3 is a flow chart of another method for filtering bad behaviors according to an embodiment of the present application.

Fig. 4 is a flow chart of another bad behavior filtering method according to an embodiment of the present application.

Fig. 5 is a block diagram of an electronic device according to an embodiment of the present application.

Fig. 6 is a schematic structural diagram of a program product according to an embodiment of the present application.

Detailed Description

The technical scheme of the present application will be described below with reference to the drawings and the specific embodiments of the present application, and it should be noted that, on the premise of no conflict, new embodiments may be formed by any combination of the embodiments or technical features described below.

In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any implementation or design described as "exemplary" or "e.g." in the examples of this application should not be construed as preferred or advantageous over other implementations or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.

The first, second, etc. descriptions in the embodiments of the present application are only used for illustration and distinction of description objects, and no order division is used, nor does it represent a particular limitation on the number in the embodiments of the present application, nor should it constitute any limitation on the embodiments of the present application.

The technical field and related terms of the embodiments of the present application are briefly described below.

The virtual objects include virtual humans, virtual animals, virtual cartoon figures, and the like. The virtual person is a personified image constructed by CG technology and operated in a code form, and has various interaction modes such as language communication, expression, action display and the like. The technology of virtual persons has been rapidly developed in the field of artificial intelligence and has been applied in many technical fields such as video, media, games, finance, travel, education, medical treatment, etc., and not only can a virtual host, a virtual anchor, a virtual even image, a virtual customer service, a virtual lawyer, a virtual financial advisor, a virtual anchor, a virtual doctor, a virtual instructor, a virtual assistant, etc., but also a video can be generated through text or audio one-key. In the virtual people, the service type virtual people mainly have the functions of replacing real people to serve and provide daily accompaniment, are the virtualization of service type roles in reality, and have the industrial value of mainly reducing the cost of the existing service type industry and enhancing the cost reduction of the stock market.

Artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. The design principle and the implementation method of various intelligent machines are researched by artificial intelligence, so that the machines have the functions of perception, reasoning and decision. The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.

Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. The computer program may learn experience E given a certain class of tasks T and performance metrics P, and increase with experience E if its performance in task T happens to be measured by P. Machine learning is specialized in studying how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, reorganizing existing knowledge structures to continually improve its own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence.

Deep learning is a special machine learning by which the world is represented using a hierarchy of nested concepts, each defined as being associated with a simple concept, and achieving great functionality and flexibility, while a more abstract representation is computed in a less abstract way. Machine learning and deep learning typically include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.

The virtual object interaction application is used for providing virtual object interaction functions. Virtual objects can simulate human communication and behavior and interact with users. Such software (referred to as virtual human interactive applications) is typically driven by artificial intelligence and natural language processing techniques and is capable of interacting with a user by means of text, speech, images, forms, etc.

In the technology of virtual people, the 'man in' refers to a person who deducts and perfects the image of the virtual person through the technology of motion capture and face capture, and can realize the interaction between the virtual person and reality, so that the virtual person can interact with a true person freely.

The scheme provided by the embodiment of the application relates to technologies such as virtual man, artificial intelligence, 3D modeling, cloud computing and the like, and is specifically described by the following embodiment. The following description of the embodiments is not intended to limit the preferred embodiments.

(bad behavior filtering method)

Referring to fig. 1, fig. 1 is a flow chart of a bad behavior filtering method according to an embodiment of the present application.

The embodiment of the application provides a bad behavior filtering method, which is used for providing a live broadcast function by utilizing a virtual anchor, wherein the virtual anchor adopts a man-in-the-art driver, and the method comprises the following steps:

step S101: acquiring behavior information of the person in the live broadcast process in real time, and detecting whether the behavior of the person in the live broadcast process is bad behavior based on the behavior information, wherein the behavior information comprises image information and/or voice information, and the bad behavior comprises at least one of the following: white eyes, poor speech, smoking, alcoholism, gambling, wear exposure, disability and injury to others;

step S102: if the behavior of the person is bad behavior, driving the virtual anchor by adopting a first driving strategy to filter the bad behavior of the person, wherein the first driving strategy comprises at least one of the following steps: the method comprises the steps of a first preset text and a first preset action.

In an embodiment of the present application, the virtual anchor comprises one or more of a virtual person, a virtual animal, and a virtual cartoon character. As one example, the virtual anchor is a virtual person "JING" (chinese name: mirror).

The adverse behavior filtering method can be operated on the electronic equipment, the electronic equipment and the terminal equipment mentioned below can be independent, and the electronic equipment and the terminal equipment can be integrated. When the electronic device and the terminal device are independent, the electronic device may be a computer, a server (including a cloud server), or the like having computing power. The terminal device is not limited in the embodiment of the application, and may be, for example, an intelligent terminal device such as a mobile phone, a tablet computer, a notebook computer, a desktop computer, an intelligent wearable device, or the terminal device may be a workstation or a console.

In the embodiment of the application, the bad language comprises any one of the following language: network violence, sexual implications, defamation, hypocolloquia, abuse, reaction, and the like.

As an example, the first preset text may be, for example: "New incoming babies remembers the attention of the hand point-! Viewing by thank you babies, the first preset action may be, for example: "Feishunxin".

The first driving strategy may be fixed, i.e. for different bad behaviors of the people in the pair, the virtual object is driven by the same text and/or action, and the text content is: the action of ' having a rest and returning immediately ' is ' blinking smile ' and the comparison of the hands with the heart '.

The first driving strategy may also be dynamically adjusted, i.e. driving the virtual object with different text and/or actions, depending on the specific type of behavior information. For example, when the person in question gives a bad speech, the corresponding first driving strategy is: text is combined with actions, and the text content is: the user takes a rest and returns immediately, and the actions are blinking smile and double hands are heart comparing; when the person smoking, the corresponding first driving strategy is: text is combined with actions, and the text content is: let me see how many babies are looking me, act as smile, tap the leg with both hands.

Therefore, bad behaviors of the middle people in the live broadcast process are monitored and filtered in real time, so that health and positive upward of live broadcast content are guaranteed, bad content is avoided, and user experience is improved. When the people in the process do bad behaviors, the virtual anchor does not correspondingly do consistent bad behaviors, but the first driving strategy (preset text and actions) is adopted to drive the virtual anchor, so that the bad behaviors of the people in the process are effectively filtered, legal rights and interests of audiences are ensured, and especially for the underage audiences, the underage audiences can be protected from bad information. In addition, bad content can be effectively reduced, the public praise of the platform is improved to a certain extent, the recognition and trust of the user to the platform are enhanced, the method is suitable for filtering various bad behaviors, has higher flexibility and expandability, can be continuously optimized and upgraded according to actual conditions, and is suitable for the requirements of different platforms. In conclusion, the method can effectively improve the live broadcast quality and user experience of the live broadcast platform, ensure the interests of audiences and the public praise of the platform, and has higher practical value and application prospect.

In some embodiments, the method further comprises: if the behaviors of the people are not bad behaviors, the voice information of the collected people is converted into a driving text, and the driving text is adopted to drive the expression and the mouth shape of the virtual anchor; the real-time image of the person in the process is collected, the driving action of the person in the process is extracted from the real-time image, and the driving action of the person in the process is adopted to drive the action of the virtual anchor.

The method has the advantages that when the middle person does not make bad behaviors, the behaviors of the virtual anchor are consistent with those of the middle person, natural and humanized interaction can be achieved, and when the middle person makes bad behaviors, the virtual anchor makes corresponding behaviors according to the first driving strategy, so that the bad behaviors of the middle person are filtered, and the positive healthy sunlight is always presented to the audience.

In some embodiments, in step S102, the manner of acquiring the first driving policy includes:

acquiring a behavior identifier of the behavior information;

Thus, different driving strategies are given for different bad behaviors of the people in the pair, so that the filter mechanism is more natural and does not cause the discomfort of the audience. Specifically, behavior information is obtained in real time, and policy matching is performed according to real-time conditions, so that the execution of the policies is more real-time and dynamic, and the actual application requirements can be met. By acquiring the behavior identification and inputting the behavior identification into the strategy configuration model, the corresponding first driving strategy can be more accurately matched, and therefore the accuracy of strategy matching is improved. Based on the algorithm model and the behavioral data analysis, the optimal driving strategy can be rapidly and accurately worked out, the complex manual data processing and analysis process is avoided, and the operation efficiency and the management level of the virtual anchor platform are greatly improved. Through accurate matching first driving strategy, can improve the utilization efficiency and the operating efficiency of platform resource to reduce the platform cost, improve the income.

The behavior identification is used to indicate the behavior type of the bad behavior, and the behavior identification can be obtained by a behavior detection model mentioned below.

Specifically, the behavior information is input into the behavior detection model, so that a behavior detection result can be obtained, wherein the behavior detection result is used for indicating bad behavior or not, and when the behavior is bad behavior, the behavior detection result also comprises a behavior identifier.

The behavior identification may be represented by one or more of the following: chinese, digits, letters, and special characters. The behavior identification may be, for example: "speech abuse", "smoking", "turning the eyes" or "attacking others".

The policy configuration model may be a semantic extraction model, and for text information corresponding to the behavior identifier, a pre-training language model, such as BERT, GPT, etc., is used to extract semantic features of the input text. The models can learn the structure and semantic information of the language by pre-training a large amount of texts, so that the semantic information of the input texts can be effectively extracted.

Referring to fig. 2, fig. 2 is a schematic flow chart of driving a virtual anchor according to an embodiment of the present application.

In some embodiments, in the step S102, driving the virtual anchor with the first driving policy includes:

Step S201: driving the expression and the mouth shape of the virtual anchor based on the first preset text; and/or the number of the groups of groups,

step S202: and driving the action of the virtual anchor based on the first preset action.

Therefore, the expression, the mouth shape and the action of the virtual anchor are driven based on the first preset text or action, so that the virtual anchor can simulate the communication mode and the behavior of human more realistically, the realism and the interactivity of the virtual anchor are further enhanced, and the immersion and the participation of audiences are further improved. The first driving strategy is adopted, so that the manufacturing process of the virtual anchor can be more automatic and standardized, the production cost is reduced, and the production efficiency is improved. For example, in the manufacturing process, only the expression and the action of the virtual anchor are required to be set according to the first preset text or action, and detailed design and manufacture of each action and each expression are not required, so that the realistic sensation, the intelligent interaction capability and the production efficiency of the virtual anchor are improved, the user requirements are better met, and the user experience is improved.

In some embodiments, in the step S101, detecting whether the behavior of the person is bad behavior based on the behavior information includes:

for each of the training data, the following processing is performed:

Therefore, the deep learning model is adopted for training and identifying, automatic identification can be realized, the time consumption is short, the cost can be reduced, the efficiency can be improved, and the bad behaviors can be identified more accurately, so that the propagation of the bad behaviors in the network space can be effectively reduced. In the training process, the weight of the deep learning model can be updated according to the labeling data and the prediction data, so that the detection accuracy is continuously improved, and the deep learning model is more suitable for various scenes and changes. By detecting and finding out bad behaviors, corresponding measures can be taken in time, network safety is protected, and healthy development of a network environment is maintained.

The behavior detection result may be represented by one or more of Chinese, letters, numbers, and symbols, for example, "bad behavior," "Y," "1," "v," etc. may be used to indicate bad behavior, and "not bad behavior," "N," "0," "x," etc. may be used to indicate not bad behavior.

The method for obtaining the configuration model and the behavior detection model is not limited in the embodiment of the present application, in some embodiments, the models may be obtained by training, and in other embodiments, the models may be obtained by training in advance.

When each model is obtained through training in a deep learning mode, a proper amount of neuron computing nodes and a multi-layer operation hierarchical structure are established through design, a proper input layer and a proper output layer are selected, a preset deep learning model corresponding to each model (namely an initial model corresponding to each model) can be obtained, a functional relation from input to output is established through learning and optimization of the deep learning model, although the functional relation from input to output cannot be found 100%, the functional relation can be as close to the actual association relation as possible, and therefore each model obtained through training can obtain corresponding output data based on input data.

Training the deep learning model by using training sets corresponding to the models, quickly modeling can be performed by learning a small number of samples, training errors of the deep learning model can be gradually reduced in the continuous training process, and the optimal weight is stored and read; recording the accuracy of the training set and the verification set, and facilitating parameter adjustment (adjustment of model parameters); the model parameters of the deep learning model are updated, so that the model can be better fitted with data, the generalization capability is effectively achieved, and the robustness and the fitting precision are improved.

In some alternative embodiments, the historical data may be data mined to obtain sample data in the training set. That is, the sample data may be collected during the real interaction. In addition, the sample data may be automatically generated by using a GAN model generation network.

The GAN model generates an countermeasure network (Generative Adversarial Network) composed of a generation network and a discrimination network. The generation network samples randomly from the potential space (latency space) as input, the output of which needs to mimic as much as possible the real samples in the training set. The input of the discrimination network is then the real sample or the output of the generation network, the purpose of which is to distinguish the output of the generation network as far as possible from the real sample. And the generation of the network should be as fraudulent as possible to discriminate the network. The two networks are mutually opposed and continuously adjust parameters, and the final purpose is that the judging network can not judge whether the output result of the generated network is real or not. A large amount of sample data can be generated by using the GAN model and used for the training process of each model, so that the data volume of original data acquisition can be effectively reduced, and the cost of data acquisition and labeling is greatly reduced.

The training process of each model is not limited, and for example, a training mode of supervised learning, a training mode of semi-supervised learning or a training mode of unsupervised learning can be adopted.

When a training mode of supervised learning or semi-supervised learning is adopted, the method for acquiring the annotation data is not limited, and for example, a manual annotation mode or an automatic annotation or semi-automatic annotation mode can be adopted. When the sample data is acquired in the real interaction process, the real data can be acquired from the historical data in a keyword extraction mode to serve as the annotation data.

The training ending condition in the training process of each model is not limited in the embodiment of the application, for example, the training times can reach the preset times (the preset times are, for example, 1 time, 3 times, 10 times, 100 times, 1000 times, 10000 times, etc.), or the training data in the training set can be all trained once or more times, or the total loss value obtained in the training is not greater than the preset loss value.

In some embodiments, the method further comprises:

A frame image in which bad behavior occurs;

a timestamp corresponding to the frame image;

A frame image refers to a static image of a frame in a video, and can be said to be every instant in a video stream. Typically, a video stream is made up of a series of successive frame images. In video applications, frame images are the fundamental unit of video analysis, processing, and encoding, each of which contains complete information in the video stream. For example, in video monitoring applications, the shape, size, color, etc. of an object can be identified by analyzing a frame image, thereby achieving the purposes of detection, tracking, and alarm. In video coding, compression processing is also required for each frame image in order to more effectively store and transmit video data. Therefore, frame images have important significance in video applications.

A time stamp refers to a number or character string representing a certain point in time. It is typically expressed in a particular format, common timestamp formats include ISO 8601, RFC 2822, and the like. The time stamp is often applied to record the exact time that the event occurred, for example, in a computer system, the order of occurrence of the plurality of events may need to be ordered and processed according to the time stamp. Time stamps are also widely used in communications between server and client to ensure the order and accuracy of data transmission.

The present application is not limited to the preset time period, and the preset time period is, for example, 1 minute, 2 minutes, 5 minutes, 10 minutes, or 30 minutes.

Thus, by generating the record of the bad behaviors for the person in question, the bad behaviors can be better recorded and managed, which is helpful for the future investigation and processing. Through recording and analyzing the bad behaviors, the occurrence mechanism and rule of the bad behaviors can be deeply known, and an important basis is provided for preventing similar bad behaviors. The bad behaviors can be timely found and enhanced through recording and management, so that social order and network safety are effectively maintained, the bad behaviors can be more accurately recorded through the modes of frame images, time stamps, videos containing the frame images and the like, and the recording effectiveness and accuracy are improved. In summary, by generating the record of the bad behaviors for the person in question, the record and management capability of the bad behaviors can be improved, an important basis is provided for preventing similar bad behaviors, the striking force of the bad behaviors is enhanced, and meanwhile, the accuracy and effectiveness of the record can be improved.

Referring to fig. 3, fig. 3 is a flow chart of another bad behavior filtering method according to an embodiment of the present application.

In some embodiments, the method further comprises:

step S103: when the number of the bad behavior records of the people is not less than a first preset number, displaying behavior prompt information by using a live broadcasting room management interface so as to prompt the people to adjust the behaviors of the people;

step S104: when the number of bad behavior records of the people is not smaller than a second preset number, sending illegal prompt information to terminal equipment of platform management personnel so that the platform management personnel can supervise the people to adjust the behaviors of the people, and the first preset number is smaller than the second preset number.

The embodiment of the application is not limited to a first preset number, for example, 1, 2, 3, 5 or 7 bars, and a second preset number, for example, 10, 15, 18, 20, 23, 28 or 30 bars.

The behavior hint information may be, for example: "detect you have made bad behavior, please note own talk.

The violation alert information may be, for example: "anchor A makes bad actions multiple times, please supervise anchor A in time".

In some embodiments, the method further comprises: when the number of bad behavior records of the middle person is not smaller than a second preset number, setting the commission proportion of the middle person based on the number of bad behavior records of the middle person. This has the advantage that the more times the person in question makes bad actions, the lower the proportion of commissions to be drawn, thus encouraging the person in question to strictly standardize his own actions in order to obtain more commissions.

Therefore, the number threshold value of the bad behavior records is set, so that the bad behaviors of people in the process of monitoring can be timely found and managed, and the maintenance of the order and network security of the live broadcasting room is facilitated.

Referring to fig. 4, fig. 4 is a flow chart of another bad behavior filtering method according to an embodiment of the present application.

In some embodiments, the live room corresponding to the virtual anchor is a live room for a preset theme, and the preset theme includes at least one of the following: education, product promotion, fitness, games, and music;

the method further comprises the steps of:

step S105: based on the behavior information, obtaining the correlation degree between the behavior of the person and the preset theme;

step S106: when the correlation is smaller than a preset correlation, driving the virtual anchor by adopting a second driving strategy, wherein the second driving strategy comprises at least one of the following steps: the second preset text and the second preset action.

The embodiment of the application does not limit the preset correlation degree, and the preset correlation degree can be 30%, 35%, 38%, 43%, 49%, 55%, 60%, 65% or 70%.

The embodiment of the application does not limit the second driving strategy, and the second driving strategy can be automatically generated according to a preset theme or manually configured in advance by a configurator.

The second driving policy may be automatically generated by using the aforementioned policy configuration model, specifically, the theme identifier of the preset theme is input to the policy configuration model, and the second driving policy corresponding to the preset theme is obtained.

When the preset theme is an educational theme, the text of the second driving policy (i.e., the second preset text) may be: "how tired is learning, we relax next," the action (i.e., the second preset action) may be "two-hand clapping the leg". When the preset theme is a game-like theme, the text of the second driving policy may be: "what the person has about the game, welcome to discuss in the living room", the action may be "finger light brain.

Therefore, the driving strategy of the virtual anchor can be adjusted in real time by acquiring the correlation degree between the behavior and the preset theme based on the behavior information of the person, so that the driving strategy better accords with the preset theme of the live broadcasting room, and the theme correlation of the live broadcasting room is improved. Through ensuring that the behavior of the virtual host is related to the preset theme, the identity and attribution of the user to the live broadcasting room can be increased, the participation degree and loyalty of the user can be improved, and the user viscosity of the live broadcasting room can be increased. By adopting the second driving strategy to drive the virtual anchor, the interaction and the interestingness of the live broadcasting room can be increased, more users are attracted to participate in live broadcasting interaction, and the user experience and the entertainment value are improved. Products and services of different categories can be directionally promoted by aiming at live broadcasting rooms with different preset topics, and the commercial value and the flow rendering capability of the live broadcasting rooms are improved. In summary, the correlation degree between the behavior information of the people and the preset theme is obtained based on the behavior information of the people, and the second driving strategy is adopted to drive the virtual anchor, so that the theme correlation, user viscosity, interactivity and interestingness of the living broadcasting room are improved, and the commercial value of the living broadcasting room is improved.

In some embodiments, the obtaining the correlation between the behavior of the person and the preset theme may include:

and inputting the behavior information of the person and the theme identification of the preset theme into a relevance detection model to obtain the relevance of the person and the theme identification.

The training process of the correlation detection model comprises the following steps:

acquiring a correlation training set, wherein the correlation training set comprises a plurality of correlation training data, and each correlation training data comprises sample behavior information, a sample topic identifier and labeling data of the correlation of the sample behavior information and the sample topic identifier;

for each relevance training data in the relevance training set, performing the following processing:

inputting the sample behavior information and the sample topic identification in the correlation training data into a preset deep learning model to obtain prediction data of the correlation of the sample behavior information and the sample topic identification;

updating model parameters of the deep learning model based on the sample behavior information and the prediction data and the labeling data of the sample topic identification;

detecting whether a preset relativity training ending condition is met or not; if yes, taking the trained deep learning model as the correlation detection model; if not, continuing to train the deep learning model by using the next relevancy training data.

In some embodiments, the preset theme is for an underage educational theme, the method further comprising:

The third preset number is not limited in the embodiment of the present application, and the third preset number may be 3, 5, 8, 9, 12, 16, 21 or 30, for example.

Therefore, the live broadcasting room barrage is obtained and detected in real time, and the barrage filtering barrage window is started, so that poor-quality speech can be filtered in the live broadcasting room management interface, the number of poor-quality speech in the live broadcasting room is effectively reduced, and healthy growth of minors is protected. And by starting the bullet screen filtering bullet window, people in the bullet screen can filter bad language in the management interface of the living broadcasting room, so that the management and control of the bad language are enhanced, and the safety and standardization of the living broadcasting room are improved. The interaction and enthusiasm of the live broadcasting room can be improved by filtering bad comments, more users are attracted to participate in live broadcasting interaction, and user experience and entertainment value are improved. By reducing the number of bad utterances in the live broadcast room, the trust degree and acceptance degree of users to the live broadcast room can be improved, the attention and support of the users to the theme of the live broadcast room can be further increased, and the user viscosity of the live broadcast room can be increased. In summary, by acquiring and detecting the live broadcast room barrage in real time and starting the barrage filtering barrage window, the setting of bad filtering comments is carried out in the management interface of the live broadcast room, the number of bad utterances in the live broadcasting room is reduced, the healthy growth of minors is protected, the management and control of the bad utterances are enhanced, and the interactivity, the enthusiasm and the trust degree of the live broadcasting room are improved.

Wherein the detecting whether each real-time bullet screen is a bad speaker includes:

inputting the real-time barrage into a semantic detection model to obtain a semantic detection result of the real-time barrage, wherein the semantic detection result is used for indicating whether the real-time barrage is a bad language or not. It should be noted that the semantic detection model may be obtained by training a training method similar to the behavior detection model.

In a specific application scenario, the embodiment of the application further provides a bad behavior filtering method, which is used for providing a live broadcast function by using a virtual anchor, wherein the virtual anchor adopts a person driver, and the method comprises the following steps:

The method for acquiring the first driving strategy comprises the following steps:

acquiring a behavior identifier of the behavior information;

The driving the virtual anchor by adopting the first driving strategy comprises the following steps:

and driving the action of the virtual anchor based on the first preset action.

The detecting whether the behavior of the person is bad behavior based on the behavior information comprises:

for each of the training data, the following processing is performed:

The method further comprises the steps of:

a frame image in which bad behavior occurs;

a timestamp corresponding to the frame image;

The method further comprises the steps of:

The live broadcasting room corresponding to the virtual host is a live broadcasting room aiming at a preset theme, and the preset theme comprises at least one of the following: education, product promotion, fitness, games, and music;

the method further comprises the steps of:

The preset theme is for an underage educational theme, the method further comprising:

(electronic device)

The embodiment of the application also provides an electronic device, the specific embodiment of which is consistent with the embodiment described in the method embodiment and the achieved technical effect, and part of the contents are not repeated.

The electronic device for providing live functionality with a virtual anchor, the electronic device comprising a memory storing a computer program and at least one processor configured to implement the following steps when executing the computer program:

In some embodiments, the at least one processor is configured to obtain the first drive strategy when executing the computer program by:

acquiring a behavior identifier of the behavior information;

In some embodiments, the at least one processor is configured to drive the virtual anchor with a first drive policy when executing the computer program in the following manner:

and driving the action of the virtual anchor based on the first preset action.

In some embodiments, the at least one processor is configured to detect whether the person's behavior is bad behavior based on the behavior information when executing the computer program in the following manner:

For each of the training data, the following processing is performed:

In some embodiments, the at least one processor is configured to execute the computer program to further implement the steps of:

a frame image in which bad behavior occurs;

a timestamp corresponding to the frame image;

In some embodiments, the preset theme is for an underage educational theme, and the at least one processor is configured to execute the computer program to further implement the steps of:

Referring to fig. 5, fig. 5 is a block diagram of an electronic device 10 according to an embodiment of the present application.

The electronic device 10 may for example comprise at least one memory 11, at least one processor 12 and a bus 13 connecting the different platform systems.

Memory 11 may include readable media in the form of volatile memory, such as Random Access Memory (RAM) 111 and/or cache memory 112, and may further include Read Only Memory (ROM) 113.

The memory 11 also stores a computer program executable by the processor 12 to cause the processor 12 to implement the steps of any of the methods described above.

Memory 11 may also include utility 114 having at least one program module 115, such program modules 115 include, but are not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.

Accordingly, the processor 12 may execute the computer programs described above, as well as may execute the utility 114.

The processor 12 may employ one or more application specific integrated circuits (ASICs, application Specific Integrated Circuit), DSPs, programmable logic devices (PLD, programmableLogic devices), complex programmable logic devices (CPLDs, complex Programmable Logic Device), field programmable gate arrays (FPGAs, fields-Programmable Gate Array), or other electronic components.

Bus 13 may be a local bus representing one or more of several types of bus structures including a memory bus or memory controller, a peripheral bus, an accelerated graphics port, a processor, or any of a variety of bus architectures.

The electronic device 10 may also communicate with one or more external devices such as a keyboard, pointing device, bluetooth device, etc., as well as one or more devices capable of interacting with the electronic device 10 and/or with any device (e.g., router, modem, etc.) that enables the electronic device 10 to communicate with one or more other computing devices. Such communication may be via the input-output interface 14. Also, the electronic device 10 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN) and/or a public network, such as the Internet, through a network adapter 15. The network adapter 15 may communicate with other modules of the electronic device 10 via the bus 13. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with the electronic device 10 in actual applications, including, but not limited to: microcode, device drivers, redundant processors, external disk drive arrays, RAID systems, tape drives, data backup storage platforms, and the like.

(computer-readable storage Medium)

The embodiment of the application also provides a computer readable storage medium, and the specific embodiment of the computer readable storage medium is consistent with the embodiment recorded in the method embodiment and the achieved technical effect, and part of the contents are not repeated.

The computer readable storage medium stores a computer program which, when executed by at least one processor, performs the steps of any of the methods or performs the functions of any of the electronic devices described above.

Referring to fig. 6, fig. 6 is a schematic structural diagram of a program product according to an embodiment of the present application.

The program product is for implementing the steps of any of the methods described above or for implementing the functions of any of the electronic devices described above. The program product may take the form of a portable compact disc read-only memory (CD-ROM) and comprises program code and may be run on a terminal device, such as a personal computer. However, the program product of the present application is not limited thereto, and in the embodiments of the present application, the readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can transmit, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing. Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the C programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).

The present application has been described in terms of its purpose, performance, advancement, and novelty, and the like, and is thus adapted to the functional enhancement and use requirements highlighted by the patent statutes, but the description and drawings are not limited to the preferred embodiments of the present application, and therefore, all equivalents and modifications that are included in the construction, apparatus, features, etc. of the present application shall fall within the scope of the present application.

Claims

1. A method for filtering adverse behavior, for providing live functionality with a virtual anchor, the virtual anchor employing an in-person driver, the method comprising:

2. The bad behavior filtering method according to claim 1, wherein the manner of obtaining the first driving policy comprises:

acquiring a behavior identifier of the behavior information;

3. The bad behavior filtering method according to claim 2, wherein driving the virtual anchor with the first driving policy comprises:

and driving the action of the virtual anchor based on the first preset action.

4. The bad behavior filtering method according to claim 3, wherein the detecting whether the behavior of the person in question is a bad behavior based on the behavior information comprises:

For each of the training data, the following processing is performed:

5. The adverse behavior filtering method of claim 1, wherein the method further comprises:

a frame image in which bad behavior occurs;

a timestamp corresponding to the frame image;

6. The adverse behavior filtering method of claim 5, further comprising:

7. The bad behavior filtering method according to claim 1, wherein the live room corresponding to the virtual anchor is a live room for a preset theme, and the preset theme includes at least one of the following: education, product promotion, fitness, games, and music;

the method further comprises the steps of:

8. The adverse behavior filtering method of claim 7, wherein the preset topic is for an underage educational topic, the method further comprising:

9. An electronic device for providing live functionality with a virtual anchor, the electronic device comprising a memory and at least one processor, the memory storing a computer program, the at least one processor being configured to implement the following steps when executing the computer program:

10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by at least one processor, implements the steps of the method of any of claims 1-8 or implements the functionality of the electronic device of claim 9.