CN115423909A - Virtual character generation method, device, equipment and computer readable storage medium - Google Patents

Virtual character generation method, device, equipment and computer readable storage medium Download PDF

Info

Publication number
CN115423909A
CN115423909A CN202211003831.5A CN202211003831A CN115423909A CN 115423909 A CN115423909 A CN 115423909A CN 202211003831 A CN202211003831 A CN 202211003831A CN 115423909 A CN115423909 A CN 115423909A
Authority
CN
China
Prior art keywords
virtual character
user
current
screenshot
time period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211003831.5A
Other languages
Chinese (zh)
Inventor
孙思凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen Skyworth RGB Electronics Co Ltd
Original Assignee
Shenzhen Skyworth RGB Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen Skyworth RGB Electronics Co Ltd filed Critical Shenzhen Skyworth RGB Electronics Co Ltd
Priority to CN202211003831.5A priority Critical patent/CN115423909A/en
Publication of CN115423909A publication Critical patent/CN115423909A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T13/00Animation
    • G06T13/203D [Three Dimensional] animation
    • G06T13/403D [Three Dimensional] animation of characters, e.g. humans, animals or virtual beings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit

Abstract

The invention discloses a virtual character generation method, a virtual character generation device, virtual character generation equipment and a computer readable storage medium, wherein the method comprises the following steps: acquiring a screenshot and an audio corresponding to a program played in a preset time period; identifying the screenshot and the audio through a pre-established identification model to obtain a current identification result, and determining the use habit of a user according to the current identification result and a historical identification result; and determining a virtual character image according to the using habits of the user, and generating a virtual character according to the virtual character image. According to the method and the device, the screenshot and the audio corresponding to the program played in the preset time period are identified through the identification model so as to determine the use habit of the user, the virtual character image is determined according to the use habit of the user, and then the virtual character is generated, so that the virtual character meets the requirements of the user, the interaction willingness of the user and the virtual character is improved, and the user experience is improved.

Description

Virtual character generation method, device, equipment and computer readable storage medium
Technical Field
The invention relates to the technical field of televisions, in particular to a virtual character generation method, a virtual character generation device, virtual character generation equipment and a computer readable storage medium.
Background
With the development of television technology, a virtual character is set in a television at present, and the virtual character can inform information in voice or other forms at some time, but the image of the virtual character at present cannot be changed by default, so that the virtual character is not in line with the requirement of a user, and when the user interacts with the default virtual character, the user feels tired easily, the interaction will of the user is reduced, and poor user experience is caused.
Therefore, how to generate the image of the virtual character and improve the interaction will of the user is a problem to be solved urgently.
Disclosure of Invention
The invention mainly aims to provide a virtual character generation method, a virtual character generation device, virtual character generation equipment and a computer readable storage medium, and aims to solve the problems of generating the image of a virtual character and improving the interaction will of a user.
In order to achieve the above object, the present invention provides a virtual character generation method, including the steps of:
acquiring a screenshot and an audio corresponding to a program played in a preset time period;
identifying the screenshot and the audio through a pre-established identification model to obtain a current identification result, and determining the use habit of a user according to the current identification result and a historical identification result;
and determining a virtual character image according to the use habit of the user, and generating a virtual character according to the virtual character image.
Optionally, the step of obtaining a screenshot and an audio corresponding to a program played in a preset time period includes:
acquiring a current time period, and comparing the current time period with a preset time period;
and if the current time period is within the preset time period, performing screenshot operation and recording operation on the program played in the current time period according to the preset acquisition frequency to obtain a screenshot and an audio corresponding to the program played in the preset time period.
Optionally, the step of recognizing the screenshot and the audio through a pre-created recognition model to obtain a current recognition result includes:
extracting key pixel points in the screenshot through a pre-established identification model, and determining first scene information corresponding to the screenshot according to the key pixel points;
extracting voiceprint features corresponding to the audio through a pre-established recognition model, and determining second scene information corresponding to the audio according to the voiceprint features;
and determining a current identification result according to the first scene information and the second scene information.
Optionally, before the step of determining the usage habit of the user according to the current recognition result and the historical recognition result, the method includes:
acquiring a scene proportion set of a current recognition result, and comparing the scene proportion set with a preset confidence coefficient;
if the scene occupation ratio greater than the preset confidence coefficient exists in the scene occupation ratio set, storing the current recognition result, and executing the following steps: determining the use habits of the user according to the current identification result and the historical identification result;
if the scene occupation ratio larger than the preset confidence coefficient does not exist in the scene occupation ratio set, deleting the current recognition result, and re-executing the steps: and acquiring a screenshot and audio corresponding to the program played in a preset time period.
Optionally, the step of determining the usage habit of the user according to the current recognition result and the historical recognition result includes:
acquiring a preset number of historical recognition results corresponding to the current recognition result, and calculating the similarity between the current recognition result and the preset number of historical recognition results;
and if the similarity between the current identification result and the historical identification results of the previous preset number is greater than a preset similarity threshold, determining the use habit of the user according to the current identification result and the historical identification results of the previous preset number.
Optionally, the step of determining a virtual character according to the user usage habit and generating a virtual character according to the virtual character comprises:
inputting the use habits of the users into a pre-established virtual character management module, and determining a virtual character according to the use habits of the users through the virtual character management module;
acquiring a user image corresponding to the use habit of the user;
and sending the virtual character image and the user image to a cloud virtual character making end, generating a virtual character through the cloud virtual character making end according to the virtual character image and the user image, and transmitting the virtual character back to the virtual character image management module.
Optionally, after the step of determining an avatar according to the user's habit and generating an avatar according to the avatar, the method includes:
and monitoring push information, and displaying the push information to a user through the virtual character.
Further, to achieve the above object, the present invention provides a virtual character generation apparatus including:
the acquisition module is used for acquiring a screenshot and an audio corresponding to a program played in a preset time period;
the recognition module is used for recognizing the screenshot and the audio through a pre-established recognition model to obtain a current recognition result, and determining the use habit of the user according to the current recognition result and a historical recognition result;
and the generating module is used for determining a virtual character according to the use habit of the user and generating a virtual character according to the virtual character.
Further, the obtaining module is further configured to:
acquiring a current time period, and comparing the current time period with a preset time period;
and if the current time period is within the preset time period, performing screenshot operation and recording operation on the program played in the current time period according to the preset acquisition frequency to obtain a screenshot and an audio corresponding to the program played in the preset time period.
Further, the identification module is further configured to:
extracting key pixel points in the screenshot through a pre-established identification model, and determining first scene information corresponding to the screenshot according to the key pixel points;
extracting voiceprint features corresponding to the audio through a pre-established recognition model, and determining second scene information corresponding to the audio according to the voiceprint features;
and determining a current identification result according to the first scene information and the second scene information.
Further, the identification module is further configured to:
acquiring a scene proportion set of a current recognition result, and comparing the scene proportion set with a preset confidence coefficient;
if the scene proportion set has the scene proportion larger than the preset confidence coefficient, storing the current recognition result, and executing the following steps: determining the use habits of the user according to the current identification result and the historical identification result;
if the scene occupation ratio larger than the preset confidence coefficient does not exist in the scene occupation ratio set, deleting the current recognition result, and re-executing the steps: and acquiring a screenshot and audio corresponding to the program played in a preset time period.
Preferably, the identification module further comprises a determination module configured to:
acquiring a preset number of historical recognition results corresponding to the current recognition result, and calculating the similarity between the current recognition result and the preset number of historical recognition results;
and if the similarity between the current identification result and the historical identification results of the previous preset number is greater than a preset similarity threshold, determining the use habit of the user according to the current identification result and the historical identification results of the previous preset number. Preferably, the generating module is further configured to:
inputting the user using habits into a pre-established virtual character management module, and determining a virtual character according to the user using habits through the virtual character management module;
acquiring a user image corresponding to the use habit of the user;
and sending the virtual character image and the user image to a cloud virtual character making end, generating a virtual character through the cloud virtual character making end according to the virtual character image and the user image, and transmitting the virtual character back to the virtual character image management module.
Preferably, the generating module further comprises a presentation module, the presentation module is configured to:
and monitoring push information, and displaying the push information to a user through the virtual character.
Further, to achieve the above object, the present invention also provides a virtual character generation apparatus including: a memory, a processor, and a avatar generation program stored on the memory and executable on the processor, the avatar generation program when executed by the processor implementing the steps of the avatar generation method as described above.
Further, to achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a virtual character generation program which, when executed by a processor, realizes the steps of the virtual character generation method as described above.
The virtual character generation method provided by the invention comprises the steps of obtaining a screenshot and an audio frequency corresponding to a program played in a preset time period; identifying the screenshot and the audio through a pre-established identification model to obtain a current identification result, and determining the use habit of a user according to the current identification result and a historical identification result; and determining a virtual character image according to the using habits of the user, and generating a virtual character according to the virtual character image. According to the method and the device, the screenshot and the audio corresponding to the program played in the preset time period are identified through the identification model so as to determine the use habit of the user, the virtual character image is determined according to the use habit of the user, and then the virtual character is generated, so that the virtual character meets the requirements of the user, the interaction willingness of the user and the virtual character is improved, and the user experience is improved.
Drawings
FIG. 1 is a schematic diagram of an apparatus architecture of a hardware operating environment according to an embodiment of the present invention;
FIG. 2 is a schematic flowchart illustrating a first embodiment of a method for generating a virtual character according to the present invention;
FIG. 3 is a flowchart illustrating a second embodiment of a method for generating a virtual character according to the present invention;
FIG. 4 is a schematic structural diagram of a virtual character generating apparatus according to the present invention.
The implementation, functional features and advantages of the objects of the present invention will be further explained with reference to the accompanying drawings.
Detailed Description
It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
As shown in fig. 1, fig. 1 is a schematic device structure diagram of a hardware operating environment according to an embodiment of the present invention.
The device of the embodiment of the invention can be a PC or a server device.
As shown in fig. 1, the apparatus may include: a processor 1001, e.g. a CPU, a network interface 1004, a user interface 1003, a memory 1005, a communication bus 1002. The communication bus 1002 is used to implement connection communication among these components. The user interface 1003 may include a Display screen (Display), an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (e.g., a magnetic disk memory). The memory 1005 may alternatively be a storage device separate from the processor 1001 described previously.
Those skilled in the art will appreciate that the configuration of the apparatus shown in fig. 1 is not intended to be limiting of the apparatus and may include more or fewer components than those shown, or some components may be combined, or a different arrangement of components.
As shown in fig. 1, a memory 1005, which is a kind of computer storage medium, may include therein an operating system, a network communication module, a user interface module, and a virtual character generation program.
The operating system is a program for managing and controlling the portable virtual character generation equipment and software resources, and supports the operation of a network communication module, a user interface module, a virtual character generation program and other programs or software; the network communication module is used for managing and controlling the network interface 1002; the user interface module is used to manage and control the user interface 1003.
In the virtual character generation apparatus shown in fig. 1, the virtual character generation apparatus calls the virtual character generation program stored in the memory 1005 by the processor 1001 and performs the operations in the respective embodiments of the virtual character generation method described below.
Based on the hardware structure, the embodiment of the virtual character generation method is provided.
Referring to fig. 2, fig. 2 is a schematic flowchart of a first embodiment of a method for generating a virtual character according to the present invention, where the method includes:
step S10, acquiring a screenshot and an audio corresponding to a program played in a preset time period;
s20, identifying the screenshot and the audio through a pre-established identification model to obtain a current identification result, and determining the use habit of a user according to the current identification result and a historical identification result;
and S30, determining a virtual character image according to the use habits of the user, and generating a virtual character according to the virtual character image.
The virtual character generation method is applied to intelligent equipment, the intelligent equipment comprises an intelligent television, an intelligent terminal, a pc terminal and the like, and for convenience in description, the intelligent television is taken as an example for description; when a user watches a program, the smart television acquires a current time period and compares the current time period with a preset time period; if the current time period is within the preset time period, performing screenshot operation and recording operation on the program played in the current time period according to the preset acquisition frequency to obtain a screenshot and an audio corresponding to the program played in the preset time period; the smart television extracts key pixel points in the screenshot through a pre-established identification model, and determines first scene information corresponding to the screenshot according to the key pixel points; the smart television extracts voiceprint features corresponding to the audio through a pre-established recognition model, and determines second scene information corresponding to the audio according to the voiceprint features; the smart television determines a current identification result according to the first scene information and the second scene information; the smart television acquires a preset number of historical recognition results corresponding to the current recognition result, and calculates the similarity between the current recognition result and the preset number of historical recognition results; and if the similarity between the current identification result and the historical identification results of the previous preset number is greater than a preset similarity threshold, determining the use habit of the user according to the current identification result and the historical identification results of the previous preset number. The intelligent television inputs the use habits of the user into a pre-established virtual character management module, and the virtual character is determined according to the use habits of the user through the virtual character management module; the intelligent television acquires a user image corresponding to the use habit of the user, and generates a virtual character according to the virtual character image and the user image. It should be noted that the preset time period is set in the smart television in advance, and the user can adjust the preset time period according to the own requirements; the user's usage habit is the type of program that the user watches for a longer time.
The virtual character generation method of the embodiment obtains a screenshot and an audio corresponding to a program played in a preset time period; identifying the screenshot and the audio through a pre-established identification model to obtain a current identification result, and determining the use habit of a user according to the current identification result and a historical identification result; and determining the virtual character image according to the using habits of the user, and generating the virtual character according to the virtual character image. According to the method and the device, the screenshot and the audio corresponding to the program played in the preset time period are identified through the identification model so as to determine the use habit of the user, the virtual character image is determined according to the use habit of the user, the virtual character is further generated, the virtual character meets the requirements of the user, the interaction willingness of the user and the virtual character is further improved, and the user experience is improved.
The respective steps will be described in detail below:
step S10, acquiring a screenshot and an audio corresponding to a program played in a preset time period;
in this embodiment, in the process of watching a television program by a user, the smart television obtains a screenshot and an audio corresponding to the program played in a preset time period; optionally, the preset time period is preset in the smart television, generally speaking, the preset time period is 24 hours, and the user can also set the preset time period according to the self requirement; optionally, the smart television counts a time period of using the television by the user, and further determines a preset time period, for example: eight to ten night each day is the time period when the user uses the television most, and the intelligent television can take the time period from eight to ten night as the preset time period; another example is: the user can use the television more throughout the day from 7 months to 8 months and from 1 month to 2 months, so that the smart television can use 24 hours throughout the day as the preset time period from 7 months to 8 months and from 1 month to 2 months. On the premise of ensuring that the computing resources of the intelligent television are not wasted, the intelligent television can be ensured to obtain more screenshots and audios with reference values.
It can be understood that the intelligent television is provided with the screenshot module and the recording module, when the intelligent television detects that the user watches programs by using the television within a preset time period, the screenshot module is started to screenshot played programs, the recording module is started to record the played programs, and audio corresponding to the programs is obtained.
Further, step S10 includes:
step a, acquiring a current time interval, and comparing the current time interval with a preset time interval;
in the step, after the smart television is started, the current time interval is obtained, and the current time interval is compared with the preset time interval to determine whether the current time interval is within the preset time interval so as to determine whether to obtain a screenshot and an audio corresponding to a played program; preferably, when the preset time period is 24 hours all day long, the intelligent television directly acquires the screenshot and the audio corresponding to the played program without acquiring the current time period after being started.
And b, if the current time period is within the preset time period, performing screenshot operation and recording operation on the program played in the current time period according to the preset acquisition frequency to obtain a screenshot and an audio corresponding to the program played in the preset time period.
In the step, if the smart television determines that the current time interval is within the preset time interval, starting a screenshot module and a recording module, performing screenshot operation on the program played at the current time interval through the screenshot module according to a preset acquisition frequency to obtain a screenshot of the program played at the preset time interval, and performing recording operation on the program played at the current time interval through the screenshot module according to the preset acquisition frequency to obtain the audio frequency of the program played at the preset time interval; preferably, the preset acquisition frequency is set in the smart television in advance, and is generally set to be 330 ms/time, that is, screenshot operation and recording operation are performed on a currently played program every 330ms, and a corresponding screenshot and audio are acquired; optionally, the preset acquiring frequency may be adjusted according to a user requirement.
S20, identifying the screenshot and the audio through a pre-established identification model to obtain a current identification result, and determining the use habit of a user according to the current identification result and a historical identification result;
in this embodiment, after acquiring a screenshot and an audio of a program played in a preset time period, the smart television inputs the screenshot and the audio into a pre-created identification model, respectively identifies the screenshot and the audio through the identification model to obtain a current identification result, acquires a historical identification result according to the current identification result, and determines a use habit of a user according to the current identification result and the historical identification result; preferably, in the process of acquiring the screenshot and the audio of the program played in the preset time period, the intelligent television inputs the screenshot and the audio into the recognition model, the recognition model recognizes the input screenshot and the audio, after the intelligent television acquires the screenshot and the audio of the program played in the preset time period, the recognition module can output the current recognition result, and the screenshot and the audio are recognized in the process of acquiring the screenshot and the audio, so that the recognition efficiency is improved; optionally, after the intelligent television obtains the screenshots and audios of the programs played within the preset time period, all the screenshots and audios are input into the recognition model, and then the current recognition result is obtained.
It should be noted that, each time the intelligent television obtains the identification result, whether the identification result is valid is judged, and the valid identification result is stored according to the time sequence of obtaining the identification result, so as to obtain the historical identification result.
Specifically, the step of recognizing the screenshot and the audio through a pre-created recognition model to obtain a current recognition result includes:
step c, extracting key pixel points in the screenshot through a pre-established identification model, and determining first scene information corresponding to the screenshot according to the key pixel points;
in this step, the smart television extracts key pixel points in the screenshot through a pre-created identification model, and determines first scene information corresponding to the screenshot according to the key pixel points, it can be understood that some pixel points in the screenshot record important information, for example, in a program screenshot of live NBA, a symbolic NBA icon generally exists in the upper left corner of the screenshot, and objects such as basketball stands and basketball stands still exist in the screenshot, the pixel points used for expressing the icons or objects in the screenshot are the key pixel points, and the identification module determines the first scene information corresponding to the screenshot according to the key pixel points, where the first scene information is what program type the program corresponding to the screenshot belongs to, such as basketball programs, news programs, movie programs, and the like.
Step d, extracting voiceprint characteristics corresponding to the audio through a pre-established recognition model, and determining second scene information corresponding to the audio according to the voiceprint characteristics;
in this step, the smart television extracts a voiceprint feature corresponding to the audio through a pre-created identification model, and determines second scene information corresponding to the audio according to the voiceprint feature, it can be understood that the voiceprint feature of the audio records key information of the audio, the identification model extracts the voiceprint feature of the audio, and the second scene information corresponding to the audio is analyzed according to the voiceprint feature, for example: and if the voiceprint characteristics are identified to be in accordance with the voiceprint characteristics of the piano, determining that the second scene information is possible to be a piano playing program, and if the voiceprint characteristics are identified to be in accordance with the voiceprint characteristics of a basketball collision floor, determining that the second scene information is possible to be a basketball program.
Further, when the recognition model detects that voice exists in the audio, the voice can be converted into characters, and then the second scene information is determined through recognition of the characters and recognition of the voiceprint characteristics.
And e, determining a current identification result according to the first scene information and the second scene information.
In the step, after the smart television identifies a screenshot and audio corresponding to a program played in a preset time period through an identification model to obtain first scene information and second scene information, determining a current identification result according to the first scene information and the second scene information; it can be understood that the first scene information and the second scene information obtained by the smart television respectively include a plurality of different scenes, the smart television counts the plurality of different scenes, determines a current recognition result, and the current recognition result includes proportions of the plurality of different scenes, for example: the current identification results comprise 80% of basketball program, 10% of news program, 5% of music program and 5% of science and education program
Specifically, the step of determining the usage habit of the user according to the current recognition result and the historical recognition result comprises the following steps:
step f, acquiring a preset number of historical recognition results corresponding to the current recognition result, and calculating the similarity between the current recognition result and the preset number of historical recognition results;
in this step, the smart television calculates similarities between the current recognition result and the previous preset number of historical recognition results according to the previous preset number of historical recognition results corresponding to the current recognition result, it can be understood that the preset number is set in the smart television in advance and is generally 2, and when the smart television obtains the current recognition result, the smart television obtains the previous two historical recognition results of the current recognition result and calculates similarities between the current recognition result and the previous two historical recognition results. For example: the smart television respectively obtains historical identification results from 20 days in 7 months to 25 days in 7 months, obtains current identification results from 26 days in 7 months, obtains historical identification results from 24 days in 7 months and 25 days in 7 months when the preset number is 2, and calculates the similarity between the historical identification results from 24 days in 7 months and 25 days in 7 months and the current identification results obtained from 26 days in 7 months.
And g, if the similarity between the current identification result and the historical identification results of the previous preset number is greater than a preset similarity threshold, determining the use habit of the user according to the current identification result and the historical identification results of the previous preset number.
In the step, the intelligent television calculates the similarity between the current recognition result and the historical recognition results of the previous preset number, compares the similarity with a preset similarity threshold, and determines the use habit of the user according to the current recognition result and the historical recognition results of the previous preset number if the similarity between the current recognition result and the historical recognition results of the previous preset number is larger than the preset similarity threshold. Such as: the current identification result is basketball program 80%, news program 10% and music program 10%, the two historical identification results are basketball program 82%, news program 10% and music program 8%, and basketball program 80%, news program 12% and music program 8%, and the smart television can determine that the user habit is favorite in watching basketball program according to the current identification result and the historical identification results of the previous preset number.
It can be understood that the smart television determines the use habit of the user according to the continuous multiple recognition results only when the similarity between the continuous multiple recognition results is determined to be greater than the preset similarity threshold, so that the influence of inaccurate use habit of the user caused by an accidental recognition result is avoided.
And S30, determining a virtual character according to the use habit of the user, and generating a virtual character according to the virtual character.
In this embodiment, after determining the use habits of the user, the smart television makes a corresponding virtual character image according to the use habits of the user, and generates a virtual character according to the virtual character image.
Furthermore, as more than one family member exists in a family, the smart television can bind the use habits of the user with the corresponding family members, for example, in a family, the use habits of dad users are that dad users like to watch basketball programs, the use habits of children users are that dad users like to watch animation programs, and the like, and the smart television can determine the corresponding virtual character image according to the use habits of different family members, and generate the virtual character according to the virtual character image.
Specifically, step S30 includes:
step h, inputting the user use habits into a pre-established virtual character management module, and determining a virtual character according to the user use habits through the virtual character management module;
step i, acquiring a user image corresponding to the use habit of the user;
step j, sending the virtual character image and the user image to a cloud virtual character making end, generating a virtual character through the cloud virtual character making end according to the virtual character image and the user image, and transmitting the virtual character back to the virtual character image management module.
In the steps h to j, the intelligent television inputs the use habits of the user into a pre-established virtual character management module, the virtual character management module determines a virtual character according to the use habits of the user, for example, the use habits of the user like watching basketball programs, and the virtual character management module determines that the virtual character is a basketball character; the intelligent television acquires a user image corresponding to the use habit of the user through the camera module, sends the virtual character image and the user image to the cloud virtual character making end, acquires the information of the five sense organs of the user through the cloud virtual character making end according to the user image, generates a virtual character according to the information of the virtual character image and the five sense organs of the user, and then transmits the virtual character back to the virtual character image management module for the intelligent television to use. It should be noted that the virtual character production end can also be directly arranged in the smart television.
Further, after step S30, the method includes:
and monitoring push information, and displaying the push information to a user through the virtual character.
In the step, after the virtual character is generated by the smart television, a user using the television at present can be determined through the camera module, the corresponding virtual character is called according to the user, when the push information is monitored, the push information is displayed to the user through the virtual character in a voice mode, the virtual character can also receive the voice of the user, the intention of the user is determined according to the voice, and then the smart television is controlled to finish the action corresponding to the intention of the user, so that the interaction between the virtual character and the user is realized.
In the virtual character generation method of the embodiment, when a user watches a program, the smart television acquires a current time period and compares the current time period with a preset time period; if the current time period is within the preset time period, performing screenshot operation and recording operation on the program played at the current time period according to the preset acquisition frequency to obtain a screenshot and an audio corresponding to the program played at the preset time period; the smart television extracts key pixel points in the screenshot through a pre-established identification model, and determines first scene information corresponding to the screenshot according to the key pixel points; the smart television extracts voiceprint characteristics corresponding to the audio through a pre-established recognition model, and determines second scene information corresponding to the audio according to the voiceprint characteristics; the smart television determines a current identification result according to the first scene information and the second scene information; the smart television acquires a preset number of historical recognition results corresponding to the current recognition result, and calculates the similarity between the current recognition result and the preset number of historical recognition results; and if the similarity between the current identification result and the historical identification results of the previous preset number is greater than a preset similarity threshold, determining the use habit of the user according to the current identification result and the historical identification results of the previous preset number. The intelligent television inputs the use habits of the user into a pre-established virtual character management module, and the virtual character is determined according to the use habits of the user through the virtual character management module; the intelligent television acquires a user image corresponding to the use habit of the user, and generates a virtual character according to the virtual character image and the user image. According to the method and the device, the screenshot and the audio corresponding to the program played in the preset time period are identified through the identification model so as to determine the use habit of the user, the virtual character image is determined according to the use habit of the user, the virtual character is further generated, the virtual character meets the requirements of the user, the interaction willingness of the user and the virtual character is further improved, and the user experience is improved.
Further, as shown in fig. 3, a second embodiment of the virtual character generating method of the present invention is proposed based on the first embodiment of the virtual character generating method of the present invention.
The second embodiment of the virtual character generation method is different from the first embodiment of the virtual character generation method in that, before the step of determining the usage habit of the user based on the current recognition result and the historical recognition result, the method comprises:
step k, acquiring a scene proportion set of a current recognition result, and comparing the scene proportion set with a preset confidence coefficient;
step l, if the scene occupation ratio greater than the preset confidence coefficient exists in the scene occupation ratio set, storing the current recognition result, and executing the steps of: determining the use habits of the user according to the current identification result and the historical identification result;
step m, if the scene occupation ratio larger than the preset confidence coefficient does not exist in the scene occupation ratio set, deleting the current recognition result, and executing the steps again: and acquiring a screenshot and audio corresponding to the program played in a preset time period.
In this embodiment, after obtaining a current recognition result, the smart television acquires a scene proportion set of the current recognition result, compares the scene proportion set with a preset confidence level, stores the current recognition result if it is determined that a scene proportion larger than the preset confidence level exists in the scene proportion set, and performs the following steps of determining a user use habit according to the current recognition result and a historical recognition result; and if the scene occupation ratio set does not have the scene occupation ratio larger than the preset confidence coefficient, deleting the current recognition result, and re-executing the steps of obtaining the screenshot and the audio corresponding to the program played in the preset time period and the subsequent steps. For example: supposing that the current recognition result is 90% of a basketball program, 5% of a news program and 5% of a music program, and the preset confidence coefficient is 85%, the intelligent television compares the scene proportion set of the current recognition result with the preset confidence coefficient to obtain the scene proportion which is greater than the preset confidence coefficient in the scene proportion set, at the moment, the intelligent television can determine that the current recognition result is an effective recognition result, store the current recognition result, and execute the subsequent steps; assuming that the current recognition result is 80% of the basketball program, 10% of the news program and 10% of the music program, and the preset confidence level is 85%, the smart television compares the scene proportion set of the current recognition result with the preset confidence level to obtain the scene proportion which is not greater than the preset confidence level in the scene proportion set, at this time, the smart television can determine that the current recognition result is an invalid recognition result, delete the current recognition result, wait for the next preset time period, and re-execute the steps of obtaining the screenshot and the audio corresponding to the program played in the preset time period and the subsequent steps. Whether the current recognition result is effective or not is judged by analyzing the current recognition result, so that the accuracy of determining the use habits of the user is improved, and the accuracy of determining the virtual character image is improved.
Further, if the stored current identification result is not used for determining the use habit of the user in the subsequent step, the stored current identification result is stored in the smart television as a history identification result.
In this embodiment, after the smart television obtains the current recognition result, a scene proportion set of the current recognition result is obtained, the scene proportion set is compared with a preset confidence level, if it is determined that the scene proportion set has a scene proportion larger than the preset confidence level, the current recognition result is stored, and the user using habit and the subsequent steps are determined according to the current recognition result and the historical recognition result; and if the scene occupation ratio which is larger than the preset confidence coefficient does not exist in the scene occupation ratio set, deleting the current recognition result, and re-executing the steps of obtaining the screenshot and the audio corresponding to the program played in the preset time period and the subsequent steps. Whether the current recognition result is effective or not is judged by analyzing the current recognition result, so that the accuracy of determining the use habits of the user is improved, and the accuracy of determining the virtual character image is improved.
As shown in fig. 4, the present invention also provides a virtual character generating apparatus. The virtual character generation apparatus of the present invention includes:
the acquisition module 101 is configured to acquire a screenshot and an audio corresponding to a program played in a preset time period;
the recognition module 102 is configured to recognize the screenshot and the audio through a pre-created recognition model to obtain a current recognition result, and determine a user usage habit according to the current recognition result and a historical recognition result;
the generating module 103 is configured to determine a virtual character according to the usage habit of the user, and generate a virtual character according to the virtual character.
Further, the obtaining module is further configured to:
acquiring a current time period, and comparing the current time period with a preset time period;
and if the current time period is within the preset time period, performing screenshot operation and recording operation on the program played in the current time period according to the preset acquisition frequency to obtain a screenshot and an audio corresponding to the program played in the preset time period.
Further, the identification module is further configured to:
extracting key pixel points in the screenshot through a pre-established recognition model, and determining first scene information corresponding to the screenshot according to the key pixel points;
extracting voiceprint features corresponding to the audio through a pre-established recognition model, and determining second scene information corresponding to the audio according to the voiceprint features;
and determining a current identification result according to the first scene information and the second scene information.
Further, the identification module is further configured to:
acquiring a scene proportion set of a current recognition result, and comparing the scene proportion set with a preset confidence coefficient;
if the scene occupation ratio greater than the preset confidence coefficient exists in the scene occupation ratio set, storing the current recognition result, and executing the following steps: determining the use habits of the user according to the current identification result and the historical identification result;
if the scene occupation ratio larger than the preset confidence coefficient does not exist in the scene occupation ratio set, deleting the current recognition result, and re-executing the steps: and acquiring a screenshot and audio corresponding to the program played in a preset time period.
Preferably, the identification module further comprises a determination module configured to:
acquiring a preset number of historical recognition results corresponding to the current recognition result, and calculating the similarity between the current recognition result and the preset number of historical recognition results;
and if the similarity between the current identification result and the historical identification results of the previous preset number is greater than a preset similarity threshold, determining the use habit of the user according to the current identification result and the historical identification results of the previous preset number. Preferably, the generating module is further configured to:
inputting the use habits of the users into a pre-established virtual character management module, and determining a virtual character according to the use habits of the users through the virtual character management module;
acquiring a user image corresponding to the use habit of the user;
and sending the virtual character image and the user image to a cloud virtual character making end, generating a virtual character through the cloud virtual character making end according to the virtual character image and the user image, and transmitting the virtual character back to the virtual character image management module.
Preferably, the generating module further comprises a display module, and the display module is configured to:
and monitoring push information, and displaying the push information to a user through the virtual character.
The invention also provides virtual character generation equipment.
The virtual character generation apparatus of the present invention includes: a memory, a processor, and a virtual character generation program stored on the memory and executable on the processor, the virtual character generation program when executed by the processor implementing the steps of the virtual character generation method as described above.
The method implemented when the avatar generation program running on the processor is executed may refer to various embodiments of the avatar generation method of the present invention, and will not be described herein again.
The invention also provides a computer readable storage medium.
The computer-readable storage medium of the present invention has stored thereon a virtual character generation program which, when executed by a processor, implements the steps of the virtual character generation method as described above.
The method implemented when the avatar generation program running on the processor is executed may refer to the embodiments of the avatar generation method of the present invention, and will not be described herein again.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrases "comprising a," "8230," "8230," or "comprising" does not exclude the presence of other like elements in a process, method, article, or system comprising the element.
The above-mentioned serial numbers of the embodiments of the present invention are only for description, and do not represent the advantages and disadvantages of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention or portions thereof contributing to the prior art may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) as described above and includes several instructions for enabling a terminal device (which may be a mobile phone, a computer, a server, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A virtual character generation method is characterized by comprising the following steps:
acquiring a screenshot and an audio corresponding to a program played in a preset time period;
identifying the screenshot and the audio through a pre-established identification model to obtain a current identification result, and determining the use habit of a user according to the current identification result and a historical identification result;
and determining a virtual character image according to the use habit of the user, and generating a virtual character according to the virtual character image.
2. The virtual character generation method of claim 1, wherein the step of obtaining the screenshot and the audio corresponding to the program played in the preset time period comprises:
acquiring a current time period, and comparing the current time period with a preset time period;
and if the current time period is within the preset time period, performing screenshot operation and recording operation on the program played in the current time period according to the preset acquisition frequency to obtain a screenshot and an audio corresponding to the program played in the preset time period.
3. The virtual character generation method as claimed in claim 1, wherein said step of recognizing said screen shot and said audio through a recognition model created in advance to obtain a current recognition result comprises:
extracting key pixel points in the screenshot through a pre-established recognition model, and determining first scene information corresponding to the screenshot according to the key pixel points;
extracting voiceprint features corresponding to the audio through a pre-established recognition model, and determining second scene information corresponding to the audio according to the voiceprint features;
and determining a current identification result according to the first scene information and the second scene information.
4. The method of generating a virtual character as claimed in claim 1, wherein said step of determining a usage habit of the user based on said current recognition result and said historical recognition result is preceded by the steps of:
acquiring a scene proportion set of a current recognition result, and comparing the scene proportion set with a preset confidence coefficient;
if the scene occupation ratio greater than the preset confidence coefficient exists in the scene occupation ratio set, storing the current recognition result, and executing the following steps: determining the use habits of the user according to the current identification result and the historical identification result;
if the scene proportion set does not have the scene proportion larger than the preset confidence coefficient, deleting the current recognition result, and executing the steps again: and acquiring a screenshot and audio corresponding to the program played in a preset time period.
5. The virtual character generation method as claimed in claim 1, wherein said step of determining a user's usage habit based on said current recognition result and historical recognition result comprises:
acquiring a preset number of historical recognition results corresponding to the current recognition result, and calculating the similarity between the current recognition result and the preset number of historical recognition results;
and if the similarity between the current identification result and the historical identification results of the previous preset number is greater than a preset similarity threshold, determining the use habit of the user according to the current identification result and the historical identification results of the previous preset number.
6. The virtual character generation method of claim 1, wherein the step of determining a virtual character according to the user's usage habits and generating a virtual character according to the virtual character comprises:
inputting the use habits of the users into a pre-established virtual character management module, and determining a virtual character according to the use habits of the users through the virtual character management module;
acquiring a user image corresponding to the use habit of the user;
and sending the virtual character image and the user image to a cloud virtual character making end, generating a virtual character through the cloud virtual character making end according to the virtual character image and the user image, and transmitting the virtual character back to the virtual character image management module.
7. The virtual character generation method as claimed in claim 1, wherein said step of determining a virtual character image from said user's usage habits and generating a virtual character from said virtual character image is followed by the steps of:
monitoring push information, and displaying the push information to a user through the virtual character.
8. A virtual character generation apparatus, characterized by comprising:
the acquisition module is used for acquiring a screenshot and an audio corresponding to a program played in a preset time period;
the recognition module is used for recognizing the screenshot and the audio through a pre-established recognition model to obtain a current recognition result, and determining the use habit of the user according to the current recognition result and a historical recognition result;
and the generating module is used for determining a virtual character according to the use habit of the user and generating a virtual character according to the virtual character.
9. A virtual character generation apparatus characterized by comprising: a memory, a processor, and a avatar generation program stored on the memory and executable on the processor, the avatar generation program when executed by the processor implementing the steps of the avatar generation method as claimed in any of claims 1 to 7.
10. A computer-readable storage medium, characterized in that a virtual character generation program is stored on the computer-readable storage medium, and the virtual character generation program, when executed by a processor, implements the steps of the virtual character generation method according to any one of claims 1 to 7.
CN202211003831.5A 2022-08-19 2022-08-19 Virtual character generation method, device, equipment and computer readable storage medium Pending CN115423909A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211003831.5A CN115423909A (en) 2022-08-19 2022-08-19 Virtual character generation method, device, equipment and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211003831.5A CN115423909A (en) 2022-08-19 2022-08-19 Virtual character generation method, device, equipment and computer readable storage medium

Publications (1)

Publication Number Publication Date
CN115423909A true CN115423909A (en) 2022-12-02

Family

ID=84198471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211003831.5A Pending CN115423909A (en) 2022-08-19 2022-08-19 Virtual character generation method, device, equipment and computer readable storage medium

Country Status (1)

Country Link
CN (1) CN115423909A (en)

Similar Documents

Publication Publication Date Title
CN109688475B (en) Video playing skipping method and system and computer readable storage medium
CN110691281B (en) Video playing processing method, terminal device, server and storage medium
CN110737840A (en) Voice control method and display device
CN112104915B (en) Video data processing method and device and storage medium
CN104866275B (en) Method and device for acquiring image information
CN112752121B (en) Video cover generation method and device
CN109495427B (en) Multimedia data display method and device, storage medium and computer equipment
CN108958503A (en) input method and device
CN112002321B (en) Display device, server and voice interaction method
CN113542833A (en) Video playing method, device and equipment based on face recognition and storage medium
CN110545475B (en) Video playing method and device and electronic equipment
CN109683760B (en) Recent content display method, device, terminal and storage medium
CN113852767B (en) Video editing method, device, equipment and medium
CN108881766B (en) Video processing method, device, terminal and storage medium
CN109791545A (en) The contextual information of resource for the display including image
CN113596520A (en) Video playing control method and device and electronic equipment
CN115423909A (en) Virtual character generation method, device, equipment and computer readable storage medium
CN110662117A (en) Content recommendation method, smart television and storage medium
CN113593614B (en) Image processing method and device
CN111225250B (en) Video extended information processing method and device
CN117319340A (en) Voice message playing method, device, terminal and storage medium
CN113139093A (en) Video search method and apparatus, computer device, and medium
CN112887782A (en) Image output method and device and electronic equipment
CN113382310B (en) Information recommendation method and device, electronic equipment and medium
US20240112702A1 (en) Method and apparatus for template recommendation, device, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination