WO2023001152A1 - Procédé de recommandation de clip vidéo, dispositif électronique et serveur - Google Patents

Procédé de recommandation de clip vidéo, dispositif électronique et serveur Download PDF

Info

Publication number
WO2023001152A1
WO2023001152A1 PCT/CN2022/106529 CN2022106529W WO2023001152A1 WO 2023001152 A1 WO2023001152 A1 WO 2023001152A1 CN 2022106529 W CN2022106529 W CN 2022106529W WO 2023001152 A1 WO2023001152 A1 WO 2023001152A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
user
electronic device
information
server
Prior art date
Application number
PCT/CN2022/106529
Other languages
English (en)
Chinese (zh)
Inventor
赵静
尹明伟
黎沙
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023001152A1 publication Critical patent/WO2023001152A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/735Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/451Execution arrangements for user interfaces
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2387Stream processing in response to a playback request from an end-user, e.g. for trick-play
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/239Interfacing the upstream path of the transmission network, e.g. prioritizing client content requests
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/258Client or end-user data management, e.g. managing client capabilities, user preferences or demographics, processing of multiple end-users preferences to derive collaborative data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/25Management operations performed by the server for facilitating the content distribution or administrating data related to end-users or client devices, e.g. end-user or client device authentication, learning user preferences for recommending movies
    • H04N21/262Content or additional data distribution scheduling, e.g. sending additional data at off-peak times, updating software modules, calculating the carousel transmission frequency, delaying a video stream transmission, generating play-lists
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/437Interfacing the upstream path of the transmission network, e.g. for transmitting client requests to a VOD server
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/442Monitoring of processes or resources, e.g. detecting the failure of a recording device, monitoring the downstream bandwidth, the number of times a movie has been viewed, the storage space available from the internal hard disk
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/454Content or additional data filtering, e.g. blocking advertisements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/45Management operations performed by the client for facilitating the reception of or the interaction with the content or administrating data related to the end-user or to the client device itself, e.g. learning user preferences for recommending movies, resolving scheduling conflicts
    • H04N21/466Learning process for intelligent management, e.g. learning user preferences for recommending movies
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/482End-user interface for program selection

Definitions

  • the present application relates to the field of electronic technology, and in particular to a method for recommending video clips, an electronic device and a server.
  • the user can set the working mode of the electronic device to match the use requirements of different users. For example, when a child user is using a tablet to watch a video, parents can set the tablet to a child mode, and when the child user finishes using the tablet, the parent can switch the tablet to the normal mode to avoid recommending violent themes to the child user , bloody themes and other videos that are not suitable for children to watch.
  • the children's mode cannot guarantee that the recommended videos meet the children's viewing needs, and the effectiveness is low.
  • the electronic device in the process of using the electronic device, generally the electronic device can only be set to one mode. If the user wants to switch the working mode of the electronic device, a series of operations are required to achieve the switch. The switching process is cumbersome to operate, and the user experience is poor.
  • users may have different viewing needs in different scenes. For example, in different scenarios such as family scenarios and work scenarios, users may have different viewing demands. How to recommend videos that meet the current viewing needs for users and improve the viewing experience of users for different user groups and scenarios is an urgent problem that needs to be solved at present.
  • This application provides a method for recommending video clips, an electronic device, and a server.
  • the method can recommend videos that match the user's identity and the current scene for the user; in addition, when the user selects and plays a certain video, the method can also be based on The current user identity matches the user with clips that meet the user's viewing needs, finely controls the playback effect of the video, and improves the user experience.
  • a method for recommending video clips comprising: an electronic device displays a video list, the video list includes one or more videos; the electronic device receives an operation of the user to play the first video, and responds In the above operation, the electronic device sends a play request of the first video to the server, the play request of the first video includes first identification information, and the first video is any one of the one or more videos video: the electronic device receives a response message sent by the server, the response message is used to indicate a target segment and a filter segment of the first video, the target segment is associated with the first identification information, the The target segment is a segment to be played to the user, and the filtered segment is a segment not to be played to the user; the electronic device plays the target segment of the first video according to the response message.
  • the "request for playing the first video” can be used to request to obtain the address (content Identification, content ID) of the playing content of the first video, and the video application of the electronic device can obtain the first video from the address of the playing content The corresponding image data, audio data, etc., realize the normal playback of the first video, which will not be repeated here.
  • matching can be understood as “association”, that is, the category to which the tag information of the target segment belongs is associated with the identity information of the current user, the current scene information, and the like.
  • the first identification information includes identity information of the user, and/or current scene information of the electronic device.
  • the electronic device can collect user characteristics, accurately determine the user identity and make a user portrait according to the collected user characteristics, and then combine the tag information of each video segment with the user Preferences, behavioral habits, etc., to provide the current user with a refined and personalized video, or match the current user with a target segment in a certain video.
  • this method can control the electronic device to automatically switch to the children's mode, and control the scope of the video permissions that the electronic device can play in the children's mode, and can also set a certain video that is not suitable for The content of video clips watched by child users is blocked (filtered clips).
  • This process does not need to manually set the electronic device to switch to the children's mode, and can realize the effect of finely controlling video playback in the children's mode, thereby improving user experience.
  • the above method can also be combined with the life scene where the electronic device is located, and recommend different video content to the user from the video content library according to the recognition result of the scene.
  • the kitchen scene it can recommend food and make food-related videos for users; for the study room scene, it can recommend teaching-related videos for users; for the living room scene, it can recommend movies and TV series suitable for family members to watch together; for In the balcony scene, you can recommend videos related to home and cleaning for users.
  • the method can recommend videos that meet the current scene for the user to select, thereby improving the user experience.
  • the response message further includes information about a video list, where the information about the video list includes information about at least one second video, and the second The video matches the first identification information
  • the method further includes: the electronic device updating the video list and displaying the at least one second video in the video list according to the information of the video list.
  • the electronic device can update the video displayed in the video recommendation menu of the electronic device according to the video list information sent by the server, so that the video displayed in the video recommendation menu is a video matching the current user identity and scene for the user to choose, improving user viewing experience.
  • the play request of the first video further includes the account information logged in by the electronic device
  • the tag information of the second video is the Among the one or more videos stored by the server, the videos that match the historical playback records corresponding to the identity of the user and the account logged in by the electronic device.
  • the user’s historical playback records and other information can be obtained according to the user’s account information, and the video can be recommended for the user or the process of video playback can be controlled.
  • This method can accurately recommend to the user the video content that matches the current scene, the user’s habits and hobbies. , can recommend videos or video clips that better match the user's viewing needs for the user, improving the user's viewing experience.
  • the method further includes: the electronic device collects user characteristics, and determines the identity of the user according to the user characteristics; or the electronic device collects user characteristics, the electronic device sends the user characteristics to the server, and Receiving the identification result of user features sent by the server to determine the identity of the user; wherein the user features include one or more of facial features, age features, height features, dress features, and occupational features.
  • the electronic device may collect user characteristics through one or more collection and detection modules such as a camera, a fingerprint sensor, a touch sensor, and an infrared sensor.
  • a camera is not limited to a front camera, a rear camera, or an under-screen camera of the electronic device.
  • the electronic device can independently realize the identification of the user's identity, thereby reducing the communication between the electronic device and the server.
  • the interaction process accelerates the speed of user identification.
  • the server can receive the user characteristic data reported by the electronic device, make user portraits according to the massive user characteristic data, and encrypt and store a large amount of data of different user portraits. After information, you can quickly query the data of a large number of user portraits that have been stored, and quickly determine the identity of the current user, thereby increasing the speed of user identification and reducing the amount of data processing. In addition, it can also improve the accuracy and accuracy of user identification. robustness.
  • the electronic device recognizes that the current user is an owner user, a parent user, a child user and other different user identities, it can further recommend videos according to the current user identities.
  • education-related videos can be recommended for parent users; for the owner user, the historical playback records of the owner user can be combined to match videos of the same type or similar subject content as the owner user's historical playback records;
  • Child users can recommend children's animation and learning videos, which are not limited in this embodiment of the present application.
  • the electronic device periodically collects user characteristics; and/or the electronic device detects that the user is running a video application, and responds to The operation is to collect user characteristics.
  • the electronic device when the user turns on the function of personalized recommendation of video clips in the video application of the electronic device, and turns on the function of "allowing intelligent detection of identity information", the electronic device can click on the user's video application. icon (that is, to open the video application) and enter the running interface of the video application, it can trigger the collection and detection module of the electronic device to collect user information.
  • the collection and detection module of the electronic device may periodically collect user information according to a certain period (for example, 10 collections per minute).
  • a certain period for example, 10 collections per minute.
  • the electronic device may periodically collect current user information to determine the identity of the current user.
  • the embodiment of the present application does not limit the timing for the electronic device to collect user information.
  • the collection and detection module of the electronic device may not collect user information, and the user may manually set the current user identity of the electronic device.
  • the method further includes: the electronic device displays a first window, the The first window displays first prompt information, and the first prompt information is used to prompt the user that the electronic device can recommend videos for the user according to the identity of the user.
  • the mobile phone can automatically switch to the child mode, display the first window and display a prompt message in the first window: Dear user, identify you as Underage users will be blocked from displaying some video clips. Users can choose whether to receive the identification result of the user's identity according to their own needs.
  • the electronic device or video application can accurately determine the identity or age information of the current user, for example, the current user is: a child user or an underage user; a parent user or an adult user; an elderly user, etc., and follow up the user
  • the current user is: a child user or an underage user; a parent user or an adult user; an elderly user, etc., and follow up the user
  • different services are provided to meet the needs of different users.
  • the electronic device plays the target segment of the first video according to the response message, including: On the playback interface of the display, a playback progress bar is displayed, and the playback progress bar includes a first area corresponding to the target segment and a second area corresponding to the filter segment, and the second area of the playback progress bar is a grayscale display; or,
  • a playing progress bar is displayed, and the playing progress bar only includes the first area corresponding to the target segment and does not include the second area corresponding to the filter segment.
  • the playback progress bar when the playback progress bar includes the first area and the second area, the During the process of the target segment, when the current playing moment displayed on the playing progress bar is located at the starting position of the second area, the electronic device displays a second window, and the second window displays second prompt information, so The second prompt information is used to prompt the user that the electronic device will skip the filter segment and continue playing the target segment of the first video.
  • the electronic device when the first identification information includes the current scene information of the electronic device, the electronic device receives the information sent by the server. Before responding to the message, the method further includes: the electronic device collects the current scene features, and determines the current scene information according to the scene features; or the electronic device collects the current scene features, The electronic device sends the current scene feature to the server, receives the recognition result of the scene feature sent by the server, and determines the current scene information.
  • the above process may have different occurrence timings and implementation manners.
  • a home device with a relatively fixed location such as a smart screen
  • the smart screen can be triggered to start collecting information about life scenes, and the scene where the smart screen is located may take a long time will not change.
  • the collection and detection module of the electronic device may periodically collect life scene information according to a certain period, which is not limited in this embodiment of the present application.
  • the play request of the first video further includes the account information logged in by the electronic device, and the target segment is the first video The segment that matches the historical playback record corresponding to the first identification information and the account logged in by the electronic device.
  • the above process can combine information such as the current life scene of the electronic device, the identity of the current user, the account of the user logged in by the electronic device, and the historical browsing records corresponding to the user account, and combine the tag information of each video from the video content library, Recommend one or more videos for this user.
  • the method can accurately recommend video content for the user that conforms to the current scene, the user's habits and hobbies, and improves the user's viewing experience.
  • the response message further includes metadata of the target segment, and the metadata of the target segment includes a playback address of the target segment, One or more of image data, audio data, and text data.
  • the electronic device can collect and identify the user's identity.
  • the electronic device can report the user's identity information, and the server can query the tag information of each video clip included in the video, and match the user with a video that meets the user's viewing needs according to the current user's identity. fragment.
  • the method can also control the electronic device to switch to the child mode, and control the scope of movie permissions that the electronic device can play in the child mode.
  • the video provided by the server to the electronic device may also be a filtered video, that is, the content of the video segment in the video that is not suitable for the child user to watch is blocked. This process does not need to manually set the electronic device to switch to the children's mode, and can realize the fine control of the video playback effect in the children's mode, which improves the user experience.
  • a method for recommending video clips is provided, which is applied to a server, and the server stores one or more videos and label information of each video in the one or more videos, and the method includes: The server receives a play request of the first video sent by the electronic device, the play request of the first video includes first identification information, the first video is any one of the one or more videos, and the first video A video includes one or more segments; the server queries the tag information of the first video according to the play request of the first video, and the tag information of the first video includes each of the one or more segments label information of a segment; the server determines the target segment and filter segment of the first video according to the first identification information and the tag information of each of the one or more segments, the target segment and The first identification information is associated, the target segment is a segment for playing to the user, and the filter segment is a segment not for playing to the user; the server sends a response message to the electronic device , the response message is used to indicate the target segment and the filter segment.
  • the first identification information includes identity information of the user, and/or current scene information of the electronic device.
  • the server can receive the user identity information and/or the current scene information sent by the electronic device, and after determining the user identity and/or the current scene, the server can combine the video tags in the video content library of the server Information, from the video content library to recommend videos that match the user's identity and/or the current scene for the user.
  • the server determines that the current user is a child user, it can query the tag information of each video segment included in the first video, and then determine the tag matching the child user, and then store the tag information of each video segment and each The association between the playback progress of a video segment determines the target segment and the corresponding playback start time and end time of the target segment, that is, matches the child user from the multiple video segments included in the first video and can be a child The target segment played by the user.
  • the method when the first identification information includes the identity information of the user, the method further includes: the server receiving the electronic device According to the user characteristics sent, the identity of the user is identified according to the user characteristics, and the identification result of the user identity is sent to the electronic device.
  • the user characteristics include facial features, age features, height features, dress features, and occupational features.
  • the method further includes: the server receives the scene feature sent by the electronic device, and according to the Identify the current scene based on the scene feature, and send the scene recognition result to the electronic device.
  • the server can receive the user characteristic data reported by the electronic device, combine the stored massive user characteristic data to make a user portrait of the current user, quickly determine the identity of the current user, increase the speed of user identification, and improve the user's identification the accuracy rate. After the user's identity is determined, the server can recommend videos matching the user's identity for the user in combination with the tag information of the videos in the server's video content library.
  • the server may identify the life scene where the electronic device is located, and recommend different video content to the user from the video content library according to the scene recognition result.
  • This method can be aimed at scenarios where multiple people in a family use multiple smart screens and other large-screen devices, or when multiple family members use the same smart screen and other devices, or when different users use split screens and other large-screen devices such as smart screens.
  • the process can intelligently identify information such as the current life scene of the electronic device, the identity of the current user, the account of the user logged in by the electronic device, and the historical browsing records corresponding to the user account, and combine the label information of each video from the video content library. Recommend one or more videos for this user.
  • the method can accurately recommend video content for the user that conforms to the current scene, the user's habits and hobbies, and improves the user's viewing experience.
  • the method further includes: the server according to the first identification information and the tag of each video in the one or more videos Information, from the one or more videos, determine a video list matching the first identification information, the video list includes at least one second video; the server sends the video list to the electronic device
  • the information of the video list includes one or more of the playback address, image data, audio data and text data of the at least one second video.
  • the method further includes: the server acquires the first video, and determines the multi-frame pictures included in the first video; the The server detects each frame of the multi-frame images included in the first video, and identifies the content included in each frame; the server determines the multi-frame images according to the content included in each frame The tag information of each frame in the frame; the server divides the first video into one or more segments according to the tag information of each frame; the server divides the first video into one or more segments according to the tag of each frame; , determining tag information for each of the one or more segments, and determining tag information for the first video.
  • the operator of the video application may upload one or more videos to a server corresponding to the video application, and the server performs intelligent frame-by-frame analysis on each video.
  • the server may perform intelligent frame-by-frame analysis on the first video based on various media artificial intelligence (AI) algorithms, deep learning algorithms, etc., to generate metadata of the first video and tags of the first video information.
  • AI media artificial intelligence
  • the server analyzes the first video frame by frame to obtain the tag information of each frame, it can aggregate the tags of the video clips according to the tag similarity between adjacent frames, and then the The first video is divided into video segments, for example, the first video is divided into multiple video segments, and each of the multiple video segments may correspond to different tag information.
  • each frame may include multiple tags, and the tag information of the video clip may be determined according to any tag of each frame in the multiple frames included in the video clip.
  • the frame-by-frame detection and intelligent analysis process of the video can be triggered based on the AI algorithm, etc., and the multiple fragmented video clips included in the video and each video clip can be obtained.
  • label information In the process of dividing the video into multiple fragmented video clips and determining the label information of each video clip, the dimension of label extraction can realize the intelligent classification and extraction of video labels according to the behavior characteristics input by the user.
  • this process avoids the problems of uploading efficiency, label accuracy, and label validity caused by manual operation; on the other hand, the label information of each video segment detected by the server can be further determined or adjusted by the operator , the adjustment result can update or correct the existing frame-by-frame detection algorithm, and this process can form an inner loop of the whole process of "label detection and extraction - intelligent optimization of label detection algorithm - video clip push of different labels - user experience effect" link, thereby improving the accuracy of generating tag information.
  • the server divides the first video into one or more segments according to the label information of each frame, including: The server divides the first video into the one or more segments according to the similarity between tag information of adjacent frames.
  • the server determines the tag information of each of the one or more segments according to the tag information of each frame, And determining the label information of the first video includes: the server determines, according to the label information of each frame of pictures, the label information with the most repeated occurrences among the multiple frames included in each segment as the the tag information of the segment; and the server determines, according to the tag information of each segment, the tag information with the most repeated occurrences among the tag information of the one or more segments as the tag information of the first video.
  • the server may determine the tag that appears most frequently among multiple frames included in the video segment as the tag information of the video segment.
  • Each video clip can have multiple tags.
  • a noise threshold can be set, which can be used to mark the critical number M of frames with discontinuous labels, and M is greater than or equal to 1.
  • the noise threshold can ignore the impact of the error on the aggregation of video clips and improve The accuracy of tag information determination of the video segment is improved, and the accuracy of dividing the video segment is improved.
  • the play request of the first video further includes the account information logged in by the electronic device, and the server, according to the first identification information, and the label information of each of the one or more segments, determining the target segment and the filter segment of the first video, including: the server obtains the electronic device's Historical playback records: the server determines the target segment according to the first identification information, historical playback records of the electronic device, and label information of each of the one or more segments.
  • the response message further includes metadata of the target segment, and the metadata of the target segment includes the playback address, One or more of image data, audio data, and text data.
  • an electronic device including: a display screen; one or more processors; one or more memories; a module with multiple application programs installed; the memory stores one or more programs, when When the one or more programs are executed by the processor, the electronic device is caused to execute the method according to any one of the first aspect and the implementation manner of the first aspect.
  • a graphical user interface system on an electronic device has a display screen, one or more memories, and one or more processors, and the one or more processors are used to perform storage One or more computer programs in the one or more memories, the graphical user interface system includes the electronic device performing the method as described in any one of the first aspect and the first aspect, and the second aspect and the graphical user interface displayed during any one of the methods described in the second aspect.
  • a server including: one or more processors; one or more memories; the memories store one or more programs, and when the one or more programs are executed by the processors , causing the server to execute the method described in any one of the second aspect and the implementation manners of the second aspect.
  • a system in a sixth aspect, includes an electronic device and a server, the electronic device can execute the method according to any one of the first aspect and the implementation manner of the first aspect, and the server can execute The method according to any one of the second aspect and the implementation manners of the second aspect.
  • an apparatus which is included in an electronic device or a server, and which has the function of realizing the behavior of the electronic device in the method described in any one of the first aspect and the implementation manner of the first aspect, And the function of the server behavior in the method described in any one of the second aspect and the implementation manner of the second aspect.
  • This function may be implemented by hardware, or may be implemented by executing corresponding software on the hardware.
  • Hardware or software includes one or more modules or units corresponding to the functions described above. For example, a display module or unit, a detection module or unit, a processing module or unit, etc.
  • a computer-readable storage medium stores computer instructions, and when the computer instructions are run on an electronic device, the electronic device executes the electronic device according to the first aspect and the first aspect.
  • a computer program product is provided.
  • the electronic device is made to execute any possible method of the above-mentioned first aspect and the implementation manner of the first aspect, and the second aspect and the method described in any one of the implementations of the second aspect.
  • FIG. 1 is a schematic diagram of a user interface of a user watching a video through a video application installed on a mobile phone.
  • FIG. 2 is a schematic structural diagram of an example of an electronic device provided by an embodiment of the present application.
  • Fig. 3 is a block diagram of the software structure of the electronic device according to the embodiment of the present application.
  • FIG. 4 is a schematic interface diagram of an example of a user enabling a function of personalized recommended video clips provided by the embodiment of the present application.
  • Fig. 5 is a schematic flowchart of an example of a method for personalized recommendation of video clips provided by an embodiment of the present application.
  • FIG. 6 is a schematic diagram of a frame-by-frame analysis result of a video provided in the embodiment of the present application.
  • FIG. 7 is a schematic diagram of an interface for recommending video clips for a user on a mobile phone according to an embodiment of the present application.
  • FIG. 8 is a schematic diagram of another example of the process of recommending video clips for a user on a mobile phone according to an embodiment of the present application.
  • Fig. 9 is a schematic flowchart of another example of a method for personalized recommendation of video clips provided by an embodiment of the present application.
  • FIG. 10 is another example of an interface for recommending video clips for users on a smart screen according to an embodiment of the present application.
  • FIG. 11 is another example of an interface for recommending video clips for users on a smart screen according to an embodiment of the present application.
  • FIG. 1 is a schematic diagram of a user interface (graphical user interface, GUI) in the process of watching a video by a user through a video application installed on a mobile phone.
  • GUI graphical user interface
  • FIG. 1 shows an interface 101 currently output by the mobile phone in an unlocked mode, and the interface 101 displays content such as a weather clock component and various application programs (application, App).
  • application application, App
  • the application program may include browser, phone, video, setting and so on. It should be understood that the interface 101 may also include other more application programs, which is not limited in this embodiment of the present application.
  • the user clicks on the icon of the video application, and in response to the user's click operation, the mobile phone runs the video application and displays the main interface 102 of the video application as shown in Figure (b) in Figure 1 .
  • the main interface 102 of the video application can display different functional areas and menus, such as a video search box, setting controls 10, and multiple videos recommended for the user.
  • the interface 102 may display different types and different quantities of video clips recommended for the user.
  • the interface 102 is an interface corresponding to the "daily recommendation" menu, and the interface 102 displays a plurality of different videos currently recommended for users, such as video 1, video 2. Video 3, Video 4, Video 5, etc.
  • the user can click the icon of a desired video to trigger the mobile phone to start playing the video according to his hobbies and viewing needs.
  • the user clicks the icon of "Video 2” and in response to the user's click operation, the mobile phone displays an interface 103 as shown in Figure (c) in Figure 1, and starts to play the video 2 .
  • the video application can continue to play the seventh episode of the video 2 according to the user's historical viewing records, which will not be repeated here.
  • the multiple videos recommended by the video application for the user may belong to different categories, or the multiple videos may correspond to different themes.
  • video 1 on the interface 102 shown in (b) of FIG. 1 may correspond to a violent theme
  • video 2 may correspond to a bloody theme
  • video 3 may correspond to a love theme
  • video 4 may correspond to a teaching theme
  • video 5 may Corresponding to inspirational themes, etc.
  • each video may include multiple video segments of different types.
  • the seventh episode of the video 2 being played by the user may simultaneously include different segments such as violent segments, bloody segments, love segments, family affection segments, and funny segments.
  • users may have different viewing needs in different scenarios.
  • users may use different electronic devices such as mobile phones, tablets, personal computers, and smart screens to watch videos in different scenarios, such as home scenarios and work scenarios.
  • the user's viewing requirements may also change, and the user's playback requirements will also vary during the process of watching the video. For example, the user may expect to play certain video clips, expect to skip certain video clips, and so on.
  • the embodiment of the present application will provide a method for recommending video clips, so as to recommend video clips that meet the current viewing needs for different users and different scenarios, so as to improve the viewing experience of users.
  • first and second are used for descriptive purposes only, and cannot be understood as indicating or implying relative importance or implicitly specifying the quantity of indicated technical features. Thus, a feature defined as “first” and “second” may explicitly or implicitly include one or more of these features.
  • the method for personalized recommended video clips provided in the embodiments of the present application can be applied to mobile phones, tablet computers, wearable devices, vehicle-mounted devices, augmented reality (augmented reality, AR)/virtual reality (virtual reality, VR) devices, notebook computers, On electronic devices such as ultra-mobile personal computers (UMPC), netbooks, and personal digital assistants (personal digital assistants, PDAs), the embodiments of the present application do not impose any restrictions on the specific types of electronic devices.
  • augmented reality augmented reality, AR
  • VR virtual reality
  • UMPC ultra-mobile personal computers
  • PDAs personal digital assistants
  • FIG. 2 is a schematic structural diagram of an electronic device 100 provided in an embodiment of the present application.
  • the electronic device 100 may include a processor 110, an external memory interface 120, an internal memory 121, a universal serial bus (universal serial bus, USB) interface 130, a charging management module 140, a power management module 141, a battery 142, an antenna 1, and an antenna 2 , mobile communication module 150, wireless communication module 160, audio module 170, speaker 170A, receiver 170B, microphone 170C, earphone jack 170D, sensor module 180, button 190, motor 191, indicator 192, camera 193, display screen 194, and A subscriber identification module (subscriber identification module, SIM) card interface 195 and the like.
  • SIM subscriber identification module
  • the sensor module 180 may include a pressure sensor 180A, a gyroscope sensor 180B, an air pressure sensor 180C, a magnetic sensor 180D, an acceleration sensor 180E, a distance sensor 180F, a proximity light sensor 180G, a fingerprint sensor 180H, a temperature sensor 180J, a touch sensor 180K, an ambient light sensor 180L, bone conduction sensor 180M, etc.
  • the structure illustrated in the embodiment of the present application does not constitute a specific limitation on the electronic device 100 .
  • the electronic device 100 may include more or fewer components than shown in the figure, or combine certain components, or separate certain components, or arrange different components.
  • the illustrated components can be realized in hardware, software or a combination of software and hardware.
  • the processor 110 may include one or more processing units, for example: the processor 110 may include an application processor (application processor, AP), a modem processor, a graphics processing unit (graphics processing unit, GPU), an image signal processor (image signal processor, ISP), controller, memory, video codec, digital signal processor (digital signal processor, DSP), baseband processor, and/or neural network processor (neural-network processing unit, NPU) Wait. Wherein, different processing units may be independent devices, or may be integrated in one or more processors.
  • application processor application processor, AP
  • modem processor graphics processing unit
  • GPU graphics processing unit
  • image signal processor image signal processor
  • ISP image signal processor
  • controller memory
  • video codec digital signal processor
  • DSP digital signal processor
  • baseband processor baseband processor
  • neural network processor neural-network processing unit
  • the controller may be the nerve center and command center of the electronic device 100 .
  • the controller can generate an operation control signal according to the instruction opcode and timing signal, and complete the control of fetching and executing the instruction.
  • a memory may also be provided in the processor 110 for storing instructions and data.
  • the memory in processor 110 is a cache memory. This memory may hold instructions or data that processor 110 has just used or recycled. If the processor 110 needs to use the instruction or data again, it can be called directly from the memory. Repeated access is avoided, and the waiting time of the processor 110 is reduced, thus improving the efficiency of the system.
  • the processor 110 stores programs or instructions corresponding to the method of recommending video clips for the user in the embodiment of the present application.
  • This method can trigger the electronic device 100 to collect and identify the user's identity and collect the current scene information. etc., and upload the collected user identity and current scene information to the server. It is also possible to control the electronic device 100 to receive information about one or more videos recommended by the server, as well as segment and tag information included in each video.
  • the processor 110 can also query the label information of each video segment included in the video, and match the user with segments that meet the user's viewing needs according to the current user identity. For example, when it is determined that the current user is a child user or an underage user, the processor 110 may control the electronic device 100 to switch to the children's mode, and control the playable movie authority range of the electronic device 100 in the children's mode. If the child user or underage user plays a selected video, the electronic device 100 is controlled to play only video clips suitable for the child user to watch.
  • the processor 110 may also determine the identity of the user according to the collected user characteristics, and determine the current scene according to the collected scene information, which will not be described herein again.
  • processor 110 may include one or more interfaces.
  • the interface may include an integrated circuit (inter-integrated circuit, I2C) interface, an integrated circuit built-in audio (inter-integrated circuit sound, I2S) interface, a pulse code modulation (pulse code modulation, PCM) interface, a universal asynchronous transmitter (universal asynchronous receiver/transmitter, UART) interface, mobile industry processor interface (mobile industry processor interface, MIPI), general-purpose input and output (general-purpose input/output, GPIO) interface, subscriber identity module (subscriber identity module, SIM) interface, and /or universal serial bus (universal serial bus, USB) interface, etc.
  • I2C integrated circuit
  • I2S integrated circuit built-in audio
  • PCM pulse code modulation
  • PCM pulse code modulation
  • UART universal asynchronous transmitter
  • MIPI mobile industry processor interface
  • GPIO general-purpose input and output
  • subscriber identity module subscriber identity module
  • SIM subscriber identity module
  • USB universal serial bus
  • the I2C interface is a bidirectional synchronous serial bus, including a serial data line (serial data line, SDA) and a serial clock line (derail clock line, SCL).
  • processor 110 may include multiple sets of I2C buses.
  • the processor 110 can be respectively coupled to the touch sensor 180K, the charger, the flashlight, the camera 193 and the like through different I2C bus interfaces.
  • the processor 110 may be coupled to the touch sensor 180K through the I2C interface, so that the processor 110 and the touch sensor 180K communicate through the I2C bus interface to realize the touch function of the electronic device 100 .
  • the I2S interface can be used for audio communication.
  • the PCM interface can also be used for audio communication, sampling, quantizing and encoding the analog signal.
  • the audio module 170 and the wireless communication module 160 may be coupled through a PCM bus interface.
  • the audio module 170 can also transmit audio signals to the wireless communication module 160 through the PCM interface, so as to realize the function of answering calls through the Bluetooth headset. Both the I2S interface and the PCM interface can be used for audio communication.
  • the UART interface is a universal serial data bus used for asynchronous communication.
  • the bus can be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication.
  • a UART interface is generally used to connect the processor 110 and the wireless communication module 160 .
  • the processor 110 communicates with the Bluetooth module in the wireless communication module 160 through the UART interface to realize the Bluetooth function.
  • the audio module 170 can transmit audio signals to the wireless communication module 160 through the UART interface, so as to realize the function of playing music through the Bluetooth headset.
  • the MIPI interface can be used to connect the processor 110 with peripheral devices such as the display screen 194 and the camera 193 .
  • MIPI interface includes camera serial interface (camera serial interface, CSI), display serial interface (display serial interface, DSI), etc.
  • the processor 110 communicates with the camera 193 through the CSI interface to realize the shooting function of the electronic device 100 .
  • the processor 110 communicates with the display screen 194 through the DSI interface to realize the display function of the electronic device 100 .
  • the GPIO interface can be configured by software.
  • the GPIO interface can be configured as a control signal or as a data signal.
  • the GPIO interface can be used to connect the processor 110 with the camera 193 , the display screen 194 , the wireless communication module 160 , the audio module 170 , the sensor module 180 and so on.
  • the GPIO interface can also be configured as an I2C interface, I2S interface, UART interface, MIPI interface, etc.
  • the USB interface 130 is an interface conforming to the USB standard specification, specifically, it can be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like.
  • the USB interface 130 can be used to connect a charger to charge the electronic device 100 , and can also be used to transmit data between the electronic device 100 and peripheral devices. It can also be used to connect headphones and play audio through them. This interface can also be used to connect other electronic devices, such as AR devices.
  • the interface connection relationship between the modules shown in the embodiment of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 .
  • the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
  • the charging management module 140 is configured to receive a charging input from a charger.
  • the charger may be a wireless charger or a wired charger.
  • the charging management module 140 can receive charging input from the wired charger through the USB interface 130 .
  • the charging management module 140 may receive a wireless charging input through a wireless charging coil of the electronic device 100 . While the charging management module 140 is charging the battery 142 , it can also provide power for electronic devices through the power management module 141 .
  • the power management module 141 is used for connecting the battery 142 , the charging management module 140 and the processor 110 .
  • the power management module 141 receives the input from the battery 142 and/or the charging management module 140 to provide power for the processor 110 , the internal memory 121 , the external memory, the display screen 194 , the camera 193 , and the wireless communication module 160 .
  • the power management module 141 can also be used to monitor parameters such as battery capacity, battery cycle times, and battery health status (leakage, impedance).
  • the power management module 141 may also be disposed in the processor 110 .
  • the power management module 141 and the charging management module 140 may also be set in the same device.
  • the wireless communication function of the electronic device 100 can be realized by the antenna 1 , the antenna 2 , the mobile communication module 150 , the wireless communication module 160 , a modem processor, a baseband processor, and the like.
  • Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals.
  • Each antenna in electronic device 100 may be used to cover single or multiple communication frequency bands. Different antennas can also be multiplexed to improve the utilization of the antennas.
  • Antenna 1 can be multiplexed as a diversity antenna of a wireless local area network.
  • the antenna may be used in conjunction with a tuning switch.
  • the mobile communication module 150 can provide wireless communication solutions including 2G/3G/4G/5G applied on the electronic device 100 .
  • the mobile communication module 150 may include at least one filter, switch, power amplifier, low noise amplifier (low noise amplifier, LNA) and the like.
  • the mobile communication module 150 can receive electromagnetic waves through the antenna 1, filter and amplify the received electromagnetic waves, and send them to the modem processor for demodulation.
  • the mobile communication module 150 can also amplify the signals modulated by the modem processor, and convert them into electromagnetic waves through the antenna 1 for radiation.
  • at least part of the functional modules of the mobile communication module 150 may be set in the processor 110.
  • at least part of the functional modules of the mobile communication module 150 and at least part of the modules of the processor 110 may be set in the same device.
  • a modem processor may include a modulator and a demodulator.
  • the modulator is used for modulating the low-frequency baseband signal to be transmitted into a medium-high frequency signal.
  • the demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator sends the demodulated low-frequency baseband signal to the baseband processor for processing.
  • the low-frequency baseband signal is passed to the application processor after being processed by the baseband processor.
  • the application processor outputs sound signals through audio equipment (not limited to speaker 170A, receiver 170B, etc.), or displays images or videos through display screen 194 .
  • the modem processor may be a stand-alone device.
  • the modem processor may be independent from the processor 110, and be set in the same device as the mobile communication module 150 or other functional modules.
  • the wireless communication module 160 can provide wireless local area networks (wireless local area networks, WLAN) (such as wireless fidelity (Wireless Fidelity, Wi-Fi) network), bluetooth (bluetooth, BT), global navigation satellite, etc. applied on the electronic device 100.
  • System global navigation satellite system, GNSS
  • frequency modulation frequency modulation, FM
  • near field communication technology near field communication, NFC
  • infrared technology infrared, IR
  • the wireless communication module 160 may be one or more devices integrating at least one communication processing module.
  • the wireless communication module 160 receives electromagnetic waves via the antenna 2 , frequency-modulates and filters the electromagnetic wave signals, and sends the processed signals to the processor 110 .
  • the wireless communication module 160 can also receive the signal to be sent from the processor 110 , frequency-modulate it, amplify it, and convert it into electromagnetic waves through the antenna 2 for radiation.
  • the antenna 1 of the electronic device 100 is coupled to the mobile communication module 150, and the antenna 2 is coupled to the wireless communication module 160, so that the electronic device 100 can communicate with the network and other devices through wireless communication technology.
  • the wireless communication technology may include global system for mobile communications (GSM), general packet radio service (general packet radio service, GPRS), code division multiple access (code division multiple access, CDMA), broadband Code division multiple access (wideband code division multiple access, WCDMA), time division code division multiple access (time-division code division multiple access, TD-SCDMA), long term evolution (long term evolution, LTE), BT, GNSS, WLAN, NFC , FM, and/or IR techniques, etc.
  • GSM global system for mobile communications
  • GPRS general packet radio service
  • code division multiple access code division multiple access
  • CDMA broadband Code division multiple access
  • WCDMA wideband code division multiple access
  • time division code division multiple access time-division code division multiple access
  • TD-SCDMA time-division code division multiple access
  • the GNSS may include a global positioning system (global positioning system, GPS), a global navigation satellite system (global navigation satellite system, GLONASS), a Beidou navigation satellite system (beidou navigation satellite system, BDS), a quasi-zenith satellite system (quasi -zenith satellite system (QZSS) and/or satellite based augmentation systems (SBAS).
  • GPS global positioning system
  • GLONASS global navigation satellite system
  • Beidou navigation satellite system beidou navigation satellite system
  • BDS Beidou navigation satellite system
  • QZSS quasi-zenith satellite system
  • SBAS satellite based augmentation systems
  • the electronic device 100 can communicate with the server through the mobile communication module 150 or the wireless communication module 160 .
  • performing information and interaction with the server based on a Wi-Fi network, or performing information and interaction with the server based on wireless communication such as 2G/3G/4G/5G, is not limited in this embodiment of the present application.
  • the electronic device 100 realizes the display function through the GPU, the display screen 194 , and the application processor.
  • the GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. GPUs are used to perform mathematical and geometric calculations for graphics rendering.
  • Processor 110 may include one or more GPUs that execute program instructions to generate or change display information.
  • the display screen 194 is used to display images, videos and the like.
  • the display screen 194 includes a display panel.
  • the display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active matrix organic light emitting diode or an active matrix organic light emitting diode (active-matrix organic light emitting diode, AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), MiniLED, MicroLED, Micro-OLED, quantum dot light emitting diodes (quantum dot light emitting diodes, QLED), etc.
  • the electronic device 100 may include 1 or N display screens 194 , where N is a positive integer greater than 1.
  • the display screen 194 of the mobile phone may receive the user's operation, trigger the mobile phone to start collecting user characteristics, or start collecting current scene information.
  • a corresponding video list interface and the like may be displayed on the display screen 194 of the mobile phone.
  • the playback interface of the video can be displayed for the user on the display screen 194, and when a certain segment of the video is filtered out, some prompt information or prompt windows with different contents can be displayed, Prompt the user of the current playback progress.
  • the user can also perform operations such as clicking, double-clicking, and sliding on the display screen 194 , and the content displayed on the display screen 194 of the electronic device 100 responds according to the user's operation, which will not be repeated here.
  • the electronic device 100 can realize the shooting function through the ISP, the camera 193 , the video codec, the GPU, the display screen 194 and the application processor.
  • the ISP is used for processing the data fed back by the camera 193 .
  • the light is transmitted to the photosensitive element of the camera through the lens, and the light signal is converted into an electrical signal, and the photosensitive element of the camera transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye.
  • ISP can also perform algorithm optimization on image noise, brightness, and skin color.
  • ISP can also optimize the exposure, color temperature and other parameters of the shooting scene.
  • the ISP may be located in the camera 193 .
  • Camera 193 is used to capture still images or video.
  • the object generates an optical image through the lens and projects it to the photosensitive element.
  • the photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor.
  • CMOS complementary metal-oxide-semiconductor
  • the photosensitive element converts the light signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal.
  • the ISP outputs the digital image signal to the DSP for processing.
  • DSP converts digital image signals into standard RGB, YUV and other image signals.
  • the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
  • the camera 193 of the electronic device 100 may be a front camera, a rear camera, or an under-screen camera.
  • the camera 193 may receive instructions from the processor 110 to collect user characteristics, or collect current scene information, etc., which will not be repeated here.
  • Digital signal processors are used to process digital signals. In addition to digital image signals, they can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the energy of the frequency point.
  • Video codecs are used to compress or decompress digital video.
  • the electronic device 100 may support one or more video codecs.
  • the electronic device 100 can play or record videos in various encoding formats, for example: moving picture experts group (moving picture experts group, MPEG) 1, MPEG2, MPEG3, MPEG4 and so on.
  • MPEG moving picture experts group
  • the NPU is a neural-network (NN) computing processor.
  • NN neural-network
  • Applications such as intelligent cognition of the electronic device 100 can be realized through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
  • the external memory interface 120 can be used to connect an external memory card, such as a Micro SD card, so as to expand the storage capacity of the electronic device 100.
  • the external memory card communicates with the processor 110 through the external memory interface 120 to implement a data storage function. Such as saving music, video and other files in the external memory card.
  • the internal memory 121 may be used to store computer-executable program codes including instructions.
  • the processor 110 executes various functional applications and data processing of the electronic device 100 by executing instructions stored in the internal memory 121 .
  • the internal memory 121 may include an area for storing programs and an area for storing data.
  • the stored program area can store an operating system, at least one application program required by a function (such as a sound playing function, an image playing function, etc.) and the like.
  • the storage data area can store data created during the use of the electronic device 100 (such as audio data, phonebook, etc.) and the like.
  • the internal memory 121 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, flash memory device, universal flash storage (universal flash storage, UFS) and the like.
  • the electronic device 100 can implement audio functions through the audio module 170 , the speaker 170A, the receiver 170B, the microphone 170C, the earphone interface 170D, and the application processor. Such as music playback, recording, etc.
  • the audio module 170 is used to convert digital audio information into analog audio signal output, and is also used to convert analog audio input into digital audio signal.
  • the audio module 170 may also be used to encode and decode audio signals.
  • the audio module 170 may be set in the processor 110 , or some functional modules of the audio module 170 may be set in the processor 110 .
  • Speaker 170A also referred to as a "horn" is used to convert audio electrical signals into sound signals.
  • Electronic device 100 can listen to music through speaker 170A, or listen to hands-free calls.
  • Receiver 170B also called “earpiece” is used to convert audio electrical signals into sound signals.
  • the receiver 170B can be placed close to the human ear to receive the voice.
  • the microphone 170C also called “microphone” or “microphone” is used to convert sound signals into electrical signals. When making a phone call or sending a voice message, the user can put his mouth close to the microphone 170C to make a sound, and input the sound signal to the microphone 170C.
  • the electronic device 100 may be provided with at least one microphone 170C. In some other embodiments, the electronic device 100 may be provided with two microphones 170C, which may also implement a noise reduction function in addition to collecting sound signals. In some other embodiments, the electronic device 100 can also be provided with three, four or more microphones 170C to collect sound signals, reduce noise, identify sound sources, and realize directional recording functions, etc.
  • the earphone interface 170D is used for connecting wired earphones.
  • the earphone interface 170D can be a USB interface 130, or a 3.5mm open mobile terminal platform (OMTP) standard interface, or a cellular telecommunications industry association of the USA (CTIA) standard interface.
  • OMTP open mobile terminal platform
  • CTIA cellular telecommunications industry association of the USA
  • the pressure sensor 180A is used to sense the pressure signal and convert the pressure signal into an electrical signal.
  • pressure sensor 180A may be disposed on display screen 194 .
  • the gyro sensor 180B can be used to determine the motion posture of the electronic device 100 .
  • the angular velocity of the electronic device 100 around three axes ie, x, y and z axes
  • the gyro sensor 180B can be used for image stabilization.
  • the air pressure sensor 180C is used to measure air pressure.
  • the electronic device 100 calculates the altitude based on the air pressure value measured by the air pressure sensor 180C to assist positioning and navigation.
  • the magnetic sensor 180D includes a Hall sensor.
  • the electronic device 100 may use the magnetic sensor 180D to detect the opening and closing of the flip leather case.
  • the electronic device 100 when the electronic device 100 is a clamshell machine, the electronic device 100 can detect opening and closing of the clamshell according to the magnetic sensor 180D.
  • features such as automatic unlocking of the flip cover are set.
  • the acceleration sensor 180E can detect the acceleration of the electronic device 100 in various directions (generally three axes). When the electronic device 100 is stationary, the magnitude and direction of gravity can be detected. It can also be used to identify the posture of electronic devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
  • the distance sensor 180F is used to measure the distance.
  • the electronic device 100 may measure the distance by infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 may use the distance sensor 180F for distance measurement to achieve fast focusing.
  • Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes.
  • the light emitting diodes may be infrared light emitting diodes.
  • the electronic device 100 emits infrared light through the light emitting diode.
  • Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it may be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 .
  • the electronic device 100 can use the proximity light sensor 180G to detect that the user is holding the electronic device 100 close to the ear to make a call, so as to automatically turn off the screen to save power.
  • the proximity light sensor 180G can also be used in leather case mode, automatic unlock and lock screen in pocket mode.
  • the ambient light sensor 180L is used for sensing ambient light brightness.
  • the electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness.
  • the ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures.
  • the ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in the pocket, so as to prevent accidental touch.
  • the fingerprint sensor 180H is used to collect fingerprints.
  • the electronic device 100 can use the collected fingerprint characteristics to implement fingerprint unlocking, access to application locks, take pictures with fingerprints, answer incoming calls with fingerprints, and the like.
  • the fingerprint sensor 180H can obtain the user's fingerprint information, and transmit the user's fingerprint information to the processor 110, and the processor 110 will compare it with the existing fingerprint information, and then perform identity verification and verification. confirm.
  • the temperature sensor 180J is used to detect temperature.
  • the electronic device 100 uses the temperature detected by the temperature sensor 180J to implement a temperature treatment strategy. For example, when the temperature reported by the temperature sensor 180J exceeds the threshold, the electronic device 100 may reduce the performance of the processor located near the temperature sensor 180J, so as to reduce power consumption and implement thermal protection.
  • the electronic device 100 when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to prevent the electronic device 100 from being shut down abnormally due to the low temperature.
  • the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
  • Touch sensor 180K also known as "touch panel”.
  • the touch sensor 180K can be disposed on the display screen 194, and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”.
  • the touch sensor 180K is used to detect a touch operation on or near it.
  • the touch sensor can pass the detected touch operation to the application processor to determine the type of touch event.
  • Visual output related to the touch operation can be provided through the display screen 194 .
  • the touch sensor 180K may also be disposed on the surface of the electronic device 100 , which is different from the position of the display screen 194 .
  • the touch sensor 180K can detect user operations such as touch, click, double click, etc., and transmit the user operations to the processor 110, and the processor 110 makes a response.
  • the bone conduction sensor 180M can acquire vibration signals. In some embodiments, the bone conduction sensor 180M can acquire the vibration signal of the vibrating bone mass of the human voice. The bone conduction sensor 180M can also contact the human pulse and receive the blood pressure beating signal. In some embodiments, the bone conduction sensor 180M can also be disposed in the earphone, combined into a bone conduction earphone.
  • the audio module 170 can analyze the voice signal based on the vibration signal of the vibrating bone mass of the vocal part acquired by the bone conduction sensor 180M, so as to realize the voice function.
  • the application processor can analyze the heart rate information based on the blood pressure beating signal acquired by the bone conduction sensor 180M, so as to realize the heart rate detection function.
  • the keys 190 include a power key, a volume key and the like.
  • the key 190 may be a mechanical key. It can also be a touch button.
  • the electronic device 100 can receive key input and generate key signal input related to user settings and function control of the electronic device 100 .
  • the motor 191 can generate a vibrating reminder.
  • the motor 191 can be used for incoming call vibration prompts, and can also be used for touch vibration feedback.
  • touch operations applied to different applications may correspond to different vibration feedback effects.
  • the motor 191 may also correspond to different vibration feedback effects for touch operations acting on different areas of the display screen 194 .
  • Different application scenarios for example: time reminder, receiving information, alarm clock, games, etc.
  • the touch vibration feedback effect can also support customization.
  • the indicator 192 can be an indicator light, and can be used to indicate charging status, power change, and can also be used to indicate messages, missed calls, notifications, and the like.
  • the SIM card interface 195 is used for connecting a SIM card.
  • the SIM card can be connected and separated from the electronic device 100 by inserting it into the SIM card interface 195 or pulling it out from the SIM card interface 195 .
  • the electronic device 100 may support 1 or N SIM card interfaces, where N is a positive integer greater than 1.
  • SIM card interface 195 can support Nano SIM card, Micro SIM card, SIM card etc. Multiple cards can be inserted into the same SIM card interface 195 at the same time. The types of the multiple cards may be the same or different.
  • the SIM card interface 195 is also compatible with different types of SIM cards.
  • the SIM card interface 195 is also compatible with external memory cards.
  • the electronic device 100 interacts with the network through the SIM card to implement functions such as calling and data communication.
  • the electronic device 100 adopts an eSIM, that is, an embedded SIM card.
  • the eSIM card can be embedded in the electronic device 100 and cannot be separated from the electronic device 100 .
  • the electronic device 100 may include some or all of the structures introduced in FIG. 2 , or may include more or less structures than those in FIG. 2 , and the embodiment of the present application does not limit the hardware structure of the electronic device 100 .
  • the server may also include part or all of the structures introduced in FIG. 2 , or may include more or less structures than those in FIG. 2 , and this embodiment of the present application does not limit the structure of the server.
  • the software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a micro-kernel architecture, a micro-service architecture, or a cloud architecture.
  • the embodiment of this application uses a layered architecture
  • the system is taken as an example to illustrate the software structure of the electronic device 100 .
  • FIG. 3 is a block diagram of the software structure of the electronic device 100 according to the embodiment of the present application.
  • the layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate through software interfaces.
  • the The system is divided into four layers, from top to bottom are application layer, application framework layer, Android runtime ( runtime) and system libraries, as well as the kernel layer and network transport layer.
  • the application layer can consist of a series of application packages.
  • the application package may include application programs such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
  • application programs such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, and short message.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer includes some predefined functions.
  • the application framework layer can include window manager, content provider, view system, phone manager, resource manager, notification manager, etc.
  • a window manager is used to manage window programs.
  • the window manager can obtain the size of the display screen, determine whether the screen has a status bar, or participate in operations such as locking the screen and capturing the screen.
  • Content providers are used to store and retrieve data and make it accessible to applications.
  • the stored data may include video data, image data, audio data, etc., and may also include call record data made and received, user browsing history, bookmarks, and other data, which will not be described here.
  • the window manager may determine the size of the video playback window on the display screen according to the size of the display screen of the electronic device 100, for example, play the video for the user in a full-screen or half-screen manner.
  • the electronic device 100 receives the video data sent by the server, and the content provider can obtain the video data, such as image data, audio data, etc., and draw a video picture according to the image data, audio data, etc., and display the video picture In the playback window determined by the window manager.
  • the content provider can also obtain the browsing history data, browsing hobbies and other data included in the user account, so that after the server obtains the data, it can recommend personalized videos or match video clips for the user based on the data. Let me repeat them one by one.
  • the view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on.
  • the view system can be used to build applications.
  • a display interface can consist of one or more views.
  • a display interface including a text message notification icon may include a view for displaying text and a view for displaying pictures.
  • the video playback interface displayed on the display screen of the electronic device 100 can all be provided based on the visual controls of the view system.
  • the phone manager is used to provide communication functions of the electronic device 100 .
  • the management of the call state including the connection and hanging up of the phone, etc.).
  • the resource manager provides various resources to the application, such as localized strings, icons, pictures, layout files, video files, and so on.
  • the notification manager enables the application to display notification information in the status bar of the screen, which can be used to convey messages to the user.
  • the notification information can disappear automatically after a short stay in the status bar, without requiring the user to perform an interactive process such as closing operations.
  • the notification manager can notify the user of messages such as download completion.
  • the notification manager can also be a notification that appears on the top status bar of the system in the form of a chart or scroll bar text, such as a notification of an application running in the background; or, the notification manager can also be a notification that appears on the screen in the form of a dialog window, For example, prompting text information in the status bar; or, the notification manager can also control the electronic device to emit a prompt sound, vibrate the electronic device, and flash the indicator light of the electronic device, etc., which will not be repeated here.
  • runtime includes core library and virtual machine. The runtime is responsible for the scheduling and management of the Android system.
  • the core library consists of two parts: one part is the function function that the java language needs to call, and the other part is the core library of Android.
  • the application layer and the application framework layer run in virtual machines.
  • the virtual machine executes the java files of the application program layer and the application program framework layer as binary files.
  • the virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, security and exception management, and garbage collection.
  • a system library can include multiple function modules. For example: surface manager (surface manager), media library (media libraries), three-dimensional (three dimensional, 3D) graphics processing library (for example: OpenGL ES), two-dimensional (two dimensional, 2D) graphics engine, etc.
  • surface manager surface manager
  • media library media libraries
  • three-dimensional (three dimensional, 3D) graphics processing library for example: OpenGL ES
  • two-dimensional (two dimensional, 2D) graphics engine etc.
  • the surface manager is used to manage the display subsystem of the electronic device, and provides the fusion of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of various commonly used audio and video formats, as well as still image files, etc.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing, etc.
  • 2D graphics engine is a drawing engine for 2D drawing.
  • the kernel layer is the layer between hardware and software.
  • the kernel layer includes at least a display driver, a camera driver, an audio driver, and a sensor driver.
  • the network transmission layer may include a communication module, and a transmission channel may be established between the communication module of the electronic device 100 and the communication module of the server 200 .
  • the communication module of the electronic device 100 and the communication module of the server 200 can communicate based on a Wi-Fi channel; or, the communication module of the electronic device 100 and the communication module of the server 200 can be based on 2G/3G/4G/5G/6G
  • the embodiment of the present application does not limit this to performing communication in a manner such as wireless communication.
  • FIG. 3 also shows possible software modules of the server 200 .
  • the server 200 may at least include a communication module, a distribution module, a media service module, a content service module, a user service module, and the like.
  • the media service module can perform frame-by-frame analysis on each video uploaded by the operator, such as performing target detection, content recognition, content understanding, natural language understanding, etc. for each frame of each video in one or several dimensions. , so as to determine the label information of each video, and determine the label information of each frame.
  • the media service module can also perform tag aggregation according to the tag information of each frame, so as to divide the video into multiple video clips, and each video clip in the multiple video clips can correspond to different tag information.
  • the media service module can also analyze and extract tag information of each video or different segments of each video, and dynamically generate tag information of video segments.
  • the content service module can provide different services for different users according to the metadata of each video segment included in the video, such as recommending a certain video for the user, determining a certain video segment of the video that can be played for the user, Services such as watching back a certain video segment of the video.
  • the content service module may acquire the user identity information sent by the electronic device 100, and request the distribution module to match the target clips that can be played for the user from multiple video clips, and details will not be repeated this time.
  • the user service module can receive the user characteristic data reported by the electronic device 100, and perform user portraits according to the user characteristic data, and encrypt and store a large amount of data of different user portraits. After receiving the user information reported by the electronic device 100, it can quickly query The data of a large number of user portraits that have been stored can quickly determine the identity of the current user.
  • the user service module can also receive the scene information reported by the electronic device 100, for example, the user service module can receive the life scene picture reported by the electronic device 100, and determine the scene corresponding to the life scene picture according to the picture detection and recognition algorithm.
  • the distribution module can obtain and store tag information of different videos of the media service module or content service module, or tag information of different segments of a video.
  • the distribution module can also obtain the identification result of the user's identity from the user service module, or obtain the identification result of the current scene, and recommend one or more videos that meet the current viewing needs for the user according to the user's identity and scene information. As well as the target clips that can be played by the current user in each video, etc., to realize personalized and intelligent video recommendation.
  • the communication module is used to communicate with the electronic device 100, please refer to the above related description, and details will not be repeated this time.
  • the method provided in this embodiment of the application can be pre-configured in the video application by the developer of the video application, and when the user uses the video application, the video can be recommended to the user by default and automatically according to the method provided in the application clips; or, through the setting menu of the video application, provide users with shortcut controls or options, etc., through which the user can enable the video application to personalize the function of recommending video clips for users, that is, according to the method provided in this application:
  • the user recommends video clips; or, the shortcut control or option can be integrated in the menu of the system-level application such as the setting application, and the user can use the shortcut control or option included in the system-level application such as the setting application to enable personalized recommendation of video clips for the user function, which is not limited in this embodiment of the application.
  • the user can enable the function of personalized recommendation of video clips of the electronic device through the setting function of the video application.
  • the "video application” in this embodiment of the application may be Huawei Video provided by Huawei, or other third-party video applications such as Youku Video, Tencent Video, iQiyi Video, and Mango TV. Any application of the function is not limited in this embodiment of the present application.
  • FIG. 4 is a schematic interface diagram of an example of a user enabling a function of personalized recommended video clips provided by the embodiment of the present application.
  • Figure (a) in Figure 4 shows that in the unlocked state of the mobile phone, the screen of the mobile phone displays the main interface 401 of the currently output mobile phone, as shown in Figure (a) in Figure 4, the user clicks on the The video application icon on the main interface 401, in response to the user's click operation, the mobile phone displays the main interface 402 of the video application as shown in (b) of FIG. 4 .
  • the setting interface 403 may include a plurality of different setting menus and options. Exemplarily, as shown in (c) of FIG. 4 , the setting interface 403 may include an "account setting" menu, an "appearance setting” menu, a “playback setting” menu, a "download setting” menu, etc., and an "other setting menu.
  • the "Account Settings” menu can include options such as account security center, personal data and regional settings
  • the "Appearance Settings” menu can include options such as dark mode
  • the "Playback Settings” menu can include options such as skipping opening and closing credits, continuous playback, etc. , which will not be described in detail in this embodiment of the present application.
  • the shortcut control for enabling the function of personalized recommended video clips may be included in the interface corresponding to the "other settings” menu.
  • the user clicks on the "Other Settings” menu and in response to the user's click operation, the mobile phone further displays the "Other Settings” menu as shown in Figure (d) in Figure 4
  • the interface 404 may include a variety of shortcut controls (or switches), such as the “allow recommendation based on historical records” control, the “allow display of advertisements” control, the “allow non-WiFi automatic play” control, the “allow intelligent detection of identity information” control, the “ Allow scene recognition” control and “Allow matching video clips based on user identity” control, etc.
  • shortcut controls or switches
  • the user may click some or all of the shortcut controls (or switches) to enable corresponding functions according to their own needs.
  • the user can click the shortcut control (or switch) to enable the function of "allowing recommendation based on historical records" of the video application of the mobile phone, and the function of "allowing intelligent detection of identity information” feature, a feature that "allows scene recognition,” and a feature that "allows video clips to be matched based on user identity.”
  • Fig. 5 is a schematic flowchart of an example of a method for personalized recommendation of video clips provided by an embodiment of the present application. As shown in FIG. 5, the method 500 can be applied to the system including the electronic device 100 and the server 200 shown in FIG. stage and match the video stage.
  • the "video preprocessing stage" can be understood as: after the operator of the video application uploads one or more videos to the server 200 corresponding to the video application, the server 200 first preprocesses each video.
  • the "determining user identity stage” can be understood as: after the user runs the video application, the electronic device 100 collects user information, and the electronic device 100 or server 200 determines the user's identity based on the collected user information, such as an adult user, an underage user, Parent users, device owner users, child users, other users of family members, etc.
  • the “matching video stage” can be understood as: in the process of using a video application by a user, the video application can request video resources from the server 200, and the server 200 matches one or more videos for the user from a large number of video resources, and then the video The process of recommending to the user on the running interface of the application; or matching and playing the target segment of the video for the user while the user is playing a certain video.
  • the three stages are described in detail below.
  • the operator of the video application uploads a first video file to the server 200 corresponding to the video application, and the media service module of the server 200 acquires the first video file.
  • first video file in the embodiment of the present application may also be referred to as "the first video”, and the operator uploads one or more videos to the server 200, and the "first video” may be one or more video files any of them.
  • first video may be one or more video files any of them.
  • the embodiment of the present application will use the first video as an example to introduce the processing process of the first video.
  • the "server 200" in the embodiment of the present application may be a server corresponding to a video application, such as Huawei Video.
  • the server 200 is a server corresponding to Huawei Video. If it is Youku Video, the server may be a server corresponding to Youku Video. , for simplicity, they are collectively referred to as the server 200, and the embodiment of the present application does not limit the video application.
  • the media service module of the server 200 analyzes the first video frame by frame, and generates metadata of the first video and tag information of the first video.
  • the media service module of the server 200 may perform intelligent frame-by-frame analysis on the first video based on various media artificial intelligence (AI) algorithms, and generate metadata of the first video and the first video label information.
  • AI media artificial intelligence
  • the label information corresponding to each frame can be obtained, and then according to each frame
  • the tag information of the first video is aggregated into a plurality of fragmented video segments, such as video segment 1, video segment 2, video segment 3, etc., from N frames of the first video.
  • the "metadata of the first video” in the embodiment of the present application can be understood as: taking the label information of each frame of the first video as the dimension, and dividing the content data corresponding to the fragmented video segments obtained according to certain principles .
  • the metadata of the first video may include metadata of video clip 1, metadata of video clip 2, and metadata of video clip 3
  • the metadata of the video clip 1 may include basic content information such as image data, audio data, and text data of the video clip 1, as well as the media stream address of the video clip 1. This embodiment of the present application does not limit the content included in the metadata.
  • the media service module can perform detection in one or several dimensions, such as target detection, content recognition, content understanding, and natural language understanding, for each frame based on the media AI algorithm. , and then determine the label information of each frame. Then, according to the deep learning algorithm, etc., according to the label information of each frame, the first video is aggregated according to the segment dimension.
  • detection in one or several dimensions such as target detection, content recognition, content understanding, and natural language understanding
  • the media service module stores various video analysis algorithms, such as media AI algorithms, deep learning algorithms, etc., and step 502 can be performed by the media service module.
  • video analysis algorithms such as media AI algorithms, deep learning algorithms, etc.
  • step 502 may also be performed by other remote processing modules with stronger data processing capabilities, such as other media detection platforms or training platforms.
  • the media service module of the server 200 can synchronize the first video to remote processing modules such as other media detection platforms or training platforms, and use more AI algorithms and training models stored in other media detection platforms or training platforms to Carry out accurate frame-by-frame analysis of the video, after other media detection platforms or training platforms process the first video, return and store the metadata of the first video and tag information of the first video obtained through analysis in the media service module, The metadata of the first video and the tag information of the first video are managed and stored by the media service module, which is not limited in this embodiment of the present application.
  • each frame of the first video it may correspond to multi-dimensional tag information.
  • the media service module detects the content of the first frame to determine that the first frame includes a fighting scene, and at the same time determines whether the first frame includes a fighting scene based on content understanding, natural language understanding, etc. If bloodshed content is included, then the tag information corresponding to the first frame may include fighting and/or bloodshed.
  • FIG. 6 is a schematic diagram of a frame-by-frame analysis result of a video provided by the embodiment of the present application. As shown in FIG. 6 , it is assumed that the first video includes N frames, and N is greater than or equal to 1.
  • the media service module can detect the content of each frame picture frame by frame, determine the content included in each frame picture, and determine the tag information corresponding to the frame according to the content included in each frame picture.
  • the media service module detects that the first frame includes fighting content, and at the same time detects that the first frame includes bloody content, then it can be based on the following Multiple ways to generate the label information corresponding to the first frame:
  • the media service module detects the first frame to the Nth frame of the first video, and determines the label information corresponding to each frame.
  • tags of multiple dimensions in the same frame may correspond to different priority orders, for example, determine which tag among tags of multiple dimensions is a preferentially used tag according to tags of adjacent multi-frame pictures.
  • the first frame corresponds to the "fighting" label and the "bleeding" label, and it can be determined according to the labels of the adjacent second to third frames that the "fighting" label appears The number of times is more, the "fighting" tag can be used as a priority tag, so that the "fighting" tag and the “bleeding” tag correspond to different priority orders, and the priority of the "fighting" tag is higher than that of the "bleeding" tag , which is not limited in this embodiment of the present application.
  • the media service module may aggregate tags of video segments according to tag similarities between adjacent frames, Further, the first video is divided into video segments, for example, the first video is divided into multiple video segments, and each of the multiple video segments may correspond to different tag information.
  • Table 1 lists several possible ways of dividing video segments.
  • the first video may be divided into multiple video segments according to different tag information.
  • the tag with the most repeated occurrences among the multiple frames included in the video segment may be determined as the tag information of the video segment.
  • Each video clip can have multiple tags.
  • each frame of the first frame to the fourth frame included in segment 1 corresponds to the label information of "action>fighting", that is, "action>fighting” is The label with the most repeated occurrences in the first frame to the fourth frame included in the segment 1 determines "action>fighting" as the tag information of the segment 1.
  • “emotion>hug” has the most repeated occurrences in the 5th to 11th frames included in segment 2, and "emotion> hug” can be determined as the label information of segment 2; the 12th to 15th frames included in segment 3 "Emotion>Love” appears the most frequently in the frame, and "Emotion>Love” can be determined as the label information of segment 3.
  • each frame may include multiple tags, and the tag information of the video clip may be determined according to any tag of each frame in the multiple frames included in the video clip.
  • each frame of the first frame to the seventh frame included in the segment 1 corresponds to two kinds of label information, and the bleeding picture corresponding to the most frequent occurrence can be used
  • "Bloody" is the tag information of Fragment 1
  • each frame of the 8th to 11th frames included in Fragment 2 corresponds to two kinds of tag information, and any one of them can be used as the tag information of Fragment 2.
  • Fragment Each of the 12th to 15th frames included in 3 corresponds to a kind of label information, and "emotion" can be used as the label information of segment 3.
  • each frame of the first to fourth frames included in segment 1 corresponds to two types of label information, and any one of them can be "action”
  • each frame in the 5th to 15th frames included in clip 2 includes the tag of "emotion", and "emotion" can be used as the tag information of video clip 2, which will not be repeated here repeat.
  • a noise threshold can be set for adjacent consecutive frames, and the noise threshold can be used to mark a critical number M of frames with discontinuous labels, and M is greater than or equal to 1.
  • the media service module determines that the labels of more than consecutive K frames are different from the labels of other adjacent frames, if K is greater than or equal to M, then the first frame in the continuous K frames is used as the dividing line to perform video clipping If K is less than M, the influence of noise can be ignored, and the continuous K frame and the previous frame adjacent to the K frame can be divided into the same video segment.
  • the label of the 3rd frame of the first video in Table 1 is "fight/bad guy", which is different from the labels of the 1st frame, the 2nd frame and the 4th frame, but the number 1 is less than
  • the critical quantity is 3, so the "fight/bad guy" label of the 3rd frame can be ignored, and the 3rd frame can be divided into the video clip 1 (including the 1st to 4th frame), and the video clip 1 is labeled as "action > bloody", no more details here.
  • the noise threshold can ignore the impact of the error on the aggregation of video clips and improve The accuracy of tag information determination of the video segment is improved, and the accuracy of dividing the video segment is improved.
  • the media service module can determine the playback progress corresponding to each video segment, that is, determine the corresponding The playback start time and end time of . In other words, the media service module can determine the association between the tag information of each video segment and the playback progress of the video segment.
  • the "playing progress of each video segment” here can be understood as the corresponding period between the start time and the end time occupied by the video segment within the complete duration (long video) of the first video.
  • the video segment 1 includes the first frame to the fourth frame, and the display time of each frame is known, for example, 16.67 milliseconds (millisecond, ms), then The start time (eg 00:00) and end time (eg 00:09) corresponding to the video clip 1 can be accurately determined.
  • the media service module can accurately determine the playing start time and end time of each video segment such as video segment 2, video segment 3..., etc., which will not be repeated here.
  • the media service module of the server 200 may generate metadata of the first video and label information of the first video after analyzing the first video frame by frame.
  • Table 2 lists a possible parameter content that may be included in the tag information of the video segment.
  • the first video corresponds to a unique "address of the content to be played (content Identification, content ID)” and “tags (tags) data”.
  • “the address of the playing content (content Identification, content ID)” is the media stream address or playing address of the first video introduced above; the parameters corresponding to the tags are represented by "TagInfo", that is, the first video A list of tag information for multiple video clips included.
  • Parameter Type M/O Parameter length (bytes) Parameter Description content ID string m 128 The ID of the content being played tags List ⁇ TagInfo> m the Long video tag information list
  • the parameter "TagInfo" corresponding to the tags of the first video in Table 2 may further include more core parameters, and Table 3 lists a possible core parameter content of TagInfo.
  • the media service module can generate information such as tag information corresponding to each video segment included in the first video, the start time and end time occupied by each video segment, specifically , the tag information corresponding to each video segment may include tag semantics (theme ID) and tag name (tag name).
  • the tag semantics can indicate that the tag information includes “action”, and indicate that the tag information can further include categories (two Level classification) is "fighting"; or, taking "emotion>hug” as an example, the tag semantics (theme ID) can indicate that the tag information includes “emotion”, and indicate that the tag information can further include categories (secondary classification) It is "hug", so I won't repeat it here.
  • the media service module stores and manages the metadata of the first video and the tag of the first video, which can be understood as: the media service module stores and manages the first video included
  • the metadata of each video segment and the label information of each video segment, and the association relationship between the tag information of each video segment and the playback progress of the video segment is stored.
  • the media service module sends the metadata of the first video to the content service module.
  • the media service module may send the metadata of each video segment included in the first video to the content service module, so that when the user uses a video application to play the first video
  • the content service module can provide different services for different users according to the metadata of each video clip included in the first video, such as recommending a certain video for the user , determining a certain video segment of the video that can be played for the user, watching back a certain video segment of the video, and other services.
  • the media service module sends the label information of the first video to the distribution module.
  • the media service module may send the tag information of each video segment included in the first video to the distribution module, and the tag information of each video segment and the video
  • the media service module may store the obtained tag information of one or more video clips of the first video, and provide a preview view for the operator to confirm and Adjustment. If the operator manually adjusts, the adjustment result can be synchronized in the media service module, that is, the media service module can update some parameters in Table 2 and Table 3, and save the updated parameters.
  • step 502 is sent by the media service module to other remote processing modules with stronger data processing capabilities for execution, such as other media detection platforms or training platforms
  • other media detection platforms or training platforms will analyze
  • the obtained metadata of the first video and label information of the first video are returned and stored in the media service module. If operators make manual adjustments in the media service module, the adjustment results can be synchronized to other media detection platforms or training platforms for deep learning, so as to correct their own detection and other processes according to the adjustment results of operators. Let me repeat.
  • the massive videos uploaded by the operators can be processed, and the metadata and tag information of each video in the massive videos can be stored, which will not be described here.
  • the user runs the video application, and triggers the collection and detection module of the electronic device 100 to collect user information.
  • the electronic device 100 can click on the icon (that is, open the video application) and enter the running interface 102 of the video application as shown in (b) in FIG. 1 , the collection and detection module of the electronic device 100 can be triggered to collect user information.
  • the collection and detection module of the electronic device 100 may periodically collect user information according to a certain period (for example, 10 collections per minute).
  • the electronic device 100 may periodically collect current user information during the user's use of the video application to determine the identity of the current user.
  • the embodiment of the present application does not limit the timing for the electronic device 100 to collect user information.
  • the collection and detection module of the electronic device 100 may not collect user information, and the user may manually set the current user identity of the electronic device 100 .
  • the user can manually set the user in the current use period as a child user in the setting application or the video application, and the collection and detection module of the electronic device 100 can no longer collect user information, and according to the user's setting, it is a child user.
  • Users recommend videos or control the playback process of videos.
  • the user may associate the user account currently logged in by the electronic device 100 or the account logged in by the video application with the child user, and the collection and detection module of the electronic device 100 may no longer collect user information, recommend videos for the child user or control the playing process of the video,
  • the embodiment of the present application does not limit the manner of determining the identity of the user.
  • the acquisition and detection module of the electronic device 100 may include one or more devices such as a camera, a fingerprint sensor, a touch sensor, an infrared sensor, etc., wherein the camera is not limited to the front camera, rear camera or screen of the electronic device 100. Lower the camera, etc.
  • the user information collected by the collection and detection module may include one or more information such as the user's facial features, skin condition, fingerprint information, height information, etc., which is not limited in this embodiment of the present application.
  • the acquisition and detection module can be a camera of the mobile phone, which can obtain information such as the user's facial features and skin conditions through the camera; or, the acquisition and detection module can be a fingerprint sensor or a touch sensor of the mobile phone, The fingerprint information of the user can be collected; or, the collection and detection module can be an infrared sensor of the mobile phone, which collects the facial features of the user according to the reflected light of the infrared light on the user's face, which is not limited in the embodiment of the present application.
  • the collection and detection module of the electronic device 100 may transmit the collected user information to the processor 110 of the electronic device 100, and then the processor 110 determines the identity of the user.
  • the collection and detection module of the electronic device 100 can upload the collected user information to the user service module or the distribution module of the server 200, and the user service module or the distribution module of the server 200 can further determine the identity of the user , the following two possible ways are introduced respectively.
  • the collection and detection module of the electronic device 100 collects user information, and the processor 110 of the electronic device 100 determines the identity of the user according to the collected user information.
  • the camera 193 of the electronic device 100 may collect the face information of the current user, and transmit the face information to the processor 110, and the processor 110 performs feature recognition and comparison based on the face information, thereby determining the user's face information.
  • Information such as age, height, clothing, occupation, etc.
  • the current user is: a child user or a minor user; an adult user; an elderly user, etc.
  • the camera 193 of the electronic device 100 can collect the face information of the current user, and transmit the face information to the processor 110, and the processor 110 can compare the face information with the face information of the owner user of the electronic device 100 , through feature comparison, etc., to determine whether the current user is the owner user, parent user, child user, etc., or determine whether the current user is the user corresponding to the user account logged in by the electronic device 100, etc., which will not be repeated here.
  • the electronic device 100 may further recommend videos according to the current user identities.
  • education-related videos can be recommended for parent users; for the owner user, the historical playback records of the owner user can be combined to match videos of the same type or similar subject content as the owner user's historical playback records; Child users can recommend children's animation and learning videos, which are not limited in this embodiment of the present application.
  • the electronic device 100 can independently realize the identification of the user identity, thereby reducing the number of connections between the electronic device 100 and the server 200.
  • the interaction process accelerates the speed of user identification.
  • the collection and detection module of the electronic device 100 returns user identity information to the video application.
  • the collection and detection module may return the identification result to the video application, and the video application determines that the user currently using the video application to request to play a video is a child user.
  • the electronic device 100 can encrypt and store the identification result, for example, associate the identification result with the face feature of the current user and store it encrypted. If the face feature is detected again next time, it can directly call The identity recognition result does not need to carry out processes such as feature matching and comparison to determine the identity of the user. This process can simplify the identification process and reduce the detection workload and power consumption of electronic equipment.
  • the collection and detection module of the electronic device 100 collects user information.
  • the collection and detection module of the electronic device 100 reports the collected user information to the user service module and/or distribution module of the server 200 through the communication module.
  • the communication module here is a module that both the electronic device 100 and the server 200 have, and the communication module of the electronic device 100 and the communication module of the server 200 can be based on various possible communication methods, such as WIFI, 5G/6G, etc. The method and the like are not limited in this embodiment of the present application.
  • the user service module of the server 200 further determines the identity of the user according to the received user information, and encrypts and stores the identity information of the user.
  • the user service module of the server 200 returns the identification result to the video application of the electronic device 100 through the communication module.
  • the camera 193 of the electronic device 100 can collect the face information of the current user, and transmit the face information to the server 200, and the server 200 performs feature recognition based on the face information, such as comparison of skin texture, etc., to determine Information such as the user's age, height, dress, occupation, etc., to determine whether the current user is: a child user or an underage user; an adult user; an elderly user, etc.
  • the camera 193 of the electronic device 100 can collect the face information of the current user, and transmit the face information to the server 200, and the server 200 performs face feature comparison, etc., to determine whether the current user is the owner user, or to determine whether the user is the owner of the device. Whether the current user is the user corresponding to the user account logged in by the electronic device 100, etc., will not be repeated here.
  • the user service module of the server 200 can receive the user characteristic data reported by the electronic device 100, perform user portraits according to the massive user characteristic data, and encrypt and store a large amount of data of different user portraits, and wait for the electronic device 100 to report later. After user information, you can quickly query the data of a large number of user portraits that have been stored, and quickly determine the identity of the current user, thereby increasing the speed of user identification and reducing the amount of data processing. In addition, it can also improve the accuracy of user identification and robustness.
  • the electronic device 100 can determine the identity information of the current user through any one of the above methods 1 and 2, or can also use the processes in the methods 1 and 2 at the same time, for example, through the method 1 to detect whether the current user
  • the user who is the owner of the electronic device can use the second method to determine the identity information such as the age of the current user, which is not limited in the example of this application.
  • An automatic pop-up window on the electronic device 100 prompts the user that the current electronic device is about to switch working modes.
  • the user performs an operation of allowing switching of the working mode in the pop-up window, and in response to the operation of allowing switching of the working mode, the electronic device 100 switches the working mode.
  • the pop-up window on the electronic device 100 may automatically disappear after receiving the user's permission operation; or, if the user does not perform the permission operation within a preset time period, the pop-up window may display the preset time period (for example, 5 seconds) and automatically disappears, and the current user identity is recognized correctly by default, and the video content can be requested according to the user identity, which is not limited in the embodiment of the present application.
  • the preset time period for example, 5 seconds
  • step 512 and step 513 are processes performed only when it is detected that the user identity does not match the current working mode.
  • the process of step 512 and step 513 will be executed, and the electronic A window pops up automatically on the device 100 to remind the user that the current electronic device 100 is about to switch working modes.
  • FIG. 7 is a schematic diagram of an interface for recommending video clips for a user on a mobile phone according to an embodiment of the present application.
  • an interface 701 as shown in (a) in FIG. Run the video application and display the main interface 702 of the video application as shown in (b) of FIG. 7 .
  • the mobile phone may be triggered to perform the operation of collecting user information in step 505 . Further, the mobile phone can determine the identity of the current user according to the procedures of steps 506-507 or steps 508-511 described above.
  • a prompt window 20 as shown in (c) in FIG. 7 can pop up automatically on the main interface 702 of the video application.
  • the identified user identity and then the working mode of the mobile phone will be switched, and the video or video clips will be matched for the user according to the currently identified user identity.
  • the mobile phone can automatically switch to the child mode, and a prompt message can be displayed in the prompt window 20: Dear user, identify you as an underage user , which will block a portion of the video clip from showing.
  • the user can choose whether to receive the identification result of the user's identity in the prompt window 20 according to his own needs. For example, when the user performs the operation shown in (c) in Figure 7, and clicks the "OK" option in the prompt window 20, in response to the user's click operation, the prompt window 20 disappears, and the mobile phone resumes displaying as shown in Figure 7 In the interface 704 shown in (d), the mobile phone is switched to the children's mode at the same time, and further enters the stage of matching videos.
  • errors may occur in the process of user identification, and the current user may be mistakenly identified as an underage user; or, even if the current user is an underage user, the underage user still wants to play the first video All content, then the user can click the "cancel" option in the prompt window 20.
  • an identity verification window can pop up on the mobile phone to verify the user's identity.
  • the verification method is not limited to fingerprint verification, input digital password verification, face verification, etc., which is not limited in the embodiment of the present application.
  • the mobile phone When the user's identity verification is passed, the mobile phone will not switch to the children's mode, that is, maintain the current normal mode, recommend videos or video clips to the user in the existing way, and play the complete content of the video 2, etc., here No longer.
  • the prompt window 20 may disappear automatically after the user clicks the "OK” option; or, if the user does not click the "OK” option within a preset duration, the prompt window 20 may display the preset duration ( For example, after 5 seconds), it automatically disappears, and the identity of the current child user is correct by default, and related videos that match the child user can be recommended according to the identity of the current child user, which is not limited in the embodiments of the present application.
  • the electronic device 100 or the video application can accurately determine the identity or age information of the current user, for example, the current user is: a child user or an underage user; a parent user or an adult user; an elderly user, etc., and In the subsequent process of the user requesting to play the video, different services are provided respectively to meet the usage needs of different users.
  • the user performs an operation of playing the first video.
  • the user can find video 2 (the first video) on the running interface 704 of the video application, and click the icon of the video 2 (the first video), and respond to With the user's click operation, the electronic device 100 displays an interface 705 as shown in (e) of FIG. 7 , and the video 2 (the first video) starts to be played on the interface 705 .
  • the video application of the electronic device 100 sends a play request of the first video to the content service module of the server 200, where the play request of the first video carries user identity information.
  • the play request of the first video can be used to request to obtain the address (content Identification, content ID) of the play content of the first video, and the video application of the electronic device 100 can obtain the first video from the address of the play content The corresponding image data, audio data, etc., realize the normal playback of the first video, which will not be repeated here.
  • Table 4 lists a possible parameter content that may be included in the play request of the first video.
  • the play request of the first video sent by the electronic device 100 may include the user's age mode (age mode) information, video operation column ID (category ID) and the play address of the first video ( mv ID) and other information.
  • the play request of the first video may also include the parameter content listed in Table 4, or, the The play request of the first video may also include other user characteristic parameters, which are used to indicate that the current user is a child user, which is not limited in this embodiment of the present application.
  • the video operation column ID (category ID) can be used to determine the interface 402 content corresponding to the "daily recommendation” menu, or to determine the "tv series” menu, "movie The corresponding interface content of " menu etc.;
  • the play address (mv ID) of this first video can be used for determining the play address of any video (such as video 2) on the interface 402 corresponding to " daily recommendation " menu, no longer here repeat.
  • the content service module of the server 200 determines the first video selected by the user to play according to the play request of the first video, and requests the distribution module of the server 200 to query the tag information of the first video.
  • the distribution module of the server 200 obtains the user identity information, and determines the filter segment of the first video according to the metadata of the first video, the tag information of the first video, and the user identity information, that is, determines the multiple Among the video clips, video clips that need to be filtered out or blocked for child users, or in other words, target clips that can be played for child users among the multiple video clips included in the first video are determined.
  • the content service module of the server 200 may obtain the user identity information included in the play request of the first video, send the user identity information to the distribution module, and request the distribution module to select from the multiple video clips included in the first video Target clips that match the child user and can be played for the child user.
  • the distribution module of the server 200 determines the target segment matching the identity of the child user in the tag information stored in the first video.
  • the first video can be divided into one or more video segments after frame-by-frame analysis, and each video segment can include Table 2 and label information shown in Table 3.
  • the distribution module of the server 200 determines that the current user is a child user, it can query the label information of each video segment included in the first video, and then determine the label matching the child user, and then store each video segment according to the label information and each The association between the playback progress of a video segment determines the target segment and the corresponding playback start time and end time of the target segment.
  • the "filtering segment of the first video” in the embodiment of the present application can be understood as: skip the video segment within the duration t as shown in Figure 7 (f), when the video is played to the gray progress When the start position of the gray progress bar is reached, skip the segment corresponding to the duration t and continue to play; or it can be understood as: the metadata sent by the server 200 to the electronic device 100 is the remaining after deleting the video metadata within the duration t corresponding to the gray progress bar.
  • the metadata of the first video below is not limited in this embodiment of the present application.
  • the play request of the first video in the above step 515 may further include information such as user account information and/or historical play records of the user.
  • step 516 and step 517 if the play request of the first video includes the user's account information, the content service module requests the distribution module to query the label information of each video, and at the same time informs the user's account information to Distribute modules.
  • the distribution module can match the user's historical playback records according to the user account information, and draw the user's behavior characteristic portrait according to the historical playback records. After completing the profile of the user's behavior characteristics, combine the tag information of each video from the video content library to match one or more videos for the user.
  • the tag of each video segment included in the first video may be queried, and a target segment in the first video that can be played for the user is determined.
  • the content service module also needs to determine the video clips that need to be filtered out or blocked for child users among the multiple video clips included in each of one or more videos, or in other words, the content service module also needs to Determine a target segment that can be played by a child user among multiple video segments included in each of the one or more videos, and then determine the response message, which will not be repeated here.
  • the user can recommend videos or video clips that better match the user's viewing needs, which improves the user experience. viewing experience.
  • the distribution module of the server 200 sends the filtered first video metadata to the video application of the electronic device 100.
  • the "filtered metadata of the first video” may be understood as metadata of the target segment included in the first video.
  • Table 5 lists a possible parameter content that may be included in the metadata of the target segment in the first video obtained after filtering.
  • the metadata of the target segment in the filtered first video sent by the server 200 to the electronic device 100 can be identified by the parameter "mvInfos", which means that the first video has been filtered The target segment and the video list after the video segment.
  • the parameter "mvInfos" corresponding to the information of the filtered video and the video list after the video segment in Table 5 may further include more core parameters, and Table 6 lists a possible core parameter content of mvInfos.
  • the information controlling the playback progress of the first video in the child mode can be identified by the parameter "age modeTimeInfos".
  • the parameter "age modeTimeInfos" corresponding to the information controlling the playback progress of the first video in Table 6 may further include more core parameters, and Table 7 lists a possible core parameter content of agemodeTimeInfos, as can be seen from Table 7 , the parameter "age modeTimeInfos" corresponding to the first video playback progress information may further include the playback start time and playback end time corresponding to the filtered target segment that can be played for child users, and details will not be repeated this time.
  • parameter name Parameter Type M/O parameter length Parameter Description start time string m / The start time corresponding to the playback progress end time string m / The end time corresponding to the playback progress
  • the distribution module of the server 200 sends the metadata of the filtered target segment listed in the above Table 5-Table 7 to the video application of the electronic device 100, and the video application of the electronic device 100 can be based on the metadata of the filtered target segment. data to accurately play the target segment of the first video for child users.
  • the electronic device 100 plays the filtered part of the first video through the video application, that is, only plays the target segment in the first video that matches the child user.
  • the mobile phone only plays the filtered video of the video 2 (first video) for the user during the process of the video 2 (first video) fragment.
  • the playback progress bar of video 2 includes black area, white area and gray area, wherein, the black area is the part that has been played, and the white area is the part that has not been played yet
  • the gray area is the unplayable part (filtered segment), assuming that the gray area of the progress bar corresponds to a video segment with a duration of t.
  • start time (start time) corresponding to the playback progress and the end time (end time) corresponding to the playback progress can be used to determine the duration indicated by the gray area of the progress bar as t video clip.
  • the gray area of the progress bar is used to identify unplayable video segments, and the black area and white area are used to identify playable video segments, and details will not be described later.
  • the progress bar shown in (c) in FIG. 1 is black, that is, all video clips included in video 2 can be played.
  • video 2 in (e) in Figure 7 is matched with child users to filter out violent themes, bloody themes and other video clips that are not suitable for children users to watch, and only the video clips corresponding to the black progress bar can be played. That is, only video clips that meet the viewing level of children users can be played.
  • the mobile phone automatically pauses the playback process of the first video, for example, a pause playback icon 30 is displayed on the video playback screen, And further, a prompt window 40 pops up automatically, and the prompt window 40 can prompt the user: the video segment is limited to be played in the children's mode.
  • the critical point is the boundary between the playable video segment and the non-playable video segment, that is, the boundary between the black progress bar and the gray progress bar.
  • the electronic device 100 when the electronic device 100 plays the filtered part of the first video through the video application, it may only display the progress bar corresponding to the target segment matching the child user in the first video.
  • the filtered video segment with a duration of t may not be displayed on the progress bar.
  • the embodiment of the application does not limit the display mode of the progress bar.
  • the server 200 can not only filter some segments of the first video for child users, and only play target segments that match the child user's viewing, but also, according to the identity of the child user, provide The user matches the recommended video, and further sends the matched video information to the electronic device 100, and the electronic device 100 updates the video list displayed in the video recommendation menu according to the matched video information.
  • the process may also include the following steps:
  • the content service module of the server 200 obtains the user identity information included in the play request of the first video, and determines one or more videos according to the user identity.
  • the content service module of the server 200 sends information about one or more videos to the electronic device 100 through the communication module.
  • the video application of the electronic device 100 updates the video list recommended for the user according to the information of one or more videos.
  • the video list displayed on the interface 704 includes The content of the video is subject to change.
  • the electronic device 100 can determine that the original video 1, video 2, video 3, video 4, and video 5 on the interface 704 corresponding to the "daily recommendation" menu do not match the child user, and can update the video on the interface 704.
  • the content is multiple videos that match child users, for example, replace the original video 1, video 2, video 3, video 4, and video 5 with video 6, video 7, video 8, video 9, video 10, etc.; or, When the video 2 in the original video 1, video 2, video 3, video 4, and video 5 matches the user identity, the video 2 can be kept and displayed, and details will not be described here.
  • step 520 and step 516 are executed at the same time, and step 518 and step 521 can be the same step, that is, the server 200 can send the filtered metadata of the first video to the video application of the electronic device 100 at the same time, and Information about one or more videos recommended for the user.
  • the parameter "mvInfos" corresponding to the metadata of the target segment in Table 5 may also include one or more video information.
  • step 518 and step 521 may be different steps, for example, when the user returns to the interface 704 corresponding to the "daily recommendation" menu shown in (d) of Figure 7, the process of step 521 and step 522 is executed , the embodiment of the present application does not limit the execution sequence and timing of the steps.
  • the electronic device 100 may directly and quickly play the first video, then the electronic device 100 may not The collection of user information and the determination of user identity have not been completed.
  • FIG. 8 is a schematic diagram of another example of the process of recommending video clips for a user on a mobile phone according to an embodiment of the present application. The following describes the possible implementation process and interface in this scenario with reference to FIG. 8 .
  • an interface 801 as shown in (a) in FIG. Run the video application and display the main interface 802 of the video application as shown in (b) of FIG. 8 .
  • the user quickly finds Video 2 (the first video) on the running interface 802 of the video application, and performs the operation shown in (b) in FIG. 8, clicks the icon of the Video 2 (the first video), and responds to
  • the electronic device 100 displays the playback interface 803 of the video 2 (first video) as shown in (c) in FIG. 8 , and the video 2 (first video) starts to play on the interface 803 .
  • step 506-step 508 or step 508-step 511 introduced determines the identity of the current user.
  • the mobile phone determines the identity of the user after the mobile phone starts playing video 2 (the first video), for example, after the mobile phone determines that the current user is a child user (or an underage user), it can A prompt window 20 as shown in (c) figure in Figure 8 pops up automatically on the playback interface 802, and the prompt window 20 can display prompt information for telling the user that the user identity currently identified and the working mode of the mobile phone will be switched next. Child mode, and match videos or video clips for child users, so I won’t go into details here.
  • the user can choose whether to receive the identification result of the user's identity in the prompt window 20 according to his own needs.
  • the prompt window 20 disappears, and the mobile phone resumes displaying as In the interface 804 shown in (d) of FIG. 8, the mobile phone switches to the child mode at the same time, and further enters the stage of matching videos.
  • the mobile phone only plays the filtered video segment of the video 2 (the first video) for the user.
  • the user may receive a certain video or a link to each video in other chat applications such as WeChat.
  • the video can be directly triggered to play.
  • the playing process of the video may also be controlled according to the method provided in the embodiment of the present application.
  • the mobile phone can directly trigger to start playing the video after clicking the video link, that is, the interface 803 shown in (c) in FIG. 8 can be displayed in a jump.
  • the mobile phone can simultaneously send a request to the server to play the video corresponding to the video link, and the server determines the tags of the multiple segments included in the video, filters the video for the user, and returns the result of the video filtering to the mobile phone.
  • the mobile phone can continue to play the video for the user according to the process shown in (d) and (e) in Fig. 8, which will not be repeated here.
  • the frame-by-frame detection and intelligent analysis process of the video can be triggered based on the AI algorithm, etc., and multiple fragmented video clips included in the video can be obtained and label information for each video clip.
  • the dimension of label extraction can realize the intelligent classification and extraction of video labels according to the behavior characteristics input by the user.
  • this process avoids the problems of uploading efficiency, label accuracy, and label validity caused by manual operation; on the other hand, the label information of each video segment detected by the server can be further determined or adjusted by the operator , the adjustment result can update or correct the existing frame-by-frame detection algorithm, and this process can form an inner loop of the whole process of "label detection and extraction - intelligent optimization of label detection algorithm - video clip push of different labels - user experience effect" link, thereby improving the accuracy of generating tag information.
  • the user information can be collected, the user identity can be accurately judged and the user portrait can be accurately judged according to the collected user information, and then the label information of each video clip and user preferences, behavior habits, etc. can be combined.
  • this method can control the electronic device to switch to the child mode, and control the scope of the video permissions that the electronic device can play in the child mode, and can also set a certain video that is not suitable for child users to watch. Fragment content is blocked. This process does not need to manually set the electronic device to switch to the children's mode, and can realize the effect of finely controlling video playback in the children's mode, thereby improving user experience.
  • the electronic device 100 may be a device with relatively fixed positions such as a smart screen and a vehicle-mounted device.
  • Electronic devices with relatively fixed positions such as a smart screen and a vehicle-mounted device may be used by different users at different stages. Or be used by multiple users at the same time. Exemplary, for example, the following possible scenarios:
  • Home devices such as smart screens may be installed in bedrooms, study rooms, living rooms, and kitchens at home. Users may use smart screens in the bedroom, study room, living room, and kitchen at different stages. ;
  • the same user uses smart screens in the bedroom, study, living room, and kitchen, and only recommends videos for the user based on the user’s identity, and may recommend videos of the same type and content for the user; or, for In the above scenario (2), when parents, children, and the elderly use the smart screen in the living room together, different user groups cannot take into account everyone’s viewing needs at the same time, and recommend videos that better meet the current viewing needs for different user groups.
  • the embodiment of the present application also provides another method for personalized recommendation of video clips, so as to recommend to the user videos that conform to the current scene and the user's habits for different scenes.
  • Fig. 9 is a schematic flowchart of another example of a method for personalized recommendation of video clips provided by an embodiment of the present application. As shown in FIG. 9, the method 900 can be applied to the system including the electronic device 100 and the server 200 as shown in FIG. content stage.
  • the "determining the scene stage” can be understood as: the electronic device 100 obtains the scene information of the current user, and the electronic device 100 or the server 200 performs scene identification, and then determines the scene of the current user.
  • the "stage of recommending video content for the user” can be understood as: the electronic device 100 requests the server 200 to obtain one or more video information, and the server 200 recommends personalized videos for the user according to the current scene of the user.
  • the two stages are described in detail below.
  • the user triggers the collection and detection module of the electronic device 100 to collect life scene information.
  • step 901 may have different timings and implementation manners.
  • the smart screen can be triggered to start collecting information about life scenes, and the scene where the smart screen is located may last for a long time will not change.
  • the collection and detection module of the electronic device 100 may periodically collect life scene information according to a certain period, which is not limited in this embodiment of the present application.
  • the electronic device 100 in the method 900 can also follow the process introduced in FIG. 4 to enable the personalized recommended video segment of the electronic device 100 (such as a smart screen) through the setting menu of the electronic device 100 (such as a smart screen). , and the function of "allowing the collection of life scene information" is turned on, or the above functions are turned on through other shortcut operations or preset gestures, etc. For the sake of simplicity, the process will not be repeated here.
  • the collection and detection module of the electronic device 100 may include one or more devices such as a camera.
  • the acquisition and detection module may be a camera of the smart screen, and the smart screen may acquire a picture of a current life scene through the camera.
  • the collection and detection module of the electronic device 100 may transmit the collected picture of the living scene to the processor 110 of the electronic device 100, and then the processor 110 determines the current scene information.
  • the collection and detection module of the electronic device 100 can upload the collected life scene picture to the user service module or the distribution module of the server 200, and the user service module or the distribution module of the server 200 can further determine the current situation.
  • the scene information the following two possible ways are introduced respectively.
  • the collection and detection module of the electronic device 100 collects pictures of living scenes, and the processor 110 of the electronic device 100 determines the current scene information according to the collected pictures of living scenes.
  • the collection and detection module of the electronic device 100 returns current scene information to the video application.
  • the electronic device 100 can independently realize scene recognition, thereby reducing the interaction between the electronic device 100 and the server 200 The process speeds up the rate of scene recognition.
  • the collection and detection module of the electronic device 100 collects life scene information.
  • the "life scene information" may be acquired by means of life scene pictures, in other words, the electronic device 100 may collect life scene pictures through a camera.
  • the collection and detection module of the electronic device 100 reports the collected life scene information to the user service module of the server 200 through the communication module.
  • Table 8 lists a possible parameter content that may be included in the life scene information.
  • the life scene information sent by the electronic device 100 is in the form of a life scene picture
  • the life scene information may include information such as a parameter name (image) and a device ID (device ID).
  • the life scene picture may be a picture file compressed to a resolution of 480p, which is not limited in this embodiment of the present application.
  • the user service module of the server 200 further determines the scene where the electronic device 100 is currently located according to the life scene picture collected by the electronic device 100, and encrypts and stores the scene data where the electronic device 100 is currently located, that is, encrypts and stores the identification of the scene. result.
  • the user service module of the server 200 may receive the life scene picture reported by the electronic device 100, and determine the scene corresponding to the life scene picture according to the picture detection and recognition algorithm.
  • the electronic device 100 or the server 200 detects objects such as cabinets, stoves, ovens, refrigerators, range hoods, etc. in the life scene picture, it can be determined that the current scene is a kitchen scene; When objects such as beds and wardrobes appear in the life scene pictures, it can be determined that the current scene is the bedroom scene; when objects such as bookcases and desks appear in the life scene pictures, it can be determined that the current scene is the study room scene, and this time I will not give examples one by one .
  • the server 200 may store a large number of different types of life scene pictures, and the server 200 compares the currently uploaded life scene pictures with the massive life scene pictures in the database, and the ones with the highest similarity are the same category scene.
  • the server 200 can also store the currently uploaded life scene picture in the database with the same type of life scene to enrich the content of the database. After receiving the new life scene picture reported by the electronic device 100, it can be compared quickly. A large number of pictures of life scenes already stored in the database can quickly determine the current scene, thereby increasing the rate of scene recognition and reducing the amount of data processing. In addition, it can also improve the accuracy and robustness of scene recognition.
  • Table 9 lists a possible parameter content that may be included in the recognition result of the scene.
  • the server 200 determines the recognition result of the scene according to the picture of the life scene.
  • the recognition result of the scene may include information such as a parameter name (image), a device ID (device ID), and the like.
  • the life scene picture may be a picture file compressed to a resolution of 480p, which is not limited in this embodiment of the present application.
  • the parameter "ScenarioInfo” corresponding to the recognition result of the scenario in Table 9 may further include more core parameters, and Table 10 lists a possible core parameter content of ScenarioInfo.
  • “ScenarioInfo” may include identifying that the type of scene described by the electronic device 100 includes any one of living room, kitchen, balcony, and study.
  • the user service module of the server 200 sends the recognition result of the scene to the electronic device 100 (or the video application of the electronic device 100) through the communication module. Specifically, the user service module of the server 200 may send the parameter content of Table 9 and Table 10 listed above to the electronic device 100, and the electronic device 100 may determine the current scene according to the parameter content of Table 9 and Table 10.
  • an automatic pop-up window on the electronic device 100 reminds the user of the current scene where the electronic device is currently located.
  • the user may perform an operation in the pop-up window to allow recommending video content according to the current scene, and in response to the permission operation, the electronic device 100 closes the pop-up window.
  • the pop-up window on the electronic device 100 may automatically disappear after receiving the user's permission operation; or, if the user does not perform the permission operation within a preset time period, the pop-up window may display the preset time period (for example, 5 seconds) and automatically disappears, and the current scene recognition is correct by default, and the video content can be requested according to the current scene, which is not limited in the embodiment of the present application.
  • the preset time period for example, 5 seconds
  • the electronic device 100 may encrypt and store the result of the scene recognition, for example, associate the current scene information with the current electronic device 100 and store them encrypted.
  • the smart screen in the kitchen saves the scene information of the kitchen
  • the smart screen in the study saves the scene information in the study.
  • the server 200 can be the The smart screen in the study room recommends different video content, and the follow-up process will be introduced in detail.
  • each electronic device 100 can determine the current scene of the electronic device, and when the user requests to obtain video content through the video application of the electronic device 100, it can continue to execute the "recommend video content for the user stage" corresponding the process of.
  • FIG. 10 is another example of an interface for recommending video clips for users on a smart screen according to an embodiment of the present application.
  • the smart screen 100 when the smart screen 100 determines the current scene according to the above-mentioned steps 901-903 or 904-907, it can be displayed on the main interface 1001 of the smart screen.
  • a prompt window 50 as shown in (a) figure in Figure 10 pops up automatically, and this prompt window 50 can display prompt information for telling the user the currently recognized scene, and then will match the video or video for the user according to the currently recognized scene. video clip.
  • the smart screen can select the "kitchen” option in the prompt window 50, and display a prompt message: Dear user, identify your life scene as follows (kitchen scene), and set Recommend video clips of relevant scenes for you to experience.
  • the user can choose whether to receive the identification result of the user's identity in the prompt window 50 according to his own needs. For example, when the user performs the operation shown in (a) in Figure 10, and clicks the "OK" option in the prompt window 50, in response to the user's click operation, the prompt window 50 disappears, and the smart screen displays as shown in Figure 10
  • the interface 1002 shown in figure (b) further recommends related videos matching the kitchen scene for the user according to the current kitchen scene, such as the food video 1, food video 2 and food video 3 recommended on the interface 1002.
  • the user's operation on the electronic device 100 can be to directly use the finger to perform operations such as clicking, double-clicking, and long-pressing. If the electronic device 100 does not have a touch display screen, the user can perform corresponding operations through a stylus, a remote control, etc., which is not limited in this embodiment of the present application. Exemplarily, if the electronic device 100 is a mobile phone, a vehicle-mounted device, etc., the user can directly perform operations such as clicking, double-clicking, and long-pressing with fingers. If the electronic device 100 is a smart screen, the user can select a certain video through the remote control, and trigger the smart screen to start playing the video, which will not be repeated in the subsequent embodiments.
  • the user can modify the prompt window 50. recognition result. For example, the user selects the "Living Room” option, and clicks the check box in front of the "Kitchen” option to cancel the "Kitchen” option, and then clicks the "OK” option. Relevant videos that match the living room scene are re-recommended for users, and will not be repeated here.
  • the prompt window 50 may disappear automatically after the user clicks the "OK” option; or, if the user does not click the "OK” option within a preset duration, the prompt window 50 may display the preset duration ( For example, after 5 seconds), it automatically disappears, and the current kitchen scene is recognized correctly by default, and related videos matching the kitchen scene can be recommended for the user according to the current kitchen scene, which is not limited in this embodiment of the present application.
  • the electronic device 100 can accurately determine the current scene, such as the current kitchen scene, living room scene, driving scene, study room scene, etc., and subsequently recommend related videos matching the current scene for the user, and then Realize the provision of different video services for different scenarios to meet the needs of users in different scenarios.
  • the current scene such as the current kitchen scene, living room scene, driving scene, study room scene, etc.
  • the user runs a video application.
  • the electronic device 100 sends a request for acquiring video content to the server 200, where the request for acquiring video content carries scene information.
  • step 910 may not be included, and the process of step 911 will be triggered when the user starts the smart screen; or, this process includes step 910, for example, Huawei video may be installed on the smart screen , Youku Video and other different video applications, when the user clicks on the Huawei Video icon to run the Huawei Video application, step 911 may be triggered to send a request for obtaining video content to the server 200 corresponding to Huawei Video, which is not limited in this embodiment of the application.
  • the request for acquiring video content may be used to request information about one or more videos, and the electronic device 100 may display one or more videos on the interface of the electronic device 100 according to the information about the one or more videos list.
  • Table 11 lists a possible parameter content that may be included in the request for acquiring video content.
  • the request for acquiring video content sent by the electronic device 100 may include parameters such as scene information (image), video operation column ID (category ID) and video content playback address (mv ID) .
  • scene information image
  • video operation column ID category ID
  • mv ID video content playback address
  • the embodiment of the present application does not limit the parameters or content included in the request for obtaining video content.
  • the scene information can be used to determine that the current scene is a kitchen scene
  • the video operation column ID (category ID) can be used to determine the corresponding Interface 1002 content, or used to determine the interface content corresponding to the "Member Video” menu, "Daily Recommendation” menu, "TV Series” menu, etc.
  • the video content ID (mv ID) can be used to determine the corresponding interface content of the "Member Video” menu.
  • the playback address of any video (for example, gourmet video 2) on the interface 1002 will not be repeated here.
  • the content service module of the server 200 determines the current scene according to the request for acquiring video content, and requests the distribution module of the server 200 to query the tag information of the video.
  • the distribution module of the server 200 obtains the current scene information, and determines the information of one or more videos and the target segment of each video according to the scene information.
  • the distribution module of the server 200 sends a response message to the electronic device 100 (or the video application of the electronic device 100), where the response message includes information of one or more videos and target segment information of each video.
  • the method 900 may include the process of the video preprocessing stage introduced in the method 500.
  • the massive videos stored in the video content library of the server 200 in the method 900 are also processed through frame-by-frame analysis.
  • the server 200 The content service module, the distribution module, etc. store the label information of each video, and the label information of multiple segments included in each video, which will not be repeated here.
  • querying video tag information may include different implementation processes.
  • the content service module may request the distribution module to query the global tag of the video, and the global tag may be used to identify the category information of different long videos, for example, the operator uploaded the first video to the server 200,
  • each video can correspond to a different global label.
  • the first video belongs to the food category
  • the second video belongs to the emotional category
  • the third video belongs to the family category.
  • the content service module can first request to query the global tags of the video, find the first video of the gourmet category that matches the current kitchen scene, and determine that the first video is a video that can be recommended to the user.
  • the request for acquiring video content in step 911 may further include user identity information.
  • the content service module can also further request the distribution module to inquire about the label information of each video, that is, according to the process of step 516 in FIG. 5, inquire about multiple video clips included in each video tag, and determine the target segment that can be played in each video for the user according to the user identity.
  • the content service module can first determine that the first video is a video that can be recommended to the user, and further query the tags of each video segment included in the first video, and determine the target segment that can be played for the user in the first video .
  • the content service module also needs to determine the video segments that need to be filtered out or blocked for child users among the multiple video segments included in the first video, or in other words, the content service module also needs to determine the video segments included in the first video.
  • the target clip that can be played by the child user is used to determine the response message, which is not limited in this embodiment of the present application.
  • the response message may be returned to the content service module, the The response message may include information about the one or more videos, target segment information for each video, tag information for each video or video segment, and the like.
  • the content service module can also generate the metadata of the one or more videos, or assemble the metadata of the target segment of each video, and send it to the electronic device 100 .
  • Table 12 lists a possible parameter content that the response message may include.
  • the response message sent by the server 200 to the electronic device 100 can be identified by the parameter "mvInfos", which means that the response message can include the information of the one or more videos, each The target segment information of the video; and the parameter "tagInfos" may represent tag information identifying each video or video segment, etc.
  • the parameter "mvInfos" in Table 12 may further include more core parameters, and Table 13 lists a possible core parameter content of mvInfos. It should be understood that, if it is recognized that the current scene is a kitchen scene, the electronic device 100 may use the parameter "tagTimeInfos" to identify the information for controlling the playing progress of the first video for the kitchen scene.
  • the parameter "tagTimeInfos" corresponding to the information controlling the playing progress of the first video in the kitchen scene in Table 13 may further include more core parameters.
  • Table 14 lists a possible core parameter content of tagTimeInfos, as can be seen from Table 14.
  • the parameter "tagTimeInfos" corresponding to the playing progress information of the first video may further include the playing start time and playing end time corresponding to the filtered target segment that can match the kitchen scene. Exemplary, in conjunction with the parameter content in the table 14 and (c) figure in Fig. 11, the start time (start time) corresponding to the playback progress and the end time (end time) corresponding to the playback progress can be used to determine the progress bar
  • the video clips with a duration of t indicated by the gray area of will not be described one by one this time.
  • parameter name Parameter Type M/O parameter length Parameter Description start time string m / The start time corresponding to the playback progress end time string m / The end time corresponding to the playback progress tagName string m / label name
  • the distribution module of the server 200 sends the response message listed in Table 12 to Table 14 to the video application of the electronic device 100, and the video application of the electronic device 100 can determine the one or more video applications according to the response message.
  • the information, the target segment information of each video accurately display the recommended video list for the user, and when the user plays the first video, the target segment of the first video can be played in combination with the user's identity.
  • the electronic device 100 displays a list of one or more recommended videos according to the information of the one or more videos and the target segment information of each video.
  • the user performs an operation of playing the first video.
  • FIG. 11 is another example of an interface for recommending video clips for users on a smart screen according to an embodiment of the present application.
  • the server 200 recommends relevant videos that match the kitchen scene for the user according to the current kitchen scene, and the smart screen can display as shown in (a) in FIG. 11 , on the interface 1101 The recommended food video 1, food video 2, and food video 3 are displayed.
  • the user desires to play the gourmet video 2, he can select and click the icon of the gourmet video 2 (the first video), and in response to the user's playback operation, the smart screen starts to play the gourmet video 2.
  • part of the segment filtered by the first video may be displayed on the smart screen.
  • the smart screen may also display a target segment to be played for the user in the first video.
  • the user performs an operation of allowing the smart screen to filter part of the first video according to the current scene.
  • the smart screen starts to play the target segment of the first video according to the user's selection.
  • a prompt window 60 can be automatically displayed, and the prompt window 60 can display multiple information included in the gourmet video 2 for the user. segment, and display the segment to be filtered and/or the target segment to be played for the user during the playing of the gourmet video 2 .
  • the prompt window 60 displayed on the interface 1102 includes Fragment 1 , Fragment 2 and Fragment 3 .
  • segment 2 is selected, segment 1 and segment 3 are not selected, that is, segment 2 is the segment to be filtered in the gourmet video 2, and segment 1 and segment 3 are target segments to be played for the user.
  • the smart screen can display an interface 1103 as shown in (c) in Figure 11, which is the playback interface of the gourmet video 2, and the smart screen Start playing the food video 2 for the user.
  • the playback progress bar can also be displayed on the interface 1103.
  • the black area included on the progress bar is the part that has been played, the white area is the part that has not been played, and the gray area is the part that cannot be played, that is, ( in FIG. 11 b) Segment 2 filtered out in the figure, and the corresponding playback duration of this segment 2 is t.
  • the interface 1104 shown in (d) of Figure 11 can automatically display a prompt window 70 on the smart screen,
  • the prompt window 70 may display: Dear user, this video segment is automatically skipped for you in the current scene.
  • the smart screen can automatically skip the segment 2 with a duration of t corresponding to the progress bar in the gray area, and continue to play the content of segment 3.
  • the prompt window 70 disappears automatically after displaying a preset duration (for example, 5 seconds); or, if the user can click on any area of the prompt window 70, in response to the user's click operation, the prompt window 70 disappears, this
  • a preset duration for example, 5 seconds
  • the user may expect to watch video clips related to their favorite food.
  • the filter clip automatically selected by the smart screen is clip 2, Only Fragment 1 and Fragment 3 related to gastronomy are left.
  • the user can modify and adjust the filtered segment content according to his own needs.
  • segment 1 related to making pizza
  • the user can select the option corresponding to segment 1, and then click the "OK" option.
  • Segment 1 and Segment 2 are automatically filtered for the user, and only Segment 3 is played for the user.
  • the user can also select the option corresponding to segment 2 and cancel the filtering of segment 2.
  • segment 1 can be automatically filtered for the user, and segment 2 and segment 3 can be played for the user. Let me repeat them one by one.
  • the above method recognizes the life scene where the electronic device is located, and recommends different video content to the user from the video content library according to the recognition result of the scene. For example, for the kitchen scene, it can recommend food and make food-related videos for users; for the study room scene, it can recommend teaching-related videos for users; for the living room scene, it can recommend movies and TV series suitable for family members to watch together; for In the balcony scene, you can recommend videos related to home and cleaning for users.
  • the method can recommend videos that meet the current scene for the user to select, thereby improving the user experience.
  • the method can also combine information such as the account of the user logged in by the electronic device, the historical browsing records corresponding to the user account, etc., to draw a portrait of the user's behavioral characteristics. After completing the profile of the user's behavioral characteristics, combine the tag information of each video from the video content library to match one or more videos for the user. The method can accurately recommend video content to the user according to the current scene, the user's habits and hobbies, and improves the user's viewing experience.
  • the size of the serial numbers of the various implementation processes does not mean the order of execution, and the execution order of each process should be determined by its functions and internal logic, and should not be used in the embodiments of the present application.
  • the implementation process constitutes any limitation.
  • preset can be stored in electronic devices (such as mobile phones or smart screens) in advance by corresponding codes, forms or other information that can be used to indicate relevant information
  • electronic devices such as mobile phones or smart screens
  • codes, forms or other information that can be used to indicate relevant information
  • This application does not limit its specific implementation, such as the preset duration in the embodiment of this application.
  • method 500 and method 900 may be executed simultaneously, or only method 500 is executed in a certain scenario, or only method 900 is executed in a certain scenario, which is not limited in this embodiment of the present application.
  • the method provided in the embodiments of the present application can be used to recommend scenes of personalized videos for users, and can also be used to recommend scenes of personalized music and scenes of personalized theme pictures for users. This is not limited.
  • the server can identify the life scene where the electronic device is located, and identify the identity information of the current user, and then combine the tag information of each video clip with user preferences, behavior habits, etc., Provide the current user with a refined and personalized video, or match the current user with a target segment in a video.
  • a video suitable for the child user or underage user can be found from multiple videos matched according to the life scene, and the video is recommended to the user.
  • the method may also control the electronic device to switch to the children's mode, and control the scope of movie rights that the electronic device can play in the children's mode.
  • the video provided by the server to the electronic device may also be a filtered video, that is, the content of the video segment in the video that is not suitable for the child user to watch is blocked. This process does not need to manually set the electronic device to switch to the children's mode, and can realize the effect of finely controlling video playback in the children's mode, thereby improving user experience.
  • the method for recommending video clips can firstly realize the processing of massive videos by the server.
  • the server can analyze each video frame by frame, and process each frame of each video in one or several dimensions, such as target detection, content recognition, content understanding, and natural language understanding, so as to determine the Tag of.
  • the deep learning algorithm combined with the tag information of each frame, the multiple frames of the video are aggregated into multiple segments, and the tag information of each video and the tag information of the aggregated multiple segments are automatically extracted and stored.
  • the electronic device can collect and identify the identity of the user, or report the identity of the user to the server.
  • the server can receive user characteristic data reported by electronic devices, combine the stored massive user characteristic data to make a user portrait of the current user, quickly determine the identity of the current user, improve the speed of user identification, and improve the accuracy of user identification .
  • the server can recommend videos matching the user's identity for the user in combination with the tag information of the videos in the server's video content library.
  • the method can also query the label information of each video segment included in the video, and match the user with segments that meet the user's viewing needs according to the current user identity. For example, when it is identified that the current user is a child user or an underage user, the method can also control the electronic device to switch to the children's mode, and control the scope of movie rights that the electronic device can play in the children's mode. If the child user or underage user plays a selected video, the video provided by the server to the electronic device may also be a filtered video, that is, the content of the video segment in the video that is not suitable for the child user to watch is blocked. This process does not need to manually set the electronic device to switch to the children's mode, and can realize the fine control of the video playback effect in the children's mode, which improves the user experience.
  • the method can also recommend different video content to the user from the video content library according to the recognition result of the scene in combination with the life scene where the electronic device is located.
  • the above method can be aimed at scenarios where multiple people in a family use multiple smart screens and other large-screen devices, or when multiple family members use the same smart screen and other devices, or when different users use split screens and other large-screen devices such as smart screens.
  • the process can intelligently identify information such as the current life scene of the electronic device, the identity of the current user, the account of the user logged in by the electronic device, and the historical browsing records corresponding to the user account, and combine the label information of each video from the video content library. Recommend one or more videos for this user.
  • the method can accurately recommend video content for the user that conforms to the current scene, the user's habits and hobbies, and improves the user's viewing experience.
  • the electronic device includes hardware and/or software modules corresponding to each function.
  • the present application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software drives hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions in combination with the embodiments for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the functional modules of the electronic device may be divided according to the above method example.
  • each functional module may be divided corresponding to each function, or two or more functions may be integrated into one processing module.
  • the above integrated modules may be implemented in the form of hardware. It should be noted that the division of modules in this embodiment is schematic, and is only a logical function division, and there may be other division methods in actual implementation.
  • the electronic device 100 involved in the above embodiment may further include: a display unit, a detection unit, and a processing unit.
  • the display unit, the detection unit, and the processing unit cooperate with each other, and may be used to support the electronic device to execute the above steps, and/or be used in other processes of the technologies described herein.
  • the electronic device provided in this embodiment is used to execute the above video playing method, so the same effect as the above implementation method can be achieved.
  • the electronic device may include a processing module, a memory module and a communication module.
  • the processing module can be used to control and manage the actions of the electronic device, for example, it can be used to support the electronic device to execute the steps performed by the above-mentioned display unit, detection unit and processing unit.
  • the memory module can be used to support electronic devices to execute stored program codes and data, and the like.
  • the communication module can be used to support the communication between the electronic device and other devices.
  • the processing module may be a processor or a controller. It can implement or execute the various illustrative logical blocks, modules and circuits described in connection with the present disclosure.
  • the processor can also be a combination of computing functions, such as a combination of one or more microprocessors, a combination of digital signal processing (digital signal processing, DSP) and a microprocessor, and so on.
  • the storage module may be a memory.
  • the communication module may be a device that interacts with other electronic devices, such as a radio frequency circuit, a Bluetooth chip, and a Wi-Fi chip.
  • the electronic device 100 involved in this embodiment may be a device having the structure shown in FIG. 2 .
  • This embodiment also provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are run on the electronic device, the electronic device executes the above-mentioned relevant method steps to realize the steps in the above-mentioned embodiments.
  • a method for personalized recommendation of video clips is also provided.
  • This embodiment also provides a computer program product, which, when running on a computer, causes the computer to execute the above related steps, so as to implement the method for personalized video segment recommendation in the above embodiment.
  • an embodiment of the present application also provides a device, which may specifically be a chip, a component or a module, and the device may include a connected processor and a memory; wherein the memory is used to store computer-executable instructions, and when the device is running, The processor can execute the computer-executable instructions stored in the memory, so that the chip executes the methods for personalized recommendation of video clips in the above method embodiments.
  • the electronic device, computer-readable storage medium, computer program product or chip provided in this embodiment is all used to execute the corresponding method provided above, therefore, the beneficial effects it can achieve can refer to the above-mentioned The beneficial effects of the corresponding method will not be repeated here.
  • the disclosed devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of modules or units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components can be combined or It may be integrated into another device, or some features may be omitted, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or units may be in electrical, mechanical or other forms.
  • a unit described as a separate component may or may not be physically separated, and a component shown as a unit may be one physical unit or multiple physical units, which may be located in one place or distributed to multiple different places. Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, each unit may exist separately physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated units can be implemented in the form of hardware or in the form of software functional units.
  • an integrated unit is realized in the form of a software function unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solution of the embodiment of the present application is essentially or the part that contributes to the prior art, or all or part of the technical solution can be embodied in the form of a software product, and the software product is stored in a storage medium Among them, several instructions are included to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor (processor) execute all or part of the steps of the methods in various embodiments of the present application.
  • the aforementioned storage medium includes: various media that can store program codes such as U disk, mobile hard disk, read only memory (ROM), random access memory (random access memory, RAM), magnetic disk or optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Computer Graphics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

La présente demande concerne un procédé de recommandation d'un clip vidéo, un dispositif électronique et un serveur. Le dispositif électronique peut être un téléphone mobile, un écran intelligent ou un appareil similaire. Le serveur peut : effectuer une détection de cibles, une reconnaissance de contenus, une compréhension de contenus, une compréhension du langage naturel et des opérations analogues sur chaque trame d'image d'une vidéo ; déterminer une étiquette pour chaque trame d'image ; agréger la vidéo en une pluralité de clips ; et extraire et mémoriser des informations d'étiquette de chaque vidéo et des informations d'étiquette de chaque clip ; puis, selon le procédé, une identité d'utilisateur et/ou une scène actuelle peuvent être reconnues et une vidéo répondant à l'identité d'utilisateur et à la scène actuelle est recommandée à un utilisateur par combinaison de l'étiquette de la vidéo et de l'étiquette de chaque clip ; en outre, lorsqu'un utilisateur sélectionne et lit une certaine vidéo, selon le procédé, des informations d'étiquette de chaque clip vidéo inclus dans la vidéo peuvent en outre être interrogées et un clip répondant aux besoins de visualisation de l'utilisateur peut être mis en correspondance pour l'utilisateur selon l'identité actuelle de l'utilisateur, ce qui permet d'effectuer une commande fine sur l'effet de lecture de la vidéo et d'améliorer l'expérience d'utilisateur.
PCT/CN2022/106529 2021-07-21 2022-07-19 Procédé de recommandation de clip vidéo, dispositif électronique et serveur WO2023001152A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110827774.1 2021-07-21
CN202110827774.1A CN115695860A (zh) 2021-07-21 2021-07-21 一种推荐视频片段的方法、电子设备及服务器

Publications (1)

Publication Number Publication Date
WO2023001152A1 true WO2023001152A1 (fr) 2023-01-26

Family

ID=84980027

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/106529 WO2023001152A1 (fr) 2021-07-21 2022-07-19 Procédé de recommandation de clip vidéo, dispositif électronique et serveur

Country Status (2)

Country Link
CN (1) CN115695860A (fr)
WO (1) WO2023001152A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117014687A (zh) * 2023-09-28 2023-11-07 北京小糖科技有限责任公司 基于用户播放画像的视频定位播放方法、装置

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8942542B1 (en) * 2012-09-12 2015-01-27 Google Inc. Video segment identification and organization based on dynamic characterizations
US20160037217A1 (en) * 2014-02-18 2016-02-04 Vidangel, Inc. Curating Filters for Audiovisual Content
US20170272818A1 (en) * 2016-03-17 2017-09-21 Comcast Cable Communications, Llc Methods and systems for dynamic content modification
CN107995523A (zh) * 2017-12-21 2018-05-04 广东欧珀移动通信有限公司 视频播放方法、装置、终端及存储介质
CN109255053A (zh) * 2018-09-14 2019-01-22 北京奇艺世纪科技有限公司 资源搜索方法、装置、终端、服务器、计算机可读存储介质
CN109451349A (zh) * 2018-10-31 2019-03-08 维沃移动通信有限公司 一种视频播放方法、装置及移动终端
CN110381364A (zh) * 2019-06-13 2019-10-25 北京奇艺世纪科技有限公司 视频数据处理方法、装置、计算机设备和存储介质
CN110401873A (zh) * 2019-06-17 2019-11-01 北京奇艺世纪科技有限公司 视频剪辑方法、装置、电子设备和计算机可读介质
CN110475154A (zh) * 2018-05-10 2019-11-19 腾讯科技(深圳)有限公司 网络电视视频播放方法和装置、网络电视和计算机介质
CN111209440A (zh) * 2020-01-13 2020-05-29 腾讯科技(深圳)有限公司 一种视频播放方法、装置和存储介质
CN111866550A (zh) * 2020-07-24 2020-10-30 上海盛付通电子支付服务有限公司 视频片段的屏蔽方法和装置
CN112423133A (zh) * 2019-08-23 2021-02-26 腾讯科技(深圳)有限公司 视频切换方法、装置、计算机可读存储介质和计算机设备
CN114025242A (zh) * 2021-11-09 2022-02-08 维沃移动通信有限公司 视频处理方法、视频处理装置和电子设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105335595A (zh) * 2014-06-30 2016-02-17 杜比实验室特许公司 基于感受的多媒体处理
US10897642B2 (en) * 2019-03-27 2021-01-19 Rovi Guides, Inc. Systems and methods for media content navigation and filtering
CN112214636A (zh) * 2020-09-21 2021-01-12 华为技术有限公司 音频文件的推荐方法、装置、电子设备以及可读存储介质

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8942542B1 (en) * 2012-09-12 2015-01-27 Google Inc. Video segment identification and organization based on dynamic characterizations
US20160037217A1 (en) * 2014-02-18 2016-02-04 Vidangel, Inc. Curating Filters for Audiovisual Content
US20170272818A1 (en) * 2016-03-17 2017-09-21 Comcast Cable Communications, Llc Methods and systems for dynamic content modification
CN107995523A (zh) * 2017-12-21 2018-05-04 广东欧珀移动通信有限公司 视频播放方法、装置、终端及存储介质
CN110475154A (zh) * 2018-05-10 2019-11-19 腾讯科技(深圳)有限公司 网络电视视频播放方法和装置、网络电视和计算机介质
CN109255053A (zh) * 2018-09-14 2019-01-22 北京奇艺世纪科技有限公司 资源搜索方法、装置、终端、服务器、计算机可读存储介质
CN109451349A (zh) * 2018-10-31 2019-03-08 维沃移动通信有限公司 一种视频播放方法、装置及移动终端
CN110381364A (zh) * 2019-06-13 2019-10-25 北京奇艺世纪科技有限公司 视频数据处理方法、装置、计算机设备和存储介质
CN110401873A (zh) * 2019-06-17 2019-11-01 北京奇艺世纪科技有限公司 视频剪辑方法、装置、电子设备和计算机可读介质
CN112423133A (zh) * 2019-08-23 2021-02-26 腾讯科技(深圳)有限公司 视频切换方法、装置、计算机可读存储介质和计算机设备
CN111209440A (zh) * 2020-01-13 2020-05-29 腾讯科技(深圳)有限公司 一种视频播放方法、装置和存储介质
CN111866550A (zh) * 2020-07-24 2020-10-30 上海盛付通电子支付服务有限公司 视频片段的屏蔽方法和装置
CN114025242A (zh) * 2021-11-09 2022-02-08 维沃移动通信有限公司 视频处理方法、视频处理装置和电子设备

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117014687A (zh) * 2023-09-28 2023-11-07 北京小糖科技有限责任公司 基于用户播放画像的视频定位播放方法、装置
CN117014687B (zh) * 2023-09-28 2023-12-08 北京小糖科技有限责任公司 基于用户播放画像的视频定位播放方法、装置

Also Published As

Publication number Publication date
CN115695860A (zh) 2023-02-03

Similar Documents

Publication Publication Date Title
WO2020211701A1 (fr) Procédé de formation de modèle, procédé de reconnaissance d'émotion, appareil et dispositif associés
WO2020078299A1 (fr) Procédé permettant de traiter un fichier vidéo et dispositif électronique
WO2021013145A1 (fr) Procédé de démarrage d'application rapide et dispositif associé
US20220080261A1 (en) Recommendation Method Based on Exercise Status of User and Electronic Device
WO2020238356A1 (fr) Procédé et appareil d'affichage d'interface, terminal, et support d'enregistrement
WO2020034227A1 (fr) Procédé de synchronisation de contenu multimédia et dispositif électronique
WO2021164445A1 (fr) Procédé de traitement de notification, appareil électronique et système
CN113542839B (zh) 电子设备的投屏方法和电子设备
CN109981885B (zh) 一种电子设备在来电时呈现视频的方法和电子设备
WO2021249318A1 (fr) Procédé de projection sur écran et terminal
CN114173000B (zh) 一种回复消息的方法、电子设备和系统、存储介质
CN109819306B (zh) 一种媒体文件裁剪的方法、电子设备和服务器
CN112214636A (zh) 音频文件的推荐方法、装置、电子设备以及可读存储介质
WO2021093595A1 (fr) Procédé de vérification d'identité d'utilisateur et dispositif électronique
WO2022007707A1 (fr) Procédé de commande de dispositif domestique, dispositif terminal et support de stockage lisible par ordinateur
US20230291826A1 (en) Control Method Applied to Electronic Device and Electronic Device
CN116009999A (zh) 卡片分享方法、电子设备及通信系统
CN113810542B (zh) 一种应用于电子设备的控制方法、电子设备及计算机存储介质
CN115333941B (zh) 获取应用运行情况的方法及相关设备
WO2022135157A1 (fr) Procédé et appareil d'affichage de page, ainsi que dispositif électronique et support de stockage lisible
WO2023001152A1 (fr) Procédé de recommandation de clip vidéo, dispositif électronique et serveur
WO2022088964A1 (fr) Procédé et appareil de commande pour dispositif électronique
WO2022037479A1 (fr) Procédé de photographie et système de photographie
WO2020062014A1 (fr) Procédé d'entrée d'informations dans une boîte d'entrée et dispositif électronique
WO2023045597A1 (fr) Procédé et appareil de commande de transfert entre dispositifs de service de grand écran

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22845317

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE