CN112040277B

CN112040277B - Video-based data processing method and device, computer and readable storage medium

Info

Publication number: CN112040277B
Application number: CN202010952375.3A
Authority: CN
Inventors: 钟柯; 陈旭东
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2020-09-11
Filing date: 2020-09-11
Publication date: 2022-03-04
Anticipated expiration: 2040-09-11
Also published as: CN112040277A

Abstract

The embodiment of the application discloses a data processing method, a data processing device, a computer and a readable storage medium based on video, which relate to big data technology and can be applied to the field of education, and the method comprises the following steps: acquiring a remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video; acquiring video subject data corresponding to each data timestamp from a target video, and classifying at least two remark data based on the video subject data to obtain at least two remark classification groups; and responding to the display operation aiming at the remark data set, and performing regional display on at least two remark classification groups. By adopting the process, the display order of the remark data is improved.

Description

Video-based data processing method and device, computer and readable storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a video-based data processing method and apparatus, a computer, and a readable storage medium.

Background

With the development of internet technology, more and more activities depend on internet realization, such as entertainment (online games and the like), work, social contact or education and the like. For example, in the field of education, when a student learns through the internet, the student can generally add a note to a watched video so as to facilitate subsequent review, and at present, a screenshot function is generally added to the video, when the student triggers the screenshot function, an image is captured from the video, the captured image is stored as a note, and when the student views the note, the captured image is displayed, so that the content of the note is relatively disordered, and the display order of the note is reduced.

Disclosure of Invention

The embodiment of the application provides a data processing method, a data processing device, a computer and a readable storage medium based on videos, and the orderliness of remark data display can be improved.

An embodiment of the present application provides a video-based data processing method, which includes:

acquiring a remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video;

acquiring video subject data corresponding to each data timestamp from a target video, and classifying at least two remark data based on the video subject data to obtain at least two remark classification groups;

and responding to the display operation aiming at the remark data set, and performing regional display on at least two remark classification groups.

Wherein, the method also comprises:

responding to the triggering operation of the area switching control in the interaction area, and displaying the remark area;

responding to a trigger operation aiming at the remark area, and acquiring remark data;

the method comprises the steps of obtaining a first time point corresponding to triggering operation of a remark area in a target video, taking the first time point as a data time stamp of remark data, and adding the remark data and the data time stamp of the remark data into a remark data set.

Wherein the remark area comprises a data category control;

responding to the trigger operation aiming at the remark area, and acquiring remark data, wherein the method comprises the following steps:

responding to the trigger operation aiming at the data category control in the remark area, and displaying the remark data input area corresponding to the data category control;

responding to a trigger operation aiming at the remark data input area, and acquiring remark data input in the remark data input area; and the data type of the remark data is the data type corresponding to the data category control.

Wherein the data category control comprises a voice acquisition control;

responding to the trigger operation aiming at the remark data input area, and acquiring the remark data input in the remark data input area, wherein the method comprises the following steps:

responding to a trigger operation aiming at the remark data input area, and acquiring remark voice information input in the remark data input area;

and carrying out semantic recognition on the acquired remark voice information to obtain text information corresponding to the remark voice information, and taking the remark voice information and the text information as remark data.

The remark area comprises a screenshot control; responding to the trigger operation aiming at the remark area, and acquiring remark data, wherein the method comprises the following steps:

and responding to the triggering operation of the screenshot control in the remark area, determining a video display area where the target video is located, intercepting a first video image displayed in the video display area, and taking the first video image as remark data.

Wherein, the method also comprises:

responding to the triggering operation of the area switching control in the remark area, and displaying an interaction area; the interactive area is used for displaying conversation messages sent by participating users of the target video;

responding to the trigger operation aiming at the interactive area, acquiring the session message of the terminal user, and sending and displaying the session message of the terminal user; the terminal user is a user corresponding to the terminal for displaying the remark area.

The method includes the steps of obtaining video subject data corresponding to each data timestamp from a target video, classifying at least two remark data based on the video subject data, and obtaining at least two remark classification groups, and includes the following steps:

acquiring a video segment corresponding to the ith data timestamp in the target video from the target video, identifying the video segment corresponding to the ith data timestamp, and acquiring video subject data corresponding to the ith data timestamp until video subject data corresponding to each data timestamp are acquired; i is a positive integer, i is less than or equal to the number of at least two remark data;

and dividing remark data corresponding to the data time stamps with the same video theme data into one class to obtain at least two remark classification groups.

The method includes the steps of obtaining a video segment corresponding to the ith data timestamp in a target video from the target video, identifying the video segment corresponding to the ith data timestamp, and obtaining video subject data corresponding to the ith data timestamp, and includes the following steps:

determining a segment time range containing the ith data timestamp, and acquiring a video segment corresponding to the segment time range from the target video;

acquiring at least two second video images forming a video clip, and determining a corresponding target video image of the ith data time stamp in the at least two second video images; the target video image is the pth second video image in the at least two second video images, p is a positive integer and is less than or equal to the number of the at least two second video images;

identifying the p second video image, and if the identification result of the p second video image is obtained, taking the identification result of the p second video image as video subject data corresponding to the ith data timestamp;

and if the identification result of the p second video image is not obtained, identifying the (p-1) th second video image in the at least two second video images until the video theme data corresponding to the ith data timestamp is obtained.

determining a segment time range containing the ith data timestamp, acquiring a video segment corresponding to the segment time range from the target video, and acquiring video voice data in the video segment;

and performing semantic analysis on the video and voice data to obtain video subject data corresponding to the ith data timestamp.

Wherein, respond to the display operation to remark data set, divide regional the demonstration to at least two remark classification groups, include:

responding to display operation aiming at the remark data set, and acquiring at least two remark classification groups;

obtaining remark classification group C_jThe data timestamps corresponding to the k remark data respectively are sorted based on the data timestamps, and the sorted k remark data are sequentially displayed in the jth classification display area; k is a positive integer, k is used for representing a corresponding remark classification group C_jThe number of remark data included; j is a positive integer, j is less than or equal to the number of at least two remark classification groups.

Wherein, the method also comprises:

displaying a remark prompting message, determining a video display area based on the remark prompting message, and intercepting a third video image displayed in the video display area;

displaying a remark area based on the remark prompt message, and acquiring note data based on the remark area;

generating remark data according to the third video image and the note data, acquiring a second time point corresponding to the remark prompt message in the target video, and taking the second time point as a data timestamp of the remark data;

and adding the remark data and the data time stamp of the remark data into the remark data set.

An aspect of an embodiment of the present application provides a video-based data processing apparatus, where the apparatus includes:

the remark acquisition module is used for acquiring a remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video;

the remark classification module is used for acquiring video subject data corresponding to each data timestamp from a target video, and classifying at least two remark data based on the video subject data to obtain at least two remark classification groups;

and the remark display module is used for responding to the display operation aiming at the remark data set and displaying at least two remark classification groups in a regional mode.

Wherein, the device still includes:

the region switching module is used for responding to the triggering operation aiming at the region switching control in the interaction region and displaying the remark region;

the remark acquisition module is used for responding to trigger operation aiming at the remark area and acquiring remark data;

and the remark storage module is used for acquiring a first time point corresponding to the trigger operation of the remark area in the target video, taking the first time point as a data time stamp of the remark data, and adding the remark data and the data time stamp of the remark data into the remark data set.

Wherein the remark area comprises a data category control;

this remark collection module includes:

the input area display unit is used for responding to the triggering operation of the data type control in the remark area and displaying the remark data input area corresponding to the data type control;

an input data acquisition unit configured to acquire memo data input in a memo data input area in response to a trigger operation for the memo data input area; and the data type of the remark data is the data type corresponding to the data category control.

Wherein the data category control comprises a voice acquisition control;

the input data acquisition unit includes:

the voice acquisition subunit is used for responding to the trigger operation aiming at the remark data input area and acquiring remark voice information input in the remark data input area;

and the remark determining subunit is used for performing semantic recognition on the acquired remark voice information to obtain text information corresponding to the remark voice information, and taking the remark voice information and the text information as remark data.

The remark area comprises a screenshot control; this remark collection module includes:

and the image intercepting unit is used for responding to the triggering operation of the screenshot control in the remark area, determining the video display area where the target video is located, intercepting the first video image displayed in the video display area, and taking the first video image as remark data.

Wherein, the device still includes:

the interactive display module is used for responding to the triggering operation of the area switching control in the remark area and displaying the interactive area; the interactive area is used for displaying conversation messages sent by participating users of the target video;

the user session module is used for responding to the triggering operation aiming at the interactive area, acquiring the session message of the terminal user, and sending and displaying the session message of the terminal user; the terminal user is a user corresponding to the terminal for displaying the remark area.

Wherein, this remark classification module includes:

the theme acquisition unit is used for acquiring a video segment corresponding to the ith data timestamp in the target video from the target video, identifying the video segment corresponding to the ith data timestamp, and acquiring video theme data corresponding to the ith data timestamp until video theme data corresponding to each data timestamp are acquired; i is a positive integer, i is less than or equal to the number of at least two remark data;

and the remark dividing unit is used for dividing the remark data corresponding to the data time stamps with the same video subject data into one class to obtain at least two remark classification groups.

In the aspect of obtaining video subject data corresponding to the ith data timestamp by acquiring a video segment corresponding to the ith data timestamp in a target video and identifying the video segment corresponding to the ith data timestamp from the target video, the subject acquisition unit includes:

the video clip acquisition subunit is used for determining a clip time range containing the ith data timestamp and acquiring a video clip corresponding to the clip time range from the target video;

the target determining subunit is used for acquiring at least two second video images forming the video clip and determining a target video image corresponding to the ith data timestamp in the at least two second video images; the target video image is the pth second video image in the at least two second video images, p is a positive integer and is less than or equal to the number of the at least two second video images;

the image identification subunit is used for identifying the p-th second video image, and if the identification result of the p-th second video image is obtained, the identification result of the p-th second video image is used as the video theme data corresponding to the ith data timestamp;

the image identification subunit is further configured to identify a (p-1) th second video image of the at least two second video images until the video theme data corresponding to the ith data timestamp is obtained if the identification result of the p-th second video image is not obtained.

the voice segment acquisition subunit is used for determining a segment time range containing the ith data timestamp, acquiring a video segment corresponding to the segment time range from the target video and acquiring video voice data in the video segment;

and the semantic analysis subunit is used for performing semantic analysis on the video and voice data to obtain video theme data corresponding to the ith data timestamp.

Wherein, this remark display module includes:

the group classification acquisition unit is used for responding to display operation aiming at the remark data set and acquiring at least two remark classification groups;

a partition display unit for acquiring a remark classification group C_jThe data timestamps corresponding to the k remark data respectively are sorted based on the data timestamps, and the sorted k remark data are sequentially displayed in the jth classification display area; k is a positive integer, k is used for representing a corresponding remark classification group C_jThe number of remark data included; j is a positive integer, j is less than or equal to the number of at least two remark classification groups.

Wherein, the device still includes:

the image acquisition module is used for displaying the remark prompt message, determining a video display area based on the remark prompt message, and intercepting a third video image displayed in the video display area;

the note acquisition module is used for displaying a remark area based on the remark prompt message and acquiring note data based on the remark area;

the remark generation module is used for generating remark data according to the third video image and the note data, acquiring a second time point corresponding to the remark prompt message in the target video, and taking the second time point as a data timestamp of the remark data;

the remark storage module is further used for adding the remark data and the data timestamp of the remark data into the remark data set.

One aspect of the embodiments of the present application provides a computer device, including a processor, a memory, and an input/output interface;

the processor is respectively connected with the memory and the input/output interface, wherein the input/output interface is used for receiving data and outputting data, the memory is used for storing a computer program, and the processor is used for calling the computer program to execute the video-based data processing method in one aspect of the embodiment of the application.

An aspect of the embodiments of the present application provides a computer-readable storage medium, in which a computer program is stored, where the computer program includes program instructions, and when the program instructions are executed by a processor, the method for processing data based on video according to an aspect of the embodiments of the present application is performed.

An aspect of an embodiment of the present application provides a computer program product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternatives in one aspect of the embodiments of the application.

The embodiment of the application has the following beneficial effects:

the embodiment of the application acquires a remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video; acquiring video subject data corresponding to each data timestamp from a target video, and classifying at least two remark data based on the video subject data to obtain at least two remark classification groups; and responding to the display operation aiming at the remark data set, and performing regional display on at least two remark classification groups. Through the process, the remark data are acquired and classified, and the regional display can be performed on the basis of the classified remark data (namely, remark classification groups), so that the remark data can be classified and displayed on the basis of different video subject data, and the display order of the remark data is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present application, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a network architecture diagram of video-based data processing provided by an embodiment of the present application;

fig. 2 is a schematic view of a display scene of remark data provided in an embodiment of the present application;

fig. 3 is a flowchart of a method for video-based data processing according to an embodiment of the present application;

fig. 4 is a schematic view of a video theme acquisition scene provided in an embodiment of the present application;

fig. 5a is a schematic view of a display scene of remark data provided in an embodiment of the present application;

fig. 5b is a schematic view of a display scene of still another remark data provided in the embodiment of the present application;

fig. 6 is a schematic diagram of a process of acquiring remark data according to an embodiment of the present application;

fig. 7a is a schematic diagram of switching area display according to an embodiment of the present application;

FIG. 7b is a schematic diagram of another area display switching scheme provided in the present application;

fig. 8a is a schematic diagram of a remark data input area corresponding to a text according to an embodiment of the present application;

fig. 8b is a schematic diagram of a remark data input area corresponding to a voice according to an embodiment of the present application;

fig. 8c is a schematic diagram of a memo data input area corresponding to handwriting according to an embodiment of the present application;

FIG. 9 is a schematic diagram of a video-based data processing apparatus according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the embodiment of the present application, a cloud storage technology, a big data technology, and the like in the cloud technology field may be used in the cloud education field.

The distributed cloud storage system (hereinafter referred to as a storage system) refers to a storage system which integrates a large number of storage devices (storage devices are also referred to as storage nodes) of various types in a network through application software or application interfaces to cooperatively work through functions such as cluster application, grid technology, distributed storage file system and the like, and provides data storage and service access functions to the outside.

At present, a storage method of a storage system is as follows: logical volumes are created, and when created, each logical volume is allocated physical storage space, which may be the disk composition of a certain storage device or of several storage devices. The client stores data on a certain logical volume, that is, the data is stored on a file system, the file system divides the data into a plurality of parts, each part is an object, the object not only contains the data but also contains additional information such as data identification (ID, ID entry), the file system writes each object into a physical storage space of the logical volume, and the file system records storage location information of each object, so that when the client requests to access the data, the file system can allow the client to access the data according to the storage location information of each object.

The process of allocating physical storage space for the logical volume by the storage system specifically includes: physical storage space is divided in advance into stripes according to a group of capacity measures of objects stored in a logical volume (the measures often have a large margin with respect to the capacity of the actual objects to be stored) and Redundant Array of Independent Disks (RAID), and one logical volume can be understood as one stripe, thereby allocating physical storage space to the logical volume. In the embodiment of the application, the generated data, such as remark data sets, videos and the like, can be stored through a cloud storage technology, so that the storage space occupied by the computer equipment is reduced, and the data interaction efficiency among the computer equipment is improved.

The Big data (Big data) refers to a data set which cannot be captured, managed and processed by a conventional software tool within a certain time range, and is a massive, high-growth-rate and diversified information asset which can have stronger decision-making power, insight discovery power and flow optimization capability only by a new processing mode. With the advent of the cloud era, big data has attracted more and more attention, and the big data needs special technology to effectively process a large amount of data within a tolerance elapsed time. The method is suitable for the technology of big data, and comprises a large-scale parallel processing database, data mining, a distributed file system, a distributed database, a cloud computing platform, the Internet and an extensible storage system. In this case, a large amount of data may be generated in this embodiment of the present application, including but not limited to remark data and video, and the data related in this embodiment of the present application may be processed based on a big data technology, so as to improve data processing efficiency.

The embodiment of the application can be applied to the field of Cloud Education, and Cloud Computing reduction (CCEDU for short) refers to Education platform services applied based on a Cloud Computing business mode. On the cloud platform, all education institutions, training institutions, enrollment service institutions, propaganda institutions, industry associations, management institutions, industry media, legal structures and the like are integrated into a resource pool in a centralized cloud mode, all resources are mutually displayed and interacted and communicated according to needs to achieve intentions, so that education cost is reduced, and efficiency is improved.

Specifically, referring to fig. 1, fig. 1 is a network architecture diagram of video-based data processing provided in an embodiment of the present application, and functions implemented in the embodiment of the present application may be applied to any computer device to which remark data may be added. As shown in fig. 1, each computer device may perform data interaction through a server 101, where the server 101 is a server corresponding to an application program, and the application program may be any program that can upload or display a video, where the video may be a live video or a recorded video, and is not limited herein. The computer device according to the embodiment of the present application may upload a video in the application program, and may also view a video existing in the application program. For example, the computer device 102 uploads a target video in the application program, and the computer device 103a, the computer device 103b, or the computer device 103c, etc. can view the target video and add remark data to the target video. Optionally, each computer device in this embodiment of the application may also perform data interaction through the blockchain network, for example, the remark data and the video and the like are stored in the blockchain network, and the computer device may upload the video to the blockchain network, may also acquire the video in the blockchain network for display, may also add the remark data to the acquired video, and upload the added remark data to the blockchain network for storage, so as to improve the storage security of the remark data.

Taking the computer device 103a as an example, the computer device 103a may display a target video uploaded by the computer device 102, and a user associated with the computer device 103a may add remark data to the target video. The computer device 103a may obtain a remark data set, and display the remark data set, where the remark data set includes remark data added by a user associated with the computer device 103 a. Optionally, the remark data set may be stored in the server or the cloud server 101, or may be stored in an internal storage space of the computer device 103 a.

It is understood that the computer device mentioned in the embodiments of the present application may include, but is not limited to, a terminal device or a server. In other words, the computer device may be a server or a terminal device, or may be a system composed of a server and a terminal device. The above-mentioned terminal device may be an electronic device, including but not limited to a mobile phone, a tablet computer, a desktop computer, a notebook computer, a palm-top computer, an Augmented Reality/Virtual Reality (AR/VR) device, a helmet-mounted display, a wearable device, a smart speaker, a digital camera, a camera, and other Mobile Internet Devices (MID) with network access capability. The above-mentioned server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a Content Delivery Network (CDN), a big data and artificial intelligence platform, and the like.

Further, please refer to fig. 2, and fig. 2 is a schematic view of a scenario for displaying remark data according to an embodiment of the present application. As shown in fig. 2, the computer device obtains a remark data set 201, and it is assumed that the remark data set 201 includes a data timestamp 1 associated with remark data 1, a data timestamp 2 associated with remark data 2, a data timestamp 3 associated with remark data 3, and the like, of remark data 1 and remark data 1 for remarking the target video. The computer device obtains video subject data 1 corresponding to a data timestamp 1, video subject data 2 corresponding to a data timestamp 2, video subject data 1 corresponding to a data timestamp 3, and the like from the target video 202. Taking note data 1, note data 2, and note data 3 as an example, the computer device classifies note data 1, note data 2, and note data 3 based on the video theme data to obtain a note classification group 2031 corresponding to the video theme data 1 and a note classification group 2032 corresponding to the video theme data 2, where the note classification group 2031 includes note data 1 and note data 3, and the note classification group 2032 includes note data 2. In response to the display operation for the memo data set 201, the memo classification group 2031 and the memo classification group 2032 are displayed in different areas, for example, the memo classification group 2031 may be displayed in an area 2041 in the memo display page 204, and the memo classification group 2032 is displayed in the area 2042, optionally, in each area, video theme data corresponding to the area may also be displayed, for example, the video theme data 1 is displayed in the area 2041, and the video theme data 2 is displayed in the area 2042, so that the display of the memo data is more orderly, the display content of the memo display page is more complete and clear, and the viewing experience of the memo data by the user may be improved.

Further, please refer to fig. 3, fig. 3 is a flowchart of a method for video-based data processing according to an embodiment of the present disclosure. As shown in fig. 3, the video-based data processing procedure includes the steps of:

step S301, a remark data set is obtained, wherein the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data.

In this embodiment, the computer device may obtain a remark data set, where a data timestamp included in the remark data set is obtained according to the target video, that is, the data timestamp belongs to the target video, for example, the data timestamp is "5 minutes 30 seconds", and then the data timestamp refers to the 5 minutes 30 seconds of the target video, in other words, the data timestamp may be used to indicate a position of video content for which the corresponding remark data is intended in the target video. If one remark data exists in the remark data set, the computer equipment can directly acquire the remark data and display the remark data; if at least two remark data exist in the remark data set, the at least two remark data and the data timestamp associated with each remark data are acquired, and step S302 is executed. The target video may be a video related to any field, such as an education field (teaching video, etc.), a game field (game commentary video, etc.), or a music field (music commentary video, etc.), and is not limited herein. For example, in the educational field, the remark data may be classroom notes for educational videos; in the field of games, the remark data can be game strategies or skill collocation aiming at game commentary videos, and the like; in the music field, the memo data may be a music theory knowledge memo or the like for a music theory explanation video.

The remark data may be data input by an end user for the target video, or may be obtained by a computer device intercepting a first video image displayed in the target video and according to the first video image. The end user refers to a user corresponding to the computer device (i.e., the execution subject in the embodiment of the present application).

Step S302, video subject data corresponding to each data timestamp are obtained from the target video, and at least two remark data are classified based on the video subject data to obtain at least two remark classification groups.

In the embodiment of the application, the computer device obtains, from a target video, a video segment corresponding to an ith data timestamp in the target video, identifies the video segment corresponding to the ith data timestamp, obtains video subject data corresponding to the ith data timestamp, and obtains video subject data corresponding to each data timestamp until obtaining video subject data corresponding to each data timestamp, wherein i is a positive integer and is less than or equal to the number of remark data included in at least two remark data. And dividing remark data corresponding to the data time stamps with the same video theme data into one class to obtain at least two remark classification groups.

In a method for determining video theme data, a computer device may determine a segment time range including an ith data timestamp, and acquire a video segment corresponding to the segment time range from a target video, where a preset segment duration may be acquired, and the segment time range whose duration is the preset segment duration is acquired based on the ith data timestamp. Acquiring at least two second video images forming the video clip, recognizing the at least two second video images, obtaining video theme data corresponding to the ith data timestamp according to the Recognition result of each second video image, and optionally, recognizing the second video images by using an image Recognition technology, such as Optical Character Recognition (OCR). The computer device may determine a target video image corresponding to the ith data timestamp in the at least two second video images, where the target video image is a pth second video image in the at least two second video images, p is a positive integer, and p is less than or equal to the number of the at least two second video images. Identifying the p second video image, and if the identification result of the p second video image is obtained, taking the identification result of the p second video image as video subject data corresponding to the ith data timestamp; if the identification result of the p second video image is not obtained, identifying a (p-1) th second video image in the at least two second video images, and if the identification result of the (p-1) th second video image is obtained, taking the identification result of the (p-1) th second video image as video subject data corresponding to the ith data timestamp; and if the identification result of the (p-1) th second video image is not obtained, identifying the (p-2) th second video image in the at least two second video images until the video theme data corresponding to the ith data timestamp is obtained. Optionally, on the basis of not acquiring the video clip, at least two second video images forming the video clip may be directly acquired, a target video image corresponding to the ith data timestamp in the at least two second video images is determined, the target video image is identified, and if an identification result of the target video image is obtained, the identification result of the target video image is used as video subject data corresponding to the ith data timestamp; and if the identification result of the target video image is not obtained, sequentially traversing at least two second video images by taking the target video image as a reference until the video subject data corresponding to the ith data timestamp is obtained.

For example, if the ith data timestamp is 4 minutes 20 seconds, determining a segment time range including the ith data timestamp, assuming that the segment time range is from "1 minute 20 seconds to 8 minutes 20 seconds", identifying the second video image at the 4 minute 20 seconds, and if the identification result of the second video image at the 4 minute 20 seconds is obtained, taking the identification result of the second video image at the 4 minute 20 seconds as the video subject data corresponding to the ith data timestamp; and if the identification result of the second video image at the 4 th/20 th second is not obtained, identifying the second video image of the previous frame of the second video image at the 4 th/20 th second until the video subject data corresponding to the ith data timestamp is obtained, wherein the previous frame is based on the corresponding time point in the target video. In short, the identification can be started from the second video image corresponding to the ith data timestamp, and at least two second video images forming the video segment corresponding to the ith data timestamp are traversed until the video subject data corresponding to the ith data timestamp is obtained.

Optionally, if the video segment is traversed and the video subject data corresponding to the ith data timestamp is not yet acquired, acquiring a video frame adjacent to the video segment from the target video for identification, that is, expanding the video segment until the video subject data corresponding to the ith data timestamp is acquired; or, the computer device may obtain video and voice data corresponding to the video segment, recognize the video and voice data, and obtain video subject data corresponding to the ith data timestamp.

In a method for determining video theme data, a computer device may determine a segment time range including an ith data timestamp, acquire a video segment corresponding to the segment time range from a target video, and acquire video and voice data in the video segment. And performing semantic analysis on the video and voice data to obtain video subject data corresponding to the ith data timestamp.

Optionally, the computer device may determine a segment time range including the ith data timestamp, acquire a video segment corresponding to the segment time range from the target video, and acquire video and voice data in the video segment. And identifying the video clip to obtain the picture theme data of the video clip, and performing semantic analysis on the video and voice data to obtain the voice theme data. And combining the picture theme data and the voice theme data to obtain video theme data corresponding to the ith data timestamp. If the image theme data and the voice theme data cannot clearly represent the video theme data corresponding to the video clip, performing keyword statistics on the image theme data and the voice theme data, and combining keywords of which the occurrence frequency is greater than a theme threshold and which are not nonsense words, to obtain the video theme data corresponding to the ith data timestamp, wherein the nonsense words are words which have a wider use range and generally play an auxiliary role, such as 'yes, very or no' and the like, and the nonsense words can be preset or can be generated and perfected according to the video theme data when the video theme data is determined.

For example, please refer to fig. 4, fig. 4 is a schematic view of a video theme acquisition scene provided in the embodiment of the present application. As shown in fig. 4, the computer device obtains a video segment 401 corresponding to the ith data timestamp from the target video. In the first method, at least two second video images 402 composing the video clip 401 are obtained, including the second video image a1, the second video image a2, the second video image a3, the second video image a4, and the like. Acquiring a second video image at a time point corresponding to the ith data timestamp in the target video, recognizing the second video image a3 on the assumption that the acquired second video image corresponding to the ith data timestamp is the second video image a3, and taking the recognition result of the second video image a3 as picture theme data corresponding to the ith data timestamp if the recognition result of the second video image a3 is obtained; if the recognition result of the second video image a3 is not obtained, the second video image a2, the second video images a1, …, the second video image a4 and the like can be sequentially recognized until the picture theme data of the ith data timestamp is obtained, and the picture theme data is used as the video theme data of the ith data timestamp.

In the second method, video and voice data 402 corresponding to the video clip 401 are obtained, semantic analysis is performed on the video and voice data to obtain voice subject data corresponding to the ith data timestamp, and the voice subject data is used as video subject data corresponding to the ith data timestamp. In the third method, the computer device can integrate the video subject data and the voice subject data to obtain the video subject data corresponding to the ith data timestamp, for example, count the occurrence frequency of the keywords of the video subject data and the voice subject data, and determine the video subject data of the ith data timestamp according to the occurrence frequency; or obtaining a theme leading word, and determining video theme data corresponding to the ith data timestamp from the video theme data and the voice theme data according to the theme leading word, wherein the theme leading word is a word for leading out the theme data, such as 'this section says yes'. Optionally, a video clip 401 may also be acquired, the video clip 401 is input into the topic acquisition model, and video topic data corresponding to the ith data timestamp is obtained according to a prediction result output by the topic acquisition model. Or, when a theme display area exists in the target video, a target video image corresponding to the ith data timestamp may be acquired, the theme display area of the target video image is identified, and video theme data corresponding to the ith data timestamp is acquired, wherein if the video theme data is not acquired from the target video image, the theme display area of a previous frame video image corresponding to the target video image in the target video is identified, and if the video theme data is not acquired yet, a previous frame video image of the target video image is identified until the video theme data corresponding to the ith data timestamp is acquired. In other words, the manner of obtaining the video theme data is not limited herein.

Step S303, responding to the display operation aiming at the remark data set, and displaying at least two remark classification groups in a regional way.

In the embodiment of the application, the computer device responds to the display operation aiming at the remark data set, and obtains the remark classification group obtained by dividing the remark data set. If the remark classification group is only one, displaying the remark data in the remark classification group in sequence based on the data timestamp of the remark data included in the remark classification group; and if at least two remark classification groups exist, displaying the at least two remark classification groups in a subarea manner. Specifically, when at least two remark classification groups exist, the computer equipment responds to display operation aiming at the remark data set, and at least two remark classification groups are obtained; obtaining remark classification group C_jThe data timestamps corresponding to the k remark data respectively are sorted based on the data timestamps, and the sorted k remark data are sequentially displayed in the jth classification display area; k is a positive integer, k is used for representing a corresponding remark classification group C_jThe number of remark data included; j is a positive integer, j is less than or equal to the number of at least two remark classification groups.

The user corresponding to the computer equipment is called a terminal user, the terminal user can check the target video, and the computer equipment can respond to the display operation aiming at the target video and display the target video and remark data for remarking the target video; the computer equipment can respond to the triggering operation aiming at the remark data display control, display all remark data remarked by the terminal user, specifically divide videos remarked by the terminal user according to the video categories to obtain video groups, and display remark data sets corresponding to the videos based on the video groups.

For example, please refer to fig. 5a, and fig. 5a is a schematic view of a scenario for displaying remark data according to an embodiment of the present application. As shown in FIG. 5a, assume that the application includes three pages, "home page, video and My," where the home page may be used to display the video that the application recommends based on the end user's history; the video page can be used for displaying videos existing in the application program, and a terminal user can inquire and view the videos in the video page; my page is used for displaying data related to an end user, wherein the page division mode is only one mode exemplified in the embodiment of the application, and when the application performs page division in other modes, the application does not affect the use of the embodiment of the application. For example, controls such as history, remark data, video resources and the like exist in the "my" page 501, the computer device responds to a trigger operation of a remark data display control corresponding to the remark data, displays a remark data display page 502, and displays a remark data set corresponding to each video based on a video group in the remark data display page 502. For example, in the field of education, the video category may be determined according to the subject and the course, and as shown in fig. 5a, two video groups of "[ number ] [ third stage ] new quayside mathematic summer-fever system shift" and "[ language ] [ first stage ] new quayside Chinese three-in-one reading and writing shift" are acquired, where the video groups may be named by using the video category. The video group of the new four-grade mathematic summer-fever system class comprises a remark data set corresponding to the fact that the size of a corner knows how much (one), and a remark data set corresponding to the fact that the size of the corner knows how much (two), and the video group of the new four-grade Chinese three-in-one reading and writing class comprises a remark data set corresponding to the fact that the language is read. In response to a selection operation for a subject, the computer device may display in the memo data display page 502 a memo data set of a video group included in the subject selected by the selection operation.

The computer device responds to the display operation of the remark data set corresponding to the angle with the known number (one), displays a remark classification group display page 503, and displays the remark classification group in the remark classification group display page 503, at this time, the video corresponding to the angle with the known number (one) is the target video. For example, the remark data set corresponding to the "how many (one) the size of the corner is known" is divided into two remark classification groups, the video topic data of the first remark classification group is a knowledge point 1, the video topic data of the second remark classification group is a knowledge point 2, and the remark data corresponding to the knowledge point 1 is displayed in the first classification display area 5031 in the remark classification group display page 503; the remark data corresponding to the knowledge point 2 is displayed in the second classification display area 5032. Optionally, in the field of games, the video category may be determined according to the game, and the videos are classified based on the video category to obtain a video group, for example, one video category corresponds to peace elite, and a live line is crossed: the gunfighter (Cross Fire Mobile Games, CFM) corresponds to a video category, and the like, i.e., related videos of the same game are determined as a video group.

For example, please refer to fig. 5b, and fig. 5b is a schematic view of a display scene of still another remark data provided in the embodiment of the present application. As shown in fig. 5b, assuming that the application includes three pages "home page, video and my", the computer device displays a history display page 505 in response to a trigger operation for a history in the "my" page 504, and displays videos viewed by the end user, such as a video 5051 and a video 5052, in the history display page 505. In response to a trigger operation for the video 5051, the computer device displays the video 5051 and a remark data set corresponding to the video 5051 in the video display page 506, where the remark data set can be displayed in the remark display area 5061, and at this time, the video 5051 is a target video. Optionally, the computer device may obtain a playing time point of the video 5051, obtain to-be-displayed remark data corresponding to the playing time point from the remark data set according to a corresponding relationship between the remark data in the remark data set and the data timestamp, display the to-be-displayed remark data in the remark display area 5061, and update the to-be-displayed remark data according to the playing condition of the video 5051. The computer device responds to the trigger operation for the memo display area 5061, displays a memo classification group display page 507, displays a memo classification group obtained from a memo data set in the memo classification group display page 507, specifically displays a memo classification group corresponding to the knowledge point 1 in a first classification display area 5071, and displays a memo classification group corresponding to the knowledge point 2 in a second classification display area 5072, which may specifically refer to a display manner of a memo classification group display page 503 in fig. 5 a.

The embodiment of the application acquires a remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video; acquiring video subject data corresponding to each data timestamp from a target video, and classifying at least two remark data based on the video subject data to obtain at least two remark classification groups; and responding to the display operation aiming at the remark data set, and performing regional display on at least two remark classification groups. Through the process, the remark data are acquired and classified, and the regional display can be performed on the basis of the classified remark data (namely, remark classification groups), so that the remark data can be classified and displayed on the basis of different video subject data, and the display order and the orderliness of the remark data are improved.

Further, please refer to fig. 6, and fig. 6 is a schematic diagram illustrating a process of obtaining remark data according to an embodiment of the present application. As shown in fig. 6, the process of acquiring the remark data includes the following steps:

step S601, responding to the trigger operation aiming at the area switching control in the interactive area, and displaying the remark area.

In this embodiment of the application, if the computer device has already displayed the remark area, step S602 is executed; if the computer device displays the interaction area, the terminal user triggers the area switching control in the interaction area, and the computer device can respond to the triggering operation aiming at the area switching control in the interaction area and display the remark area. For example, refer to fig. 7a, where fig. 7a is a schematic diagram illustrating switching of area display provided in an embodiment of the present application. As shown in fig. 7a, the computer device displays a video display area b1, a video upload user display area c1 and an interaction area d1, and displays a remark area d2 in response to a triggering operation on an area switching control 701 in the interaction area d1, wherein the area switching control 701 in the interaction area d1 can display a "remark mode" to indicate that the control is used for displaying a remark area d 2. Optionally, the remark area d2 may include a data category control 7031 and may further include a screenshot control 7032, where the data category control 7031 may include a text capture control (i.e., a "text input" control), a voice capture control (i.e., a "voice input" control), or a handwriting capture control (i.e., a "handwriting input" control), and the like, which may improve flexibility of the remark input manner. Where the computer device displays a region toggle control 702 in note region d2, the region toggle control 702 can display "interaction mode" to indicate that the control is for displaying interaction region d 1.

Fig. 7b shows a manner of switching the area display, where fig. 7b is another schematic diagram of switching the area display provided in the embodiment of the present application. As shown in fig. 7b, a region switching control of "interaction mode" and a region switching control of "remark mode" are displayed in the interaction region d1, wherein the region switching control of "interaction mode" is in a selected state. When the computer device responds to the triggering operation of the region switching control of the 'remark mode', a remark region d2 is displayed, and the region switching control of the 'remark mode' is in a selected state in a remark region d 2. Fig. 7a and 7b only show that the display of the region switching control of the interaction region d1 and the note region d2 is different, and the other functions are the same.

Step S602, obtaining remark data in response to the trigger operation for the remark area.

In the embodiment of the present application, taking fig. 7a as an example, the computer device may obtain, based on the memo area d2, memo data input by the end user, where the memo data may include data in text, voice, or picture format, in other words, the memo area d2 corresponds to an input box, and the end user may input the memo data in the memo area d 2. Optionally, when the remark area d2 includes the data type control 7031 and the screenshot control 7032, the terminal user may trigger a corresponding control in the data type control 7031 according to a data type that needs to be input, and the computer device may respond to a trigger operation of the corresponding control in the data type control 7031 to obtain remark data.

Specifically, the computer device responds to a trigger operation for a data category control in the remark area, and displays a remark data input area corresponding to the data category control. And responding to a trigger operation aiming at the remark data input area, and acquiring the remark data input in the remark data input area, wherein the data type of the remark data is the data type corresponding to the data type control. For example, the text collection control corresponds to a text data type, the voice collection control corresponds to a voice data type, and the handwriting collection control corresponds to a handwriting data type. The computer equipment can display the remark data input area in the remark area, the display of the remark data input area cannot shield the playing of the target video, the display effect of the video is improved, the display independence of the remark data and the video is improved, and therefore the user experience can be improved; the remark data input area can also be directly displayed, at the moment, the remark data input area is independently displayed in the remark area, the size and the like of the remark data input area can be changed, and the display mode enables the display of the remark data input area to be more flexible.

The data category control includes a text collection control, and responds to a trigger operation for the text collection control to display a remark data input area corresponding to the text collection control, as shown in fig. 8a, fig. 8a is a schematic diagram of the remark data input area corresponding to a text provided in an embodiment of the present application. The computer device responds to the trigger operation aiming at the text acquisition control, a remark data input area 7031 is displayed, a terminal user can input remark text information in the remark data input area 7031, the computer device acquires and acquires the remark text information input in the remark data input area 7031, and the remark text information is used as remark data.

The data category control includes a voice acquisition control, and responds to a trigger operation for the voice acquisition control to display a remark data input area corresponding to the voice acquisition control, as shown in fig. 8b, where fig. 8b is a schematic diagram of the remark data input area corresponding to voice provided in an embodiment of the present application. The computer device responds to the trigger operation for the voice acquisition control, displays a remark data input area 7032, and responds to the trigger operation for the remark data input area 7032 to acquire remark voice information input in the remark data input area 7032. And carrying out semantic recognition on the acquired remark voice information to obtain text information corresponding to the remark voice information, and taking the remark voice information and the text information as remark data.

The data category control includes a handwriting acquisition control, and a note data input area corresponding to the handwriting acquisition control is displayed in response to a trigger operation for the handwriting acquisition control, as shown in fig. 8c, where fig. 8c is a schematic diagram of a note data input area corresponding to handwriting provided in an embodiment of the present application. The computer device responds to the trigger operation aiming at the handwriting acquisition control to display the remark data input area 7033, and responds to the trigger operation aiming at the remark data input area 7033 to acquire the handwritten remark information input in the remark data input area 7033. And converting the handwritten remark information into handwritten text information, and taking the handwritten text information as remark data.

In fig. 8a to 8c, the remark data input area is independently displayed as an example for description.

As shown in fig. 7a, in response to a triggering operation on a screenshot control 7032 in a remark area d2, the computer device determines a video display area b1 where the target video is located, and intercepts a first video image displayed in the video display area b1, and uses the first video image as remark data.

Step S603, obtaining a first time point corresponding to the trigger operation of the remark region in the target video, taking the first time point as a data timestamp of the remark data, and adding the remark data and the data timestamp of the remark data to the remark data set.

In this embodiment of the application, the computer device obtains a first time point corresponding to a triggering operation of a remark area in a target video, for example, when the target video is played to the 5 th 30 th second, an end user triggers the remark area, that is, the remark data is input in the remark area, after the computer device obtains the remark data, the computer device takes the "5 th 30 th second" as a data timestamp of the remark data, and adds the remark data and the data timestamp of the remark data to a remark data set.

Further, as shown in fig. 7a, the computer device responds to the triggering operation of the region switching control 702 in the remark region d2 to display an interaction region d1, where the interaction region d1 is used for displaying the session messages sent by the participating users of the target video, such as the session message of the participating user 1 and the session message of the participating user 2 in fig. 7 a. In response to the trigger operation for the interaction area d1, acquiring a session message of an end user, and sending and displaying the session message of the end user, where the end user is a user corresponding to a terminal (i.e., a computer device in the embodiment of the present application) for displaying the remark area, and may be referred to as a local user.

Further, the computer device displays the remark prompting message, determines a video display area based on the remark prompting message, and intercepts a third video image displayed in the video display area. Displaying a remark area based on the remark prompt message, and acquiring note data based on the remark area. And generating remark data according to the third video image and the note data, acquiring a second time point corresponding to the remark prompt message in the target video, and taking the second time point as a data time stamp of the remark data. And adding the remark data and the data time stamp of the remark data into the remark data set. For example, in the field of education, the target video is uploaded by a teacher side, the computer device in the embodiment of the present application belongs to a student side, when the teacher side device sends a remark prompting message, it indicates that data corresponding to the remark prompting message in the target video (i.e. data of the target video at the second time point) is important, and the computer device displays the remark prompting message and acquires the remark data based on the remark prompting message, so that the computer device can timely store the remark data, thereby improving classroom efficiency; or, when the computer device receives the confirmation operation for the remark prompting message, the remark data is acquired to reduce unnecessary memory occupation, for example, when the terminal user already grasps the knowledge point a, the teacher-side device reminds that the knowledge point a is to be remarked, and when the computer device does not acquire the confirmation operation for the remark prompting message, the remark data associated with the knowledge point a is not acquired, so that the learning of the terminal user is not affected, and the memory space occupation can be reduced.

Optionally, the terminal user may further adjust at least two remark classification groups, so that the remark data classification better conforms to the reading habit of the terminal user, and the storage flexibility of the remark data is improved. Optionally, when the computer device 1 uploads the target video, the target video may be directly divided, that is, the key time point and video topic data corresponding to the key time point are added in the target video, the video topic data corresponding to the video clip between two adjacent key time points is the video topic data of the key time point with a smaller time in the two adjacent key time points, for example, the video topic data of the key time point "5 minutes 30 seconds" is a knowledge point 1, the video topic data of the key time point "12 minutes" is a knowledge point 2, the video topic data of the key time point "5 minutes 30 seconds" is adjacent to the key time point "12 minutes", and the video topic data of the video clip from 5 minutes 30 seconds to 12 minutes in the target video is a knowledge point 1, so that the target video is divided into a plurality of video clips. When the computer device 2 executes step S302, it may directly obtain the video segment where each data timestamp is located, determine the video topic data corresponding to the video segment as the video topic data of the corresponding data timestamp, and if the data timestamp 1 belongs to the video segment of 5 minutes, 30 seconds, to 12 minutes, the video topic data of the data timestamp 1 is the knowledge point 1.

Further, please refer to fig. 9, fig. 9 is a schematic diagram of a video-based data processing apparatus according to an embodiment of the present application. The video-based data processing apparatus may be a computer program (comprising program code) running on a computer device, for example the video-based data processing apparatus being an application software; the apparatus may be used to perform the corresponding steps in the methods provided by the embodiments of the present application. As shown in fig. 9, the video-based data processing apparatus 900 may be used in the computer device in the embodiment corresponding to fig. 3 or fig. 6, and specifically, the apparatus may include: a remark acquiring module 11, a remark classifying module 12 and a remark displaying module 13.

A remark acquiring module 11, configured to acquire a remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video;

the remark classification module 12 is configured to obtain video topic data corresponding to each data timestamp from a target video, and classify at least two remark data based on the video topic data to obtain at least two remark classification groups;

and the remark display module 13 is configured to respond to a display operation for the remark data set and perform regional display on at least two remark classification groups.

Wherein, the apparatus 900 further comprises:

the region switching module 14 is configured to respond to a trigger operation for a region switching control in the interaction region, and display a remark region;

the remark acquisition module 15 is used for responding to the trigger operation aiming at the remark area and acquiring remark data;

and the remark storage module 16 is configured to acquire a first time point corresponding to a trigger operation of the remark area in the target video, use the first time point as a data timestamp of the remark data, and add the remark data and the data timestamp of the remark data to the remark data set.

Wherein the remark area comprises a data category control;

this remark collection module 15 includes:

an input area display unit 151, configured to respond to a trigger operation for a data category control in a remark area, and display a remark data input area corresponding to the data category control;

an input data acquisition unit 152 for acquiring memo data input in the memo data input area in response to a trigger operation for the memo data input area; and the data type of the remark data is the data type corresponding to the data category control.

Wherein the data category control comprises a voice acquisition control;

the input data acquisition unit 152 includes:

a voice acquiring subunit 1521, configured to respond to a trigger operation for the memo data input area, and acquire memo voice information input in the memo data input area;

a remark determining subunit 1522, configured to perform semantic recognition on the collected remark voice information, obtain text information corresponding to the remark voice information, and use the remark voice information and the text information as remark data.

The remark area comprises a screenshot control; this remark collection module 15 includes:

the image capturing unit 153 is configured to determine, in response to a trigger operation for the screenshot control in the remark area, a video display area where the target video is located, capture a first video image displayed in the video display area, and use the first video image as remark data.

Wherein, the apparatus 900 further comprises:

the interactive display module 17 is configured to respond to a trigger operation for a region switching control in the remark region and display an interactive region; the interactive area is used for displaying conversation messages sent by participating users of the target video;

a user session module 18, configured to respond to a trigger operation for the interaction area, acquire a session message of the terminal user, and send and display the session message of the terminal user; the terminal user is a user corresponding to the terminal for displaying the remark area.

Wherein, this remarks classification module 12 includes:

the theme obtaining unit 121 is configured to obtain, from the target video, a video segment corresponding to the ith data timestamp in the target video, identify the video segment corresponding to the ith data timestamp, obtain video theme data corresponding to the ith data timestamp, until video theme data corresponding to each data timestamp is obtained; i is a positive integer, i is less than or equal to the number of at least two remark data;

the remark dividing unit 122 is configured to divide remark data corresponding to the data timestamps with the same video theme data into one class, so as to obtain at least two remark classification groups.

In obtaining video clips corresponding to the ith data timestamp in the target video from the target video, identifying the video clips corresponding to the ith data timestamp, and obtaining video subject data corresponding to the ith data timestamp, the subject obtaining unit 121 includes:

a video segment acquiring subunit 1211, configured to determine a segment time range including the ith data timestamp, and acquire a video segment corresponding to the segment time range from the target video;

a target determining subunit 1212, configured to acquire at least two second video images forming the video clip, and determine a target video image corresponding to the ith data timestamp in the at least two second video images; the target video image is the pth second video image in the at least two second video images, p is a positive integer and is less than or equal to the number of the at least two second video images;

an image identification subunit 1213, configured to identify the pth second video image, and if the identification result of the pth second video image is obtained, take the identification result of the pth second video image as video theme data corresponding to the ith data timestamp;

the image identifying subunit 1213 is further configured to identify a (p-1) th second video image of the at least two second video images until the video theme data corresponding to the ith data timestamp is obtained if the identification result of the p-th second video image is not obtained.

a voice segment acquiring subunit 1214, configured to determine a segment time range including the ith data timestamp, acquire a video segment corresponding to the segment time range from the target video, and acquire video and voice data in the video segment;

and a semantic analysis subunit 1215, configured to perform semantic analysis on the video and voice data to obtain video topic data corresponding to the ith data timestamp.

Wherein, this remark display module 13 includes:

a group obtaining unit 131, configured to obtain at least two remark classification groups in response to a display operation for the remark data set;

a partition display unit 132 for acquiring a remark classification group C_jThe data timestamps corresponding to the k remark data respectively are sorted based on the data timestamps, and the sorted k remark data are sequentially displayed in the jth classification display area; k is a positive integer, k is used for representing a corresponding remark classification group C_jThe number of remark data included; j is a positive integer, j is less than or equal to the number of at least two remark classification groups.

Wherein, the apparatus 900 further comprises:

the image acquisition module 19 is configured to display a remark prompt message, determine a video display area based on the remark prompt message, and capture a third video image displayed in the video display area;

the note acquisition module 20 is configured to display a note area based on the note prompt message, and acquire note data based on the note area;

the remark generating module 21 is configured to generate remark data according to the third video image and the note data, acquire a second time point corresponding to the remark prompt message in the target video, and use the second time point as a data timestamp of the remark data;

the remark storage module 16 is further configured to add the remark data and the data timestamp of the remark data to the remark data set.

The embodiment of the application provides a data processing device based on video, which acquires a remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video; acquiring video subject data corresponding to each data timestamp from a target video, and classifying at least two remark data based on the video subject data to obtain at least two remark classification groups; and responding to the display operation aiming at the remark data set, and performing regional display on at least two remark classification groups. Through the process, the remark data are acquired and classified, and the regional display can be performed on the basis of the classified remark data (namely, remark classification groups), so that the remark data can be classified and displayed on the basis of different video subject data, and the display order of the remark data is improved.

Referring to fig. 10, fig. 10 is a schematic structural diagram of a computer device according to an embodiment of the present application. As shown in fig. 10, the computer device in the embodiment of the present application may include: one or more processors 1001, memory 1002, and input-output interface 1003. The processor 1001, the memory 1002, and the input/output interface 1003 are connected by a bus 1004. The memory 1002 is used for storing a computer program including program instructions, and the input/output interface 1003 is used for receiving data and outputting data; the processor 1001 is configured to execute program instructions stored in the memory 1002 to perform the following operations:

In some possible embodiments, the processor 1001 may be a Central Processing Unit (CPU), and the processor may be other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 1002 may include both read-only memory and random-access memory, and provides instructions and data to the processor 1001 and the input/output interface 1003. A portion of the memory 1002 may also include non-volatile random access memory. For example, the memory 1002 may also store device type information.

In a specific implementation, the computer device may execute, through each built-in functional module thereof, the implementation manner provided in each step in fig. 3 or fig. 6, which may be referred to specifically for the implementation manner provided in each step in fig. 3 or fig. 6, and is not described herein again.

The embodiment of the present application provides a computer device, including: the system comprises a processor, an input/output interface and a memory, wherein the processor acquires computer instructions in the memory, executes each step of the method shown in the figure 3 or the figure 6 and carries out data processing operation based on video. The embodiment of the application realizes the acquisition of the remark data set; the remark data set comprises at least two remark data for remarking the target video and a data timestamp associated with each remark data; the data timestamp belongs to the target video; acquiring video subject data corresponding to each data timestamp from a target video, and classifying at least two remark data based on the video subject data to obtain at least two remark classification groups; and responding to the display operation aiming at the remark data set, and performing regional display on at least two remark classification groups. Through the process, the remark data are acquired and classified, and the regional display can be performed on the basis of the classified remark data (namely, remark classification groups), so that the remark data can be classified and displayed on the basis of different video subject data, and the display order of the remark data is improved.

An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by the processor, the video-based data processing method provided in each step in fig. 3 or fig. 6 can be implemented, which may specifically refer to an implementation manner provided in each step in fig. 3 or fig. 6, and is not described herein again. In addition, the beneficial effects of the same method are not described in detail. For technical details not disclosed in embodiments of the computer-readable storage medium referred to in the present application, reference is made to the description of embodiments of the method of the present application. By way of example, program instructions may be deployed to be executed on one computer device or on multiple computer devices at one site or distributed across multiple sites and interconnected by a communication network.

The computer readable storage medium may be the video-based data processing apparatus provided in any of the foregoing embodiments or an internal storage unit of the computer device, such as a hard disk or a memory of the computer device. The computer readable storage medium may also be an external storage device of the computer device, such as a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) card, a flash card (flash card), and the like, provided on the computer device. Further, the computer-readable storage medium may also include both an internal storage unit and an external storage device of the computer device. The computer-readable storage medium is used for storing the computer program and other programs and data required by the computer device. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.

Embodiments of the present application also provide a computer program product or computer program comprising computer instructions stored in a computer-readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the method provided in the various optional manners in fig. 3 or fig. 6, thereby realizing the classified display of the remark data, improving the orderliness and orderliness of the display of the remark data, and simultaneously providing a plurality of remark input manners (such as handwriting input, text input, voice input or screenshot and the like), and improving the flexibility of the remark input manners.

The terms "first," "second," and the like in the description and in the claims and drawings of the embodiments of the present application are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "comprises" and any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, apparatus, product, or apparatus that comprises a list of steps or elements is not limited to the listed steps or modules, but may alternatively include other steps or modules not listed or inherent to such process, method, apparatus, product, or apparatus.

Those of ordinary skill in the art will appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, computer software, or combinations of both, and that the components and steps of the examples have been described in a functional general in the specification for the purpose of clearly illustrating the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

The method and the related apparatus provided by the embodiments of the present application are described with reference to the flowchart and/or the structural diagram of the method provided by the embodiments of the present application, and each flow and/or block of the flowchart and/or the structural diagram of the method, and the combination of the flow and/or block in the flowchart and/or the block diagram can be specifically implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block or blocks of the block diagram. These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block or blocks.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not to be construed as limiting the scope of the present application, so that the present application is not limited thereto, and all equivalent variations and modifications can be made to the present application.

Claims

1. A method for video-based data processing, the method comprising:

acquiring video subject data corresponding to each data timestamp from the target video, and classifying the at least two remark data based on the video subject data to obtain at least two remark classification groups;

responding to the display operation aiming at the remark data set, and carrying out regional display on the at least two remark classification groups; a remark classification group corresponds to a classification display area; displaying video theme data of a corresponding remark classification group and remark data arranged in time sequence in the corresponding remark classification group in a classification display area;

the obtaining of the video subject data corresponding to the ith data timestamp from the target video includes: determining a segment time range containing the ith data timestamp, and acquiring a video segment corresponding to the segment time range from a target video, wherein i is a positive integer and is less than or equal to the number of the at least two remark data;

acquiring at least two second video images forming the video clip, and determining a target video image corresponding to the ith data timestamp in the at least two second video images; wherein the target video image is a pth second video image of the at least two second video images, p is a positive integer, and p is less than or equal to the number of the at least two second video images;

identifying the p-th second video image, and if the identification result of the p-th second video image is obtained, taking the identification result of the p-th second video image as the video subject data corresponding to the ith data timestamp;

if the identification result of the p-th second video image is not obtained, identifying the p-1-th second video image in the at least two second video images, and if the identification result of the p-1-th second video image is obtained, taking the identification result of the p-1-th second video image as the video theme data corresponding to the ith data timestamp; if the identification result of the (p-1) th second video image is not obtained, continuously traversing and identifying the (p-2) th second video image in the at least two second video images until the video theme data corresponding to the ith data timestamp is obtained;

if the video clip is traversed and the video subject data corresponding to the ith data timestamp is not obtained yet, acquiring a video frame adjacent to the video clip from the target video for identification until the video subject data corresponding to the ith data timestamp is obtained;

or, the obtaining of the video topic data corresponding to the ith data timestamp from the target video includes: and obtaining a theme leading word, and determining video theme data corresponding to the ith data timestamp from the video theme data and the voice theme data of the target video according to the theme leading word.

2. The method of claim 1, wherein the method further comprises:

responding to the trigger operation aiming at the remark area, and acquiring remark data;

acquiring a first time point corresponding to the trigger operation aiming at the remark area in the target video, taking the first time point as a data time stamp of the remark data, and adding the remark data and the data time stamp of the remark data into the remark data set.

3. The method of claim 2, wherein the remark area includes a data category control;

the response is to the trigger operation of the remark area, and the obtaining of remark data comprises:

responding to the trigger operation aiming at the data category control in the remark area, and displaying a remark data input area corresponding to the data category control;

responding to a trigger operation aiming at the remark data input area, and acquiring remark data input in the remark data input area; and the data type of the remark data is the data type corresponding to the data type control.

4. The method of claim 3, wherein the data category control comprises a voice capture control;

the acquiring the remark data input in the remark data input area in response to the trigger operation for the remark data input area includes:

and carrying out semantic recognition on the collected remark voice information to obtain text information corresponding to the remark voice information, and taking the remark voice information and the text information as remark data.

5. The method of claim 2, wherein the remark area includes a screenshot control; the response is to the trigger operation of the remark area, and the obtaining of remark data comprises:

responding to the triggering operation of a screenshot control in the remark area, determining a video display area where the target video is located, intercepting a first video image displayed in the video display area, and taking the first video image as remark data.

6. The method of claim 2, wherein the method further comprises:

responding to the triggering operation of the area switching control in the remark area, and displaying the interaction area; the interactive area is used for displaying session messages sent by participating users of the target video;

responding to the trigger operation aiming at the interaction area, acquiring the session message of the terminal user, and sending and displaying the session message of the terminal user; and the terminal user is a user corresponding to the terminal for displaying the remark area.

7. The method of claim 1, wherein the displaying the at least two memo taxonomy groups in regions in response to the displaying operation for the memo data set comprises:

responding to a display operation aiming at the remark data set, and acquiring the at least two remark classification groups;

obtaining remark classification group C_jThe data timestamps corresponding to the k remark data respectively are sorted based on the data timestamps, and the sorted k remark data are sequentially displayed in the jth classification display area; k is a positive integer, k is used for representing a corresponding remark classification group C_jThe number of remark data included; j is a positive integer, and j is less than or equal to the number of the at least two remark classification groups.

8. The method of claim 1, wherein the method further comprises:

adding the remark data and a data timestamp of the remark data to the remark data set.

9. A video-based data processing apparatus, characterized in that the apparatus comprises:

the remark classification module is used for acquiring video subject data corresponding to each data timestamp from the target video, and classifying the at least two remark data based on the video subject data to obtain at least two remark classification groups;

the remark display module is used for responding to the display operation aiming at the remark data set and carrying out regional display on the at least two remark classification groups; a remark classification group corresponds to a classification display area; displaying video theme data of a corresponding remark classification group and remark data arranged in time sequence in the corresponding remark classification group in a classification display area;

10. The apparatus of claim 9, wherein the apparatus further comprises:

the remark acquisition module is used for responding to the trigger operation aiming at the remark area and acquiring remark data;

and the remark storage module is used for acquiring a first time point corresponding to the trigger operation aiming at the remark area in the target video, taking the first time point as a data time stamp of the remark data, and adding the remark data and the data time stamp of the remark data into the remark data set.

11. A computer device comprising a processor, a memory, an input output interface;

the processor is connected to the memory and the input/output interface, respectively, wherein the input/output interface is configured to receive data and output data, the memory is configured to store a computer program, and the processor is configured to call the computer program to perform the method according to any one of claims 1 to 8.

12. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program comprising program instructions which, when executed by a processor, perform the method according to any one of claims 1-8.