WO2021062990A1 - 视频分割方法、装置、设备及介质 - Google Patents

视频分割方法、装置、设备及介质 Download PDF

Info

Publication number
WO2021062990A1
WO2021062990A1 PCT/CN2020/083473 CN2020083473W WO2021062990A1 WO 2021062990 A1 WO2021062990 A1 WO 2021062990A1 CN 2020083473 W CN2020083473 W CN 2020083473W WO 2021062990 A1 WO2021062990 A1 WO 2021062990A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
knowledge point
divided
point data
segmentation
Prior art date
Application number
PCT/CN2020/083473
Other languages
English (en)
French (fr)
Inventor
曾德政
Original Assignee
北京沃东天骏信息技术有限公司
北京京东世纪贸易有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京沃东天骏信息技术有限公司, 北京京东世纪贸易有限公司 filed Critical 北京沃东天骏信息技术有限公司
Priority to US17/763,480 priority Critical patent/US20220375225A1/en
Publication of WO2021062990A1 publication Critical patent/WO2021062990A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/49Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/77Processing image or video features in feature spaces; using data integration or data reduction, e.g. principal component analysis [PCA] or independent component analysis [ICA] or self-organising maps [SOM]; Blind source separation
    • G06V10/7715Feature extraction, e.g. by transforming the feature space, e.g. multi-dimensional scaling [MDS]; Mappings, e.g. subspace methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • GPHYSICS
    • G09EDUCATION; CRYPTOGRAPHY; DISPLAY; ADVERTISING; SEALS
    • G09BEDUCATIONAL OR DEMONSTRATION APPLIANCES; APPLIANCES FOR TEACHING, OR COMMUNICATING WITH, THE BLIND, DEAF OR MUTE; MODELS; PLANETARIA; GLOBES; MAPS; DIAGRAMS
    • G09B5/00Electrically-operated educational appliances
    • G09B5/06Electrically-operated educational appliances with both visual and audible presentation of the material to be studied
    • G09B5/065Combinations of audio and video presentations, e.g. videotapes, videodiscs, television systems

Definitions

  • This application relates to the field of information processing, for example, to a video segmentation method, device, device, and medium.
  • video segmentation is mainly based on a specified frame, duration, or content blank space for video segmentation.
  • the segmentation method of the specified frame or duration requires manual operation and requires viewing all the video content in advance.
  • the way of segmenting the blank space of the content is likely to cause inaccurate segmentation and make it difficult for viewers to understand.
  • This application provides a video segmentation method, device, equipment, and medium to simplify the video segmentation process and improve the accuracy of video segmentation.
  • An embodiment of the application provides a video segmentation method, including:
  • the video to be divided is divided according to the corresponding relationship to obtain at least one video segment.
  • An embodiment of the present application also provides a video segmentation device, including:
  • a knowledge point recognition module configured to obtain a video to be divided, and determine the correspondence between knowledge point data in the video to be divided and video frames in the video to be divided;
  • the video segmentation module is configured to segment the video to be segmented according to the corresponding relationship to obtain at least one video segment.
  • An embodiment of the present application also provides a computer device, which includes:
  • One or more processors are One or more processors;
  • Storage device set to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the video segmentation method provided in any embodiment of the present application.
  • the embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the video segmentation method as provided in any embodiment of the present application is implemented.
  • FIG. 1 is a flowchart of a video segmentation method provided in Embodiment 1 of the present application;
  • FIG. 2 is a flowchart of a video segmentation method provided in Embodiment 2 of the present application.
  • FIG. 3a is a flowchart of a video segmentation method provided by Embodiment 3 of the present application.
  • Fig. 3b is a flowchart of a knowledge graph construction provided by the third embodiment of the present application.
  • FIG. 3c is a schematic flowchart of a video segmentation storage provided by Embodiment 3 of the present application.
  • FIG. 4 is a schematic structural diagram of a video segmentation device provided by Embodiment 4 of the present application.
  • FIG. 5 is a schematic structural diagram of a computer device provided in Embodiment 5 of the present application.
  • FIG. 1 is a flowchart of a video segmentation method provided in Embodiment 1 of the present application. This embodiment is applicable to the case of segmenting a video, for example, to the case of segmenting a teaching video.
  • the method can be executed by a video segmentation device, and the video segmentation device can be implemented in a software and/or hardware manner.
  • the video segmentation device can be configured in a computer device. As shown in Figure 1, the method includes:
  • the video to be divided may be an education teaching video or a popular science video.
  • analyze the content of the video to be divided and extract the subtitle content, voice content and/or text content in the video frame of the video.
  • the knowledge contained in the video to be divided can be obtained Point data and the video frame corresponding to each knowledge point data.
  • the natural scene text detection (Connectionist Text Proposal Network, CTPN) method can be used to locate the text area in the video frame, the tessract tool is used for text recognition in the picture, and the automatic speech recognition technology (Automatic Speech Recognition, ASP ) Recognize the voice content in the video.
  • structured data can be formed by performing attribute extraction, relationship extraction, and entity extraction on the identified video content, and determine the knowledge point data contained in the video to be segmented, and the correspondence between the knowledge point data and the video frame .
  • the correspondence between the knowledge point data and the video frame can be: knowledge point
  • the video frame range corresponding to A is 1-20
  • the video frame range corresponding to knowledge point B is 21-50
  • the video frame range corresponding to knowledge point C is 51-90.
  • the knowledge point data corresponding to the video to be segmented can be directly obtained.
  • the knowledge point data corresponding to different types of videos may be determined in advance, and after the video to be divided is obtained, the knowledge point data corresponding to the video to be divided is obtained according to the type of the video to be divided.
  • the type of the video to be divided may be the subject and/or chapter to which the video to be divided belongs.
  • the knowledge point data corresponding to the junior high school mathematics subject can be obtained as the knowledge point data corresponding to the video to be divided.
  • S120 Perform segmentation on the to-be-divided video according to the corresponding relationship to obtain at least one video segment.
  • the video frame interval corresponding to each knowledge point data can be regarded as a video segment according to the range of the number of video frames corresponding to the knowledge point data.
  • video frame range corresponding to knowledge point A is 1-20
  • the video frame range corresponding to knowledge point B is 21-50
  • the video frame range corresponding to knowledge point C is 51-90
  • video frame 1 The video segment corresponding to video frame 20 is determined to be video segment 1
  • the video segment corresponding to video frame 21 to video frame 50 is determined to be video segment 2
  • the video segment corresponding to video frame 51 to video frame 90 is determined to be video segment 3.
  • the above corresponding relationship can also be corrected by the boundary detection method to obtain the corrected knowledge point data and the video frame.
  • the video frame interval corresponding to each knowledge point data after correction is regarded as a video segment.
  • the corresponding relationship between the knowledge point data in the to-be-split video and the video frame in the to-be-split video is determined by acquiring the to-be-split video; and the to-be-split video is segmented according to the corresponding relationship to obtain At least one video segment, by segmenting the video to be segmented according to the knowledge point data in the video to be segmented, the segmentation process of the video is simplified and the segmentation accuracy of the video is improved.
  • FIG. 2 is a flowchart of a video segmentation method provided in Embodiment 2 of the present application. This embodiment is described on the basis of the above-mentioned embodiment. As shown in Figure 2, the method includes:
  • S210 Obtain a video to be divided, input the video to be divided into a pre-trained video segmentation model, obtain segmentation data output by the video segmentation model, and determine a video frame number interval corresponding to the knowledge point data according to the segmentation data.
  • the video to be divided is divided by a machine learning algorithm.
  • the video to be segmented is input into the trained video segmentation model to obtain segmentation data output by the video segmentation model.
  • the segmentation data output by the video segmentation model may include the knowledge point data contained in the video to be segmented and the correspondence between the knowledge point data and the video frame.
  • the video frame number interval corresponding to the knowledge point data is determined according to the correspondence between the knowledge point data contained in the segmentation data and the video frame.
  • the video segmentation model is constructed based on a neural network.
  • the neural network may be a Recurrent Neural Network (RNN) or other forms of neural network models.
  • RNN Recurrent Neural Network
  • the method further includes: acquiring the sample to be divided video and the segmentation data corresponding to the sample to be divided; generating a training sample pair based on the sample to be divided and the segmentation data corresponding to the sample to be divided; The training samples are used to train the pre-built video segmentation model to obtain a trained video segmentation model.
  • the segmented video can be obtained as the sample to be segmented video, the sample to be segmented video and the segmentation data corresponding to the sample to be segmented video are used as training sample pairs, and multiple training sample pairs are used to perform the pre-built video segmentation model Perform training to obtain a trained video segmentation model.
  • the video can be segmented by manual segmentation, and the segmented video is used as a sample video to be segmented.
  • S220 Determine a fuzzy boundary point of the knowledge point data according to the video frame number interval corresponding to the knowledge point data.
  • the correspondence between the knowledge point data and the video frame is obtained, the correspondence between the knowledge point data and the video frame is corrected, and the video to be divided is segmented according to the corrected correspondence.
  • the boundary point of the video frame number interval corresponding to the knowledge point data may be used as the fuzzy boundary point of the knowledge point data. Exemplarily, assuming that the video frame number interval corresponding to the knowledge point A is 21-50, the fuzzy boundary point of the knowledge point A is the video frame 21 and the video frame 50.
  • S230 Based on the fuzzy boundary point, obtain candidate video frames within a set range, perform boundary detection on the candidate video frame, and obtain the target boundary point corresponding to the knowledge point data.
  • a range may be preset to obtain candidate video frames corresponding to the knowledge point data.
  • the range can be set to (a-5, a+5), where a is the number of video frames corresponding to the fuzzy boundary point of the knowledge point data.
  • the candidate video frame is obtained according to the set range, and the candidate video frame is boundary detected, and the number of video frames corresponding to the detected boundary is used as the target boundary point of the knowledge point data.
  • the boundary detection of the candidate video frame may be: extracting the video segment features in the candidate video frame, and combining the segmentation algorithm of the joint mutation shot and the gradual shot to perform the boundary detection.
  • the setting range is (a-5, a+5), and a fuzzy boundary point of knowledge point A is video frame 21, then the video frame from video frame 16 to video frame 26 is obtained as the candidate video frame.
  • the video segment composed of video frame 16-video frame 26 is subjected to boundary detection. Assuming that the detected boundary is video frame 22, video frame 22 is taken as a target boundary point of knowledge point A.
  • S240 Determine a video segment corresponding to the knowledge point data according to the target boundary point corresponding to the knowledge point data.
  • the video segment formed by the target boundary point of the knowledge point data is taken as the video segment corresponding to the knowledge point data.
  • the video segment corresponding to the knowledge point A is the video segment between the video frame 22 and the video frame 49.
  • determining the correspondence between the knowledge point data in the video to be divided and the video frame in the video to be divided includes: determining the correspondence between the knowledge point data and the video frame through a video segmentation model, and The boundary detection method is used to determine the target boundary point corresponding to the knowledge point data.
  • the corresponding relationship between the knowledge point data and the video frame is made more accurate, so that the segmentation result of the video to be segmented is more accurate.
  • FIG. 3a is a flowchart of a video segmentation method provided in Embodiment 3 of the present application. This embodiment is described on the basis of the above-mentioned embodiment. As shown in Figure 3a, the method includes:
  • S320 Perform segmentation on the to-be-divided video according to the corresponding relationship to obtain at least one video segment.
  • the association relationship between the segmented video segments is determined according to the association relationship between the knowledge point data
  • the learning path is determined according to the association relationship between the video segments.
  • the learning path is used to characterize the learning sequence between video clips.
  • the association relationship between knowledge point data can be a directed learning relationship between knowledge points. For example, if you learn knowledge point B, you must first learn knowledge point A, and then the association relationship between knowledge points can be knowledge Point A ⁇ knowledge point B. According to the directional learning relationship between the knowledge point data and the video fragments corresponding to the knowledge point data, the directional learning relationship between the video fragments can be determined, and the directional learning relationship between the video fragments constitutes a learning path.
  • the identification of the learning path can also be determined according to the characteristics of the learning path. If there may be multiple learning paths between knowledge point A and knowledge point F, the learning path can be determined according to the number of knowledge points learned Identification, such as "shortest learning path", "most complete learning path", etc.
  • the shortest learning path may be the learning path with the least knowledge point data required to learn the target knowledge point
  • the most complete learning path may be the learning path with the most complete knowledge point data required to learn the target knowledge point.
  • the method for acquiring the association relationship between the knowledge point data may be: acquiring a knowledge graph, and determining the association relationship between the knowledge point data according to the knowledge graph.
  • the knowledge graph contains knowledge point data and the association relationship between the knowledge point data.
  • the association relationship between the knowledge points can be obtained through the knowledge graph.
  • the knowledge graph can be established by extracting the knowledge relationship from the video to be segmented, or the pre-built knowledge graph can be directly obtained.
  • the knowledge graph can be preset through teaching materials, related books, or web crawlers.
  • the preset knowledge graph will serve as an important basis for subsequent video segmentation.
  • the preset knowledge graph can be continuously improved artificially or through programs, so as to improve the accuracy of segmenting the video.
  • the knowledge graph can be obtained by analyzing the video to be segmented, including: extracting the knowledge point data contained in the video to be segmented; extracting the relationship between the knowledge point data, and constructing an association relationship between the knowledge point data Knowledge graph.
  • Fig. 3b is a flowchart of a knowledge graph construction provided by the third embodiment of the present application. As shown in Figure 3b, the subtitle content, text recognition content, and voice content in the video to be segmented are sorted, and structured data is formed through attribute extraction, relationship extraction, and entity extraction, and then through entity alignment, entity disambiguation, and knowledge After fusion, a preliminary knowledge map is formed, and finally the knowledge map corresponding to the video to be segmented is determined through quality evaluation.
  • S340 In response to the detected video viewing instruction, determine a learning path corresponding to the video viewing instruction.
  • a learning path that meets the user's needs can be recommended for the user according to the determined learning path and the user's learning needs.
  • the video viewing instruction may be an instruction triggered by the user through the terminal and used to instruct to view the video.
  • the video viewing request of the learning target knowledge point C can be triggered according to the prompt in the terminal interface.
  • the terminal detects the video viewing request triggered by the user, it generates a video viewing instruction according to the video viewing request .
  • the video viewing instruction is sent to the video segmentation device, the video segmentation device parses the received video viewing instruction, obtains the target knowledge point C contained in the video view instruction, and obtains the learning path corresponding to the target knowledge point C as the video viewing instruction The corresponding learning path.
  • the video viewing instruction may also include the user's learning needs
  • the video segmentation device may select a learning path that meets the user's needs from the learning path corresponding to the target knowledge point C according to the identification of the learning path.
  • the user's learning requirement is "the shortest learning path”
  • the path with the shortest learning path is selected from the learning path corresponding to the target knowledge point C as the learning path corresponding to the video viewing instruction.
  • FIG. 3c is a schematic flowchart of a video segmentation storage provided by Embodiment 3 of the present application.
  • multiple videos to be divided can be obtained at the same time, and the multiple videos to be divided can be divided at the same time through the preset knowledge graph, and the multiple videos to be divided are divided according to the type of video segment (such as discipline) classify video clips of the same type, store video clips of the same type in the same video clip collection, and mark video clip collections of similar types to facilitate the recommendation of subsequent video clips.
  • the type of video segment Such as discipline
  • S350 Generate path recommendation information according to the learning path, and send the path recommendation information to the client for display.
  • the path recommendation information is generated according to the knowledge point data contained in the learning path, and the path recommendation information is sent to the client for display.
  • the path recommendation information may be a directed learning relationship between knowledge point data, or a directed learning relationship between video clips.
  • the technical solution of the embodiment of the present application adds the operation of determining the learning path according to the association relationship between the knowledge points.
  • the path By determining the learning path corresponding to the video viewing instruction according to the video viewing instruction, according to the learning
  • the path generates path recommendation information, and sends the path recommendation information to the client for display, so that a learning path that meets the needs of the user can be recommended for the user, and the learning effect of the user is improved.
  • FIG. 4 is a schematic structural diagram of a video segmentation device provided by Embodiment 4 of the present application.
  • the video splitting device can be implemented in software and/or hardware.
  • the video splitting device can be configured in a computer device.
  • the device includes a knowledge point recognition module 410 and a video segmentation module 420, where:
  • the knowledge point recognition module 410 is configured to obtain the video to be divided, and determine the correspondence between the knowledge point data in the video to be divided and the video frame in the video to be divided; the video segmentation module 420 is configured to obtain the corresponding relation The video to be divided is divided to obtain at least one video segment.
  • the knowledge point recognition module obtains the video to be divided, and determines the correspondence between the knowledge point data in the video to be divided and the video frame in the video to be divided;
  • the video to be divided is divided to obtain at least one video segment.
  • the knowledge point identification module 410 is set to:
  • the video to be divided is input into a pre-trained video segmentation model, segmentation data output by the video segmentation model is obtained, and a video frame number interval corresponding to the knowledge point data is determined according to the segmentation data.
  • the video segmentation module 420 is set to:
  • the device further includes:
  • the learning path determination module is configured to, after obtaining at least one video clip, determine the relationship between the multiple video clips according to the association relationship between the knowledge point data when the at least one video clip is multiple video clips.
  • the association relationship determines at least one learning path according to the association relationship between the multiple video clips, and the learning path is used to characterize the learning sequence between the video clips.
  • the device further includes:
  • the association relationship determination module is configured to obtain a knowledge graph before determining the association relationship between the multiple video clips according to the association relationship between the knowledge point data, and determine the association relationship between the knowledge point data according to the knowledge graph.
  • association relationship determination module is set to:
  • the device further includes a path recommendation module, which is set to:
  • a learning path corresponding to the video viewing instruction is determined; path recommendation information is generated according to the learning path, and the path recommendation information is sent to the client for display.
  • the video segmentation device provided in the embodiments of the present application can execute the video segmentation method provided in any embodiment, and has functional modules and effects corresponding to the execution method.
  • FIG. 5 is a schematic structural diagram of a computer device provided in Embodiment 5 of the present application.
  • Figure 5 shows a block diagram of an exemplary computer device 512 suitable for implementing embodiments of the present application.
  • the computer device 512 shown in FIG. 5 is only an example, and should not bring any limitation to the function and scope of use of the embodiments of the present application.
  • the computer device 512 is represented in the form of a general-purpose computing device.
  • the components of the computer device 512 may include, but are not limited to: one or more processors 516, a system memory 528, and a bus 518 connecting different system components (including the system memory 528 and the processor 516).
  • the bus 518 represents one or more of several types of bus structures, including a memory bus or a memory controller, a peripheral bus, a graphics acceleration port, a processor 516, or a local bus using any bus structure among multiple bus structures.
  • these architectures include but are not limited to Industry Standard Architecture (ISA) bus, MicroChannel Architecture (MAC) bus, enhanced ISA bus, Video Electronics Standards Association (Video Electronics Standards Association) , VESA) local bus and Peripheral Component Interconnect (PCI) bus.
  • Computer device 512 typically includes a variety of computer system readable media. These media can be any available media that can be accessed by the computer device 512, including volatile and nonvolatile media, removable and non-removable media.
  • the system memory 528 may include a computer system readable medium in the form of a volatile memory, such as a random access memory (RAM) 530 and/or a cache memory 532.
  • the computer device 512 may include other removable/non-removable, volatile/non-volatile computer system storage media.
  • the storage device 534 may be configured to read and write a non-removable, non-volatile magnetic medium (not shown in FIG. 5, usually referred to as a "hard drive").
  • a disk drive configured to read and write to a removable non-volatile disk (such as a "floppy disk") and a removable non-volatile optical disk (such as a compact disc read-only memory (Compact Disc)) can be provided.
  • a removable non-volatile disk such as a "floppy disk”
  • a removable non-volatile optical disk such as a compact disc read-only memory (Compact Disc)
  • CD-ROM Compact Disc-only Memory
  • DVD-ROM Digital Versatile Disc-Read Only Memory
  • each drive may be connected to the bus 518 through one or more data medium interfaces.
  • the memory 528 may include at least one program product, and the program product has a set of (for example, at least one) program modules, and these program modules are configured to perform the functions of the embodiments of the present application.
  • a program/utility tool 540 having a set of (at least one) program module 542 may be stored in, for example, the memory 528.
  • Such program module 542 includes but is not limited to an operating system, one or more application programs, other program modules, and program data Each of these examples or a combination may include the implementation of a network environment.
  • the program module 542 generally executes the functions and/or methods in the embodiments described in this application.
  • the computer device 512 can also communicate with one or more external devices 514 (such as a keyboard, pointing device, display 524, etc.), and can also communicate with one or more devices that enable a user to interact with the computer device 512, and/or communicate with Any device (such as a network card, modem, etc.) that enables the computer device 512 to communicate with one or more other computing devices. Such communication may be performed through an input/output (Input/Output, I/O) interface 522.
  • the computer device 512 may also communicate with one or more networks (for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet) through the network adapter 520.
  • networks for example, a local area network (LAN), a wide area network (WAN), and/or a public network, such as the Internet
  • the network adapter 520 communicates with other modules of the computer device 512 through the bus 518. It should be understood that although not shown in the figure, other hardware and/or software modules can be used in conjunction with the computer device 512, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives And data backup storage system, etc.
  • the processor 516 executes a variety of functional applications and data processing by running programs stored in the system memory 528, for example, to implement the video segmentation method provided in the embodiment of the present application, the method includes:
  • Obtain the video to be divided determine the correspondence between the knowledge point data in the video to be divided and the video frames in the video to be divided; divide the video to be divided according to the correspondence to obtain at least one video Fragment.
  • the processor may also implement the technical solution of the video segmentation method provided by any embodiment of the present application.
  • the sixth embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored.
  • the program is executed by a processor, the video segmentation method as provided in the embodiment of the present application is implemented, and the method includes:
  • Obtain the video to be divided determine the correspondence between the knowledge point data in the video to be divided and the video frames in the video to be divided; divide the video to be divided according to the correspondence to obtain at least one video Fragment.
  • a computer-readable storage medium provided by an embodiment of the present application, and the computer program stored thereon is not limited to the method operations described above, and can also perform related operations in the video segmentation method provided by any embodiment of the present application.
  • the computer storage media in the embodiments of the present application may adopt any combination of one or more computer-readable media.
  • the computer-readable medium may be a computer-readable signal medium or a computer-readable storage medium.
  • the computer-readable storage medium may be, for example, but not limited to, an electrical, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or a combination of any of the above.
  • Examples of computer-readable storage media include: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Erasable Programmable Read-Only Memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the above.
  • the computer-readable storage medium can be any tangible medium that contains or stores a program, and the program can be used by or in combination with an instruction execution system, apparatus, or device.
  • the computer-readable signal medium may include a data signal propagated in baseband or as a part of a carrier wave, and computer-readable program code is carried therein. This propagated data signal can take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than the computer-readable storage medium.
  • the computer-readable medium may send, propagate, or transmit the program for use by or in combination with the instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium can be transmitted by any suitable medium, including but not limited to wireless, wire, optical cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • suitable medium including but not limited to wireless, wire, optical cable, radio frequency (RF), etc., or any suitable combination of the foregoing.
  • the computer program code used to perform the operations of this application can be written in one or more programming languages or a combination thereof.
  • the programming languages include object-oriented programming languages—such as Java, Smalltalk, C++, and also conventional Procedural programming language-such as "C" language or similar programming language.
  • the program code can be executed entirely on the user's computer, partly on the user's computer, executed as an independent software package, partly on the user's computer and partly executed on a remote computer, or entirely executed on the remote computer or server.
  • the remote computer can be connected to the user's computer through any kind of network, including a local area network (LAN) or a wide area network (WAN), or it can be connected to an external computer (for example, using an Internet service provider to pass Internet connection).
  • LAN local area network
  • WAN wide area network
  • Internet service provider for example, using an Internet service provider to pass Internet connection.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Software Systems (AREA)
  • Business, Economics & Management (AREA)
  • Educational Administration (AREA)
  • Educational Technology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Image Analysis (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

一种视频分割方法、装置、设备及介质,所述方法包括:获取待分割视频,确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系(S110);根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段(S120)。

Description

视频分割方法、装置、设备及介质
本申请要求在2019年09月30日提交中国专利局、申请号为201910943037.0的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及信息处理领域,例如涉及一种视频分割方法、装置、设备及介质。
背景技术
随着移动互联网技术的快速发展,视频教学资源得到了极大的推动。数字化教学视频得到广大师生的青睐。一般的教学视频较长,一节课对应一个视频,一个视频包含多个知识点。因此,若需要查看视频中的一个知识点,需要将视频进行分割。
上述方案中至少存在以下技术问题:视频分割主要是根据指定帧、时长或者内容空白间隙进行视频切分。指定帧或者时长的切分方式需人为操作且需事先观看全部视频内容。按照内容空白间隙切分的方式容易造成分割不准确,导致观看者理解困难。
发明内容
本申请提供了一种视频分割方法、装置、设备及介质,以实现简化视频的分割过程,提高视频的分割准确度。
本申请实施例提供了一种视频分割方法,包括:
获取待分割视频,确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;
根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段。
本申请实施例还提供了一种视频分割装置,包括:
知识点识别模块,设置为获取待分割视频,确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;
视频分割模块,设置为根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段。
本申请实施例还提供了一种计算机设备,所述设备包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如本申请任意实施例所提供的视频分割方法。
本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请任意实施例所提供的视频分割方法。
附图说明
图1是本申请实施例一所提供的一种视频分割方法的流程图;
图2是本申请实施例二所提供的一种视频分割方法的流程图;
图3a是本申请实施例三所提供的一种视频分割方法的流程图;
图3b是本申请实施例三所提供的一种知识图谱构建的流程图;
图3c是本申请实施例三所提供的一种视频分割存储的流程示意图;
图4是本申请实施例四所提供的一种视频分割装置的结构示意图;
图5是本申请实施例五所提供的一种计算机设备的结构示意图。
具体实施方式
下面结合附图和实施例对本申请进行说明。此处所描述的具体实施例仅仅用于解释本申请,而非对本申请的限定。为了便于描述,附图中仅示出了与本申请相关的部分而非全部结构。
实施例一
图1是本申请实施例一所提供的一种视频分割方法的流程图。本实施例可适用于对视频进行分割的情形,例如适用于对教学视频进行分割的情形。该方法可以由视频分割装置执行,该视频分割装置可以采用软件和/或硬件的方式实现,例如,该视频分割装置可配置于计算机设备中。如图1所示,所述方法包括:
S110、获取待分割视频,确定待分割视频中的知识点数据与待分割视频中的视频帧之间的对应关系。
在本实施例中,待分割视频可以为教育教学视频,也可以为科普视频。获取待分割视频后,对待分割视频内容进行分析,提炼出视频的字幕内容、语音内容和/或视频帧中的文字内容,对识别出的上述内容进行整理后可以得到待分 割视频中包含的知识点数据以及每个知识点数据对应的视频帧。
可选的,可以采用自然场景文本检测(Connectionist Text Proposal Network,CTPN)方法对视频帧中的文字区域进行定位,采用tessract工具进行图片中的文字识别,采用自动语音识别技术(Automatic Speech Recognition,ASP)识别视频中的语音内容。一个实施例中,可以通过对识别出的视频内容进行属性提取、关系提取和实体提取,形成结构化数据,确定待分割视频包含的知识点数据,以及知识点数据和视频帧之间的对应关系。
示例性的,假设对待分割视频进行内容提取和整理后,获得的知识点数据包括知识点A、知识点B和知识点C,则知识点数据和视频帧之间的对应关系可以为:知识点A对应的视频帧范围为1~20,知识点B对应的视频帧范围为21~50,知识点C对应的视频帧范围为51~90。
在本申请的另一种实施方式中,可以直接获取待分割视频对应的知识点数据。可选的,可以预先确定不同类型的视频对应的知识点数据,获取待分割视频后,根据待分割视频的类型获取待分割视频对应的知识点数据。可选的,若待分割视频为教育教学视频,则待分割视频的类型可以为待分割视频所属科目和/或章节。示例性的,假设待分割视频为初中数学教学视频,则可以获取初中数学科目对应的知识点数据作为待分割视频对应的知识点数据。
S120、根据对应关系对待分割视频进行分割,得到至少一个视频片段。
在本实施例中,可以根据知识点数据对应的视频帧数范围,将每个知识点数据对应的视频帧区间作为一个视频片段。示例性的,假设知识点A对应的视频帧范围为1~20,知识点B对应的视频帧范围为21~50,知识点C对应的视频帧范围为51~90,则可以将视频帧1~视频帧20对应的视频片段确定为视频片段1,将视频帧21~视频帧50对应的视频片段确定为视频片段2,将视频帧51~视频帧90对应的视频片段确定为视频片段3。
考虑到直接根据知识点数据和视频帧之间的对应关系所划分的视频片段可能不够准确,还可以通过边界检测方法对上述对应关系进行校正,得到校正后的知识点数据和视频帧之间的对应关系,将校正后的每个知识点数据对应的视频帧区间作为一个视频片段。
本申请实施例通过获取待分割视频,确定待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段,通过根据待分割视频中的知识点数据对待分割视频进行分割,简化了视频的分割过程,提高了视频的分割准确度。
实施例二
图2是本申请实施例二所提供的一种视频分割方法的流程图。本实施例在上述实施例的基础上进行了说明。如图2所示,所述方法包括:
S210、获取待分割视频,将待分割视频输入至预先训练好的视频分割模型中,获取视频分割模型输出的分割数据,根据分割数据确定知识点数据对应的视频帧数区间。
在本实施例中,通过机器学习算法将待分割视频进行分割。一实施例中,将待分割视频输入至训练好的视频分割模型中,获得视频分割模型输出的分割数据。其中,视频分割模型输出的分割数据可以包括待分割视频中包含的知识点数据以及知识点数据与视频帧之间的对应关系。获取视频分割模型输出的分割数据后,根据分割数据中包含的知识点数据与视频帧之间的对应关系确定知识点数据对应的视频帧数区间。可选的,视频分割模型是基于神经网络构建的。其中,神经网络可以为循环神经网络(Recurrent Neural Network,RNN)或其他形式的神经网络模型。
在上述方案的基础上,还包括:获取样本待分割视频以及所述样本待分割视频对应的分割数据;基于所述样本待分割视频以及所述样本待分割视频对应的分割数据生成训练样本对,使用所述训练样本对对预先构建的视频分割模型进行训练,得到训练好的视频分割模型。
可选的,可以获取已分割好的视频作为样本待分割视频,将样本待分割视频以及该样本待分割视频对应的分割数据作为训练样本对,使用多个训练样本对对预先构建的视频分割模型进行训练,得到训练好的视频分割模型。其中,可以通过人工分割的方式对视频进行分割,将分割好的视频作为样本待分割视频。
S220、根据知识点数据对应的视频帧数区间确定知识点数据的模糊分界点。
在本实施例中,获取知识点数据与视频帧之间的对应关系后,对知识点数据与视频帧之间的对应关系进行校正,根据校正后的对应关系对待分割视频进行分割。在对上述对应关系进行校正时,需要确定知识点数据的模糊分界点,基于模糊分界点确定知识点数据对应的目标分界点。可选的,可以将知识点数据对应的视频帧数区间的边界点作为该知识点数据的模糊分界点。示例性的,假设知识点A对应的视频帧数区间为21~50,则知识点A的模糊分界点为视频帧21和视频帧50。
S230、基于模糊分界点,获取设定范围内的候选视频帧,对候选视频帧进行边界检测,得到知识点数据对应的目标分界点。
一个实施方式中,可以预先设定范围,用于获取知识点数据对应的候选视频帧。可选的,可以设定范围为(a-5,a+5),其中,a为知识点数据的模糊分界点对应的视频帧数。针对知识点数据的每个模糊边界点,根据设定范围获取候选视频帧,对候选视频帧进行边界检测,将检测到的边界对应的视频帧数作为知识点数据的目标分界点。其中,对候选视频帧进行边界检测可以为:提取候选视频帧中的视频片段特征,结合联合突变镜头和渐变镜头的分割算法进行边界检测。
示例性的,若设定范围为(a-5,a+5),知识点A的一个模糊分界点为视频帧21,则获取视频帧16-视频帧26的视频帧作为候选视频帧,对视频帧16-视频帧26构成的视频段进行边界检测,假设检测出的边界为视频帧22,则将视频帧22作为知识点A的一个目标分界点。
S240、根据知识点数据对应的目标分界点确定知识点数据对应的视频片段。
在本实施例中,将知识点数据的目标分界点所构成的视频片段作为知识点数据对应的视频片段。示例性的,若知识点A对应的目标分界点为视频帧22和视频帧49,则知识点A对应的视频片段为视频帧22-视频帧49之间的视频片段。
本申请实施例的技术方案,确定待分割视频中的知识点数据与待分割视频中的视频帧之间的对应关系包括:通过视频分割模型确定知识点数据与视频帧之间的对应关系,并通过边界检测方法确定知识点数据对应的目标分界点。使得知识点数据与视频帧之间的对应关系更加准确,从而使得待分割视频的分割结果更加准确。
实施例三
图3a是本申请实施例三所提供的一种视频分割方法的流程图。本实施例在上述实施例的基础上进行了说明。如图3a所示,所述方法包括:
S310、获取待分割视频,确定待分割视频中的知识点数据与待分割视频中的视频帧之间的对应关系。
S320、根据对应关系对待分割视频进行分割,得到至少一个视频片段。
S330、在至少一个视频片段为多个视频片段的情况下,根据知识点数据之间的关联关系确定多个视频片段之间的关联关系,根据多个视频片段之间的关联关系确定至少一个学习路径。
在本实施例中,在对待分割视频进行分割之后,根据知识点数据之间的关联关系确定分割后的视频片段之间的关联关系,根据视频片段之间的关联关系 确定学习路径。其中,学习路径用于表征视频片段之间的学习顺序。
可选的,知识点数据之间的关联关系可以为知识点之间的有向学习关系,例如若学习知识点B,则必须先学习知识点A,则知识点之间的关联关系可以为知识点A→知识点B。根据知识点数据之间的有向学习关系以及知识点数据对应的视频片段可以确定视频片段之间的有向学习关系,视频片段之间的有向学习关系构成学习路径。
在确定学习路径后,还可以根据学习路径的特点确定学习路径的标识,如知识点A到知识点F之间可能有多条学习路径,则可以根据所学习的知识点的数量对学习路径进行标识,如“最短学习路径”、“最全学习路径”等。最短学习路径可以为学习目标知识点所需学习的知识点数据最少的学习路径,最全学习路径可以为学习目标知识点所需学习的知识点数据最全的学习路径。
在本申请的一种实施方式中,知识点数据之间的关联关系的获取方式可以为:获取知识图谱,根据知识图谱确定知识点数据之间的关联关系。
知识图谱中包含有知识点数据以及知识点数据之间的关联关系,可选的,知识点之间的关联关系可以通过知识图谱获取。其中,知识图谱可以通过对待分割视频进行知识关系抽取建立,也可以直接获取预先构建好的知识图谱。
一个实施方式中,可以通过教材、相关书籍或者网络爬虫等方式预设知识图谱。预设的知识图谱将作为后续视频分割的重要依据。可选的,可以将预设知识图谱通过人为或者程序不断完善,从而提高分割视频的准确程度。
一个实施方式中,可以通过对待分割视频进行分析获取知识图谱,包括:抽取出待分割视频中包含的知识点数据;对知识点数据进行关系抽取,构建出包含知识点数据之间的关联关系的知识图谱。图3b是本申请实施例三所提供的一种知识图谱构建的流程图。如图3b所示,将待分割视频中的字幕内容、文字识别内容以及语音内容进行整理,通过属性提取、关系提取以及实体提取,形成结构化数据,然后再通过实体对齐、实体消歧和知识融合之后形成初步的知识图谱,最后经过质量评估确定待分割视频对应的知识图谱。
S340、响应于检测到的视频查看指令,确定视频查看指令对应的学习路径。
在根据知识点数据之间的关联关系以及知识点数据与视频片段之间的对应关系确定学习路径后,可以根据确定的学习路径以及用户的学习需求为用户推荐符合用户需求的学习路径。
视频查看指令可以为用户通过终端触发的,用于指示查看视频的指令。示例性的,若用户需要学习目标知识点C,可以根据终端界面中的提示触发学习目标知识点C的视频查看请求,终端检测到用户触发的视频查看请求后,根据视 频查看请求生成视频查看指令,将视频查看指令发送至视频分割装置,视频分割装置对接收到的视频查看指令进行解析,获取视频查看指令中包含的目标知识点C,获取目标知识点C对应的学习路径,作为视频查看指令对应的学习路径。
可选的,视频查看指令中还可以包含用户的学习需求,视频分割装置可以根据学习路径的标识从目标知识点C对应的学习路径中选择符合用户需求的学习路径。示例性的,若用户学习需求为“学习路径最短”,则从目标知识点C对应的学习路径中选取学习路径最短的路径作为视频查看指令对应的学习路径。
图3c是本申请实施例三所提供的一种视频分割存储的流程示意图。如图3c所示,本实施例中可以同时获取多个待分割视频,通过预先设置的知识图谱同时对多个待分割视频进行分割,将多个待分割视频进行分割后根据视频片段的类型(如学科)对相同类型的视频片段进行归类处理,将同一类型的视频片段存储至同一视频片段集合中,还可以将相近类型的视频片段集合进行标记,以方便后续视频片段的推荐。
S350、根据学习路径生成路径推荐信息,并将路径推荐信息发送至客户端进行显示。
获取视频查看指令对应的学习路径后,根据学习路径中包含的知识点数据生成路径推荐信息,将路径推荐信息发送至客户端进行显示。可选的,路径推荐信息可以为知识点数据之间的有向学习关系,也可以为视频片段之间的有向学习关系。
本申请实施例的技术方案,在上述方案的基础上,增加了根据知识点之间的关联关系确定学习路径的操作,通过根据视频查看指令,确定视频查看指令对应的学习路径,根据所述学习路径生成路径推荐信息,将所述路径推荐信息发送至客户端进行显示,使得可以为用户推荐符合用户需求的学习路径,提高了用户的学习效果。
实施例四
图4是本申请实施例四所提供的一种视频分割装置的结构示意图。该视频分割装置可以采用软件和/或硬件的方式实现,例如该视频分割装置可以配置于计算机设备中。如图4所示,所述装置包括知识点识别模块410和视频分割模块420,其中:
知识点识别模块410,设置为获取待分割视频,确定待分割视频中的知识点 数据与所述待分割视频中的视频帧之间的对应关系;视频分割模块420,设置为根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段。
本申请实施例通过知识点识别模块获取待分割视频,确定待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;视频分割模块根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段,通过根据待分割视频中的知识点数据对待分割视频进行分割,简化了视频的分割过程,提高了视频的分割准确度。
在上述方案的基础上,所述知识点识别模块410是设置为:
将所述待分割视频输入至预先训练好的视频分割模型中,获取所述视频分割模型输出的分割数据,根据所述分割数据确定知识点数据对应的视频帧数区间。
在上述方案的基础上,所述视频分割模块420是设置为:
根据所述知识点数据对应的视频帧数区间确定所述知识点数据的模糊分界点;基于所述模糊分界点,获取设定范围内的候选视频帧,对所述候选视频帧进行边界检测,得到所述知识点数据对应的目标分界点;根据所述知识点数据对应的目标分界点确定所述知识点数据对应的视频片段。
在上述方案的基础上,所述装置还包括:
学习路径确定模块,设置为在得到至少一个视频片段之后,在所述至少一个视频片段为多个视频片段的情况下,根据知识点数据之间的关联关系确定所述多个视频片段之间的关联关系,根据所述多个视频片段之间的关联关系确定至少一个学习路径,所述学习路径用于表征视频片段之间的学习顺序。
在上述方案的基础上,所述装置还包括:
关联关系确定模块,设置为在根据知识点数据之间的关联关系确定所述多个视频片段之间的关联关系之前,获取知识图谱,根据所述知识图谱确定知识点数据之间的关联关系。
在上述方案的基础上,所述关联关系确定模块是设置为:
抽取出所述待分割视频中包含的知识点数据;对所述知识点数据进行关系抽取,构建出包含知识点数据之间的关联关系的知识图谱。
在上述方案的基础上,所述装置还包括路径推荐模块,设置为:
响应于检测到的视频查看指令,确定所述视频查看指令对应的学习路径;根据所述学习路径生成路径推荐信息,并将所述路径推荐信息发送至客户端进行显示。
本申请实施例所提供的视频分割装置可执行任意实施例所提供的视频分割方法,具备执行方法相应的功能模块和效果。
实施例五
图5是本申请实施例五所提供的一种计算机设备的结构示意图。图5示出了适于用来实现本申请实施方式的示例性计算机设备512的框图。图5显示的计算机设备512仅仅是一个示例,不应对本申请实施例的功能和使用范围带来任何限制。
如图5所示,计算机设备512以通用计算设备的形式表现。计算机设备512的组件可以包括但不限于:一个或者多个处理器516,系统存储器528,连接不同系统组件(包括系统存储器528和处理器516)的总线518。
总线518表示几类总线结构中的一种或多种,包括存储器总线或者存储器控制器,外围总线,图形加速端口,处理器516或者使用多种总线结构中的任意总线结构的局域总线。举例来说,这些体系结构包括但不限于工业标准体系结构(Industry Standard Architecture,ISA)总线,微通道体系结构(MicroChannel Architecture,MAC)总线,增强型ISA总线、视频电子标准协会(Video Electronics Standards Association,VESA)局域总线以及外围组件互连(Peripheral Component Interconnect,PCI)总线。
计算机设备512典型地包括多种计算机系统可读介质。这些介质可以是任何能够被计算机设备512访问的可用介质,包括易失性和非易失性介质,可移动的和不可移动的介质。
系统存储器528可以包括易失性存储器形式的计算机系统可读介质,例如随机存取存储器(Random Access Memory,RAM)530和/或高速缓存存储器532。计算机设备512可以包括其它可移动/不可移动的、易失性/非易失性计算机系统存储介质。仅作为举例,存储装置534可以设置为读写不可移动的、非易失性磁介质(图5未显示,通常称为“硬盘驱动器”)。尽管图5中未示出,可以提供设置为对可移动非易失性磁盘(例如“软盘”)读写的磁盘驱动器,以及对可移动非易失性光盘(例如光盘只读存储器(Compact Disc-Read Only Memory,CD-ROM),数字多功能光盘只读存储器(Digital Versatile Disc-Read Only Memory,DVD-ROM)或者其它光介质)读写的光盘驱动器。在这些情况下,每个驱动器可以通过一个或者多个数据介质接口与总线518相连。存储器528可以包括至少一个程序产品,该程序产品具有一组(例如至少一个)程序模块,这些程序模块被配置以执行本申请实施例的功能。
具有一组(至少一个)程序模块542的程序/实用工具540,可以存储在例如存储器528中,这样的程序模块542包括但不限于操作系统、一个或者多个应用程序、其它程序模块以及程序数据,这些示例中的每一个或一种组合中可能包括网络环境的实现。程序模块542通常执行本申请所描述的实施例中的功能和/或方法。
计算机设备512也可以与一个或多个外部设备514(例如键盘、指向设备、显示器524等)通信,还可与一个或者多个使得用户能与该计算机设备512交互的设备通信,和/或与使得该计算机设备512能与一个或多个其它计算设备进行通信的任何设备(例如网卡,调制解调器等等)通信。这种通信可以通过输入/输出(Input/Output,I/O)接口522进行。并且,计算机设备512还可以通过网络适配器520与一个或者多个网络(例如局域网(Local Area Network,LAN),广域网(Wide Area Network,WAN)和/或公共网络,例如因特网)通信。如图所示,网络适配器520通过总线518与计算机设备512的其它模块通信。应当明白,尽管图中未示出,可以结合计算机设备512使用其它硬件和/或软件模块,包括但不限于:微代码、设备驱动器、冗余处理单元、外部磁盘驱动阵列、RAID系统、磁带驱动器以及数据备份存储系统等。
处理器516通过运行存储在系统存储器528中的程序,从而执行多种功能应用以及数据处理,例如实现本申请实施例所提供的视频分割方法,该方法包括:
获取待分割视频,确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段。
处理器还可以实现本申请任意实施例所提供的视频分割方法的技术方案。
实施例六
本申请实施例六还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现如本申请实施例所提供的视频分割方法,该方法包括:
获取待分割视频,确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段。
本申请实施例所提供的一种计算机可读存储介质,其上存储的计算机程序不限于如上所述的方法操作,还可以执行本申请任意实施例所提供的视频分割 方法中的相关操作。
本申请实施例的计算机存储介质,可以采用一个或多个计算机可读的介质的任意组合。计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的例子(非穷举的列表)包括:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机存取存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器(Erasable Programmable Read-Only Memory,EPROM或闪存)、光纤、便携式紧凑磁盘只读存储器(CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本文件中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。
计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。
计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括——但不限于无线、电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。
可以以一种或多种程序设计语言或其组合来编写用于执行本申请操作的计算机程序代码,所述程序设计语言包括面向对象的程序设计语言—诸如Java、Smalltalk、C++,还包括常规的过程式程序设计语言—诸如“C”语言或类似的程序设计语言。程序代码可以完全地在用户计算机上执行、部分地在用户计算机上执行、作为一个独立的软件包执行、部分在用户计算机上部分在远程计算机上执行、或者完全在远程计算机或服务器上执行。在涉及远程计算机的情形中,远程计算机可以通过任意种类的网络——包括局域网(LAN)或广域网(WAN)—连接到用户计算机,或者,可以连接到外部计算机(例如利用因特网服务提供商来通过因特网连接)。

Claims (11)

  1. 一种视频分割方法,包括:
    获取待分割视频,确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;
    根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段。
  2. 根据权利要求1所述的方法,其中,所述确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系,包括:
    将所述待分割视频输入至预先训练好的视频分割模型中,获取所述视频分割模型输出的分割数据,根据所述分割数据确定所述知识点数据对应的视频帧数区间。
  3. 根据权利要求2所述的方法,其中,所述根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段,包括:
    根据所述知识点数据对应的视频帧数区间确定所述知识点数据的模糊分界点;
    基于所述模糊分界点,获取设定范围内的候选视频帧,对所述候选视频帧进行边界检测,得到所述知识点数据对应的目标分界点;
    根据所述知识点数据对应的目标分界点确定所述知识点数据对应的视频片段。
  4. 根据权利要求1所述的方法,在所述得到至少一个视频片段之后,还包括:
    在所述至少一个视频片段为多个视频片段的情况下,根据所述知识点数据之间的关联关系确定所述多个视频片段之间的关联关系,根据所述多个视频片段之间的关联关系确定至少一个学习路径,所述学习路径用于表征视频片段之间的学习顺序。
  5. 根据权利要求4所述的方法,在所述根据所述知识点数据之间的关联关系确定所述多个视频片段之间的关联关系之前,还包括:
    获取知识图谱,根据所述知识图谱确定所述知识点数据之间的关联关系。
  6. 根据权利要求5所述的方法,其中,所述获取知识图谱,包括:
    抽取出所述待分割视频中包含的知识点数据;
    对所述知识点数据进行关系抽取,构建出包含所述知识点数据之间的关联关系的知识图谱。
  7. 根据权利要求4所述的方法,还包括:
    响应于检测到的视频查看指令,确定所述视频查看指令对应的学习路径;
    根据所述学习路径生成路径推荐信息,并将所述路径推荐信息发送至客户端进行显示。
  8. 根据权利要求2所述的方法,还包括:
    获取样本待分割视频以及所述样本待分割视频对应的分割数据;
    基于所述样本待分割视频以及所述样本待分割视频对应的分割数据生成训练样本对,使用所述训练样本对对预先构建的视频分割模型进行训练,得到训练好的视频分割模型。
  9. 一种视频分割装置,包括:
    知识点识别模块,设置为获取待分割视频,确定所述待分割视频中的知识点数据与所述待分割视频中的视频帧之间的对应关系;
    视频分割模块,设置为根据所述对应关系对所述待分割视频进行分割,得到至少一个视频片段。
  10. 一种计算机设备,包括:
    至少一个处理器;
    存储装置,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-8中任一所述的视频分割方法。
  11. 一种计算机可读存储介质,存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求1-8中任一所述的视频分割方法。
PCT/CN2020/083473 2019-09-30 2020-04-07 视频分割方法、装置、设备及介质 WO2021062990A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/763,480 US20220375225A1 (en) 2019-09-30 2020-04-07 Video Segmentation Method and Apparatus, Device, and Medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910943037.0A CN111738041A (zh) 2019-09-30 2019-09-30 一种视频分割方法、装置、设备及介质
CN201910943037.0 2019-09-30

Publications (1)

Publication Number Publication Date
WO2021062990A1 true WO2021062990A1 (zh) 2021-04-08

Family

ID=72646108

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/083473 WO2021062990A1 (zh) 2019-09-30 2020-04-07 视频分割方法、装置、设备及介质

Country Status (3)

Country Link
US (1) US20220375225A1 (zh)
CN (1) CN111738041A (zh)
WO (1) WO2021062990A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114491254A (zh) * 2022-01-24 2022-05-13 湖南大学 知识点推荐方法、装置及存储介质
CN114697762A (zh) * 2022-04-07 2022-07-01 脸萌有限公司 一种处理方法、装置、终端设备及介质
CN115174982A (zh) * 2022-06-30 2022-10-11 咪咕文化科技有限公司 实时视频关联展示方法、装置、计算设备和存储介质

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112560663B (zh) * 2020-12-11 2024-08-23 南京谦萃智能科技服务有限公司 教学视频打点方法、相关设备及可读存储介质
CN112289321B (zh) * 2020-12-29 2021-03-30 平安科技(深圳)有限公司 讲解同步的视频高光处理方法、装置、计算机设备及介质
CN113051379B (zh) * 2021-02-24 2023-08-04 南京审计大学 一种知识点推荐方法、装置、电子设备及存储介质
CN113297419B (zh) * 2021-06-23 2024-04-09 南京谦萃智能科技服务有限公司 视频知识点确定方法、装置、电子设备和存储介质
CN113887334B (zh) * 2021-09-13 2024-08-09 华中师范大学 一种视频知识点抽取方法及装置
CN114708008A (zh) * 2021-12-30 2022-07-05 北京有竹居网络技术有限公司 一种推广内容处理方法、装置、设备、介质及产品
CN114550300A (zh) * 2022-02-25 2022-05-27 北京百度网讯科技有限公司 视频数据分析方法、装置、电子设备及计算机存储介质
CN117033665B (zh) * 2023-10-07 2024-01-09 成都华栖云科技有限公司 一种图谱知识点与视频的对齐方法及装置

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107968959A (zh) * 2017-11-15 2018-04-27 广东广凌信息科技股份有限公司 一种教学视频的知识点分割方法
CN109460488A (zh) * 2018-11-16 2019-03-12 广东小天才科技有限公司 一种辅助教学方法及系统
CN109934188A (zh) * 2019-03-19 2019-06-25 上海大学 一种幻灯片切换检测方法、系统、终端及存储介质
US20190272765A1 (en) * 2018-03-04 2019-09-05 NN Medical, Inc. Online Teaching System and Method thereof

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107343223B (zh) * 2017-07-07 2019-10-11 北京慕华信息科技有限公司 视频片段的识别方法和装置
CN108596940B (zh) * 2018-04-12 2021-03-30 北京京东尚科信息技术有限公司 一种视频分割方法和装置
CN109151615B (zh) * 2018-11-02 2022-01-25 湖南双菱电子科技有限公司 视频处理方法、计算机设备和计算机存储介质
CN109359215B (zh) * 2018-12-03 2023-08-22 江苏曲速教育科技有限公司 视频智能推送方法和系统
CN110147846A (zh) * 2019-05-23 2019-08-20 软通智慧科技有限公司 视频分割方法、装置、设备及存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107968959A (zh) * 2017-11-15 2018-04-27 广东广凌信息科技股份有限公司 一种教学视频的知识点分割方法
US20190272765A1 (en) * 2018-03-04 2019-09-05 NN Medical, Inc. Online Teaching System and Method thereof
CN109460488A (zh) * 2018-11-16 2019-03-12 广东小天才科技有限公司 一种辅助教学方法及系统
CN109934188A (zh) * 2019-03-19 2019-06-25 上海大学 一种幻灯片切换检测方法、系统、终端及存储介质

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114491254A (zh) * 2022-01-24 2022-05-13 湖南大学 知识点推荐方法、装置及存储介质
CN114697762A (zh) * 2022-04-07 2022-07-01 脸萌有限公司 一种处理方法、装置、终端设备及介质
CN114697762B (zh) * 2022-04-07 2023-11-28 脸萌有限公司 一种处理方法、装置、终端设备及介质
CN115174982A (zh) * 2022-06-30 2022-10-11 咪咕文化科技有限公司 实时视频关联展示方法、装置、计算设备和存储介质
CN115174982B (zh) * 2022-06-30 2024-04-09 咪咕文化科技有限公司 实时视频关联展示方法、装置、计算设备和存储介质

Also Published As

Publication number Publication date
US20220375225A1 (en) 2022-11-24
CN111738041A (zh) 2020-10-02

Similar Documents

Publication Publication Date Title
WO2021062990A1 (zh) 视频分割方法、装置、设备及介质
US11062090B2 (en) Method and apparatus for mining general text content, server, and storage medium
CN109614934B (zh) 在线教学质量评估参数生成方法及装置
CN112115706B (zh) 文本处理方法、装置、电子设备及介质
WO2023050650A1 (zh) 动画视频生成方法、装置、设备及存储介质
US20180366107A1 (en) Method and device for training acoustic model, computer device and storage medium
CN108305618B (zh) 语音获取及搜索方法、智能笔、搜索终端及存储介质
CN111475627B (zh) 解答推导题目的检查方法、装置、电子设备及存储介质
CN111639766B (zh) 样本数据的生成方法以及装置
CN111522970A (zh) 习题推荐方法、装置、设备及存储介质
US20210133623A1 (en) Self-supervised object detector training using raw and unlabeled videos
US11501655B2 (en) Automated skill tagging, knowledge graph, and customized assessment and exercise generation
CN110489747A (zh) 一种图像处理方法、装置、存储介质及电子设备
CN111325031B (zh) 简历解析方法及装置
CN114885216B (zh) 习题推送方法、系统、电子设备和存储介质
CN109408175B (zh) 通用高性能深度学习计算引擎中的实时交互方法及系统
CN110647613A (zh) 一种课件构建方法、装置、服务器和存储介质
CN111385659B (zh) 一种视频推荐方法、装置、设备及存储介质
CN113282509B (zh) 音色识别、直播间分类方法、装置、计算机设备和介质
CN111966839B (zh) 数据处理方法、装置、电子设备及计算机存储介质
WO2021104274A1 (zh) 图文联合表征的搜索方法、系统、服务器和存储介质
CN111968624B (zh) 数据构建方法、装置、电子设备及存储介质
CN112542163B (zh) 智能语音交互方法、设备及存储介质
CN116050382A (zh) 章节检测方法、装置、电子设备和存储介质
CN113850235B (zh) 一种文本处理方法、装置、设备及介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20871554

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 09.08.2022)

122 Ep: pct application non-entry in european phase

Ref document number: 20871554

Country of ref document: EP

Kind code of ref document: A1