WO2020052082A1 - 视频漂浮纸片检测方法、装置和计算机可读存储介质 - Google Patents

视频漂浮纸片检测方法、装置和计算机可读存储介质 Download PDF

Info

Publication number
WO2020052082A1
WO2020052082A1 PCT/CN2018/117711 CN2018117711W WO2020052082A1 WO 2020052082 A1 WO2020052082 A1 WO 2020052082A1 CN 2018117711 W CN2018117711 W CN 2018117711W WO 2020052082 A1 WO2020052082 A1 WO 2020052082A1
Authority
WO
WIPO (PCT)
Prior art keywords
detected
video
floating paper
picture
floating
Prior art date
Application number
PCT/CN2018/117711
Other languages
English (en)
French (fr)
Inventor
周多友
王长虎
Original Assignee
北京字节跳动网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京字节跳动网络技术有限公司 filed Critical 北京字节跳动网络技术有限公司
Publication of WO2020052082A1 publication Critical patent/WO2020052082A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/431Generation of visual interfaces for content selection or interaction; Content or additional data rendering
    • H04N21/4312Generation of visual interfaces for content selection or interaction; Content or additional data rendering involving specific graphical features, e.g. screen layout, special fonts or colors, blinking icons, highlights or animations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Definitions

  • the present disclosure relates to the technical field of information processing, and in particular, to a method, a device, and a computer-readable storage medium for detecting video floating paper.
  • the author of a video often adds some words to the video, such as advertising words, introductions, etc. In general, these words are printed, and the existing recognition methods are relatively easy to identify. But there are other scenarios where the author of the video will add some paper effects to the video, and there will be some handwritten characters on the paper effect. Because handwritten words are usually scribbled and difficult to recognize, the difficulties brought by video classification are often classified as wordless videos because they are not recognized.
  • the technical problem solved by the present disclosure is to provide a video floating paper detection method to at least partially solve the technical problem of inaccurate video classification.
  • a video floating sheet detection device a video floating sheet detection hardware device, a computer-readable storage medium, and a video floating sheet detection terminal are also provided.
  • a video floating paper detection method includes:
  • the floating paper being a sub-display window inserted into the video to be detected and irrelevant to the content of the video to be detected;
  • the step of determining whether a floating paper sheet is included in the video to be detected according to a detection result of the at least one frame of the picture to be detected includes:
  • the to-be-detected video includes a floating paper sheet.
  • step of performing floating paper detection on at least one frame of the picture to be detected extracted from the video to be detected includes:
  • the image features of the pictures to be detected in each frame are compared, and if there are pictures to be detected containing the same image feature, it is determined that at least two of the pictures to be detected include floating paper.
  • step of performing floating paper detection on at least one frame of the picture to be detected extracted from the video to be detected includes:
  • the picture to be detected includes at least two feature regions, it is determined that the picture to be detected includes a floating paper sheet.
  • the method further includes:
  • the step of performing floating paper detection on at least one frame of the picture to be detected extracted from the video to be detected includes:
  • a video floating paper detection device includes:
  • a floating paper detection module is configured to perform floating paper detection on at least one frame of a picture to be detected extracted from a video to be detected, where the floating paper is inserted into the video to be detected and is related to the content of the video to be detected Unrelated child display windows;
  • a floating paper sheet determining module is configured to determine whether a floating paper sheet is included in the video to be detected according to a detection result of the at least one frame of the picture to be detected.
  • the floating paper sheet determination module is specifically configured to: if it is detected that at least one frame of the to-be-detected picture includes a floating paper sheet, determine that the to-be-detected video includes a floating paper sheet.
  • the floating paper detection module is specifically configured to: for multiple frames of pictures to be detected, extract image features of each frame of pictures to be detected; compare the image features of each frame of pictures to be detected, and if there are For the pictures to be detected, it is determined that at least two frames of the pictures to be detected include floating paper.
  • the floating paper detection module is specifically configured to: for a single frame of a picture to be detected, extract feature points of the picture to be detected and neighboring feature points of the feature points; and according to the feature points and the neighboring features The similarity of the points determines the feature area; if it is detected that the picture to be detected includes at least two feature areas, it is determined that the picture to be detected includes a floating paper sheet.
  • the device further includes:
  • An image classifier training module configured to use pictures known to contain floating paper and / or pictures not known to contain floating paper as training samples; label the training samples according to whether floating paper is included; and use deep learning
  • a classification algorithm performs training learning on the labeled training samples to obtain an image classifier
  • the floating paper detection module is specifically configured to input the at least one frame of the picture to be detected into the image classifier, and determine a detection result in the at least one frame of the picture to be detected according to a classification result of the image classifier.
  • a video floating paper detection hardware device includes:
  • Memory for storing non-transitory computer-readable instructions
  • a processor configured to run the computer-readable instructions, so that the processor, when executed, implements the steps described in any one of the foregoing technical solutions of the video floating sheet detection method.
  • a computer-readable storage medium is used to store non-transitory computer-readable instructions, and when the non-transitory computer-readable instructions are executed by a computer, cause the computer to execute any one of the technical solutions of the video floating paper detection method described above As described in the steps.
  • a video floating paper detection terminal includes any of the above video floating paper detection devices.
  • Embodiments of the present disclosure provide a video floating sheet detection method, a video floating sheet detection device, a video floating sheet detection hardware device, a computer-readable storage medium, and a video floating sheet detection terminal.
  • the video floating sheet detection method includes performing floating sheet detection on at least one frame of a to-be-detected picture extracted from a video to be detected, the floating sheet being inserted into the video to be detected and connected to the to-be-detected video.
  • a video display unrelated sub-display window determining whether a floating paper is included in the video to be detected according to a detection result of the at least one frame of the picture to be detected.
  • An embodiment of the present disclosure first performs floating paper detection on at least one frame of a picture to be detected extracted from a video to be detected, where the floating paper is a sub-display inserted into the video to be detected and has nothing to do with the content of the video to be detected A window, and then determining whether a floating paper is included in the video to be detected according to the detection result of the at least one frame of the video to be detected, which can improve the accuracy of video classification.
  • FIG. 1a is a schematic flowchart of a video floating sheet detection method according to an embodiment of the present disclosure
  • FIG. 1b is a schematic flowchart of a video floating paper detection method according to another embodiment of the present disclosure.
  • FIG. 1c is a schematic flowchart of a video floating sheet detection method according to another embodiment of the present disclosure.
  • FIG. 1d is a schematic flowchart of a video floating sheet detection method according to another embodiment of the present disclosure.
  • FIG. 1e is a schematic flowchart of a video floating sheet detection method according to another embodiment of the present disclosure.
  • FIG. 2a is a schematic structural diagram of a device for detecting a video floating sheet according to an embodiment of the present disclosure
  • FIG. 2b is a schematic structural diagram of a video floating sheet detection device according to another embodiment of the present disclosure.
  • FIG. 3 is a schematic structural diagram of a video floating paper detection hardware device according to an embodiment of the present disclosure
  • FIG. 4 is a schematic structural diagram of a computer-readable storage medium according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic structural diagram of a video floating paper detection terminal according to an embodiment of the present disclosure.
  • the video floating sheet detection method mainly includes the following steps S1 to S2. among them:
  • Step S1 Perform floating paper detection on at least one frame of the picture to be detected extracted from the video to be detected.
  • the floating paper is a sub-display window inserted into the video to be detected and has nothing to do with the content of the video to be detected.
  • the picture to be detected may be one or more frames.
  • the picture to be detected is a plurality of frames, a single frame of the picture to be detected is detected, or the multiple pictures to be detected are compared.
  • the sub-display window includes, but is not limited to, inserted advertisements, pornographic information, or handwritten text information.
  • Step S2 Determine whether a floating paper is included in the video to be detected according to a detection result of at least one frame of the picture to be detected.
  • the detection results include, but are not limited to, only one frame of pictures to be detected contains floating paper, or multiple frames of pictures to be detected include floating paper, or no pictures to be detected include floating paper.
  • This embodiment performs floating paper detection on at least one frame of a picture to be detected extracted from a video to be detected, where the floating paper is a sub-display window inserted into the video to be detected and has nothing to do with the content of the video to be detected.
  • the detection result of at least one frame of the picture to be detected determines whether a floating paper is included in the video to be detected, which can improve the accuracy of video classification.
  • step S2 specifically includes:
  • the to-be-detected video includes a floating paper sheet.
  • floating paper detection is performed on at least one frame of the picture to be detected extracted from the video to be detected.
  • the floating paper is a sub-display window inserted into the video to be detected and has nothing to do with the content of the video to be detected.
  • One frame of pictures to be detected contains floating paper, and it is determined that the video to be detected contains floating paper, which can improve the accuracy of video classification.
  • step S1 specifically includes:
  • S11 For multiple frames of pictures to be detected, image characteristics of each frame of pictures to be detected are extracted.
  • the image feature may be a feature point of the picture to be detected, or a feature area of the picture to be detected.
  • S12 Compare the image features of the pictures to be detected in each frame. If there are pictures to be detected containing the same image feature, determine that at least two of the pictures to be detected include floating paper.
  • a shape context feature and a scale-invariant feature transform (SIFT) feature of the feature point are extracted, and according to the shape context feature and the SIFT of the feature point
  • SIFT scale-invariant feature transform
  • the features compare the similarity of the feature points between the pictures to be detected for multiple frames, obtain the matching result of the similarity of the feature points between the pictures to be detected, and obtain the matched feature area, which is the same image feature.
  • This example can be used to detect when the position of floating paper in each frame of the video changes.
  • the method can be determined by using pixel matching or calculating the similarity of the feature area. This example can be used to detect the situation where the floating paper is fixed in each frame of the video.
  • step S1 specifically includes:
  • the feature point may be a SIFT feature point.
  • S14 Determine the feature area according to the similarity between the feature point and the neighboring feature points.
  • the pixels of the single-frame picture contained in it are highly correlated, and the floating paper inserted is often not related to the video content, and the pixels contained in it are also related to the extracted single-frame picture. There are large differences in pixel points.
  • the feature points of the picture to be detected and the neighboring feature points of the feature points can be extracted, and the feature area can be determined according to the similarity between the feature points and the neighboring feature points. If it is detected that the picture to be detected includes at least two feature regions, it is determined that the picture to be detected includes a floating sheet of paper.
  • This embodiment extracts the feature points of the picture to be detected and the neighboring feature points of the feature points, and determines the feature area according to the similarity between the feature points and the neighboring feature points. If it is detected that the picture to be detected contains at least two feature areas, the The detection picture contains a floating paper sheet, thereby determining that the video to be detected contains a floating paper sheet, which can improve the accuracy of video classification.
  • the method in this embodiment further includes:
  • S4 Annotate training samples according to whether floating paper is included.
  • each picture needs to be labeled. For example, a picture containing a floating piece of paper is labeled 1 and a picture not containing a piece of floating paper is labeled 0.
  • S5 Use a deep learning classification algorithm to perform training learning on the labeled training samples to obtain an image classifier.
  • the deep learning classification algorithms include, but are not limited to, any of the following: Naive Bayes algorithm, artificial neural network algorithm, genetic algorithm, K-Nearest Neighbor (KNN) classification algorithm, clustering algorithm, and the like.
  • KNN K-Nearest Neighbor
  • Step S1 specifically includes:
  • the at least one frame of the picture to be detected is input to the image classifier, and the detection result in the at least one frame of the picture to be detected is determined according to the classification result of the image classifier.
  • At least one frame of a picture to be detected is input to the image classifier, and a detection result in the at least one frame of the picture to be detected is determined according to the classification result of the image classifier.
  • the detection result determines whether a floating paper is included in the video to be detected, which can improve the accuracy of video classification.
  • the following is a device embodiment of the present disclosure.
  • the device embodiment of the present disclosure can be used to perform the steps implemented by the method embodiments of the present disclosure.
  • Only parts related to the embodiments of the present disclosure are shown. Specific technical details are not disclosed. Reference is made to the method embodiments of the present disclosure.
  • an embodiment of the present disclosure provides a video floating paper detection device.
  • the device can perform the steps in the above-mentioned embodiment of the video floating sheet detection method.
  • the device mainly includes: a floating paper detection module 21 and a floating paper determination module 22; wherein the floating paper detection module 21 is configured to perform at least one frame of a picture to be detected extracted from a video to be detected Floating paper detection, the floating paper is a sub-display window inserted into the video to be detected and has nothing to do with the content of the video to be detected; the floating paper determination module 22 is configured to determine the video to be detected based on the detection result of at least one frame of the image to be detected Whether floating paper is included.
  • the picture to be detected may be one or more frames.
  • the picture to be detected is a plurality of frames, a single frame of the picture to be detected is detected, or the multiple pictures to be detected are compared.
  • the sub-display window includes, but is not limited to, inserted advertisements, pornographic information, or handwritten text information.
  • the detection results include, but are not limited to, only one frame of pictures to be detected contains floating paper, or multiple frames of pictures to be detected include floating paper, or no pictures to be detected include floating paper.
  • This embodiment uses floating paper detection module 21 to perform floating paper detection on at least one frame of to-be-detected pictures extracted from the video to be detected, where the floating paper is a child inserted into the video to be detected and has nothing to do with the content of the video to be detected Displaying the window, and then determining whether the floating video is included in the video to be detected by the floating paper determination module 22 according to the detection result of at least one frame of the picture to be detected can improve the accuracy of video classification.
  • the floating paper sheet determining module 22 is specifically configured to: if it is detected that at least one frame of the picture to be detected includes a floating paper sheet, determine that the video to be detected includes a floating paper sheet .
  • the floating paper sheet determination module 22 determines that the to-be-detected video contains a floating paper sheet. Otherwise, it is determined that the floating paper is not included in the video to be detected.
  • This embodiment uses floating paper detection module 21 to perform floating paper detection on at least one frame of the picture to be detected extracted from the video to be detected.
  • the floating paper is a sub-display inserted into the video to be detected and has nothing to do with the content of the video to be detected. Window, if the floating paper sheet determining module 22 detects that at least one frame of the to-be-detected picture contains floating paper sheets, determining that the to-be-detected video contains floating paper sheets can improve the accuracy of video classification.
  • the floating paper detection module 21 is specifically configured to: for multiple frames of pictures to be detected, extract image characteristics of each frame of pictures to be detected; compare the images of each frame of pictures to be detected Feature, if there are pictures to be detected containing the same image features, it is determined that at least two frames of the pictures to be detected include floating paper.
  • the image feature may be a feature point of the picture to be detected, or a feature area of the picture to be detected.
  • the shape context feature and SIFT feature of the feature point are extracted, and the similarity of the feature points between the pictures to be detected is compared according to the shape context feature and the SIFT feature of the feature point. Degree, to obtain the matching result of the similarity of the feature points between the pictures to be detected, and to obtain a matched feature area, which is the same image feature.
  • This example can be used to detect when the position of floating paper in each frame of the video changes.
  • the method can be determined by using pixel matching or calculating the similarity of the feature area. This example can be used to detect the situation where the floating paper is fixed in each frame of the video.
  • the floating paper detection module 21 is used to extract the image features of each frame of the to-be-detected picture and compare the image features of each frame of the to-be-detected picture. If there are to-be-detected pictures containing the same image feature, it is determined that at least two of the to-be-detected pictures The frame to-be-detected picture contains floating paper pieces, so that the floating paper piece determination module 22 determines that the to-be-detected video contains floating paper pieces, which can improve the accuracy of video classification.
  • the floating paper detection module 21 is specifically configured to: for a single frame of a picture to be detected, extract feature points of the picture to be detected and neighboring feature points of the feature points; according to the feature points The similarity with the neighboring feature points determines the feature area; if it is detected that the picture to be detected includes at least two feature areas, it is determined that the picture to be detected contains a floating sheet of paper.
  • the feature point may be a SIFT feature point.
  • the pixels of the single-frame picture contained in it are highly correlated, and the floating paper inserted is often not related to the video content, and the pixels contained in it are also related to the extracted single-frame picture. There are large differences in pixel points.
  • the feature points of the picture to be detected and the neighboring feature points of the feature points can be extracted, and the feature area can be determined according to the similarity between the feature points and the neighboring feature points. If it is detected that the picture to be detected includes at least two feature regions, it is determined that the picture to be detected includes a floating sheet of paper.
  • the floating paper detection module 21 extracts the feature points of the picture to be detected and the neighboring feature points of the feature points, and determines a feature area according to the similarity between the feature points and the neighboring feature points.
  • the picture to be detected contains at least two characteristic regions, and the floating paper piece determination module 22 determines that the floating paper piece is included in the picture to be detected, thereby determining that the floating paper piece is included in the video to be detected, which can improve the accuracy of video classification. .
  • the device in this embodiment further includes: an image classifier training module 23; wherein the image classifier training module 23 is configured to combine a picture that is known to contain a floating piece of paper and / Or pictures that are not known to contain floating paper as training samples; label training samples based on whether floating paper is included; use deep learning classification algorithms to train and learn labeled training samples to obtain image classifiers;
  • the floating paper detection module 21 is specifically configured to input at least one frame of the picture to be detected into the image classifier, and determine the detection result in the at least one frame of the picture to be detected according to the classification result of the image classifier.
  • the image classifier training module 23 needs to label each picture in order to distinguish between pictures containing floating paper and pictures not containing floating paper. For example, a picture containing a floating piece of paper is labeled 1 and a picture not containing a piece of floating paper is labeled 0.
  • the deep learning classification algorithms include, but are not limited to, any of the following: Naive Bayes algorithm, artificial neural network algorithm, genetic algorithm, K-Nearest Neighbor (KNN) classification algorithm, clustering algorithm, and the like.
  • KNN K-Nearest Neighbor
  • the image classifier is trained by the image classifier training module 23, and at least one frame of the picture to be detected is input to the image classifier.
  • the detection result of the at least one frame of the picture to be detected is determined according to the classification result of the image classifier, thereby floating the paper.
  • the determining module 22 determines whether a floating paper is included in the video to be detected according to the detection result of at least one frame of the picture to be detected, which can improve the accuracy of video classification.
  • FIG. 3 is a hardware block diagram illustrating a video floating sheet detection hardware device according to an embodiment of the present disclosure.
  • a video floating sheet detection hardware device 30 according to an embodiment of the present disclosure includes a memory 31 and a processor 32.
  • the memory 31 is configured to store non-transitory computer-readable instructions.
  • the memory 31 may include one or more computer program products, which may include various forms of computer-readable storage media, such as volatile memory and / or non-volatile memory.
  • the volatile memory may include, for example, a random access memory (RAM) and / or a cache memory.
  • the non-volatile memory may include, for example, a read-only memory (ROM), a hard disk, a flash memory, and the like.
  • the processor 32 may be a central processing unit (CPU) or other form of processing unit having data processing capability and / or instruction execution capability, and may control other components in the video floating sheet detection hardware device 30 to perform a desired function .
  • the processor 32 is configured to execute the computer-readable instructions stored in the memory 31, so that the video floating sheet detection hardware device 30 executes the foregoing video floating sheet according to the embodiments of the present disclosure. All or part of the steps of the film detection method.
  • this embodiment may also include well-known structures such as a communication bus and an interface. These well-known structures should also be included in the protection scope of the present disclosure. within.
  • FIG. 4 is a schematic diagram illustrating a computer-readable storage medium according to an embodiment of the present disclosure.
  • a computer-readable storage medium 40 according to an embodiment of the present disclosure stores non-transitory computer-readable instructions 41 thereon.
  • the non-transitory computer-readable instruction 41 is executed by a processor, all or part of the steps of the method for comparing video features of the foregoing embodiments of the present disclosure are performed.
  • the computer-readable storage medium 40 includes, but is not limited to, optical storage media (for example, CD-ROM and DVD), magneto-optical storage media (for example, MO), magnetic storage media (for example, magnetic tape or mobile hard disk), Non-volatile memory rewritable media (for example: memory card) and media with built-in ROM (for example: ROM box).
  • optical storage media for example, CD-ROM and DVD
  • magneto-optical storage media for example, MO
  • magnetic storage media for example, magnetic tape or mobile hard disk
  • Non-volatile memory rewritable media for example: memory card
  • media with built-in ROM for example: ROM box
  • FIG. 5 is a schematic diagram illustrating a hardware structure of a terminal according to an embodiment of the present disclosure. As shown in FIG. 5, the video floating sheet detection terminal 50 includes the foregoing video floating sheet detection device embodiment.
  • the terminal may be implemented in various forms, and the terminal in the present disclosure may include, but is not limited to, such as a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP ( Portable multimedia players), navigation devices, on-board terminals, on-board display terminals, on-board electronic rear-view mirrors, and other mobile terminals, and fixed terminals such as digital TVs, desktop computers, and the like.
  • a mobile phone such as a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a PDA (personal digital assistant), a PAD (tablet computer), a PMP ( Portable multimedia players), navigation devices, on-board terminals, on-board display terminals, on-board electronic rear-view mirrors, and other mobile terminals, and fixed terminals such as digital TVs, desktop computers, and the like.
  • PDA personal digital assistant
  • PAD tablet computer
  • PMP Portable multimedia players
  • navigation devices
  • the terminal may further include other components.
  • the video floating sheet detection terminal 50 may include a power supply unit 51, a wireless communication unit 52, an A / V (audio / video) input unit 53, a user input unit 54, a sensing unit 55, and an interface unit 56. , Controller 57, output unit 58 and memory 59, and so on.
  • FIG. 5 illustrates a terminal having various components, but it should be understood that it is not required to implement all the illustrated components, and more or fewer components may be implemented instead.
  • the wireless communication unit 52 allows radio communication between the terminal 50 and a wireless communication system or network.
  • the A / V input unit 53 is used to receive audio or video signals.
  • the user input unit 54 may generate key input data according to a command input by the user to control various operations of the terminal.
  • the sensing unit 55 detects the current state of the terminal 50, the position of the terminal 50, the presence or absence of a user's touch input to the terminal 50, the orientation of the terminal 50, the acceleration or deceleration movement and direction of the terminal 50, and the like, and generates a signal for controlling the terminal 50 commands or signals for operation.
  • the interface unit 56 functions as an interface through which at least one external device can be connected to the terminal 50.
  • the output unit 58 is configured to provide an output signal in a visual, audio, and / or tactile manner.
  • the memory 59 may store software programs and the like for processing and control operations performed by the controller 55, or may temporarily store data that has been output or is to be output.
  • the memory 59 may include at least one type of storage medium.
  • the terminal 50 may cooperate with a network storage device that performs a storage function of the memory 59 through a network connection.
  • the controller 57 generally controls the overall operation of the terminal.
  • the controller 57 may include a multimedia module for reproducing or playing back multimedia data.
  • the controller 57 may perform a pattern recognition process to recognize a handwriting input or a picture drawing input performed on the touch screen as characters or images.
  • the power supply unit 51 receives external power or internal power under the control of the controller 57 and provides appropriate power required to operate each element and component.
  • Various embodiments of the video feature comparison method proposed by the present disclosure may be implemented in a computer-readable medium using, for example, computer software, hardware, or any combination thereof.
  • various embodiments of the video feature comparison method proposed in the present disclosure can be implemented by using an application-specific integrated circuit (ASIC), a digital signal processor (DSP), a digital signal processing device (DSPD), and a programmable logic device. (PLD), field programmable gate array (FPGA), processor, controller, microcontroller, microprocessor, electronic unit designed to perform the functions described herein, and in some cases implemented
  • ASIC application-specific integrated circuit
  • DSP digital signal processor
  • DSPD digital signal processing device
  • PLD programmable logic device
  • FPGA field programmable gate array
  • processor controller
  • microcontroller microprocessor
  • electronic unit designed to perform the functions described herein and in some cases implemented
  • Various embodiments of the video feature comparison method proposed in the present disclosure may be implemented in the controller 57.
  • various embodiments of the video feature comparison method proposed by the present disclosure can be implemented with a separate software module that allows at least one function or operation to be performed.
  • the software codes may be implemented by a software application (or program) written in any suitable programming language, and the software codes may be stored in the memory 59 and executed by the controller 57.
  • an "or” used in an enumeration of items beginning with “at least one” indicates a separate enumeration such that, for example, an "at least one of A, B, or C” enumeration means A or B or C, or AB or AC or BC, or ABC (ie A and B and C).
  • the word "exemplary” does not mean that the described example is preferred or better than other examples.
  • each component or each step can be disassembled and / or recombined.

Abstract

本公开公开一种视频漂浮纸片检测方法、视频漂浮纸片检测装置、视频漂浮纸片检测硬件装置和计算机可读存储介质。其中,该视频漂浮纸片检测方法包括对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,所述漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口;根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片。本公开实施例首先对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,其中漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口,然后根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片,可以提高视频分类准确率。

Description

视频漂浮纸片检测方法、装置和计算机可读存储介质
交叉引用
本公开引用于2018年09月13日递交的名称为“视频漂浮纸片检测方法、装置和计算机可读存储介质”的、申请号为201811068698.5的中国专利申请,其通过引用被全部并入本申请。
技术领域
本公开涉及一种信息处理技术领域,特别是涉及一种视频漂浮纸片检测方法、装置和计算机可读存储介质。
背景技术
近年来,随着多媒体技术和计算机网络的飞速发展,数字视频的容量正以惊人的速度增长。这样,从数字视频中抓取到的图像中往往包含有重要的文字信息,这在基于文字内容的视频数据库检索中起到重要的作用。即在一定程度上便于视频主要内容进行简练描述和说明,或便于视频分类,或便于非法视频的鉴定等。
现有技术中,视频的作者往往会在视频上加入一些字,比如广告语、介绍语等等,在一般情况下,这些字是打印体,现有的识别方法比较容易识别出来。但是还有一些场景下,视频的作者会在视频中加入一些纸片效果,在纸片效果上会有一些手写的字。由于手写的字通常比较潦草,不太容易识别,因此给视频分类带来的困难,常常会因为没有识别出来,而被分类为无字视频。
因此,对于这种类型的视频需要检测出其包含的漂浮纸片,然后通过人工识别出该漂浮纸片中包含的文字信息,因此,如何准确的检测出视频中是 否包含漂浮纸片就变得十分重要。
发明内容
本公开解决的技术问题是提供一种视频漂浮纸片检测方法,以至少部分地解决现有视频分类不准确的技术问题。此外,还提供一种视频漂浮纸片检测装置、视频漂浮纸片检测硬件装置、计算机可读存储介质和视频漂浮纸片检测终端。
为了实现上述目的,根据本公开的一个方面,提供以下技术方案:
一种视频漂浮纸片检测方法,包括:
对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,所述漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口;
根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片。
进一步的,所述根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片的步骤,包括:
若检测出至少一帧待检测图片中包含漂浮纸片,则确定所述待检测视频中包含漂浮纸片。
进一步的,所述对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测的步骤,包括:
针对多帧待检测图片,提取各帧待检测图片的图像特征;
比较所述各帧待检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定所述待检测图片中至少有两帧待检测图片包含漂浮纸片。
进一步的,所述对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测的步骤,包括:
针对单帧待检测图片,提取所述待检测图片的特征点及所述特征点的邻近特征点;
根据所述特征点与所述邻近特征点的相似度确定特征区域;
若检测到所述待检测图片中包含至少两个特征区域,则确定所述待检 测图片中包含漂浮纸片。
进一步的,所述方法还包括:
将已知包含漂浮纸片的图片和/或已知未包含漂浮纸片的图片作为训练样本;
根据是否包含漂浮纸片对所述训练样本进行标注;
采用深度学习分类算法对所述标注后的训练样本进行训练学习,得到图像分类器;
所述对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测的步骤,包括:
将所述至少一帧待检测图片输入所述图像分类器,根据所述图像分类器的分类结果确定所述至少一帧待检测图片中的检测结果。
为了实现上述目的,根据本公开的又一个方面,还提供以下技术方案:
一种视频漂浮纸片检测装置,包括:
漂浮纸片检测模块,用于对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,所述漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口;
漂浮纸片确定模块,用于根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片。
进一步的,所述漂浮纸片确定模块具体用于:若检测出至少一帧待检测图片中包含漂浮纸片,则确定所述待检测视频中包含漂浮纸片。
进一步的,所述漂浮纸片检测模块具体用于:针对多帧待检测图片,提取各帧待检测图片的图像特征;比较所述各帧待检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定所述待检测图片中至少有两帧待检测图片包含漂浮纸片。
进一步的,所述漂浮纸片检测模块具体用于:针对单帧待检测图片,提取所述待检测图片的特征点及所述特征点的邻近特征点;根据所述特征点与所述邻近特征点的相似度确定特征区域;若检测到所述待检测图片中包含至少两个特征区域,则确定所述待检测图片中包含漂浮纸片。
进一步的,所述装置还包括:
图像分类器训练模块,用于将已知包含漂浮纸片的图片和/或已知未包含漂浮纸片的图片作为训练样本;根据是否包含漂浮纸片对所述训练样本进行标注;采用深度学习分类算法对所述标注后的训练样本进行训练学习,得到图像分类器;
所述漂浮纸片检测模块具体用于:将所述至少一帧待检测图片输入所述图像分类器,根据所述图像分类器的分类结果确定所述至少一帧待检测图片中的检测结果。
为了实现上述目的,根据本公开的又一个方面,还提供以下技术方案:
一种视频漂浮纸片检测硬件装置,包括:
存储器,用于存储非暂时性计算机可读指令;以及
处理器,用于运行所述计算机可读指令,使得所述处理器执行时实现上述任一视频漂浮纸片检测方法技术方案中所述的步骤。
为了实现上述目的,根据本公开的又一个方面,还提供以下技术方案:
一种计算机可读存储介质,用于存储非暂时性计算机可读指令,当所述非暂时性计算机可读指令由计算机执行时,使得所述计算机执行上述任一视频漂浮纸片检测方法技术方案中所述的步骤。
为了实现上述目的,根据本公开的又一个方面,还提供以下技术方案:
一种视频漂浮纸片检测终端,包括上述任一视频漂浮纸片检测装置。
本公开实施例提供一种视频漂浮纸片检测方法、视频漂浮纸片检测装置、视频漂浮纸片检测硬件装置、计算机可读存储介质和视频漂浮纸片检测终端。其中,该视频漂浮纸片检测方法包括对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,所述漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口;根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片。本公开实施例首先对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,其中漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口,然后根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片,可以提高视频分类准确率。
上述说明仅是本公开技术方案的概述,为了能更清楚了解本公开的技术手段,而可依照说明书的内容予以实施,并且为让本公开的上述和其他目的、特征和优点能够更明显易懂,以下特举较佳实施例,并配合附图,详细说明如下。
附图说明
图1a为根据本公开一个实施例的视频漂浮纸片检测方法的流程示意图;
图1b为根据本公开另一个实施例的视频漂浮纸片检测方法的流程示意图;
图1c为根据本公开另一个实施例的视频漂浮纸片检测方法的流程示意图;
图1d为根据本公开另一个实施例的视频漂浮纸片检测方法的流程示意图;
图1e为根据本公开另一个实施例的视频漂浮纸片检测方法的流程示意图;
图2a为根据本公开一个实施例的视频漂浮纸片检测的装置的结构示意图;
图2b为根据本公开另一个实施例的视频漂浮纸片检测装置的结构示意图;
图3为根据本公开一个实施例的视频漂浮纸片检测硬件装置的结构示意图;
图4为根据本公开一个实施例的计算机可读存储介质的结构示意图;
图5为根据本公开一个实施例的视频漂浮纸片检测终端的结构示意图。
具体实施方式
以下通过特定的具体实例说明本公开的实施方式,本领域技术人员可由本说明书所揭露的内容轻易地了解本公开的其他优点与功效。显然,所描 述的实施例仅仅是本公开一部分实施例,而不是全部的实施例。本公开还可以通过另外不同的具体实施方式加以实施或应用,本说明书中的各项细节也可以基于不同观点与应用,在没有背离本公开的精神下进行各种修饰或改变。需说明的是,在不冲突的情况下,以下实施例及实施例中的特征可以相互组合。基于本公开中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本公开保护的范围。
需要说明的是,下文描述在所附权利要求书的范围内的实施例的各种方面。应显而易见,本文中所描述的方面可体现于广泛多种形式中,且本文中所描述的任何特定结构及/或功能仅为说明性的。基于本公开,所属领域的技术人员应了解,本文中所描述的一个方面可与任何其它方面独立地实施,且可以各种方式组合这些方面中的两者或两者以上。举例来说,可使用本文中所阐述的任何数目个方面来实施设备及/或实践方法。另外,可使用除了本文中所阐述的方面中的一或多者之外的其它结构及/或功能性实施此设备及/或实践此方法。
还需要说明的是,以下实施例中所提供的图示仅以示意方式说明本公开的基本构想,图式中仅显示与本公开中有关的组件而非按照实际实施时的组件数目、形状及尺寸绘制,其实际实施时各组件的型态、数量及比例可为一种随意的改变,且其组件布局型态也可能更为复杂。
另外,在以下描述中,提供具体细节是为了便于透彻理解实例。然而,所属领域的技术人员将理解,可在没有这些特定细节的情况下实践所述方面。
为了解决视频分类不准确的技术问题,本公开实施例提供一种视频漂浮纸片检测方法。如图1a所示,该视频漂浮纸片检测方法主要包括如下步骤S1至步骤S2。其中:
步骤S1:对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,漂浮纸片为插入待检测视频中的且与待检测视频内容无关的子显示窗口。
其中,待检测图片可以为一帧或多帧,当待检测图片为多帧时,分别对单帧待检测图片进行检测,或者,通过对比多帧待检测图片进行检测。
其中,子显示窗口包括但不限于插入的广告、色情信息、或手写的文字 信息等。
步骤S2:根据对至少一帧待检测图片的检测结果确定待检测视频中是否包含漂浮纸片。
其中,检测结果包括但不限于只有一帧待检测图片包含漂浮纸片,或多帧待检测图片包含漂浮纸片,或无待检测图片包含漂浮纸片。
本实施例通过对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,其中漂浮纸片为插入待检测视频中的且与待检测视频内容无关的子显示窗口,然后根据对至少一帧待检测图片的检测结果确定待检测视频中是否包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,如图1b所示,步骤S2具体包括:
若检测出至少一帧待检测图片中包含漂浮纸片,则确定待检测视频中包含漂浮纸片。
具体的,当检测出只有一帧待检测图片包含漂浮纸片,或多帧待检测图片包含漂浮纸片时,则确定待检测视频中包含漂浮纸片,否则确定待检测视频中不包含漂浮纸片。
本实施例通过对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,漂浮纸片为插入待检测视频中的且与待检测视频内容无关的子显示窗口,若检测出至少一帧待检测图片中包含漂浮纸片,则确定待检测视频中包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,如图1c所示,步骤S1具体包括:
S11:针对多帧待检测图片,提取各帧待检测图片的图像特征。
其中,图像特征可以为待检测图片的特征点,或为待检测图片的特征区域。
S12:比较各帧待检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定待检测图片中至少有两帧待检测图片包含漂浮纸片。
具体的,当图像特征为待检测图片的特征点时,提取该特征点的形状上下文特征和尺度不变特征变换(Scale-invariant feature transform,SIFT)特征,并根据特征点的形状上下文特征和SIFT特征比较多帧待检测图片间的特征点的相似度,获得待检测图片间的特征点的相似度的匹配结 果,得到匹配的特征区域,该特征区域即为相同的图像特征。此示例可用于检测漂浮纸片在视频各帧图片中位置发生变化的情况。
当图像特征为待检测图片的特征点区域时,比较各帧待检测图片是否包含相同的特征区域,具体可采用像素点匹配或计算特征区域相似度的方法进行确定。此示例可用于检测漂浮纸片在视频各帧图片中位置固定不变的情况。
本实施例通过提取各帧待检测图片的图像特征,比较各帧待检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定待检测图片中至少有两帧待检测图片包含漂浮纸片,从而确定待检测视频中包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,如图1d所示,步骤S1具体包括:
S13:针对单帧待检测图片,提取待检测图片的特征点及特征点的邻近特征点。
其中,特征点可以为SIFT特征点。
S14:根据特征点与邻近特征点的相似度确定特征区域。
S15:若检测到待检测图片中包含至少两个特征区域,则确定待检测图片中包含漂浮纸片。
具体的,根据检测视频的特征,其包含的单帧图片的像素点有很大的相关性,而对于插入的漂浮纸片往往与视频内容无关,其包含的像素点也与提取的单帧图片的像素点有很大的差别,根据上述特征,可以针对单帧待检测图片,提取待检测图片的特征点及特征点的邻近特征点,并根据特征点与邻近特征点的相似度确定特征区域,如果检测到待检测图片中包含至少两个特征区域,则确定待检测图片中包含漂浮纸片。
本实施例通过提取待检测图片的特征点及特征点的邻近特征点,根据特征点与邻近特征点的相似度确定特征区域,若检测到待检测图片中包含至少两个特征区域,则确定待检测图片中包含漂浮纸片,从而确定所述待检测视频中包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,如图1e所示,本实施例的方法还包括:
S3:将已知包含漂浮纸片的图片和/或已知未包含漂浮纸片的图片作为 训练样本。
S4:根据是否包含漂浮纸片对训练样本进行标注。
具体的,在训练之前,为区分包含漂浮纸片的图片和未包含漂浮纸片的图片,需要对每个图片进行标注。例如,将包含漂浮纸片的图片标注1,将未包含漂浮纸片的图片标注0。
S5:采用深度学习分类算法对标注后的训练样本进行训练学习,得到图像分类器。
其中,可采用的深度学习分类算法包括但不限于以下任意一种:朴素贝叶斯算法、人工神经网络算法、遗传算法、K最近邻(K-NearestNeighbor,KNN)分类算法、聚类算法等。
步骤S1具体包括:
将至少一帧待检测图片输入图像分类器,根据图像分类器的分类结果确定至少一帧待检测图片中的检测结果。
本实施例通过训练图像分类器,将至少一帧待检测图片输入图像分类器,根据图像分类器的分类结果确定至少一帧待检测图片中的检测结果,从而根据对至少一帧待检测图片的检测结果确定待检测视频中是否包含漂浮纸片,可以提高视频分类准确率。
本领域技术人员应能理解,在上述各个实施例的基础上,还可以进行明显变型(例如,对所列举的模式进行组合)或等同替换。
在上文中,虽然按照上述的顺序描述了视频漂浮纸片检测方法实施例中的各个步骤,本领域技术人员应清楚,本公开实施例中的步骤并不必然按照上述顺序执行,其也可以倒序、并行、交叉等其他顺序执行,而且,在上述步骤的基础上,本领域技术人员也可以再加入其他步骤,这些明显变型或等同替换的方式也应包含在本公开的保护范围之内,在此不再赘述。
下面为本公开装置实施例,本公开装置实施例可用于执行本公开方法实施例实现的步骤,为了便于说明,仅示出了与本公开实施例相关的部分,具体技术细节未揭示的,请参照本公开方法实施例。
为了解决如何提高用户体验效果的技术问题,本公开实施例提供一种视频漂浮纸片检测装置。该装置可以执行上述视频漂浮纸片检测方法实施 例中的步骤。如图2a所示,该装置主要包括:漂浮纸片检测模块21和漂浮纸片确定模块22;其中,漂浮纸片检测模块21用于对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,漂浮纸片为插入待检测视频中的且与待检测视频内容无关的子显示窗口;漂浮纸片确定模块22用于根据对至少一帧待检测图片的检测结果确定待检测视频中是否包含漂浮纸片。
其中,待检测图片可以为一帧或多帧,当待检测图片为多帧时,分别对单帧待检测图片进行检测,或者,通过对比多帧待检测图片进行检测。
其中,子显示窗口包括但不限于插入的广告、色情信息、或手写的文字信息等。
其中,检测结果包括但不限于只有一帧待检测图片包含漂浮纸片,或多帧待检测图片包含漂浮纸片,或无待检测图片包含漂浮纸片。
本实施例通过漂浮纸片检测模块21对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,其中漂浮纸片为插入待检测视频中的且与待检测视频内容无关的子显示窗口,然后通过漂浮纸片确定模块22根据对至少一帧待检测图片的检测结果确定待检测视频中是否包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,基于图2a所示,漂浮纸片确定模块22具体用于:若检测出至少一帧待检测图片中包含漂浮纸片,则确定待检测视频中包含漂浮纸片。
具体的,当漂浮纸片检测模块21检测出只有一帧待检测图片包含漂浮纸片,或多帧待检测图片包含漂浮纸片时,则漂浮纸片确定模块22确定待检测视频中包含漂浮纸片,否则确定待检测视频中不包含漂浮纸片。
本实施例通过漂浮纸片检测模块21对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,漂浮纸片为插入待检测视频中的且与待检测视频内容无关的子显示窗口,若漂浮纸片确定模块22检测出至少一帧待检测图片中包含漂浮纸片,则确定待检测视频中包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,基于图2a所示,漂浮纸片检测模块21具体用于:针对多帧待检测图片,提取各帧待检测图片的图像特征;比较各帧待 检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定待检测图片中至少有两帧待检测图片包含漂浮纸片。
其中,图像特征可以为待检测图片的特征点,或为待检测图片的特征区域。
具体的,当图像特征为待检测图片的特征点时,提取该特征点的形状上下文特征和SIFT特征,并根据特征点的形状上下文特征和SIFT特征比较多帧待检测图片间的特征点的相似度,获得待检测图片间的特征点的相似度的匹配结果,得到匹配的特征区域,该特征区域即为相同的图像特征。此示例可用于检测漂浮纸片在视频各帧图片中位置发生变化的情况。
当图像特征为待检测图片的特征点区域时,比较各帧待检测图片是否包含相同的特征区域,具体可采用像素点匹配或计算特征区域相似度的方法进行确定。此示例可用于检测漂浮纸片在视频各帧图片中位置固定不变的情况。
本实施例通过漂浮纸片检测模块21提取各帧待检测图片的图像特征,比较各帧待检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定待检测图片中至少有两帧待检测图片包含漂浮纸片,从而通过漂浮纸片确定模块22确定待检测视频中包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,基于图2a所示,漂浮纸片检测模块21具体用于:针对单帧待检测图片,提取待检测图片的特征点及特征点的邻近特征点;根据特征点与邻近特征点的相似度确定特征区域;若检测到待检测图片中包含至少两个特征区域,则确定待检测图片中包含漂浮纸片。
其中,特征点可以为SIFT特征点。
具体的,根据检测视频的特征,其包含的单帧图片的像素点有很大的相关性,而对于插入的漂浮纸片往往与视频内容无关,其包含的像素点也与提取的单帧图片的像素点有很大的差别,根据上述特征,可以针对单帧待检测图片,提取待检测图片的特征点及特征点的邻近特征点,并根据特征点与邻近特征点的相似度确定特征区域,如果检测到待检测图片中包含至少两个特征区域,则确定待检测图片中包含漂浮纸片。
本实施例通过漂浮纸片检测模块21提取待检测图片的特征点及所述特 征点的邻近特征点,根据所述特征点与所述邻近特征点的相似度确定特征区域,若检测到所述待检测图片中包含至少两个特征区域,则通过漂浮纸片确定模块22确定所述待检测图片中包含漂浮纸片,从而确定所述待检测视频中包含漂浮纸片,可以提高视频分类准确率。
在一个可选的实施例中,如图2b所示,本实施例的装置还包括:图像分类器训练模块23;其中,图像分类器训练模块23用于将已知包含漂浮纸片的图片和/或已知未包含漂浮纸片的图片作为训练样本;根据是否包含漂浮纸片对训练样本进行标注;采用深度学习分类算法对标注后的训练样本进行训练学习,得到图像分类器;
漂浮纸片检测模块21具体用于:将至少一帧待检测图片输入图像分类器,根据图像分类器的分类结果确定至少一帧待检测图片中的检测结果。
具体的,图像分类器训练模块23在训练之前,为区分包含漂浮纸片的图片和未包含漂浮纸片的图片,需要对每个图片进行标注。例如,将包含漂浮纸片的图片标注1,将未包含漂浮纸片的图片标注0。
其中,可采用的深度学习分类算法包括但不限于以下任意一种:朴素贝叶斯算法、人工神经网络算法、遗传算法、K最近邻(K-NearestNeighbor,KNN)分类算法、聚类算法等。
本实施例通过图像分类器训练模块23训练图像分类器,将至少一帧待检测图片输入图像分类器,根据图像分类器的分类结果确定至少一帧待检测图片中的检测结果,从而漂浮纸片确定模块22根据对至少一帧待检测图片的检测结果确定待检测视频中是否包含漂浮纸片,可以提高视频分类准确率。
有关视频漂浮纸片检测装置实施例的工作原理、实现的技术效果等详细说明可以参考前述视频漂浮纸片检测方法实施例中的相关说明,在此不再赘述。
图3是图示根据本公开的实施例的视频漂浮纸片检测硬件装置的硬件框图。如图3所示,根据本公开实施例的视频漂浮纸片检测硬件装置30包括存储器31和处理器32。
该存储器31用于存储非暂时性计算机可读指令。具体地,存储器31可 以包括一个或多个计算机程序产品,该计算机程序产品可以包括各种形式的计算机可读存储介质,例如易失性存储器和/或非易失性存储器。该易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。该非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。
该处理器32可以是中央处理单元(CPU)或者具有数据处理能力和/或指令执行能力的其它形式的处理单元,并且可以控制视频漂浮纸片检测硬件装置30中的其它组件以执行期望的功能。在本公开的一个实施例中,该处理器32用于运行该存储器31中存储的该计算机可读指令,使得该视频漂浮纸片检测硬件装置30执行前述的本公开各实施例的视频漂浮纸片检测方法的全部或部分步骤。
本领域技术人员应能理解,为了解决如何获得良好用户体验效果的技术问题,本实施例中也可以包括诸如通信总线、接口等公知的结构,这些公知的结构也应包含在本公开的保护范围之内。
有关本实施例的详细说明可以参考前述各实施例中的相应说明,在此不再赘述。
图4是图示根据本公开的实施例的计算机可读存储介质的示意图。如图4所示,根据本公开实施例的计算机可读存储介质40,其上存储有非暂时性计算机可读指令41。当该非暂时性计算机可读指令41由处理器运行时,执行前述的本公开各实施例的视频特征的比对方法的全部或部分步骤。
上述计算机可读存储介质40包括但不限于:光存储介质(例如:CD-ROM和DVD)、磁光存储介质(例如:MO)、磁存储介质(例如:磁带或移动硬盘)、具有内置的可重写非易失性存储器的媒体(例如:存储卡)和具有内置ROM的媒体(例如:ROM盒)。
有关本实施例的详细说明可以参考前述各实施例中的相应说明,在此不再赘述。
图5是图示根据本公开实施例的终端的硬件结构示意图。如图5所示,该视频漂浮纸片检测终端50包括上述视频漂浮纸片检测装置实施例。
该终端可以以各种形式来实施,本公开中的终端可以包括但不限于诸如移动电话、智能电话、笔记本电脑、数字广播接收器、PDA(个人数字助 理)、PAD(平板电脑)、PMP(便携式多媒体播放器)、导航装置、车载终端、车载显示终端、车载电子后视镜等等的移动终端以及诸如数字TV、台式计算机等等的固定终端。
作为等同替换的实施方式,该终端还可以包括其他组件。如图5所示,该视频漂浮纸片检测终端50可以包括电源单元51、无线通信单元52、A/V(音频/视频)输入单元53、用户输入单元54、感测单元55、接口单元56、控制器57、输出单元58和存储器59等等。图5示出了具有各种组件的终端,但是应理解的是,并不要求实施所有示出的组件,也可以替代地实施更多或更少的组件。
其中,无线通信单元52允许终端50与无线通信系统或网络之间的无线电通信。A/V输入单元53用于接收音频或视频信号。用户输入单元54可以根据用户输入的命令生成键输入数据以控制终端的各种操作。感测单元55检测终端50的当前状态、终端50的位置、用户对于终端50的触摸输入的有无、终端50的取向、终端50的加速或减速移动和方向等等,并且生成用于控制终端50的操作的命令或信号。接口单元56用作至少一个外部装置与终端50连接可以通过的接口。输出单元58被构造为以视觉、音频和/或触觉方式提供输出信号。存储器59可以存储由控制器55执行的处理和控制操作的软件程序等等,或者可以暂时地存储己经输出或将要输出的数据。存储器59可以包括至少一种类型的存储介质。而且,终端50可以与通过网络连接执行存储器59的存储功能的网络存储装置协作。控制器57通常控制终端的总体操作。另外,控制器57可以包括用于再现或回放多媒体数据的多媒体模块。控制器57可以执行模式识别处理,以将在触摸屏上执行的手写输入或者图片绘制输入识别为字符或图像。电源单元51在控制器57的控制下接收外部电力或内部电力并且提供操作各元件和组件所需的适当的电力。
本公开提出的视频特征的比对方法的各种实施方式可以以使用例如计算机软件、硬件或其任何组合的计算机可读介质来实施。对于硬件实施,本公开提出的视频特征的比对方法的各种实施方式可以通过使用特定用途集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理装置(DSPD)、可编程逻辑装置(PLD)、现场可编程门阵列(FPGA)、处理器、控制器、微控制器、微处理器、被设计为执行这里描述的功能的电子单元中的至少一种来实 施,在一些情况下,本公开提出的视频特征的比对方法的各种实施方式可以在控制器57中实施。对于软件实施,本公开提出的视频特征的比对方法的各种实施方式可以与允许执行至少一种功能或操作的单独的软件模块来实施。软件代码可以由以任何适当的编程语言编写的软件应用程序(或程序)来实施,软件代码可以存储在存储器59中并且由控制器57执行。
有关本实施例的详细说明可以参考前述各实施例中的相应说明,在此不再赘述。
以上结合具体实施例描述了本公开的基本原理,但是,需要指出的是,在本公开中提及的优点、优势、效果等仅是示例而非限制,不能认为这些优点、优势、效果等是本公开的各个实施例必须具备的。另外,上述公开的具体细节仅是为了示例的作用和便于理解的作用,而非限制,上述细节并不限制本公开为必须采用上述具体的细节来实现。
本公开中涉及的器件、装置、设备、系统的方框图仅作为例示性的例子并且不意图要求或暗示必须按照方框图示出的方式进行连接、布置、配置。如本领域技术人员将认识到的,可以按任意方式连接、布置、配置这些器件、装置、设备、系统。诸如“包括”、“包含”、“具有”等等的词语是开放性词汇,指“包括但不限于”,且可与其互换使用。这里所使用的词汇“或”和“和”指词汇“和/或”,且可与其互换使用,除非上下文明确指示不是如此。这里所使用的词汇“诸如”指词组“诸如但不限于”,且可与其互换使用。
另外,如在此使用的,在以“至少一个”开始的项的列举中使用的“或”指示分离的列举,以便例如“A、B或C的至少一个”的列举意味着A或B或C,或AB或AC或BC,或ABC(即A和B和C)。此外,措辞“示例的”不意味着描述的例子是优选的或者比其他例子更好。
还需要指出的是,在本公开的系统和方法中,各部件或各步骤是可以分解和/或重新组合的。这些分解和/或重新组合应视为本公开的等效方案。
可以不脱离由所附权利要求定义的教导的技术而进行对在此所述的技术的各种改变、替换和更改。此外,本公开的权利要求的范围不限于以上所述的处理、机器、制造、事件的组成、手段、方法和动作的具体方面。可以利用与在此所述的相应方面进行基本相同的功能或者实现基本相同的结果的当前存在的或者稍后要开发的处理、机器、制造、事件的组成、手段、方 法或动作。因而,所附权利要求包括在其范围内的这样的处理、机器、制造、事件的组成、手段、方法或动作。
提供所公开的方面的以上描述以使本领域的任何技术人员能够做出或者使用本公开。对这些方面的各种修改对于本领域技术人员而言是非常显而易见的,并且在此定义的一般原理可以应用于其他方面而不脱离本公开的范围。因此,本公开不意图被限制到在此示出的方面,而是按照与在此公开的原理和新颖的特征一致的最宽范围。
为了例示和描述的目的已经给出了以上描述。此外,此描述不意图将本公开的实施例限制到在此公开的形式。尽管以上已经讨论了多个示例方面和实施例,但是本领域技术人员将认识到其某些变型、修改、改变、添加和子组合。

Claims (12)

  1. 一种视频漂浮纸片检测方法,其特征在于,包括:
    对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,所述漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口;
    根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片。
  2. 根据权利要求1所述的方法,其特征在于,所述根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片的步骤,包括:
    若检测出至少一帧待检测图片中包含漂浮纸片,则确定所述待检测视频中包含漂浮纸片。
  3. 根据权利要求1所述的方法,其特征在于,所述对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测的步骤,包括:
    针对多帧待检测图片,提取各帧待检测图片的图像特征;
    比较所述各帧待检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定所述待检测图片中至少有两帧待检测图片包含漂浮纸片。
  4. 根据权利要求1所述的方法,其特征在于,所述对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测的步骤,包括:
    针对单帧待检测图片,提取所述待检测图片的特征点及所述特征点的邻近特征点;
    根据所述特征点与所述邻近特征点的相似度确定特征区域;
    若检测到所述待检测图片中包含至少两个特征区域,则确定所述待检测图片中包含漂浮纸片。
  5. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    将已知包含漂浮纸片的图片和/或已知未包含漂浮纸片的图片作为训练样本;
    根据是否包含漂浮纸片对所述训练样本进行标注;
    采用深度学习分类算法对所述标注后的训练样本进行训练学习,得到图像分类器;
    所述对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测的步骤,包括:
    将所述至少一帧待检测图片输入所述图像分类器,根据所述图像分类器的分类结果确定所述至少一帧待检测图片中的检测结果。
  6. 一种视频漂浮纸片检测装置,其特征在于,包括:
    漂浮纸片检测模块,用于对从待检测视频中抽取的至少一帧待检测图片进行漂浮纸片检测,所述漂浮纸片为插入所述待检测视频中的且与所述待检测视频内容无关的子显示窗口;
    漂浮纸片确定模块,用于根据对所述至少一帧待检测图片的检测结果确定所述待检测视频中是否包含漂浮纸片。
  7. 根据权利要求6所述的装置,其特征在于,所述漂浮纸片确定模块具体用于:若检测出至少一帧待检测图片中包含漂浮纸片,则确定所述待检测视频中包含漂浮纸片。
  8. 根据权利要求6所述的装置,其特征在于,所述漂浮纸片检测模块具体用于:针对多帧待检测图片,提取各帧待检测图片的图像特征;比较所述各帧待检测图片的图像特征,若存在包含相同图像特征的待检测图片,则确定所述待检测图片中至少有两帧待检测图片包含漂浮纸片。
  9. 根据权利要求6所述的装置,其特征在于,所述漂浮纸片检测模块具体用于:针对单帧待检测图片,提取所述待检测图片的特征点及所述特征点的邻近特征点;根据所述特征点与所述邻近特征点的相似度确定特征区域;若检测到所述待检测图片中包含至少两个特征区域,则确定所述待检测图片中包含漂浮纸片。
  10. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    图像分类器训练模块,用于将已知包含漂浮纸片的图片和/或已知未包含漂浮纸片的图片作为训练样本;根据是否包含漂浮纸片对所述训练样本进行标注;采用深度学习分类算法对所述标注后的训练样本进行训练学习,得到图像分类器;
    所述漂浮纸片检测模块具体用于:将所述至少一帧待检测图片输入所 述图像分类器,根据所述图像分类器的分类结果确定所述至少一帧待检测图片中的检测结果。
  11. 一种视频漂浮纸片检测硬件装置,包括:
    存储器,用于存储非暂时性计算机可读指令;以及
    处理器,用于运行所述计算机可读指令,使得所述处理器执行时实现根据权利要求1-5中任意一项所述的视频漂浮纸片检测方法。
  12. 一种计算机可读存储介质,用于存储非暂时性计算机可读指令,当所述非暂时性计算机可读指令由计算机执行时,使得所述计算机执行权利要求1-5中任意一项所述的视频漂浮纸片检测方法。
PCT/CN2018/117711 2018-09-13 2018-11-27 视频漂浮纸片检测方法、装置和计算机可读存储介质 WO2020052082A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811068698.5 2018-09-13
CN201811068698.5A CN109064494B (zh) 2018-09-13 2018-09-13 视频漂浮纸片检测方法、装置和计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2020052082A1 true WO2020052082A1 (zh) 2020-03-19

Family

ID=64761493

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/117711 WO2020052082A1 (zh) 2018-09-13 2018-11-27 视频漂浮纸片检测方法、装置和计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN109064494B (zh)
WO (1) WO2020052082A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738173A (zh) * 2020-06-24 2020-10-02 北京奇艺世纪科技有限公司 视频片段检测方法、装置、电子设备及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170316285A1 (en) * 2016-04-28 2017-11-02 International Business Machines Corporation Detection of objects in images using region-based convolutional neural networks
CN107948640A (zh) * 2017-12-19 2018-04-20 百度在线网络技术(北京)有限公司 视频播放测试方法、装置、电子设备和存储介质
CN108038850A (zh) * 2017-12-08 2018-05-15 天津大学 一种基于深度学习的排水管道异常类型自动检测方法

Family Cites Families (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101453575B (zh) * 2007-12-05 2010-07-21 中国科学院计算技术研究所 一种视频字幕信息提取方法
CN100595780C (zh) * 2007-12-13 2010-03-24 中国科学院合肥物质科学研究院 一种基于模块神经网络的手写体数字自动识别方法
US9202137B2 (en) * 2008-11-13 2015-12-01 Google Inc. Foreground object detection from multiple images
CN101448100B (zh) * 2008-12-26 2011-04-06 西安交通大学 一种快速准确的视频字幕提取方法
CN101853398B (zh) * 2010-05-11 2012-07-04 浙江大学 基于空间约束特征选择及其组合的中国剪纸识别方法
CN103186780B (zh) * 2011-12-30 2018-01-26 乐金电子(中国)研究开发中心有限公司 视频字幕识别方法及装置
CN104966097B (zh) * 2015-06-12 2019-01-18 成都数联铭品科技有限公司 一种基于深度学习的复杂文字识别方法
CN105184226A (zh) * 2015-08-11 2015-12-23 北京新晨阳光科技有限公司 数字识别方法和装置及神经网络训练方法和装置
CN105718861B (zh) * 2016-01-15 2019-06-07 北京市博汇科技股份有限公司 一种识别视频流数据类别的方法及装置
CN107679552A (zh) * 2017-09-11 2018-02-09 北京飞搜科技有限公司 一种基于多分支训练的场景分类方法以及系统
CN108288077A (zh) * 2018-04-17 2018-07-17 天津和或节能科技有限公司 废纸分类器建立装置及方法、废纸分类系统及方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170316285A1 (en) * 2016-04-28 2017-11-02 International Business Machines Corporation Detection of objects in images using region-based convolutional neural networks
CN108038850A (zh) * 2017-12-08 2018-05-15 天津大学 一种基于深度学习的排水管道异常类型自动检测方法
CN107948640A (zh) * 2017-12-19 2018-04-20 百度在线网络技术(北京)有限公司 视频播放测试方法、装置、电子设备和存储介质

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111738173A (zh) * 2020-06-24 2020-10-02 北京奇艺世纪科技有限公司 视频片段检测方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN109064494B (zh) 2021-09-21
CN109064494A (zh) 2018-12-21

Similar Documents

Publication Publication Date Title
WO2020052084A1 (zh) 视频封面选择方法、装置和计算机可读存储介质
TWI462035B (zh) 物件偵測後設資料
US8792722B2 (en) Hand gesture detection
US8750573B2 (en) Hand gesture detection
JP5510167B2 (ja) ビデオ検索システムおよびそのためのコンピュータプログラム
US11749020B2 (en) Method and apparatus for multi-face tracking of a face effect, and electronic device
US20170017844A1 (en) Image content providing apparatus and image content providing method
WO2020211624A1 (zh) 对象追踪方法、追踪处理方法、相应的装置、电子设备
KR102402511B1 (ko) 영상 검색 방법 및 이를 위한 장치
WO2020052083A1 (zh) 侵权图片的识别方法、装置和计算机可读存储介质
Escalante et al. A naive bayes baseline for early gesture recognition
EP2291722A1 (en) Method, apparatus and computer program product for providing gesture analysis
US9715638B1 (en) Method and apparatus for identifying salient subimages within a panoramic image
TW201546636A (zh) 註解顯示器輔助裝置及輔助方法
US9082184B2 (en) Note recognition and management using multi-color channel non-marker detection
WO2019105457A1 (zh) 图像处理方法、计算机设备和计算机可读存储介质
US20150154718A1 (en) Information processing apparatus, information processing method, and computer-readable medium
Lim et al. Scene recognition with camera phones for tourist information access
US20160104052A1 (en) Text-based thumbnail generation
WO2019137259A1 (zh) 图像处理方法、装置、存储介质及电子设备
CN110619656A (zh) 基于双目摄像头的人脸检测跟踪方法、装置及电子设备
WO2020052085A1 (zh) 视频文字检测方法、装置和计算机可读存储介质
US8498978B2 (en) Slideshow video file detection
CN110309324A (zh) 一种搜索方法及相关装置
WO2020052082A1 (zh) 视频漂浮纸片检测方法、装置和计算机可读存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18933371

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 24.06.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18933371

Country of ref document: EP

Kind code of ref document: A1