CN112419132A

CN112419132A - Video watermark detection method and device, electronic equipment and storage medium

Info

Publication number: CN112419132A
Application number: CN202011225956.3A
Authority: CN
Inventors: 陈广; 王雷; 张波; 苏正航
Original assignee: Guangzhou Huaduo Network Technology Co Ltd
Current assignee: Guangzhou Overseas shoulder sub network technology Co.,Ltd.
Priority date: 2020-11-05
Filing date: 2020-11-05
Publication date: 2021-02-26

Abstract

The application discloses a video watermark detection method, a video watermark detection device, electronic equipment and a storage medium, wherein the method comprises the following steps: detecting the watermark of a video to be processed by utilizing a pre-trained target detection network to obtain a watermark detection result of each frame of video image in the video; acquiring a target video image without the watermark detected in the video to be processed according to the watermark detection result; judging whether the adjacent video images detect the watermarks or not according to the watermark detection results; when the adjacent video images detect the watermarks, acquiring watermark detection results of the adjacent video images as watermark detection results of the target video images; and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image. After the target video image with the watermark not detected in the video is obtained, the watermark detection result of the adjacent video image can be used as the watermark detection result of the target video image, so that the image with the watermark missing detection in the video is avoided.

Description

Video watermark detection method and device, electronic equipment and storage medium

Technical Field

The present application relates to the field of video processing technologies, and in particular, to a video watermark detection method, apparatus, electronic device, and storage medium.

Background

In the age of rapid development of multimedia technology, a large number of video files are produced. In some video files, producers watermark the videos for advertising purposes, or for protecting the copyright of the videos, tracking infringement, and the like. However, these watermarked videos tend to degrade the viewing experience of the audience, and it is undesirable for the video propagator to propagate the watermarks of others as part of the video content while propagating the video. Therefore, in some cases, watermark detection is required for video to remove the watermark in the video when the watermark is detected. However, the current watermark detection method has low accuracy, and is easy to cause unclean watermark removal in the video, so that the user has poor viewing experience of the video.

Disclosure of Invention

The embodiment of the application provides a video watermark detection method, a video watermark detection device, electronic equipment and a storage medium, and the accuracy of video watermark detection can be improved.

In a first aspect, an embodiment of the present application provides a video watermark detection method, where the method includes: detecting a watermark in a video to be processed by utilizing a pre-trained target detection network to obtain a watermark detection result of each frame of video image in the video to be processed; acquiring a target video image without the watermark detected in the video to be processed according to the watermark detection result; judging whether a watermark is detected in an adjacent video image according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed; when the adjacent video images detect watermarks, acquiring watermark detection results of the adjacent video images as watermark detection results of the target video images; and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image.

In a second aspect, an embodiment of the present application provides a video watermark detection apparatus, including: the target detection module is used for detecting the watermark of the video to be processed by utilizing a pre-trained target detection network to obtain the watermark detection result of each frame of video image in the video to be processed; the target acquisition module is used for acquiring a target video image of which the watermark is not detected in the video to be processed according to the watermark detection result; the adjacent judgment module is used for judging whether an adjacent video image detects a watermark or not according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed; the result copying module is used for acquiring a watermark detection result of the adjacent video image as a watermark detection result of the target video image when the adjacent video image detects the watermark; and the result generation module is used for regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image.

In a third aspect, an embodiment of the present application provides an electronic device, including: a memory; one or more processors coupled with the memory; one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more programs configured to perform the video watermark detection method provided by the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, where a program code is stored in the computer-readable storage medium, and the program code may be called by a processor to execute the video watermark detection method provided in the first aspect.

According to the video watermark detection method, the video watermark detection device, the electronic equipment and the storage medium, a pre-trained target detection network is used for detecting a watermark in a video to be processed, and a watermark detection result of each frame of video image in the video to be processed is obtained; acquiring a target video image without the watermark detected in the video to be processed according to the watermark detection result; judging whether a watermark is detected in an adjacent video image according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed; when the adjacent video images detect watermarks, acquiring watermark detection results of the adjacent video images as watermark detection results of the target video images; and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image. Therefore, the method and the device detect the watermark in the video through the pre-trained target detection network, can automatically position the watermark position in the video, do not need to manually search for a watermark area, and can use the continuity of video image frames and the smaller variability of the watermark position in adjacent frame video images to use the watermark detection result of the adjacent frame video images of the target video images as the watermark detection result of the target video images when the target video images without detected watermarks exist in the watermark detection result of the video obtained through the target detection network, thereby avoiding the problem of missing detection of the watermark in one or more frames of video images in the video and improving the accuracy of video watermark detection.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 shows a flowchart of a video watermark detection method according to an embodiment of the present application.

Fig. 2 is a flowchart illustrating a video watermark detection method according to another embodiment of the present application.

Fig. 3 shows another flowchart of a video watermark detection method according to another embodiment of the present application.

Fig. 4 shows a watermark sample diagram of a video watermark detection method provided by the present application.

Fig. 5 shows a flowchart of step S220 in the video watermark detection method according to the embodiment of the present application.

Fig. 6 shows a schematic flowchart of a video watermark detection method according to another embodiment of the present application.

Fig. 7 is a flowchart illustrating a video watermark detection method according to another embodiment of the present application.

Fig. 8 is a schematic overall flow chart illustrating a video watermark detection method according to an embodiment of the present application.

Fig. 9 shows a schematic diagram of watermark detection effect in the video watermark detection method according to the embodiment of the application.

Fig. 10 shows schematic effects before and after watermark restoration in a video watermark detection method according to an embodiment of the present application.

Fig. 11 shows a block diagram of a video watermark detection apparatus according to an embodiment of the present application.

Fig. 12 shows a block diagram of an electronic device according to an embodiment of the present application.

Fig. 13 illustrates a storage unit according to an embodiment of the present application, configured to store or carry program code for implementing a video watermark detection method according to an embodiment of the present application.

Detailed Description

In order to make the technical solutions better understood by those skilled in the art, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application.

Current watermark detection methods, typically fixed-location watermark detection, typically require manual pre-assignment of watermark regions. For example, for a watermark fixed to the lower right corner in a video, the watermark region may be manually pre-designated as the lower right corner region of the video, so that watermark detection and watermark removal may be directly performed on the lower right corner region. However, for videos with complex problems such as watermark position change, watermark form change, etc., the watermark detection method cannot effectively detect the watermark, and is prone to missing detection, so that the accuracy of watermark detection is not high.

In order to solve the foregoing drawbacks, embodiments of the present application provide a method, an apparatus, a system, an electronic device, and a storage medium for detecting a video watermark. The automatic positioning of the watermarks in the video can be realized through a pre-trained target detection network, the accuracy of watermark detection is improved, meanwhile, the watermark detection results of adjacent images can be copied to target images without the watermarks detected through interframe smoothing processing, so that the problem that the images with one or more frames of watermarks missed in detection appear in the video due to the influence of the detection recall rate of the target detection network is solved, and the accuracy of video watermark detection is further improved. The following will be described in detail by way of specific examples.

Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a video watermark detection method according to an embodiment of the present application, which can be applied to an electronic device, and the video watermark detection method may include:

step S110: and detecting the watermark in the video to be processed by utilizing a pre-trained target detection network to obtain a watermark detection result of each frame of video image in the video to be processed.

The video to be processed may be a video that needs to be subjected to watermark detection processing, and the source and format of the video are not limited, but are not listed here. Such as available for download from a local or network. In some embodiments, the video to be processed may be video without a watermark or video with one or more watermarks. When the watermark is added to the video, the watermark may be added to each frame of video image, or only to the image in a certain period of time of the video, and the added watermark may be a watermark whose number, shape, size, position and posture (such as rotation angle, etc.) are all fixed, or a dynamic watermark whose number, shape, size, position and posture are all not fixed (such as a "jittering" watermark of a short video platform). The specific video type and the duration, type and number of watermarks added to the video are not limited in the embodiments of the present application.

In this embodiment of the application, the video to be processed may be input to a pre-trained target detection network, so as to perform watermark detection on each frame of video image in the video through the target detection network, and then after the target detection network outputs a watermark detection result of each frame of video image, the electronic device may obtain the watermark detection result of each frame of video image in the video to be processed.

The target detection network may detect a specific target by applying a deep learning algorithm, i.e., a target detection algorithm, where the target is a watermark in this embodiment of the application. The target detection algorithm may be a target detection algorithm based on an image segmentation technique, a target detection algorithm based on image feature matching, a frequency domain-based method, and the like, which are not limited herein.

In some embodiments, the target detection Network may be a two-stage detection Network, such as R-CNN (Region-based Convolutional Neural Networks), Fast R-CNN (Fast Region-based Convolutional Neural Networks, Fast R-CNN (Fast Region-based Convolutional Neural Networks, Faster Region-based Convolutional Neural Networks), etc., and the two-stage detection Network generates a large number of candidate regions by using methods such as RPN (Region pro-possible Network), selective search, etc., and generally has relatively high detection accuracy. In other embodiments, the target detection network may also be a one-stage detection network, such as a RetinaNet network, an SSD (Single Shot multi box Detector), a yolo (young Only Look once), and the like, and the one-stage detection network may directly generate a target region on the basis of a multi-scale anchor, and often has a faster detection speed, and the like, which is not limited in this embodiment. For example, the target detection network of the present application may be a RetinaNet network with a relatively fast detection speed.

In some embodiments, the pre-trained target detection network may be obtained by pre-training the neural network model according to a plurality of first training samples. The first training sample may include a first image sample and a watermark marking sample corresponding to the first image sample, where the watermark marking sample may include specific location information of a watermark in the first image sample. Therefore, the watermark of each frame of video image in the video to be processed can be identified and detected according to the pre-trained target detection network, namely the pre-trained target detection network can be used for outputting the watermark detection result corresponding to the video image according to the input video image. The watermark detection result may include position information of the watermark in the video image, which may be presented in the form of a detection frame or in the form of coordinate values, and is not limited here. In some embodiments, the first image samples in the first training samples may be images of watermarks at different positions to ensure the watermark location identification effect of the target detection network.

Step S120: and acquiring a target video image without the watermark detected in the video to be processed according to the watermark detection result.

Since the target detection network has a recall rate (that is, the number of detected images with watermarks accounts for the total number of images with watermarks actually, and the ideal recall rate is 100%, which indicates that the images are not missed), when the target detection network is used for detecting the watermarks of each frame of video image in the video to be processed, the watermarks of one or more frames of video images may be missed. For example, if a video with a length of 20s has 500 frames in total, when each frame of video image is to be watermarked and the recall rate of the target detection network reaches 99%, it is highly likely that the watermark in 5 frames of the average video is missed. For the played video, even if only one frame of video image in the video has a watermark, the played video brings a bad impression to the user. Therefore, in the embodiment of the application, in order to avoid that the watermark of one or more frames of the video is missed, an inter-frame smoothing processing flow may be newly added after the watermark detection result of the video output by the target detection network is obtained, so as to ensure that each frame of the video image can detect the watermark, improve the recall rate of the target detection network, and avoid that the watermark is missed in the video. The inter-frame smoothing process may be understood as determining a watermark detection result of the target image by using a watermark detection result of an image adjacent to the target image.

Specifically, in the embodiment of the present application, after a watermark detection result of each frame of video image in a to-be-processed video output by a target detection network is obtained, a video image in which a watermark is not detected in the to-be-processed video may be obtained as a target video image in the inter-frame smoothing process flow. That is to say, in the embodiment of the present application, any video image may not enter the inter-frame smoothing process flow, but only a video image in which a watermark is not detected may enter the inter-frame smoothing process flow, so that the execution steps of the processor are reduced, and the watermark detection efficiency is improved.

It can be understood that when each frame of video image in the video to be processed detects the watermark, it may be considered that the target detection network has not missed detection, and the inter-frame smoothing process flow of the present application may not be executed. And when the target video image without the watermark detected exists in the video to be processed, the target video image is likely to be the image which is missed by the target detection network, and the inter-frame smoothing processing flow of the application can be executed.

In some embodiments, when the target detection network does not detect the watermark, a preset detection result may be output, so that whether each frame of video image is the target video image in which the watermark is not detected may be determined by respectively determining whether the watermark detection result of each frame of video image is the preset detection result. The preset detection result may be a watermark position information set whose element is empty, or may be preset characters such as "none", "no watermark detected", and the like, which is not limited herein.

Step S130: and judging whether a watermark is detected in an adjacent video image according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed.

In some scenarios, due to the limited time for the occurrence of the watermark, the video may not have the watermark from beginning to end, and therefore, when a target video image with no watermark detected exists in the video to be processed, the target video image may indeed have no watermark, and is not detected by the target detection network. Therefore, in the embodiment of the application, after a target video image in which a watermark is not detected in a video to be processed is acquired, whether a watermark is detected in an adjacent video image can be judged, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed, so that whether the target video image is an image missed for detection by a target detection network can be determined according to a watermark detection result of the adjacent video image. The adjacent video image may be a frame of video image adjacent to the target video image, or may be a plurality of frames of video images adjacent to the target video image, which is not limited herein.

It can be understood that if the adjacent video images also do not detect the watermark, it is likely that the watermark does not exist in the target video image, and it is likely that the target detection network fails to detect; if the adjacent video images detect the watermarks, the watermarks are likely to exist in the target video images, and the target detection network is likely to miss detection. Therefore, whether the target video image is the image which is missed to be detected by the target detection network can be determined by judging whether the adjacent video images detect the watermarks or not, so that the target video image which needs to enter the inter-frame smoothing processing flow can be further accurately identified, unnecessary operations of other video images are reduced, and the watermark detection efficiency is improved.

In some embodiments, each frame of video image in the video to be processed is sequenced according to time sequence, so that an adjacent video image corresponding to a time node adjacent to a target time node can be determined according to the target time node of the target video image in the video to be processed, and thus a watermark detection result of the adjacent video image can be obtained to determine whether a watermark is detected. In other embodiments, when the target detection network detects the watermark in the video to be processed, it is likely that each frame of video image in the video to be processed is input to the target detection network according to the time sequence for watermark detection, so that watermark detection results adjacent to the watermark detection results of the target detection network and the target video image before and after the watermark detection results of the target detection network and the target video image can also be directly obtained according to the detection sequence of the target video image in the target detection network, and the watermark detection results of the adjacent video images are used for confirming whether the watermark is detected. The foregoing description may be referred to for determining whether the correlation description of the watermark is detected according to the watermark detection result, and details are not described here.

Step S140: and when the adjacent video images detect the watermarks, acquiring the watermark detection results of the adjacent video images as the watermark detection results of the target video images.

In the embodiment of the application, when a watermark is detected in an adjacent video image adjacent to a target video image, and the watermark is not detected in the target video image, it can be considered that the watermark is likely to exist in the target video image, and the target video image is likely to be missed by a target detection network, so that the target video image can enter an inter-frame smoothing process flow. Specifically, the continuity of the video image frames and the smaller variability of the watermark positions in the video images of the adjacent frames can be utilized to directly obtain the watermark detection result of the adjacent video image adjacent to the target video image as the watermark detection result of the target video image, so that a watermark detection result with higher accuracy can be provided for the target video image missed to be detected by the target detection network, the target detection network does not need to be input again for re-detection, and the watermark detection efficiency is improved.

In some embodiments, when a target video image does not detect a watermark and an adjacent video image adjacent to the target video image also does not detect a watermark, it may be considered that a watermark does not exist in the target video image, which is likely not missed by the target detection network. Therefore, the inter-frame smoothing process flow does not need to be entered, that is, the step of obtaining the watermark detection result of the adjacent video image adjacent to the target video image as the watermark detection result of the target video image does not need to be executed.

Step S150: and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image.

In the embodiment of the application, after the watermark detection result of the adjacent video image is used as the watermark detection result of the target video image, the watermark detection result of each frame of video image in the video to be processed can be regenerated according to the watermark detection result of the target video image, so that the watermark detection result of the video to be processed is updated, and the subsequent processing is facilitated. For example, according to the updated watermark detection result of the video to be processed, the watermark in the video to be processed is removed.

In some embodiments, the watermark detection result of each frame of video image in the video to be processed may be stored in a centralized manner according to a certain rule, such as in a matrix or set form, so that the watermark detection result of the adjacent video image may be directly copied, and the original watermark detection result of the target video image is replaced with the copied watermark detection result, so that a new watermark detection result of each frame of video image in the video to be processed may be regenerated. The specific watermark detection result updating method of the video to be processed is not determined here.

The video watermark detection method provided by the embodiment of the application detects watermarks in a video to be processed by utilizing a pre-trained target detection network, and obtains watermark detection results of each frame of video image in the video to be processed; acquiring a target video image without the watermark detected in the video to be processed according to the watermark detection result; judging whether a watermark is detected in an adjacent video image according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed; when the adjacent video images detect watermarks, acquiring watermark detection results of the adjacent video images as watermark detection results of the target video images; and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image. By utilizing the continuity of the video image frames and the smaller variability of the watermark positions in the adjacent frame video images, when the watermark is not detected in the target video image, the watermark detection result of the adjacent frame video image of the target video image is used as the watermark detection result of the target video image, so that the problem of missing detection of the watermark in one or more frames of video images in the video to be processed can be avoided, and the accuracy of video watermark detection is improved.

Referring to fig. 2, fig. 2 is a flowchart illustrating a video watermark detection method according to another embodiment of the present application, which can be applied to an electronic device, and the video watermark detection method can include:

step S210: and performing frame division processing on the video to be processed to obtain a video image sequence.

Because the video is formed by splicing one frame of video image and another frame of video image according to the time sequence, in some embodiments, after the to-be-processed video which needs to be subjected to watermark detection is acquired, the to-be-processed video can be subjected to framing processing through various existing video frame-cutting decomposition software, so that a complete video image sequence of the to-be-processed video is obtained. The video image sequence can be understood as a video image frame set { V } which is generated after a video is decomposed into a plurality of video images and has a time sequence and is continuous_t|t＝1......n}。

For example, a 1 minute length of 25FPS video to be processed is decomposed into 1500 video image frames (1 minute by 60 seconds/minute by frame number of frames transferred per second of picture). Here, FPS (Frames Per Second) can be understood as the Frames Per Second of the picture. It should be noted that the number of video frames per second (i.e. the sampling rate, such as the number of frames per second of the above pictures) cannot be smaller than the number of frames per second of the pictures when the video to be processed is subjected to framing processing, so as to ensure that all watermarks in the video to be processed are detected as far as possible.

Step S220: and detecting the watermark in the video image sequence by utilizing a pre-trained target detection network, and acquiring the watermark detection result of each frame of video image in the video image sequence.

In some embodiments, after obtaining a video image sequence after framing processing of a video to be processed, the video image sequence may be input to a pre-trained target detection network to perform watermark detection on each frame of video image in the video image sequence through the target detection network, and then after the target detection network outputs a watermark detection result of each frame of video image, the electronic device may obtain the watermark detection result of each frame of video image in the video image sequence. The target detection network trains a neural network in advance according to a first training sample, wherein the first training sample comprises a first image sample and a watermark marking sample corresponding to the first image sample.

In some embodiments, in order to ensure the generalization performance of the target detection network in the real complex short video application scene, the first image sample may be an image with a watermark at different positions, so as to ensure the watermark positioning identification effect of the target detection network.

As one approach, the first image sample may be an image of a randomly synthesized watermark at different locations. Specifically, referring to fig. 3, before step S220, the video watermark detection method of the present application may further include:

step S200: and acquiring a plurality of background samples and a plurality of watermark samples.

In some embodiments, multiple background samples and multiple watermark samples may be obtained prior to combining the first image sample of the present application. The background sample may be any clean background picture without watermark, and may be obtained from various manners such as internet, download, local reading, and the like, which is not limited herein. The watermark sample can be any clean watermark image without a background, and the watermark sample can be a final clean watermark sample without a background processed by using image processing tools such as Photoshop and the like after the watermark image with the background is obtained. For example, please refer to FIG. 4, which shows a sample set of background-free watermarks { L } obtained_wI w 1.. m, for a total of m watermarks.

In some embodiments, the watermark types in the watermark samples may be preset, so that the trained target detection network may detect only the watermarks of the preset watermark types. Therefore, a batch of watermarks needing to be removed can be determined in advance according to actual service requirements, and targeted watermark detection is realized.

Step S201: and randomly synthesizing the plurality of watermark samples and the plurality of background samples through a fusion algorithm to obtain a plurality of synthesized first image samples.

In some embodiments, after obtaining the plurality of background samples and the plurality of watermark samples, a first training sample set for training the target detection network of the present application may be generated. As one mode, the obtained multiple watermark samples and multiple background samples may be randomly synthesized by a fusion algorithm, so as to randomly synthesize the obtained background-free watermark sample to any position of the background sample, thereby obtaining multiple synthesized first image samples.

As one way, the fusion algorithm may be an Alpha-Blending fusion algorithm, so-called Alpha-Blending, which is to blend the source pixels in the background sample and the target pixels in the watermark sample in terms of the values of an "Alpha (used to indicate how the pixel produces a special effect, i.e. translucency as we say generally) blend vector. Specifically, the three RGB color components of the source pixel in the background sample and the target pixel in the watermark sample may be separated, then the three color components of the source pixel are multiplied by the value of Alpha, respectively, the three color components of the target pixel are multiplied by the inverse value of Alpha, respectively, then the results are added according to the corresponding color components, then the finally obtained result of each component is divided by the maximum value of Alpha, and finally the three color components are synthesized into one pixel again for output. That is, the RGB values of the source pixel are mixed with the RGB values of the target pixel in proportion, and finally a mixed RGB value is obtained.

Wherein, the fusion formula of the fusion algorithm can be: i is_merge＝(1-a)I_bg+a*L_w. Wherein I_bgRepresenting the pixel value, L, in a background sample_wRepresenting pixel values in background-free watermark samples, I_mergeThe pixel values in the synthesized first image sample are represented, and a represents an Alpha normalized value, i.e., Alpha/256, wherein the value of Alpha is generally 0 to 255. Therefore, after the random fusion position is determined, the pixel values of the background sample and the watermark sample at the fusion position can be substituted for calculation.

For example, if 30 ten thousand clean background samples and 30 clean watermark samples are obtained, the 30 clean watermark samples may be randomly fused to any position of the 30 ten thousand clean background samples, so as to obtain 30 ten thousand synthesized first image samples. Therefore, 1 watermark sample can be fused with 1 ten thousand background samples almost, enough images watermarked at different positions can be obtained, and the detection precision of the target detection network is improved.

Step S202: and generating a watermark marking sample corresponding to each first image sample according to the synthesis position of the watermark sample in each first image sample.

In this embodiment of the application, after a plurality of synthesized first image samples are obtained, a watermark marking sample corresponding to each first image sample may be generated according to a synthesis position of a watermark sample in each first image sample. The watermark marking sample is used for representing position information of the watermark sample in the first image sample, and may be coordinate values of four corners of the watermark sample, or may be only coordinate values of opposite corners, such as coordinate values of the upper left corner and the lower right corner of the watermark sample, or a center point coordinate and a width and height value of the watermark sample, and the specific position information is not limited here, and only the position of the watermark sample needs to be determined.

It can be understood that, by using the randomly synthesized first image samples of different watermarks at different positions as a training data set, a trained target detection network can obtain an accurate watermark detection result even in a video to be processed facing complex problems such as watermark position change, watermark form change and the like, thereby realizing accurate dynamic and static watermark positioning. .

Alternatively, a batch of watermarked first image samples may be directly obtained, in addition to the randomly synthesized first image samples described above. The method can be characterized in that a batch of real videos with preset watermarks are collected, after a video image sequence is obtained through framing processing and serves as a first image sample, the video images with the watermarks can be artificially marked with the watermarks, and therefore the watermark marking samples corresponding to the first image sample are obtained. In some embodiments, the randomly synthesized first image sample and the directly obtained first image sample may be used together as a training data set for training a target detection network, so that the generalization may be also performed while the authenticity of the detection effect is ensured.

Since the watermark detection result output by the target detection network may have a problem of false detection, that is, the content in some detection boxes is not the watermark, in some embodiments, a classification network may be added to further reduce the false detection rate. Specifically, referring to fig. 5, after step S220, the video watermark detection method of the present application may further include:

step S221: and identifying the watermark detection result of each frame of video image by utilizing a pre-trained classification network, and acquiring the identification result of the watermark detection result of each frame of video image.

In some embodiments, after obtaining the watermark detection result of each frame of video image output by the target detection network, the watermark detection result of each frame of video image may be input to a classification network trained in advance, so as to perform watermark identification on the watermark detection result of each frame of video image through the classification network, and then after the target detection network outputs the identification result of the watermark detection result of each frame of video image, the electronic device may obtain the identification result of the watermark detection result of each frame of video image. The identification result may be used to characterize whether a watermark region in a watermark detection result output by the target detection network is a watermark, that is, to determine whether an error detection exists in the watermark detection result output by the target detection network.

In some embodiments, the classification network may be a shuffle net efficient lightweight network based on deep learning, or may be a CNN (Convolutional Neural network), which is not limited herein and may be set reasonably according to actual scene requirements. The classification network may be obtained by training the neural network model in advance according to a large number of second training samples, where the second training samples may include second image samples and classification labeling samples corresponding to the second image samples. The classification labeled sample corresponding to the second image sample may be a sample labeled with whether or not it is not watermarked, or may be a sample labeled with a specific watermark type, which is not limited herein.

In some embodiments, in order to reduce workload and improve detection efficiency, the training sample data set of the classification network may be performed on the basis of the target detection network. Specifically, referring to fig. 6, before step S221, the video watermark detection method of the present application may further include:

step S2201: acquiring a watermark image in the first image sample.

In some embodiments, a first image sample of the first training samples may be obtained, so as to obtain a watermark image therein according to the first image sample. As one mode, a sample may be marked according to a watermark in the first image sample, an area of the watermark image is determined, and the watermark image is cut out, so as to obtain the watermark image.

Step S2202: and carrying out boundary expansion on the watermark image, and acquiring the expanded watermark image as a second image sample.

Considering that the watermark image in the real scene usually has noise and the generalization capability of the classification network is increased, after the watermark image is obtained, the boundary extension can be performed on the watermark image to obtain the expanded watermark images with different sizes, and the expanded watermark images are used as second image samples, so that the identification precision of the classification network is improved. Wherein the boundary expansion may be an expansion according to a less random size of the watermark image.

Step S2203: and generating a classification labeling sample corresponding to the second image sample according to the watermark image in the second image sample.

After the second image sample is obtained, the classification labeling sample corresponding to the second image sample can be generated according to the watermark image in the second image sample. The second image sample may be generated according to the position coordinate of the watermark image and the watermark category, and the corresponding classification label sample may also include the position information and the type information of the watermark image.

It can be understood that the training data set of the classification network is determined according to the training data set of the target detection network, so that the trained target detection network can obtain an accurate watermark identification result and reduce the workload of data collection.

Step S222: and determining the identification result in each frame of video image as the watermark detection result of the watermark as a new watermark detection result of each frame of video image.

It can be understood that after the watermark detection result of each frame of video image is identified through the classification network, the watermark detection result identified as the watermark can be obtained, and the watermark detection result identified as the non-watermark can also be obtained. The watermark detection result identified as the non-watermark is filtered to remove the watermark area wrongly detected by the target detection network, so that the watermark area correctly detected by the target detection network can be obtained, and the correctly detected watermark area is the watermark detection result of each frame of video image in the video to be processed. Specifically, the watermark detection result with the identification result being the watermark can be determined from the watermark detection results of all the video images, and the watermark detection result with the identification result being the watermark in each frame of video image is determined as the new watermark detection result of each frame of video image. Therefore, the target video image without the watermark detected in the video to be processed can be obtained according to the new watermark detection result of each frame of video image, namely, the subsequent interframe smoothing processing flow is carried out.

Step S230: and acquiring a target video image without the watermark detected in the video to be processed according to the watermark detection result.

Step S240: and judging whether a watermark is detected in an adjacent video image according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed.

Step S250: and when the adjacent video images detect the watermarks, acquiring the watermark detection results of the adjacent video images as the watermark detection results of the target video images.

In the embodiment of the present application, steps S230 to S250 can refer to the foregoing embodiments, and are not described herein again.

In some embodiments, the adjacent video image may be a previous frame video image or a next frame video image adjacent to the target video image in the video to be processed, or both the previous frame video image and the next frame video image, which is not limited herein.

Specifically, when the adjacent video image is the previous frame video image, the forward smoothing process is performed: if the i frame video image is detected to obtain a watermark region set { bbox_k1.. K }, wherein bbox_k＝{x₁，y₁，x₂，y₂}，(x₁，y₁) As the coordinates of the upper left corner of the watermark region, (x)₂，y₂) The coordinates of the lower right corner of the watermark region. When no watermark is detected in the (i + 1) th frame of video image, the watermark detection result of the (i) th frame of video image, that is, the watermark region set { bbox }_kAnd (i) copying a watermark detection result to the i +1 frame video image.

Specifically, when the adjacent video image is the next frame video image, that is, the backward smoothing process is performed: if the i +1 th frame video image is detected to obtain the watermark region set { bbox_kI K1.. K }. When no watermark is detected in the ith frame of video image, the watermark detection result of the (i + 1) th frame of video image, that is, the watermark region set { bbox }_kAnd (i) copying a watermark detection result to the i-th frame video image, wherein i is 1.

Step S260: and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image.

In the embodiment of the present application, step S260 may refer to the foregoing embodiments, and is not described herein again.

According to the video watermark detection method provided by the embodiment of the application, after the watermark in the video to be processed is detected by using the pre-trained target detection network and the watermark detection result of each frame of video image in the video to be processed is obtained, the pre-trained classification network can be used for identifying the watermark detection result of each frame of video image to obtain the identification result of the watermark detection result of each frame of video image, so that the negative influence caused by the false detection of the target detection network can be reduced. And then determining a new watermark detection result for each frame of video image according to the watermark detection result with the identification result of the watermark in each frame of video image, so as to obtain a target video image without the watermark detected in the video to be processed according to the new watermark detection result. And judging whether the adjacent video image detects the watermark or not according to the new watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed. When the adjacent video images detect the watermarks, acquiring watermark detection results of the adjacent video images as watermark detection results of the target video images; and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image. By utilizing the continuity of the video image frames and the smaller variability of the watermark positions in the adjacent frame video images, when the watermark is not detected in the target video image, the watermark detection result of the adjacent frame video image of the target video image is used as the watermark detection result of the target video image, so that the problem of missing detection of the watermark in one or more frames of video images in the video to be processed can be avoided, and the accuracy of video watermark detection is improved.

Referring to fig. 7, fig. 7 is a flowchart illustrating a video watermark detection method according to another embodiment of the present application, which can be applied to an electronic device, and the video watermark detection method can include:

step S310: acquiring a standard watermark image to be detected;

although the watermark detection results of the adjacent video images of the target video image can be copied to the target video image through inter-frame smoothing processing, missing detection is avoided, and meanwhile, the influence of a wrong detection area is amplified. For example, the watermark detection results of the previous frame video image and the next frame video image of the target video image are copied to the target video image, and if the error detection areas exist in the previous frame video image and the next frame video image, the error detection areas of the target video image are doubled.

Therefore, in the embodiment of the present application, a de-error detection region is addedThe domain processing flow can also compare the watermark detection result of each frame of video image with the standard watermark image to be detected, and when the comparison of one watermark detection area is different, the watermark detection area is an error detection area. Specifically, a standard watermark image to be detected may be acquired first. The standard watermark image to be detected can be a predetermined watermark image according to actual business requirements, so that whether the detected image of the watermark region is the watermark image to be detected or not can be determined according to the standard watermark image. In some embodiments, the standard watermark image to be detected may be the aforementioned set of clean, background-free watermark samples { L }_w|w＝1......m}。

Step S320: and acquiring the area image of each watermark area in the watermark detection result of each frame of video image.

In the embodiment of the present application, after obtaining the watermark detection result of each frame of video image in the regenerated video to be processed through the inter-frame smoothing processing, each watermark region bbox in the watermark detection result of each frame of video image can be obtained_kTo determine whether the area image is misdetected or not according to the area image. The area image including the watermark area may be cut out according to the coordinate information in the watermark detection result.

Step S330: and respectively calculating the similarity value of each region image in each frame of video image and the standard watermark image.

In the embodiment of the application, after the standard watermark image to be detected and the area image of each watermark area in the watermark detection result are acquired, the similarity value between each area image in each frame of video image and the standard watermark image can be respectively calculated, so as to determine whether the area image is the watermark image to be detected according to the similarity value, and thus, whether the area image is mistakenly detected can be determined.

In some embodiments, the standard watermark image and the area image may be scaled to a uniform size to ensure the accuracy of the result. As one mode, the area image may be scaled to a size consistent with the standard watermark image, the standard watermark image may be scaled to a size consistent with the area image, or the standard watermark image and the area image may be scaled to a preset size, which is not limited herein.

In some embodiments, the similarity between the area image and the standard watermark image may be determined by using various deduplication methods such as Dhash, AHA, and Phash, and the specific deduplication method is not limited. For example, in the embodiment of the present application, a Dhash deduplication algorithm that is considered to have a relatively good deduplication effect after experimental testing may be used to calculate the similarity.

In some embodiments, when the similarity is calculated by using the Dhash de-recalculation method, the scaled region image and the standard watermark image may be grayed to obtain a grayed standard watermark image L'_wAnd a grayed area image bbox'_k. And then difference calculation is carried out on the two to obtain a difference value array. The difference calculation may be a point-by-point comparison with the right pixel value, where the difference is 1 when the right pixel value is greater than or equal to the right pixel value, and the difference is 0 when the right pixel value is less than the right pixel value. Wherein, the difference value is calculated according to the line, if each line in the image has p pixels, p-1 difference values are generated. And then, performing hash value conversion on the difference values, namely, regarding each difference value in the difference value array as a bit, forming a 16-system value by every 8 bits, and connecting the 16-system values to convert the 16-system values into character strings to obtain the final dHash value. And finally, calculating the Hamming Distance (Hamming Distance) according to the dHash value of the regional image and the standard watermark image. Thus, the similarity value is determined according to the Hamming distance. Wherein, the larger the Hamming distance is, the smaller the similarity value is, the lower the similarity is.

Step S340: and determining the watermark region corresponding to the region image with the similarity value larger than the preset value in each frame of video image as the final watermark detection result of each frame of video image.

In the embodiment of the present application, after obtaining the similarity value between each region image in each frame of video image and the standard watermark image, the region image in each frame of video image whose similarity value is greater than the preset value may be obtained, and the region image whose similarity value is greater than the preset value may be regarded as the image in the watermark region correctly detected in the watermark detection result. Therefore, the watermark region corresponding to the region image can be determined as the final watermark detection result of each frame of video image, namely the final watermark detection result which is free of missed detection and correct.

It can be understood that, when the similarity value is smaller than the preset value, the area image is different from the standard watermark image, and is likely to be an image of the watermark area erroneously detected by the target detection network, so that the watermark area corresponding to the area image with the similarity value smaller than the preset value can be removed from the original watermark detection result, thereby obtaining a final correct watermark detection result.

In some embodiments, after determining a final correct watermark detection result for each frame of video image in the video to be processed, the electronic device may perform a watermark removal process on the video to be processed according to the watermark detection result. Specifically, the electronic device may perform interpolation calculation on the reserved watermark region after processing the debug detection region, so as to fill pixels outside the region into the watermark region, thereby achieving the effect of removing the watermark, and then may output the repaired video image frames, thereby fusing all the repaired video image frames, and obtaining the clean video after removing the watermark. For example, referring to fig. 8, 9 and 10, fig. 8 shows an overall flowchart of video watermark detection provided by the present application, fig. 9 shows an effect schematic diagram of a watermark position detection result provided by the present application, where detection blocks 610 and 620 are watermark positions obtained by detecting a video image 600, fig. 10 shows an effect schematic diagram of video restoration after watermark removal provided by the present application, and compared with fig. 9, in fig. 10, watermarks in the detection blocks 610 and 620 detected in fig. 9 have been removed.

According to the video watermark detection method provided by the embodiment of the application, after interframe smoothing processing is performed, the watermark detection result of the adjacent image is copied to the target image without the watermark detected, and the watermark detection result of each frame of video image in the video to be processed is regenerated, the amplification influence of a wrong detection area caused by interframe smoothing can be reduced by increasing the processing flow of a debugging detection area, the accuracy of the final watermark detection result is improved, and the accuracy of video watermark detection is improved.

Referring to fig. 11, fig. 11 is a block diagram illustrating a structure of a video watermark detection apparatus 400 according to an embodiment of the present application, where the video watermark detection apparatus 400 is applied to an electronic device. The video watermark detection apparatus 400 includes: an object detection module 410, an object acquisition module 420, an adjacent judgment module 430, a result replication module 440, and a result generation module 450. The target detection module 410 is configured to detect a watermark of a to-be-processed video by using a pre-trained target detection network, and obtain a watermark detection result of each frame of video image in the to-be-processed video; the target obtaining module 420 is configured to obtain, according to the watermark detection result, a target video image in the to-be-processed video where the watermark is not detected; the adjacent judgment module 430 is configured to judge whether an adjacent video image detects a watermark according to the watermark detection result, where the adjacent video image is a video frame image adjacent to the target video image in the video to be processed; the result copying module 440 is configured to, when the adjacent video image detects a watermark, obtain a watermark detection result of the adjacent video image as a watermark detection result of the target video image; the result generating module 450 is configured to regenerate the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image.

In some embodiments, the target detection module 410 may be specifically configured to: performing frame processing on a video to be processed to obtain a video image sequence; detecting the watermark in the video image sequence by using a pre-trained target detection network, and obtaining a watermark detection result of each frame of video image in the video image sequence, wherein the target detection network is obtained by training a neural network in advance according to a first training sample, and the first training sample comprises a first image sample and a watermark marking sample corresponding to the first image sample.

In some embodiments, the video watermark detection apparatus 400 may further include: the system comprises a sample acquisition module, a watermark detection module and a watermark detection module, wherein the sample acquisition module is used for acquiring a plurality of background samples and a plurality of watermark samples; the sample synthesis module is used for randomly synthesizing the watermark samples and the background samples through a fusion algorithm to obtain a plurality of synthesized first image samples; and the sample labeling module is used for generating a watermark labeling sample corresponding to each first image sample according to the synthesis position of the watermark sample in each first image sample.

In some embodiments, the video watermark detection apparatus 400 may further include: the result identification module is used for identifying the watermark detection result of each frame of video image by using a pre-trained classification network and acquiring the identification result of the watermark detection result of each frame of video image, wherein the identification result is used for representing whether a watermark area in the watermark detection result is a watermark or not, the classification network is obtained by training a neural network in advance according to a second training sample, and the second training sample comprises a second image sample and a classification labeling sample corresponding to the second image sample; and the result judging module is used for determining the identification result in each frame of video image as the watermark detection result of the watermark as a new watermark detection result of each frame of video image.

In this embodiment, the target obtaining module 420 may be specifically configured to: and acquiring a target video image without the watermark detected in the video to be processed according to the new watermark detection result.

Further, in some embodiments, the video watermark detection apparatus 400 may further include: the watermark acquisition module is used for acquiring a watermark image in the first image sample; the watermark expansion module is used for carrying out boundary expansion on the watermark image and acquiring the expanded watermark image as a second image sample; and the watermark marking module is used for generating a classification marking sample corresponding to the second image sample according to the watermark image in the second image sample.

In some embodiments, the neighboring video images in the neighboring determination module 430 may be: and a previous frame video image or a next frame video image adjacent to the target video image in the video to be processed.

In some embodiments, the video watermark detection apparatus 400 may further include: the standard watermark acquisition module is used for acquiring a standard watermark image to be detected; the regional image acquisition module is used for acquiring a regional image of each watermark region in the watermark detection result of each frame of video image; the similarity calculation module is used for respectively calculating the similarity value of each region image in each frame of video image and the standard watermark image; and the similarity judgment module is used for determining the watermark region corresponding to the region image with the similarity value larger than the preset value in each frame of video image as the final watermark detection result of each frame of video image.

In some embodiments, the video watermark detection apparatus 400 may further include: and the watermark removing module is used for performing watermark removing processing on the video to be processed according to the watermark detection result.

The video watermark detection apparatus provided in the embodiment of the present application is used to implement the corresponding video watermark detection method in the foregoing method embodiment, and has the beneficial effects of the corresponding method embodiment, which are not described herein again.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses and modules may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the several embodiments provided in the present application, the coupling or direct coupling or communication connection between the modules shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or modules may be in an electrical, mechanical or other form.

In addition, functional modules in the embodiments of the present application may be integrated into one processing module, or each of the modules may exist alone physically, or two or more modules are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode.

Referring to fig. 12, fig. 12 is a block diagram illustrating a structure of an electronic device according to an embodiment of the present disclosure. The electronic device 700 may be an electronic device such as a server capable of running an application. The electronic device 700 in the present application may include one or more of the following components: a processor 710, a memory 720, and one or more applications, wherein the one or more applications may be stored in the memory 720 and configured to be executed by the one or more processors 710, the one or more programs configured to perform a method as described in the aforementioned method embodiments.

Processor 710 may include one or more processing cores. The processor 710 interfaces with various components throughout the electronic device 700 using various interfaces and circuitry to perform various functions of the electronic device 700 and process data by executing or executing instructions, programs, code sets, or instruction sets stored in the memory 720 and invoking data stored in the memory 720. Alternatively, the processor 710 may be implemented in hardware using at least one of Digital Signal Processing (DSP), Field-Programmable Gate Array (FPGA), and Programmable Logic Array (PLA). The processor 710 may integrate one or more of a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a modem, and the like. Wherein, the CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing display content; the modem is used to handle wireless communications. It is understood that the modem may not be integrated into the processor 710, but may be implemented by a communication chip.

The Memory 720 may include a Random Access Memory (RAM) or a Read-Only Memory (Read-Only Memory). The memory 720 may be used to store instructions, programs, code sets, or instruction sets. The memory 720 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for implementing at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing various method embodiments described below, and the like. The storage data area may also store data created during use by the electronic device 700, and the like.

Those skilled in the art will appreciate that the structure shown in fig. 12 is a block diagram of only a portion of the structure relevant to the present disclosure, and does not constitute a limitation on the electronic device to which the present disclosure may be applied, and that a particular electronic device may include more or less components than those shown, or combine certain components, or have a different arrangement of components.

Referring to fig. 13, a block diagram of a computer-readable storage medium according to an embodiment of the present disclosure is shown. The computer-readable storage medium 800 stores program code that can be called by a processor to execute the methods described in the above-described method embodiments.

The computer-readable storage medium 800 may be an electronic memory such as a flash memory, an EEPROM (electrically erasable and programmable read only memory), an EPROM, a hard disk, or a ROM. Alternatively, the computer-readable storage medium 800 includes a non-transitory computer-readable storage medium. The computer readable storage medium 800 has storage space for program code 810 for performing any of the method steps described above. The program code can be read from or written to one or more computer program products. The program code 810 may be compressed, for example, in a suitable form.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not necessarily depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A method for video watermark detection, the method comprising:

detecting a watermark in a video to be processed by utilizing a pre-trained target detection network to obtain a watermark detection result of each frame of video image in the video to be processed;

acquiring a target video image without the watermark detected in the video to be processed according to the watermark detection result;

judging whether a watermark is detected in an adjacent video image according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed;

when the adjacent video images detect watermarks, acquiring watermark detection results of the adjacent video images as watermark detection results of the target video images;

and regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image.

2. The method according to claim 1, wherein the detecting the watermark in the video to be processed by using the pre-trained target detection network to obtain the watermark detection result of each frame of video image in the video to be processed comprises:

performing frame processing on a video to be processed to obtain a video image sequence;

detecting the watermark in the video image sequence by using a pre-trained target detection network, and obtaining a watermark detection result of each frame of video image in the video image sequence, wherein the target detection network is obtained by training a neural network in advance according to a first training sample, and the first training sample comprises a first image sample and a watermark marking sample corresponding to the first image sample.

3. The method of claim 2, wherein before the detecting the watermark in the video image sequence by using the pre-trained target detection network and obtaining the watermark detection result of each frame of video image in the video image sequence, the method further comprises:

acquiring a plurality of background samples and a plurality of watermark samples;

randomly synthesizing the plurality of watermark samples and the plurality of background samples through a fusion algorithm to obtain a plurality of synthesized first image samples;

and generating a watermark marking sample corresponding to each first image sample according to the synthesis position of the watermark sample in each first image sample.

4. The method according to claim 2, wherein before the obtaining, according to the watermark detection result, a target video image of the video to be processed in which a watermark is not detected, the method further comprises:

identifying the watermark detection result of each frame of video image by using a pre-trained classification network, and obtaining the identification result of the watermark detection result of each frame of video image, wherein the identification result is used for representing whether a watermark area in the watermark detection result is a watermark or not, the classification network is obtained by training a neural network in advance according to a second training sample, and the second training sample comprises a second image sample and a classification labeling sample corresponding to the second image sample;

determining the identification result in each frame of video image as a watermark detection result of the watermark as a new watermark detection result of each frame of video image;

the obtaining a target video image without the watermark detected in the video to be processed according to the watermark detection result includes:

and acquiring a target video image without the watermark detected in the video to be processed according to the new watermark detection result.

5. The method according to claim 4, wherein before the identifying the watermark detection result of each frame of video image by using the pre-trained classification network and obtaining the identification result of the watermark detection result of each frame of video image, the method further comprises:

acquiring a watermark image in the first image sample;

performing boundary expansion on the watermark image, and acquiring the expanded watermark image as a second image sample;

and generating a classification labeling sample corresponding to the second image sample according to the watermark image in the second image sample.

6. The method according to any one of claims 1 to 5, wherein the adjacent video image is a previous frame video image or a next frame video image adjacent to the target video image in the video to be processed.

7. The method according to any one of claims 1 to 5, wherein after the regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image, the method further comprises:

acquiring a standard watermark image to be detected;

acquiring a region image of each watermark region in a watermark detection result of each frame of video image;

respectively calculating the similarity value of each region image in each frame of video image and the standard watermark image;

and determining the watermark region corresponding to the region image with the similarity value larger than the preset value in each frame of video image as the final watermark detection result of each frame of video image.

8. The method according to any one of claims 1 to 5, wherein after the regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image, the method further comprises:

and according to the watermark detection result, performing watermark removing treatment on the video to be treated.

9. A video watermark detection apparatus, characterized in that the apparatus comprises:

the target detection module is used for detecting the watermark of the video to be processed by utilizing a pre-trained target detection network to obtain the watermark detection result of each frame of video image in the video to be processed;

the target acquisition module is used for acquiring a target video image of which the watermark is not detected in the video to be processed according to the watermark detection result;

the adjacent judgment module is used for judging whether an adjacent video image detects a watermark or not according to the watermark detection result, wherein the adjacent video image is a video frame image adjacent to the target video image in the video to be processed;

the result copying module is used for acquiring a watermark detection result of the adjacent video image as a watermark detection result of the target video image when the adjacent video image detects the watermark;

and the result generation module is used for regenerating the watermark detection result of each frame of video image in the video to be processed according to the watermark detection result of the target video image.

10. An electronic device, comprising:

one or more processors;

a memory;

one or more applications, wherein the one or more applications are stored in the memory and configured to be executed by the one or more processors, the one or more applications configured to perform the method of any of claims 1-8.

11. A computer-readable storage medium, having stored thereon program code that can be invoked by a processor to perform the method according to any one of claims 1 to 8.