CN118118651A - Smear detection method and device, electronic equipment, storage medium and program product - Google Patents

Smear detection method and device, electronic equipment, storage medium and program product Download PDF

Info

Publication number
CN118118651A
CN118118651A CN202410241777.0A CN202410241777A CN118118651A CN 118118651 A CN118118651 A CN 118118651A CN 202410241777 A CN202410241777 A CN 202410241777A CN 118118651 A CN118118651 A CN 118118651A
Authority
CN
China
Prior art keywords
moments
detected
video images
original
displayed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410241777.0A
Other languages
Chinese (zh)
Inventor
徐劲松
蒲小华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yijia Intelligent Technology Group Co ltd
Original Assignee
Shanghai Yijia Intelligent Technology Group Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yijia Intelligent Technology Group Co ltd filed Critical Shanghai Yijia Intelligent Technology Group Co ltd
Priority to CN202410241777.0A priority Critical patent/CN118118651A/en
Publication of CN118118651A publication Critical patent/CN118118651A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The disclosure relates to a smear detection method and device, electronic equipment, a storage medium and a program product, and relates to the technical field of video image processing. Wherein the method comprises the following steps: acquiring a plurality of time-original video images to be detected in a video stream, and respectively extracting a plurality of first characteristic detection points corresponding to the time-original video images to be detected; displaying original video images to be detected at a plurality of moments in the video stream by using a terminal display device to obtain corresponding video images to be detected displayed at a plurality of moments, and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected displayed at the plurality of moments; and detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments is smeared or not based on the first feature detection points and the second feature detection points which are set at the adjacent moments in the plurality of moments. The smear detection of the video image can be realized.

Description

Smear detection method and device, electronic equipment, storage medium and program product
Technical Field
The disclosure relates to the technical field of video image processing, and in particular relates to a smear detection method and device, electronic equipment, a storage medium and a program product.
Background
The screen displays dynamic pictures by means of a refresh rate. For example, the video image corresponding to a certain moment in the video stream and the video image corresponding to a next moment in the video stream are alternately refreshed, and more than 24 video images in one second form continuous video image frames, thus becoming dynamic video. But if a smear of dynamic video occurs, this will necessarily affect the user experience.
The applicant's Song's optical technology Co., ltd, has proposed a smear detection method, apparatus and electronic equipment, (application number: 2022110082110), in order to solve and detect the smear degree of the projector through the way of manual observation, the detection error is great, it is difficult to guarantee the problem of the uniformity of the product. Meanwhile, the applicant's Uighur technology and technology corporation also provides a smear detection method, a smear detection device and a storage medium (application number: 2021114563098), and provides a method for accurately and objectively measuring the smear region color cast condition of the display panel to be detected.
However, in the method, the device and the electronic device for detecting smear proposed by the above-mentioned golgi optical technologies, a preset threshold set needs to be set, but it is difficult for a person skilled in the art to set the preset threshold set. In the smear detection method, the device and the storage medium proposed by the vitamin and science and technology company, smear detection is performed by determining the smear color offset value of the display panel to be detected, however, colors (three colors) displayed by a test object are limited, and in the process of determining the white coordinate measured value according to the tristimulus values in the smear region corresponding to the three colors, obviously, the method cannot perform smear detection on video images with multiple colors.
Based on the above, it is necessary to provide a smear detection method based on video stream images, so as to solve the problems that the smear of the video stream images is difficult to detect and the current smear detection based on the video stream images has poor universality.
Disclosure of Invention
The disclosure provides a smear detection method and device, an electronic device, a storage medium and a program product.
According to an aspect of the present disclosure, there is provided a smear detection method including:
acquiring a plurality of time-original video images to be detected in a video stream, and respectively extracting a plurality of first characteristic detection points corresponding to the time-original video images to be detected by using a preset convolutional neural network;
Displaying original video images to be detected at a plurality of moments in the video stream by using a terminal display device to obtain corresponding video images to be detected displayed at a plurality of moments, and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected displayed at the plurality of moments by using a preset convolutional neural network;
and detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments is smeared or not based on the first feature detection points and the second feature detection points which are set at the adjacent moments in the plurality of moments.
Preferably, before the extracting of the plurality of first feature detection points corresponding to the plurality of time-original video images to be detected, respectively, a first target corresponding to the plurality of time-original video images to be detected is determined by using a preset target detection model;
respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network; and/or the number of the groups of groups,
The method for respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network comprises the following steps:
Cutting the video images to be detected which are original at a plurality of moments by utilizing the first target to obtain cutting video images which are original at a plurality of moments and correspond to the first target; and respectively extracting a plurality of first characteristic detection points corresponding to the original clipping video images at a plurality of moments by using a preset convolutional neural network.
Preferably, before the extracting the plurality of second feature detection points corresponding to the video images to be detected, which are displayed at the plurality of moments, respectively, a second target corresponding to the video images to be detected, which are displayed at the plurality of moments, is determined by using a preset target detection model;
Respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network; and/or the number of the groups of groups,
The method for respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network comprises the following steps:
cutting the video images to be detected displayed at a plurality of moments by using the second target to obtain cut video images displayed at a plurality of moments corresponding to the second target; and respectively extracting a plurality of second characteristic detection points corresponding to the clipping video images displayed at a plurality of moments by using a preset convolutional neural network.
Preferably, the method for detecting whether the smear exists in the video image to be detected displayed at the plurality of moments corresponding to the original video image to be detected at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the adjacent moments set at the plurality of moments includes: detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to an original to-be-detected video image at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the plurality of moments which are set at adjacent moments by using a preset classifier; or alternatively, the first and second heat exchangers may be,
The method for detecting whether the smear exists in the video image to be detected displayed at the plurality of moments corresponding to the original video image to be detected at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the set adjacent moments in the plurality of moments comprises the steps of:
Registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using a preset registration model to obtain a first registration matrix at a plurality of corresponding time;
Pairing the plurality of first characteristic detection points and the plurality of second characteristic detection points based on the first registration matrixes at the plurality of moments respectively to obtain a plurality of paired first characteristic detection point sets;
Detecting whether a smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the paired first feature detection point sets and the corresponding preset distances; and/or the number of the groups of groups,
Before detecting whether the smear exists in the to-be-detected video image displayed at a plurality of moments corresponding to the to-be-detected video image at the plurality of moments original to-be-detected video images based on the paired plurality of first feature detection point sets and the corresponding preset distances, the method further comprises:
Registering the original video images to be detected corresponding to the set adjacent moments in the moments by using a preset registration model to obtain a second registration matrix corresponding to the moments;
pairing the plurality of first feature detection points based on the second registration matrix to obtain a plurality of paired second feature detection point sets;
Respectively calculating the feature detection point distances among the plurality of second feature detection point sets of the pair to obtain corresponding preset distances; and/or the number of the groups of groups,
The method for detecting whether the smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the paired first feature detection point sets and the corresponding preset distances comprises the following steps:
respectively calculating a plurality of characteristic detection point distances corresponding to the plurality of first characteristic detection point sets;
Detecting whether a smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the distances between the plurality of characteristic detection points and the corresponding preset distances; and/or the number of the groups of groups,
The method for detecting whether the smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the plurality of feature detection point distances and the corresponding preset distances comprises the following steps:
if the distances of the plurality of characteristic detection points are respectively smaller than or equal to the corresponding preset distances, the to-be-detected video images displayed at the plurality of times corresponding to the original to-be-detected video images at the plurality of times are free of smear; otherwise, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at the plurality of moments have smear; or alternatively, the first and second heat exchangers may be,
The method for detecting whether the smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the plurality of feature detection point distances and the corresponding preset distances comprises the following steps:
counting the number of the feature detection points with the distances larger than the corresponding preset distances;
if the number is smaller than or equal to the set number, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at a plurality of moments are free of smear; otherwise, the to-be-detected video images displayed at the plurality of times corresponding to the to-be-detected video images at the plurality of times are smeared.
According to an aspect of the present disclosure, there is provided a smear detection apparatus including:
The first characteristic detection point extraction unit is used for acquiring a plurality of time-original video images to be detected in the video stream, and respectively extracting a plurality of first characteristic detection points corresponding to the time-original video images to be detected by utilizing a preset convolutional neural network;
The second characteristic detection point extraction unit is used for displaying the original video images to be detected at a plurality of moments in the video stream by using the terminal display equipment to obtain the corresponding video images to be detected displayed at a plurality of moments, and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected displayed at a plurality of moments by using a preset convolutional neural network;
the detection unit is used for detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments exists or not based on the plurality of first characteristic detection points and the plurality of second characteristic detection points which are set at adjacent moments in the plurality of moments.
Preferably, the first feature detection point extraction unit includes: a first target detection unit; the first target detection unit is configured to determine, by using a preset target detection model, a first target corresponding to the video image to be detected, where the first target corresponds to the video image to be detected and is original at multiple times, before the multiple first feature detection points corresponding to the video image to be detected, where the multiple times are respectively extracted; respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network; and/or the number of the groups of groups,
The first feature detection point extraction unit further includes: a first clipping unit; the first clipping unit is configured to clip the video images to be detected, which are original at multiple times, by using the first target, so as to obtain clipping video images, which are original at multiple times and correspond to the first target, respectively; respectively extracting a plurality of first characteristic detection points corresponding to the original cut video images at a plurality of moments by using a preset convolutional neural network; and/or the number of the groups of groups,
The second feature detection point extraction unit includes: a second target detection unit; the second target detection unit is configured to determine, by using a preset target detection model, a second target corresponding to the video image to be detected displayed at the multiple moments before extracting multiple second feature detection points corresponding to the video image to be detected displayed at the multiple moments respectively;
Respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network; and/or the number of the groups of groups,
The second feature detection point extraction unit further includes: a second clipping unit; the second clipping unit is configured to clip the video images to be detected displayed at the multiple moments by using the second target, so as to obtain clipping video images displayed at the multiple moments corresponding to the second target; and respectively extracting a plurality of second characteristic detection points corresponding to the clipping video images displayed at a plurality of moments by using a preset convolutional neural network.
Preferably, the detection unit includes: presetting a classifier; detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to an original to-be-detected video image at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the plurality of moments which are set at adjacent moments by using the preset classifier; or alternatively, the first and second heat exchangers may be,
The detection unit includes: the device comprises a first registration unit, a first pairing unit and a first detection unit; the first registration unit is used for registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using a preset registration model to obtain a first registration matrix at a plurality of corresponding times; the first pairing unit is configured to pair the plurality of first feature detection points and the plurality of second feature detection points based on the first registration matrices at the plurality of moments, so as to obtain a plurality of paired first feature detection point sets; the first detection unit is configured to detect whether a smear exists in a to-be-detected video image displayed at multiple times corresponding to the to-be-detected video image at multiple times original based on the paired multiple first feature detection point sets and corresponding preset distances; and/or the number of the groups of groups,
The first detection unit includes: the first registration unit, the first pairing unit and the first distance calculation unit; the second registration unit is configured to register original video images to be detected corresponding to adjacent moments set in the multiple moments by using a preset registration model, so as to obtain a second registration matrix corresponding to the multiple moments; the second pairing unit is used for pairing the plurality of first characteristic detection points based on the second registration matrix to obtain a plurality of paired second characteristic detection point sets; the first distance calculation unit is used for calculating the feature detection point distances among the plurality of second feature detection point sets of the pair respectively to obtain corresponding preset distances; and/or the number of the groups of groups,
The first distance unit includes: the first calculating unit and the second detecting unit; the first computing unit is used for respectively computing a plurality of feature detection point distances corresponding to the plurality of first feature detection point sets; the second detection unit is configured to detect whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments based on the plurality of feature detection point distances and the corresponding preset distances; and/or the number of the groups of groups,
The second detection unit includes: a judging unit; the judging unit is configured to, if the distances between the plurality of feature detection points are respectively smaller than or equal to the corresponding preset distances, prevent smear from being present in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at the plurality of moments; otherwise, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at the plurality of moments have smear; or alternatively, the first and second heat exchangers may be,
The second detection unit includes: a statistics unit; the statistics unit is used for counting the number that the distances between the plurality of characteristic detection points are larger than the corresponding preset distances; if the number is smaller than or equal to the set number, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at a plurality of moments are free of smear; otherwise, the to-be-detected video images displayed at the plurality of times corresponding to the to-be-detected video images at the plurality of times are smeared.
According to an aspect of the present disclosure, there is provided an electronic apparatus including:
A processor;
a memory for storing processor-executable instructions;
Wherein the processor is configured to: and executing the smear detection method.
According to an aspect of the present disclosure, there is provided a computer-readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the smear detection method described above.
According to an aspect of the present disclosure, there is provided a computer program product comprising a computer program/instruction which, when executed by a processor, implements the smear detection method described above.
In an embodiment of the present disclosure, the present disclosure proposes a method and an apparatus for detecting smear, an electronic device, a storage medium, and a program product, so as to solve the problems that a video stream image smear is difficult to detect and the current smear detection based on the video stream image has poor universality.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Other features and aspects of the present disclosure will become apparent from the following detailed description of exemplary embodiments, which proceeds with reference to the accompanying drawings.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the technical aspects of the disclosure.
FIG. 1 illustrates a flow chart of a smear detection method according to an embodiment of the present disclosure;
FIG. 2 shows a schematic diagram of a smear detection apparatus according to an embodiment of the disclosure;
FIG. 3 is a block diagram of an electronic device 800, shown in accordance with an exemplary embodiment;
fig. 4 is a block diagram illustrating an electronic device 1900 according to an example embodiment.
Detailed Description
Various exemplary embodiments, features and aspects of the disclosure will be described in detail below with reference to the drawings. In the drawings, like reference numbers indicate identical or functionally similar elements. Although various aspects of the embodiments are illustrated in the accompanying drawings, the drawings are not necessarily drawn to scale unless specifically indicated.
The word "exemplary" is used herein to mean "serving as an example, embodiment, or illustration. Any embodiment described herein as "exemplary" is not necessarily to be construed as preferred or advantageous over other embodiments.
The term "and/or" is herein merely an association relationship describing an associated object, meaning that there may be three relationships, e.g., a and/or B, may represent: a exists alone, A and B exist together, and B exists alone. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, may mean including any one or more elements selected from the group consisting of A, B and C.
Furthermore, numerous specific details are set forth in the following detailed description in order to provide a better understanding of the present disclosure. It will be understood by those skilled in the art that the present disclosure may be practiced without some of these specific details. In some instances, methods, means, elements, and circuits well known to those skilled in the art have not been described in detail in order not to obscure the present disclosure.
It will be appreciated that the above-mentioned method embodiments of the present disclosure may be combined with each other to form a combined embodiment without departing from the principle logic, and are limited to the description of the present disclosure.
In addition, the disclosure further provides a smear detection device, an electronic device, a computer readable storage medium, and a program, where the foregoing may be used to implement any one of the smear detection methods provided in the disclosure, and corresponding technical schemes and descriptions and corresponding descriptions referring to method parts are not repeated.
Fig. 1 shows a flowchart of a smear detection method according to an embodiment of the present disclosure. As shown in fig. 1, the smear detection method includes: step S101, acquiring original video images to be detected at a plurality of moments in a video stream, and respectively extracting a plurality of first characteristic detection points corresponding to the original video images to be detected at the plurality of moments by using a preset convolutional neural network; step S102, displaying original video images to be detected at a plurality of moments in the video stream by using a terminal display device to obtain corresponding video images to be detected displayed at a plurality of moments, and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected displayed at the plurality of moments by using a preset convolutional neural network; step S103, detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments is smeared or not based on the plurality of first characteristic detection points and the plurality of second characteristic detection points which are set at adjacent moments in the plurality of moments. The method solves the problems that the smear of the video stream image is difficult to detect and the current smear detection based on the video stream image is poor in universality.
Step S101, obtaining original video images to be detected at a plurality of moments in a video stream, and respectively extracting a plurality of first characteristic detection points corresponding to the original video images to be detected at the plurality of moments by using a preset convolutional neural network.
In the embodiments of the present disclosure and other embodiments, the video stream includes a plurality of original video images to be detected at a certain moment, and the video images corresponding to a certain moment and the video images corresponding to a next moment are alternately refreshed, and more than 24 video images in one second form a continuous video image frame, so as to become a dynamic video. Meanwhile, a plurality of moments in the video stream can be configured to be 100, namely the video stream contains 100 original video images to be detected; and respectively extracting a plurality of first characteristic detection points corresponding to the 100 original video images to be detected by using a preset convolutional neural network.
In embodiments of the present disclosure and other embodiments, the present disclosure does not limit the type of video stream. For example, the type of the video stream may be configured as one or several of RTMP (push end, pull end), RTSP (push end), HLS (pull end), FLV (pull end), and the like.
In embodiments of the present disclosure and other embodiments, the preset convolutional neural network may include: the device comprises a first convolution layer, a first pooling layer connected with the first convolution layer, a second convolution layer connected with the first pooling layer, a second pooling layer connected with the second convolution layer, and a full-connection layer connected with the second pooling layer, wherein the output of the full-connection layer is configured to obtain a plurality of first characteristic detection points corresponding to the original video images to be detected at a plurality of moments respectively. Wherein the first convolution layer may be configured as a convolution kernel of size 5×5, and the corresponding step size may be configured as 1; the second convolution layer may be configured as a convolution kernel of size 3×3, and the corresponding step size may be configured as 1. Wherein the first pooling layer and the second pooling layer are respectively configurable as maximum pooling.
In embodiments of the present disclosure and other embodiments, the preset convolutional neural network may include: the device comprises a first convolution layer, a first pooling layer connected with the first convolution layer, an attention mechanism module connected with the first pooling layer, a second convolution layer connected with the attention mechanism module and the first pooling layer, a second pooling layer connected with the second convolution layer, and a full-connection layer connected with the second pooling layer, wherein the output of the full-connection layer is configured to obtain a plurality of first characteristic detection points corresponding to the original video images to be detected at a plurality of moments respectively.
In the embodiments of the present disclosure and other embodiments, the first convolution layer and a first pooling layer connected to the first convolution layer are used to perform feature extraction and pooling on the video images to be detected, which are original at multiple times, respectively, so as to obtain an original feature matrix; performing attention extraction on the original feature matrix by using the attention mechanism module to obtain a corresponding original weight feature matrix; multiplying the original feature matrix and the original weight feature matrix to obtain the original feature matrix; and performing feature compression on the original feature matrix by using the full connection layer to obtain a plurality of first feature detection points corresponding to the original video images to be detected at a plurality of moments.
In the embodiment of the disclosure, before the first feature detection points corresponding to the video images to be detected, which are original at the multiple moments, are respectively extracted, a first target corresponding to the video images to be detected, which are original at the multiple moments, is respectively determined by using a preset target detection model; and respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network.
In embodiments of the present disclosure and others, the preset target detection model may be configured as a YOLO target detection model, such as YOLOv. Wherein the preset target detection model includes: the device comprises a backbone layer, a neck layer connected with the backbone layer and a prediction layer connected with the neck layer; the backbone layer is utilized to respectively conduct feature extraction on the original video images to be detected at a plurality of moments to obtain corresponding global feature matrixes with different scales; the terminal of the backbone layer is provided with a first multi-scale Swin transducer module which is used for obtaining the global feature matrixes with different scales; processing the global feature matrixes with different scales by utilizing the neck layer to obtain a plurality of processed global attention feature matrixes with different scales; the neck layer comprises or is provided with a plurality of sequentially connected second multi-scale Swin transducer modules for processing the global feature matrixes with different scales; and based on the global attention feature matrixes, completing the first targets corresponding to the original video images to be detected at the moments. The detected object of the first target is configured as a plurality of characters and/or objects in the video images to be detected, which are original at a plurality of moments.
In an embodiment of the disclosure, the method for respectively extracting a plurality of first feature detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network includes: cutting the video images to be detected which are original at a plurality of moments by utilizing the first target to obtain cutting video images which are original at a plurality of moments and correspond to the first target; and respectively extracting a plurality of first characteristic detection points corresponding to the original clipping video images at a plurality of moments by using a preset convolutional neural network.
In an embodiment of the present disclosure and other embodiments, the method for clipping the video images to be detected, which are original at multiple times, by using the first target to obtain the clipped video images, which are original at multiple times and correspond to the first target, includes: respectively calculating the lengths of 2 diagonals corresponding to each detection object detection frame in the first target and the vertex position information corresponding to the lengths; based on the set multiple, determining a region corresponding to each detection object by using the 2 diagonal lengths and the vertex position information corresponding to the diagonal lengths; and respectively cutting out each detected object from the video images to be detected which are original at a plurality of moments based on the corresponding area of each detected object, so as to obtain the cut video images which are original at a plurality of moments and correspond to the first target. Wherein, the person skilled in the art can configure the setting multiple according to actual needs.
Step S102, displaying original video images to be detected at a plurality of moments in the video stream by using a terminal display device to obtain corresponding video images to be detected displayed at a plurality of moments, and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected displayed at the plurality of moments by using a preset convolutional neural network.
In the embodiments of the present disclosure and other embodiments, a plurality of set refresh rates corresponding to a terminal display device are used to display a plurality of time-original video images to be detected in the video stream, so as to obtain a plurality of time-displayed video images to be detected corresponding to the plurality of set refresh rates, further determine that a set refresh rate corresponding to a smear exists, and further display the plurality of time-original video images to be detected in the video stream under the set refresh rate corresponding to the existence of the smear in a process that the terminal display device displays the plurality of time-original video images to be detected in the video stream.
In embodiments of the present disclosure and other embodiments, in the process that the terminal display device displays the video images to be detected that are original at multiple times in the video stream, the method for displaying the video images to be detected that are original at multiple times in the video stream at the set refresh rate corresponding to the presence of the smear includes: acquiring a configuration refresh rate of the terminal display equipment; if the configuration refresh rate is smaller than the set refresh rate corresponding to the existence of the smear, displaying original video images to be detected at a plurality of moments in the video stream at the configuration refresh rate; otherwise, the configuration refresh rate is reduced to the set refresh rate, and original video images to be detected at a plurality of moments in the video stream are displayed at the set refresh rate.
In embodiments of the present disclosure and other embodiments, the terminal display device may be configured as a corresponding terminal display of a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a handheld device, a computing device, a vehicle-mounted device, a wearable device, or the like.
Likewise, in embodiments of the present disclosure and other embodiments, the preset convolutional neural network may include: the device comprises a first convolution layer, a first pooling layer connected with the first convolution layer, a second convolution layer connected with the first pooling layer, a second pooling layer connected with the second convolution layer, a full-connection layer connected with the second pooling layer, and a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments, are respectively configured by the output of the full-connection layer.
In embodiments of the present disclosure and other embodiments, the preset convolutional neural network may include: the device comprises a first convolution layer, a first pooling layer connected with the first convolution layer, an attention mechanism module connected with the first pooling layer, a second convolution layer connected with the attention mechanism module and the first pooling layer, a second pooling layer connected with the second convolution layer, and a full-connection layer connected with the second pooling layer, wherein the output of the full-connection layer is configured to obtain a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments. Wherein the first convolution layer may be configured as a convolution kernel of size 5×5, and the corresponding step size may be configured as 1; the second convolution layer may be configured as a convolution kernel of size 3×3, and the corresponding step size may be configured as 1. Wherein the first pooling layer and the second pooling layer are respectively configurable as maximum pooling.
In the embodiments of the present disclosure and other embodiments, the feature extraction and pooling are performed on the video images to be detected displayed at the multiple moments by using the first convolution layer and a first pooling layer connected with the first convolution layer, so as to obtain a display feature matrix; performing attention extraction on the display feature matrix by using the attention mechanism module to obtain a corresponding display weight feature matrix; multiplying the display characteristic matrix and the display weight characteristic matrix to obtain the display characteristic matrix; and carrying out feature compression on the display feature matrix by utilizing the full connection layer to obtain a plurality of second feature detection points corresponding to the video images to be detected, which are displayed at a plurality of moments.
In the embodiment of the disclosure, before the second feature detection points corresponding to the video images to be detected displayed at the multiple moments are respectively extracted, respectively determining second targets corresponding to the video images to be detected displayed at the multiple moments by using a preset target detection model; and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network.
Likewise, in embodiments of the present disclosure and others, the preset target detection model may be configured as a YOLO target detection model, such as YOLOv. Wherein the preset target detection model includes: the device comprises a backbone layer, a neck layer connected with the backbone layer and a prediction layer connected with the neck layer; the backbone layer is utilized to respectively conduct feature extraction on the video images to be detected which are displayed at a plurality of moments, and corresponding global feature matrixes with different scales are obtained; the terminal of the backbone layer is provided with a first multi-scale Swin transducer module which is used for obtaining the global feature matrixes with different scales; processing the global feature matrixes with different scales by utilizing the neck layer to obtain a plurality of processed global attention feature matrixes with different scales; the neck layer comprises or is provided with a plurality of sequentially connected second multi-scale Swin transducer modules for processing the global feature matrixes with different scales; and based on the global attention feature matrixes, completing a second target corresponding to the video images to be detected, which are displayed at the moments. The detected object of the second target is configured as a plurality of characters and/or objects in the video images to be detected, which are displayed at the plurality of moments.
In an embodiment of the disclosure, the method for respectively extracting a plurality of second feature detection points corresponding to the video images to be detected displayed at the plurality of moments in the second target by using a preset convolutional neural network includes: cutting the video images to be detected displayed at a plurality of moments by using the second target to obtain cut video images displayed at a plurality of moments corresponding to the second target; and respectively extracting a plurality of second characteristic detection points corresponding to the clipping video images displayed at a plurality of moments by using a preset convolutional neural network.
In an embodiment of the present disclosure and other embodiments, the method for clipping the video images to be detected displayed at the multiple times by using the second target to obtain clipping video images displayed at the multiple times corresponding to the second target includes: respectively calculating the lengths of 2 diagonals corresponding to each detection object detection frame in the second target and the corresponding vertex position information of the lengths; based on the set multiple, determining a region corresponding to each detection object by using the 2 diagonal lengths and the vertex position information corresponding to the diagonal lengths; and respectively cutting out each detected object from the video images to be detected which are displayed at a plurality of moments based on the corresponding region of each detected object, so as to obtain cut video images which are displayed at a plurality of moments and correspond to the second target. Wherein, the person skilled in the art can configure the setting multiple according to actual needs.
Step S103, detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments is smeared or not based on the plurality of first characteristic detection points and the plurality of second characteristic detection points which are set at adjacent moments in the plurality of moments.
In embodiments of the present disclosure and other embodiments, an intelligent smear detection method based on a classifier is provided. The method for detecting whether the smear exists in the video image to be detected displayed at the plurality of moments corresponding to the original video image to be detected at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the adjacent moments set at the plurality of moments comprises the steps of: and detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments is smeared or not by utilizing a preset classifier based on the plurality of first characteristic detection points and the plurality of second characteristic detection points which are set at adjacent moments in the plurality of moments.
In embodiments of the present disclosure and others, the preset classifier may be configured as one or more of a support vector machine, a decision tree, a random forest, a K-nearest neighbor, logistic regression, adaptive enhancement, linear discriminant analysis, and a multi-layer perceptron.
In an embodiment of the present disclosure and other embodiments, a preset classifier is utilized to respectively obtain a plurality of first probability values indicating that a smear exists and a plurality of second probability values indicating that a smear does not exist based on the plurality of first feature detection points and the plurality of second feature detection points set at adjacent times in the plurality of times, and determine whether a smear exists in a to-be-detected video image displayed at a plurality of times corresponding to the original to-be-detected video image at the plurality of times based on the plurality of first probability values and the corresponding second probability values.
In an embodiment of the present disclosure and other embodiments, the method for determining whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to an original to-be-detected video image at the plurality of moments based on the plurality of first probability values and the corresponding second probability values includes: if a certain first probability value in the plurality of first probability values is larger than a corresponding second probability value, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at a plurality of moments have smear; otherwise, the to-be-detected video images displayed at the plurality of times corresponding to the to-be-detected video images at the plurality of times are free from smear.
In an embodiment of the present disclosure and other embodiments, the method for detecting whether a smear exists in a video image to be detected displayed at a plurality of moments corresponding to an original video image to be detected at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the plurality of moments set at adjacent moments by using a preset classifier includes: respectively determining a plurality of paired first feature detection point sets corresponding to the plurality of first feature detection points and the plurality of second feature detection points; respectively calculating the difference values among the paired first characteristic detection points in the centralized pairing points to obtain corresponding difference value characteristic vectors; and detecting whether the smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the difference feature vector by using a preset classifier.
In an embodiment of the present disclosure and other embodiments, the method for determining a plurality of first feature detection point sets of the pairing corresponding to the plurality of first feature detection points and the plurality of second feature detection points includes: registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using a preset registration model to obtain a first registration matrix at a plurality of corresponding time; and respectively pairing the plurality of first characteristic detection points and the plurality of second characteristic detection points based on the first registration matrixes at the plurality of moments to obtain a plurality of paired first characteristic detection point sets.
In embodiments of the present disclosure and other embodiments, one skilled in the art may configure the set adjacent time according to actual needs. For example, the set adjacent time may be configured to be 1ms, where whether there is smear in the video image to be detected displayed at a plurality of times corresponding to the video image to be detected at the plurality of times is detected based on the plurality of first feature detection points and the plurality of second feature detection points corresponding to the video image to be detected displayed at the previous 1ms or the next 1ms of the video image to be detected. Meanwhile, the set adjacent time may be configured to other values, for example, 0.1ms or 0.5ms, etc.
In an embodiment of the disclosure, the method for detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to an original to-be-detected video image at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the set adjacent moments in the plurality of moments includes: registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using a preset registration model to obtain a first registration matrix at a plurality of corresponding time; pairing the plurality of first characteristic detection points and the plurality of second characteristic detection points based on the first registration matrixes at the plurality of moments respectively to obtain a plurality of paired first characteristic detection point sets; and detecting whether the smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the paired first feature detection point sets and the corresponding preset distances.
In embodiments of the present disclosure and other embodiments, one skilled in the art may configure the preset registration model according to actual needs. For example, the preset registration model may be configured as a registration model corresponding to a point feature, a line feature, and a surface feature. For another example, the registration model is one or more of SIFT registration model, SURF registration model, affine transformation registration model, and the like.
In the embodiments of the present disclosure and other embodiments, before the registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using the preset registration model, the method further includes: respectively counting a first number of original clipping video images at a plurality of moments corresponding to the first target and a second number of clipping video images displayed at the plurality of moments in the second target; if the first number is equal to the second number, registering original cut video images at a plurality of moments and video images to be detected displayed at a plurality of moments which are the same and set adjacent moments by using a preset registration model respectively to obtain a first registration matrix at a plurality of corresponding moments; otherwise, respectively calculating the similarity between the original clipping video images at the multiple moments and each image in the clipping video images displayed at the multiple moments; determining similar images in the original clipping video images at the multiple moments and the clipping video images displayed at the multiple moments based on the similarity and the preset similarity; and registering the original clipping video images at the multiple moments and similar images in the clipping video images displayed at the multiple moments by using a preset registration model respectively to obtain a first registration matrix at the corresponding multiple moments.
For example, in embodiments of the present disclosure and other embodiments, the first number of cropped video images that are original at the plurality of moments corresponding to the first target is 50; likewise, the second number of cropped video images displayed at the plurality of times within the second target is also 50. At this time, the first number is equal to the second number, and the original clipping video images at a plurality of moments and the video images to be detected displayed at a plurality of moments which are the same and set adjacent moments are registered by using a preset registration model respectively to obtain a first registration matrix at a plurality of corresponding moments. For another example, the first number of original cropped video images at a plurality of times corresponding to the first target is 50; similarly, a second number of cropped video images displayed at the plurality of times within the second target is also 70, indicating that a new detected object is present for the cropped video images displayed at the plurality of times at that time; or, the first number of the original clipping video images at a plurality of moments corresponding to the first target is 50; similarly, the second number of clip video images displayed at the plurality of times in the second target is 35, which indicates that the clip video images displayed at the plurality of times at this time have the detected object disappeared; further, the similarity between the original clipping video images at the plurality of moments and each image in the clipping video images displayed at the plurality of moments is calculated respectively; determining similar images in the original clipping video images at the multiple moments and the clipping video images displayed at the multiple moments based on the similarity and the preset similarity; and registering the original clipping video images at the multiple moments and similar images in the clipping video images displayed at the multiple moments by using a preset registration model respectively to obtain a first registration matrix at the corresponding multiple moments.
In the embodiments of the present disclosure and other embodiments, the algorithm for respectively calculating the similarity correspondence between the plurality of time-instant original clip video images and each of the plurality of time-instant displayed clip video images may be configured as one or several of a cosine similarity algorithm, a mutual information algorithm, a structural similarity algorithm, a hash similarity algorithm, and the like.
In an embodiment of the present disclosure, before detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to the to-be-detected video image of the original plurality of moments based on the paired plurality of first feature detection point sets and corresponding preset distances, the method further includes: registering the original video images to be detected corresponding to the set adjacent moments in the moments by using a preset registration model to obtain a second registration matrix corresponding to the moments; pairing the plurality of first feature detection points based on the second registration matrix to obtain a plurality of paired second feature detection point sets; and respectively calculating the feature detection point distances among the plurality of second feature detection point sets of the pair to obtain corresponding preset distances.
Likewise, in the embodiments of the present disclosure and other embodiments, before the registering, by using the preset registration model, the registering, respectively, the original video images to be detected corresponding to the set adjacent time points in the plurality of time points to obtain the second registration matrix corresponding to the plurality of time points, the method further includes: respectively counting a first number of original clipping video images corresponding to set adjacent moments in a plurality of moments corresponding to the first target; if the first numbers corresponding to the set adjacent moments are equal, registering original cut video images at a plurality of moments of the same set adjacent moments and video images to be detected displayed at the plurality of moments by using a preset registration model respectively to obtain a first registration matrix at the corresponding plurality of moments; otherwise, respectively calculating the similarity between each image in the original cut video image corresponding to the set adjacent moment in a plurality of moments corresponding to the first target; based on the similarity and a preset similarity, determining similar images in original clipping video images corresponding to set adjacent moments in a plurality of moments corresponding to the first target; and registering similar images in the original cut video images corresponding to the set adjacent moments in the moments corresponding to the first target by using a preset registration model respectively to obtain a second registration matrix corresponding to the moments.
In an embodiment of the disclosure, the method for detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at a plurality of moments based on the paired plurality of first feature detection point sets and corresponding preset distances includes: respectively calculating a plurality of characteristic detection point distances corresponding to the plurality of first characteristic detection point sets; detecting whether the smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the distances between the plurality of characteristic detection points and the corresponding preset distances.
In an embodiment of the present disclosure, the method for detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to the to-be-detected video image of the original at a plurality of moments based on the plurality of feature detection point distances and the corresponding preset distances includes: if the distances of the plurality of characteristic detection points are respectively smaller than or equal to the corresponding preset distances, the to-be-detected video images displayed at the plurality of times corresponding to the original to-be-detected video images at the plurality of times are free of smear; otherwise, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at the plurality of moments have smear; or, the method for detecting whether the smear exists in the video image to be detected displayed at the plurality of moments corresponding to the original video image to be detected at the plurality of moments based on the plurality of feature detection point distances and the corresponding preset distances comprises the following steps: counting the number of the feature detection points with the distances larger than the corresponding preset distances; if the number is smaller than or equal to the set number, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at a plurality of moments are free of smear; otherwise, the to-be-detected video images displayed at the plurality of times corresponding to the to-be-detected video images at the plurality of times are smeared.
In embodiments of the present disclosure and others, the set number may be configured by those skilled in the art as desired.
The smear detection method may be performed by a smear detection apparatus, for example, the smear detection method may be performed by a terminal device or a server or other processing device, where the terminal device may be a User Equipment (UE), a mobile device, a User terminal, a cellular phone, a cordless phone, a Personal digital assistant (Personal DIGITAL ASSISTANT, PDA), a handheld device, a computing device, an in-vehicle device, a wearable device, or the like. In some possible implementations, the smear detection method may be implemented by way of a processor invoking computer readable instructions stored in a memory.
It will be appreciated by those skilled in the art that in the above smear detection method of the specific embodiment, the written order of the steps is not meant to imply a strict execution order but rather should be construed as any limitation on the implementation, and the specific execution order of the steps should be determined by their functions and possible inherent logic.
Fig. 2 shows a block diagram of a smear detection apparatus according to an embodiment of the present disclosure. As shown in fig. 2, the smear detection apparatus includes: the first feature detection point extraction unit 101 is configured to obtain a plurality of time-original video images to be detected in a video stream, and respectively extract a plurality of first feature detection points corresponding to the time-original video images to be detected by using a preset convolutional neural network; the second feature detection point extracting unit 102 is configured to display, by using a terminal display device, original video images to be detected at multiple times in the video stream, obtain corresponding video images to be detected displayed at multiple times, and respectively extract multiple second feature detection points corresponding to the video images to be detected displayed at multiple times by using a preset convolutional neural network; and a detecting unit 103, configured to detect whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video images at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the set adjacent moments. The method solves the problems that the smear of the video stream image is difficult to detect and the current smear detection based on the video stream image is poor in universality.
In some embodiments, the functions or modules included in the apparatus provided by the embodiments of the present disclosure may be used to perform the methods described in the embodiments of the smear detection method, and the specific implementation of the methods may refer to the descriptions in the embodiments of the smear detection method, which are not repeated herein for brevity.
In an embodiment of the present disclosure, the first feature detection point extraction unit includes: a first target detection unit; the first target detection unit is configured to determine, by using a preset target detection model, a first target corresponding to the video image to be detected, where the first target corresponds to the video image to be detected and is original at multiple times, before the multiple first feature detection points corresponding to the video image to be detected, where the multiple times are respectively extracted; and respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network.
In an embodiment of the present disclosure, the first feature detection point extraction unit further includes: a first clipping unit; the first clipping unit is configured to clip the video images to be detected, which are original at multiple times, by using the first target, so as to obtain clipping video images, which are original at multiple times and correspond to the first target, respectively; and respectively extracting a plurality of first characteristic detection points corresponding to the original clipping video images at a plurality of moments by using a preset convolutional neural network.
In an embodiment of the present disclosure, the second feature detection point extraction unit includes: a second target detection unit; the second target detection unit is configured to determine, by using a preset target detection model, a second target corresponding to the video image to be detected displayed at the multiple moments before extracting multiple second feature detection points corresponding to the video image to be detected displayed at the multiple moments respectively; and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network.
In an embodiment of the present disclosure, the second feature detection point extraction unit further includes: a second clipping unit; the second clipping unit is configured to clip the video images to be detected displayed at the multiple moments by using the second target, so as to obtain clipping video images displayed at the multiple moments corresponding to the second target; and respectively extracting a plurality of second characteristic detection points corresponding to the clipping video images displayed at a plurality of moments by using a preset convolutional neural network.
In an embodiment of the present disclosure, the detection unit includes: presetting a classifier; detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to an original to-be-detected video image at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the plurality of moments which are set at adjacent moments by using the preset classifier; or, the detection unit includes: the device comprises a first registration unit, a first pairing unit and a first detection unit; the first registration unit is used for registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using a preset registration model to obtain a first registration matrix at a plurality of corresponding times; the first pairing unit is configured to pair the plurality of first feature detection points and the plurality of second feature detection points based on the first registration matrices at the plurality of moments, so as to obtain a plurality of paired first feature detection point sets; the first detection unit is configured to detect whether a smear exists in a to-be-detected video image displayed at multiple times corresponding to the to-be-detected video image at multiple times original based on the paired multiple first feature detection point sets and corresponding preset distances.
In an embodiment of the disclosure, the first detection unit includes: the first registration unit, the first pairing unit and the first distance calculation unit; the second registration unit is configured to register original video images to be detected corresponding to adjacent moments set in the multiple moments by using a preset registration model, so as to obtain a second registration matrix corresponding to the multiple moments; the second pairing unit is used for pairing the plurality of first characteristic detection points based on the second registration matrix to obtain a plurality of paired second characteristic detection point sets; the first distance calculation unit is used for calculating the feature detection point distances among the plurality of second feature detection point sets of the pair respectively to obtain corresponding preset distances.
In embodiments of the present disclosure and other embodiments, the first distance unit includes: the first calculating unit and the second detecting unit; the first computing unit is used for respectively computing a plurality of feature detection point distances corresponding to the plurality of first feature detection point sets; the second detection unit is configured to detect whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to the to-be-detected video image of which the original moments are to be detected based on the plurality of feature detection point distances and the corresponding preset distances.
In embodiments of the present disclosure and other embodiments, the second detection unit includes: a judging unit; the judging unit is configured to, if the distances between the plurality of feature detection points are respectively smaller than or equal to the corresponding preset distances, prevent smear from being present in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at the plurality of moments; otherwise, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at the plurality of moments have smear; or, the second detection unit includes: a statistics unit; the statistics unit is used for counting the number that the distances between the plurality of characteristic detection points are larger than the corresponding preset distances; if the number is smaller than or equal to the set number, the to-be-detected video images displayed at a plurality of moments corresponding to the original to-be-detected video images at a plurality of moments are free of smear; otherwise, the to-be-detected video images displayed at the plurality of times corresponding to the to-be-detected video images at the plurality of times are smeared.
The disclosed embodiments also provide a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the smear detection method described above. The computer readable storage medium may be a non-volatile computer readable storage medium.
The embodiment of the disclosure also provides an electronic device, which comprises: a processor; a memory for storing processor-executable instructions; wherein the processor is configured to implement the smear detection method described above. Wherein the electronic device may be provided as a terminal, server or other modality of device.
The disclosed embodiments also provide a computer program product comprising a computer program/instruction which, when executed by a processor, implements the smear detection method described above.
Fig. 3 is a block diagram of an electronic device 800, according to an example embodiment. For example, electronic device 800 may be a mobile phone, computer, digital broadcast terminal, messaging device, game console, tablet device, medical device, exercise device, personal digital assistant, or the like.
Referring to fig. 3, the electronic device 800 may include one or more of the following components: a processing component 802, a memory 804, a power component 806, a multimedia component 808, an audio component 810, an input/output (I/O) interface 812, a sensor component 814, and a communication component 816.
The processing component 802 generally controls overall operation of the electronic device 800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to perform all or part of the steps of the methods described above. Further, the processing component 802 can include one or more modules that facilitate interactions between the processing component 802 and other components. For example, the processing component 802 can include a multimedia module to facilitate interaction between the multimedia component 808 and the processing component 802.
The memory 804 is configured to store various types of data to support operations at the electronic device 800. Examples of such data include instructions for any application or method operating on the electronic device 800, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 804 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.
The power supply component 806 provides power to the various components of the electronic device 800. The power components 806 may include a power management system, one or more power sources, and other components associated with generating, managing, and distributing power for the electronic device 800.
The multimedia component 808 includes a screen between the electronic device 800 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or slide action, but also the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the electronic device 800 is in an operational mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.
The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a Microphone (MIC) configured to receive external audio signals when the electronic device 800 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 804 or transmitted via the communication component 816. In some embodiments, audio component 810 further includes a speaker for outputting audio signals.
The I/O interface 812 provides an interface between the processing component 802 and peripheral interface modules, which may be a keyboard, click wheel, buttons, etc. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.
The sensor assembly 814 includes one or more sensors for providing status assessment of various aspects of the electronic device 800. For example, the sensor assembly 814 may detect an on/off state of the electronic device 800, a relative positioning of the components, such as a display and keypad of the electronic device 800, the sensor assembly 814 may also detect a change in position of the electronic device 800 or a component of the electronic device 800, the presence or absence of a user's contact with the electronic device 800, an orientation or acceleration/deceleration of the electronic device 800, and a change in temperature of the electronic device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 816 is configured to facilitate communication between the electronic device 800 and other devices, either wired or wireless. The electronic device 800 may access a wireless network based on a communication standard, such as WiFi,2G, or 3G, or a combination thereof. In one exemplary embodiment, the communication component 816 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 816 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the electronic device 800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 804 including computer program instructions executable by processor 820 of electronic device 800 to perform the above-described methods.
Fig. 4 is a block diagram illustrating an electronic device 1900 according to an example embodiment. For example, electronic device 1900 may be provided as a server. Referring to FIG. 4, electronic device 1900 includes a processing component 1922 that further includes one or more processors and memory resources represented by memory 1932 for storing instructions, such as application programs, that can be executed by processing component 1922. The application programs stored in memory 1932 may include one or more modules each corresponding to a set of instructions. Further, processing component 1922 is configured to execute instructions to perform the methods described above.
The electronic device 1900 may also include a power component 1926 configured to perform power management of the electronic device 1900, a wired or wireless network interface 1950 configured to connect the electronic device 1900 to a network, and an input/output (I/O) interface 1958. The electronic device 1900 may operate based on an operating system stored in memory 1932, such as Windows Server, mac OS XTM, unixTM, linuxTM, freeBSDTM, or the like.
In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 1932, including computer program instructions executable by processing component 1922 of electronic device 1900 to perform the methods described above.
The present disclosure may be a system, method, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for causing a processor to implement aspects of the present disclosure.
The computer readable storage medium may be a tangible device that can hold and store instructions for use by an instruction execution device. The computer readable storage medium may be, for example, but not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. More specific examples (a non-exhaustive list) of the computer-readable storage medium would include the following: portable computer disks, hard disks, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), static Random Access Memory (SRAM), portable compact disk read-only memory (CD-ROM), digital Versatile Disks (DVD), memory sticks, floppy disks, mechanical coding devices, punch cards or in-groove structures such as punch cards or grooves having instructions stored thereon, and any suitable combination of the foregoing. Computer-readable storage media, as used herein, are not to be construed as transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through waveguides or other transmission media (e.g., optical pulses through fiber optic cables), or electrical signals transmitted through wires.
The computer readable program instructions described herein may be downloaded from a computer readable storage medium to a respective computing/processing device or to an external computer or external storage device over a network, such as the internet, a local area network, a wide area network, and/or a wireless network. The network may include copper transmission cables, fiber optic transmissions, wireless transmissions, routers, firewalls, switches, gateway computers and/or edge servers. The network interface card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium in the respective computing/processing device.
The computer program instructions for performing the operations of the present disclosure may be assembly instructions, instruction Set Architecture (ISA) instructions, machine-related instructions, microcode, firmware instructions, state setting data, or source or object code written in any combination of one or more programming languages, including an object oriented programming language such as SMALLTALK, C ++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The computer readable program instructions may be executed entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider). In some embodiments, aspects of the present disclosure are implemented by personalizing electronic circuitry, such as programmable logic circuitry, field Programmable Gate Arrays (FPGAs), or Programmable Logic Arrays (PLAs), with state information of computer readable program instructions, which can execute the computer readable program instructions.
Various aspects of the present disclosure are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the disclosure. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer-readable program instructions.
These computer readable program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions may also be stored in a computer readable storage medium that can direct a computer, programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable medium having the instructions stored therein includes an article of manufacture including instructions which implement the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions may also be loaded onto a computer, other programmable data processing apparatus, or other devices to cause a series of operational steps to be performed on the computer, other programmable apparatus or other devices to produce a computer implemented process such that the instructions which execute on the computer, other programmable apparatus or other devices implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The foregoing description of the embodiments of the present disclosure has been presented for purposes of illustration and description, and is not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the various embodiments described. The terminology used herein was chosen in order to best explain the principles of the embodiments, the practical application, or the technical improvement of the technology in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (10)

1. A smear detection method, comprising:
acquiring a plurality of time-original video images to be detected in a video stream, and respectively extracting a plurality of first characteristic detection points corresponding to the time-original video images to be detected by using a preset convolutional neural network;
Displaying original video images to be detected at a plurality of moments in the video stream by using a terminal display device to obtain corresponding video images to be detected displayed at a plurality of moments, and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected displayed at the plurality of moments by using a preset convolutional neural network;
and detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments is smeared or not based on the first feature detection points and the second feature detection points which are set at the adjacent moments in the plurality of moments.
2. The smear detection method according to claim 1, comprising: before a plurality of first feature detection points corresponding to the video images to be detected, which are original at a plurality of moments, are respectively extracted, respectively determining first targets corresponding to the video images to be detected, which are original at a plurality of moments, by using a preset target detection model;
respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network; and/or the number of the groups of groups,
The method for respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network comprises the following steps:
Cutting the video images to be detected which are original at a plurality of moments by utilizing the first target to obtain cutting video images which are original at a plurality of moments and correspond to the first target; and respectively extracting a plurality of first characteristic detection points corresponding to the original clipping video images at a plurality of moments by using a preset convolutional neural network.
3. A smear detection method according to any one of claims 1-2, comprising: before a plurality of second characteristic detection points corresponding to the video images to be detected displayed at a plurality of moments are respectively extracted, respectively determining second targets corresponding to the video images to be detected displayed at the plurality of moments by using a preset target detection model;
Respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network; and/or the number of the groups of groups,
The method for respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network comprises the following steps:
cutting the video images to be detected displayed at a plurality of moments by using the second target to obtain cut video images displayed at a plurality of moments corresponding to the second target; and respectively extracting a plurality of second characteristic detection points corresponding to the clipping video images displayed at a plurality of moments by using a preset convolutional neural network.
4. A smear detection method according to any one of claims 1 to 3, wherein the method for detecting whether there is smear in a video image to be detected displayed at a plurality of times corresponding to an original video image to be detected at the plurality of times based on the plurality of first feature detection points and the plurality of second feature detection points at which adjacent times are set at the plurality of times includes: detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to an original to-be-detected video image at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the plurality of moments which are set at adjacent moments by using a preset classifier; or alternatively, the first and second heat exchangers may be,
The method for detecting whether the smear exists in the video image to be detected displayed at the plurality of moments corresponding to the original video image to be detected at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the set adjacent moments in the plurality of moments comprises the steps of:
Registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using a preset registration model to obtain a first registration matrix at a plurality of corresponding time;
Pairing the plurality of first characteristic detection points and the plurality of second characteristic detection points based on the first registration matrixes at the plurality of moments respectively to obtain a plurality of paired first characteristic detection point sets;
Detecting whether a smear exists in the video image to be detected displayed at a plurality of moments corresponding to the original video image to be detected at a plurality of moments based on the paired first feature detection point sets and the corresponding preset distances; and/or the number of the groups of groups,
Before detecting whether the smear exists in the to-be-detected video image displayed at a plurality of moments corresponding to the to-be-detected video image at the plurality of moments original to-be-detected video images based on the paired plurality of first feature detection point sets and the corresponding preset distances, the method further comprises:
Registering the original video images to be detected corresponding to the set adjacent moments in the moments by using a preset registration model to obtain a second registration matrix corresponding to the moments;
pairing the plurality of first feature detection points based on the second registration matrix to obtain a plurality of paired second feature detection point sets;
And respectively calculating the feature detection point distances among the plurality of second feature detection point sets of the pair to obtain corresponding preset distances.
5. A smear detection apparatus, comprising:
The first characteristic detection point extraction unit is used for acquiring a plurality of time-original video images to be detected in the video stream, and respectively extracting a plurality of first characteristic detection points corresponding to the time-original video images to be detected by utilizing a preset convolutional neural network;
The second characteristic detection point extraction unit is used for displaying the original video images to be detected at a plurality of moments in the video stream by using the terminal display equipment to obtain the corresponding video images to be detected displayed at a plurality of moments, and respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected displayed at a plurality of moments by using a preset convolutional neural network;
the detection unit is used for detecting whether the to-be-detected video image displayed at a plurality of moments corresponding to the original to-be-detected video image at the plurality of moments exists or not based on the plurality of first characteristic detection points and the plurality of second characteristic detection points which are set at adjacent moments in the plurality of moments.
6. The smear detection apparatus according to claim 5, wherein the first feature detection point extraction unit includes: a first target detection unit; the first target detection unit is configured to determine, by using a preset target detection model, a first target corresponding to the video image to be detected, where the first target corresponds to the video image to be detected and is original at multiple times, before the multiple first feature detection points corresponding to the video image to be detected, where the multiple times are respectively extracted; respectively extracting a plurality of first characteristic detection points corresponding to the video images to be detected, which are original at a plurality of moments, in the first target by using a preset convolutional neural network; and/or the number of the groups of groups,
The first feature detection point extraction unit further includes: a first clipping unit; the first clipping unit is configured to clip the video images to be detected, which are original at multiple times, by using the first target, so as to obtain clipping video images, which are original at multiple times and correspond to the first target, respectively; respectively extracting a plurality of first characteristic detection points corresponding to the original cut video images at a plurality of moments by using a preset convolutional neural network; and/or the number of the groups of groups,
The second feature detection point extraction unit includes: a second target detection unit; the second target detection unit is configured to determine, by using a preset target detection model, a second target corresponding to the video image to be detected displayed at the multiple moments before extracting multiple second feature detection points corresponding to the video image to be detected displayed at the multiple moments respectively;
Respectively extracting a plurality of second characteristic detection points corresponding to the video images to be detected, which are displayed at a plurality of moments in the second target, by using a preset convolutional neural network; and/or the number of the groups of groups,
The second feature detection point extraction unit further includes: a second clipping unit; the second clipping unit is configured to clip the video images to be detected displayed at the multiple moments by using the second target, so as to obtain clipping video images displayed at the multiple moments corresponding to the second target; and respectively extracting a plurality of second characteristic detection points corresponding to the clipping video images displayed at a plurality of moments by using a preset convolutional neural network.
7. A smear detection apparatus according to any one of claims 5 to 6, wherein the detection unit comprises: presetting a classifier; detecting whether a smear exists in a to-be-detected video image displayed at a plurality of moments corresponding to an original to-be-detected video image at the plurality of moments based on the plurality of first feature detection points and the plurality of second feature detection points at the plurality of moments which are set at adjacent moments by using the preset classifier; or alternatively, the first and second heat exchangers may be,
The detection unit includes: the device comprises a first registration unit, a first pairing unit and a first detection unit; the first registration unit is used for registering the original video image to be detected and the displayed video image to be detected at the same set adjacent time by using a preset registration model to obtain a first registration matrix at a plurality of corresponding times; the first pairing unit is configured to pair the plurality of first feature detection points and the plurality of second feature detection points based on the first registration matrices at the plurality of moments, so as to obtain a plurality of paired first feature detection point sets; the first detection unit is configured to detect whether a smear exists in a to-be-detected video image displayed at multiple times corresponding to the to-be-detected video image at multiple times original based on the paired multiple first feature detection point sets and corresponding preset distances; and/or the number of the groups of groups,
The first detection unit includes: the first registration unit, the first pairing unit and the first distance calculation unit; the second registration unit is configured to register original video images to be detected corresponding to adjacent moments set in the multiple moments by using a preset registration model, so as to obtain a second registration matrix corresponding to the multiple moments; the second pairing unit is used for pairing the plurality of first characteristic detection points based on the second registration matrix to obtain a plurality of paired second characteristic detection point sets; the first distance calculation unit is used for calculating the feature detection point distances among the plurality of second feature detection point sets of the pair respectively to obtain corresponding preset distances.
8. An electronic device, comprising:
A processor;
a memory for storing processor-executable instructions;
Wherein the processor is configured to invoke the instructions stored by the memory to perform the smear detection method of any of claims 1 to 4.
9. A computer readable storage medium having stored thereon computer program instructions, which when executed by a processor, implement the smear detection method of any of claims 1 to 4.
10. A computer program product comprising computer programs/instructions which when executed by a processor implement a smear detection method as claimed in any one of claims 1 to 4.
CN202410241777.0A 2024-03-04 2024-03-04 Smear detection method and device, electronic equipment, storage medium and program product Pending CN118118651A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410241777.0A CN118118651A (en) 2024-03-04 2024-03-04 Smear detection method and device, electronic equipment, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410241777.0A CN118118651A (en) 2024-03-04 2024-03-04 Smear detection method and device, electronic equipment, storage medium and program product

Publications (1)

Publication Number Publication Date
CN118118651A true CN118118651A (en) 2024-05-31

Family

ID=91213703

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410241777.0A Pending CN118118651A (en) 2024-03-04 2024-03-04 Smear detection method and device, electronic equipment, storage medium and program product

Country Status (1)

Country Link
CN (1) CN118118651A (en)

Similar Documents

Publication Publication Date Title
CN111310616B (en) Image processing method and device, electronic equipment and storage medium
CN110287874B (en) Target tracking method and device, electronic equipment and storage medium
CN112001321B (en) Network training method, pedestrian re-identification method, device, electronic equipment and storage medium
CN110674719B (en) Target object matching method and device, electronic equipment and storage medium
CN109522910B (en) Key point detection method and device, electronic equipment and storage medium
CN107692997B (en) Heart rate detection method and device
CN110688951A (en) Image processing method and device, electronic equipment and storage medium
CN111753822A (en) Text recognition method and device, electronic equipment and storage medium
CN109615006B (en) Character recognition method and device, electronic equipment and storage medium
CN110633700B (en) Video processing method and device, electronic equipment and storage medium
CN107944367B (en) Face key point detection method and device
CN111243011A (en) Key point detection method and device, electronic equipment and storage medium
CN109145970B (en) Image-based question and answer processing method and device, electronic equipment and storage medium
CN110781813B (en) Image recognition method and device, electronic equipment and storage medium
CN110781957A (en) Image processing method and device, electronic equipment and storage medium
CN110532956B (en) Image processing method and device, electronic equipment and storage medium
CN109670077B (en) Video recommendation method and device and computer-readable storage medium
CN113065591B (en) Target detection method and device, electronic equipment and storage medium
US20210326649A1 (en) Configuration method and apparatus for detector, storage medium
CN105678296B (en) Method and device for determining character inclination angle
CN113920465A (en) Method and device for identifying film trailer, electronic equipment and storage medium
CN114332503A (en) Object re-identification method and device, electronic equipment and storage medium
CN110633715B (en) Image processing method, network training method and device and electronic equipment
CN112148980A (en) Item recommendation method, device, equipment and storage medium based on user click
CN111652107A (en) Object counting method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination