CN113869310A - Dialog box detection method and device, electronic equipment and storage medium - Google Patents

Dialog box detection method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN113869310A
CN113869310A CN202111137112.8A CN202111137112A CN113869310A CN 113869310 A CN113869310 A CN 113869310A CN 202111137112 A CN202111137112 A CN 202111137112A CN 113869310 A CN113869310 A CN 113869310A
Authority
CN
China
Prior art keywords
preset
area
region
dialog box
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111137112.8A
Other languages
Chinese (zh)
Inventor
钟东宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Dajia Internet Information Technology Co Ltd
Original Assignee
Beijing Dajia Internet Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Dajia Internet Information Technology Co Ltd filed Critical Beijing Dajia Internet Information Technology Co Ltd
Priority to CN202111137112.8A priority Critical patent/CN113869310A/en
Publication of CN113869310A publication Critical patent/CN113869310A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

The disclosure relates to a dialog box detection method, a dialog box detection device, an electronic device and a storage medium, wherein the method comprises the following steps: carrying out binarization processing on a video frame image to be detected to obtain a binarized image; carrying out region detection on the binary image to obtain a connected region in the binary image; determining a communication area which is a preset polygon in the communication area as a preset polygon area; and when a preset shape and/or a preset color exist at a preset position of the preset polygonal area, determining that the preset polygonal area is a dialog box of a preset dialog box type. According to the method, after the preset polygonal area is determined, the preset dialog box of the dialog box type is detected by detecting the preset shape and the preset color, so that the dialog box with local details can be accurately detected, the detection accuracy is improved, a deep learning method is not needed for detection, manual data labeling is not needed, and the labor cost is reduced.

Description

Dialog box detection method and device, electronic equipment and storage medium
Technical Field
The present disclosure relates to the field of image processing technologies, and in particular, to a dialog box detection method and apparatus, an electronic device, and a storage medium.
Background
Image target detection is a hotspot problem existing in image processing, a plurality of researches on general target detection and specific target detection exist in academia, and in actual work, certain targets (such as human faces, vehicles and the like) are generally detected in a targeted manner. The target detection method is mainly divided into a traditional method and a deep learning method at present, and the deep learning method obtains great results in the field of target detection in recent years. A target detection technology based on methods such as deep learning is a popular scheme for many detection problems at present, a target detection network model is trained by using a large amount of training data marked with detection frames, and then an image to be detected is input into the target detection network model to obtain target frame coordinate information of a target.
Currently, many dialog boxes are artificially added to many video contents to assist in describing the video contents or to increase the interest and richness of the video contents, but some foreign transport videos use a dialog box template provided by a non-self application program, so that the special dialog boxes need to be detected. The appearance difference of the special dialog box and the dialog box provided by the application program is very small, if the deep learning-based method is used for detection, the method cannot well distinguish the difference of the detection target details due to the downsampling processing existing in the deep neural network, the detection performance of the special dialog box is poor, a large amount of labeled training data is needed, and the great labor cost is consumed.
Disclosure of Invention
The present disclosure provides a dialog box detection method, apparatus, electronic device and storage medium, so as to at least solve the problems of poor dialog box detection accuracy and manpower waste in the related art. The technical scheme of the disclosure is as follows:
according to a first aspect of the embodiments of the present disclosure, there is provided a dialog box detection method, including:
carrying out binarization processing on a video frame image to be detected to obtain a binarized image;
carrying out region detection on the binary image to obtain a connected region in the binary image;
determining a communication area which is a preset polygon in the communication area as a preset polygon area;
and when a preset shape and/or a preset color exist at a preset position of the preset polygonal area, determining that the preset polygonal area is a dialog box of a preset dialog box type.
Optionally, when a preset shape and/or a preset color exists at a preset position of the preset polygonal area, determining that the preset polygonal area is a dialog box of a preset dialog box type, including:
determining a preset vertex in the preset polygonal area, and extracting a rectangular area or a square area with a preset size based on the preset vertex to be used as an area to be detected;
dividing the area to be detected into a first area and a second area based on a diagonal line of the area to be detected, wherein the first area is an area located above the diagonal line, and the second area is an area located below the diagonal line;
determining the pixel mean value of the first area and the pixel mean value of the second area in the binary image;
determining a ratio value of the pixel mean value of the first area to the pixel mean value of the second area as a first ratio value;
and when the first proportion value is larger than a judgment threshold value, determining the preset polygonal area as a dialog box comprising triangular sharp corners.
Optionally, when the first ratio value is greater than a judgment threshold, determining that the preset polygonal area is a dialog box including a triangular pointed corner, including:
when the diagonal line is the diagonal line between the lower left corner and the upper right corner of the region to be detected, and the first proportional value is larger than the judgment threshold value, determining the polygonal region as a dialog box comprising a first triangular sharp corner;
and when the diagonal line is the diagonal line between the upper left corner and the lower right corner of the region to be detected, and the first proportional value is greater than the judgment threshold value, determining the preset polygonal region as a dialog box comprising a second triangular sharp corner.
Optionally, the judgment threshold is determined by the following steps:
for a plurality of image samples of the dialog box with triangular sharp corners, respectively determining a pixel mean value of a first area of each image sample as a first pixel mean value, and respectively determining a pixel mean value of a second area of each image sample as a second pixel mean value;
determining the second pixel mean value of each image sample and the proportion value of the first pixel mean value as a second proportion value corresponding to each image sample respectively;
determining a plurality of candidate threshold values according to the difference of the second proportional values corresponding to each image sample;
and determining the judgment threshold according to the candidate thresholds.
Optionally, when a preset shape and/or a preset color exists at a preset position of the preset polygonal area, determining that the preset polygonal area is a dialog box of a preset dialog box type, including:
when the preset polygonal area is a rectangular area, extracting a lower boundary range area and/or a right boundary range area of the rectangular area from the video frame image to be detected, and taking the lower boundary range area and/or the right boundary range area as the area to be detected;
and determining the pixel mean value of the region to be detected, and determining the preset polygonal region as a dialog box comprising preset colors when the pixel mean value is within a preset color range.
Optionally, the preset polygon is a rectangle;
determining a region which is a preset polygon in the communication region as a preset polygon region, including:
performing polygon approximation processing on the connected region, determining a polygon boundary of the connected region, and determining a polygon region in the connected region based on the polygon boundary of the connected region;
when the polygonal area has four polygonal boundaries, determining the included angle degree between every two intersected edges of the polygonal area;
and when the degree of the included angle between every two intersected edges is within a preset degree range, determining that the polygonal area is a rectangular area.
According to a second aspect of the embodiments of the present disclosure, there is provided a dialog box detection apparatus, including:
the binarization processing module is configured to perform binarization processing on a video frame image to be detected to obtain a binarization image;
the region detection module is configured to perform region detection on the binary image to obtain a connected region in the binary image;
a region determination module configured to perform determination of a connected region that is a preset polygon in the connected region as a preset polygon region;
the dialog box detection module is configured to execute dialog box determination that the preset polygonal area is a preset dialog box type when a preset shape and/or a preset color exists at a preset position of the preset polygonal area.
Optionally, the dialog box detection module includes:
a first region-to-be-detected determining unit configured to perform determining a preset vertex in the preset polygonal region, and extract a rectangular region or a square region of a preset size as a region to be detected based on the preset vertex;
the area dividing unit is configured to divide the area to be detected into a first area and a second area based on a diagonal line of the area to be detected, wherein the first area is an area located above the diagonal line, and the second area is an area located below the diagonal line;
a pixel mean value determination unit configured to perform determination of a pixel mean value of the first region and a pixel mean value of a second region in the binarized image;
a determination data determination unit configured to perform determination of a ratio value of a pixel mean value of the first region to a pixel mean value of a second region as a first ratio value;
a first dialog box determination unit configured to perform, when the first scale value is greater than a determination threshold value, determining that the preset polygonal area is a dialog box including a triangular-shaped tip angle.
Optionally, the dialog box determining unit is configured to perform:
when the diagonal line is the diagonal line between the lower left corner and the upper right corner of the region to be detected, and the first proportional value is larger than the judgment threshold value, determining the polygonal region as a dialog box comprising a first triangular sharp corner;
and when the diagonal line is the diagonal line between the upper left corner and the lower right corner of the region to be detected, and the first proportional value is greater than the judgment threshold value, determining the preset polygonal region as a dialog box comprising a second triangular sharp corner.
Optionally, the judgment threshold is determined by the following steps:
for a plurality of image samples of the dialog box with triangular sharp corners, respectively determining a pixel mean value of a first area of each image sample as a first pixel mean value, and respectively determining a pixel mean value of a second area of each image sample as a second pixel mean value;
determining the second pixel mean value of each image sample and the proportion value of the first pixel mean value as a second proportion value corresponding to each image sample respectively;
determining a plurality of candidate threshold values according to the second proportional value corresponding to each image sample;
and determining the judgment threshold according to the candidate thresholds.
Optionally, the dialog box detection module includes:
a second to-be-detected region determining unit configured to extract a lower boundary range region and/or a right boundary range region of the rectangular region from the to-be-detected video frame image when the preset polygonal region is a rectangular region, and take the lower boundary range region and/or the right boundary range region as the to-be-detected region;
and the second dialog box determining unit is configured to determine a pixel mean value of the region to be detected, and when the pixel mean value is within a preset color range, determine the preset polygonal region as a dialog box comprising a preset color.
Optionally, the preset polygon is a rectangle;
the region determination module includes:
a polygon approximation unit configured to perform polygon approximation processing on the connected region, determine a polygon boundary of the connected region, and determine a polygon region in the connected region based on the polygon boundary of the connected region;
an included angle determining unit configured to determine a degree of an included angle between two intersecting edges of the polygonal area when the polygonal area has four polygonal boundaries;
a rectangular area determination unit configured to determine the polygonal area to be a rectangular area when the degree of the included angle between the two intersecting sides is within a preset degree range.
According to a third aspect of the embodiments of the present disclosure, there is provided an electronic apparatus including:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the dialog box detection method according to the first aspect.
According to a fourth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium, wherein instructions, when executed by a processor of an electronic device, enable the electronic device to perform the dialog detection method according to the first aspect.
According to a fifth aspect of embodiments of the present disclosure, there is provided a computer program product comprising a computer program or computer instructions which, when executed by a processor, implements the dialog box detection method of the first aspect.
The technical scheme provided by the embodiment of the disclosure at least brings the following beneficial effects:
the method comprises the steps of conducting binarization processing on a video frame image to be detected to obtain a binarized image, conducting region detection on the binarized image to obtain a communicated region in the binarized image, determining the communicated region which is a preset polygon in the communicated region to be used as a preset polygon region, determining the preset polygon region to be a dialog box of a preset dialog box type when a preset shape and/or a preset color exist at a preset position of the preset polygon region, and detecting the dialog box of the preset dialog box type by detecting the preset shape and the preset color after the preset polygon region is determined.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the principles of the disclosure and are not to be construed as limiting the disclosure.
FIGS. 1a-1c are stylistic views of a dialog to be detected in the present disclosure;
FIGS. 2a-2b are style diagrams of a generic dialog box in the present disclosure;
FIG. 3 is a flow diagram illustrating a dialog detection method in accordance with an exemplary embodiment;
4a-4b are schematic diagrams of the partitioning of the area to be detected in the embodiments of the present disclosure;
FIG. 5 is a block diagram illustrating a dialog detection device in accordance with an exemplary embodiment;
FIG. 6 is a block diagram illustrating an electronic device in accordance with an example embodiment.
Detailed Description
In order to make the technical solutions of the present disclosure better understood by those of ordinary skill in the art, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the accompanying drawings.
It should be noted that the terms "first," "second," and the like in the description and claims of the present disclosure and in the above-described drawings are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the disclosure described herein are capable of operation in sequences other than those illustrated or otherwise described herein. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.
Fig. 1a to 1c are schematic diagrams of styles of dialog boxes to be detected in the present disclosure, fig. 2a to 2b are schematic diagrams of styles of normal dialog boxes in the present disclosure, and a normal dialog box may be a dialog box used by an application program itself, as shown in fig. 1a to 1c and fig. 2a to 2b, a dialog box to be detected and a normal dialog box have little difference, some slight difference or even only some slight difference, if a deep learning type detection network is used, the slight difference cannot be distinguished, and accurate detection cannot be performed.
Although the appearance of the object to be detected is similar to that of the non-detected object, the object to be detected is different from the non-detected object in some detail parts, so that a reasonable detection method can be designed by combining some image processing algorithms. Embodiments of the present disclosure provide the following processing logic flow, which may address this issue.
Fig. 3 is a flowchart illustrating a dialog box detection method according to an exemplary embodiment, which is used in an electronic device such as a server, a computer, etc. as shown in fig. 3, and includes the following steps.
In step S31, a binarization process is performed on the video frame image to be detected, so as to obtain a binarized image.
And the video frame image to be detected is a color image.
Firstly, carrying out binarization processing on a video frame image to be detected, and converting the video frame image to be detected into a binarized image, namely converting a color video frame image into a black and white image to obtain the binarized image.
In an exemplary embodiment, the binarizing processing on the video frame image to be detected to obtain a binarized image includes: carrying out gray level processing on the video frame image to be detected to obtain a gray level image of the video frame image to be detected; and carrying out binarization processing on the gray level image to obtain the binarized image.
Firstly, carrying out gray level processing on a video frame image to be detected, converting a three-channel color image into a single-channel gray level image, eliminating interference of partial noise by converting the three-channel color image into the gray level image, then converting the gray level image into a binary image based on a gray level threshold value, namely, assigning the pixel points with the gray level values larger than the gray level threshold value in the gray level image to be 255, and assigning the pixel points with the gray level values smaller than the gray level threshold value in the gray level image to be 0, thereby obtaining the binary image. The gray value threshold may be set as needed, for example, if the color of the dialog box is a light color such as white, beige, etc., the gray value threshold may be set to 250, etc.
After the video frame image to be detected is converted into the binary image, subsequent dialog box detection is carried out, interference caused by irrelevant pixel distribution can be eliminated, and detection accuracy is improved.
In step S32, region detection is performed on the binarized image to obtain a connected region in the binarized image.
And performing area detection on the binary image by using an area detection algorithm, traversing the binary image, determining connected areas in the binary image, and obtaining the positions of the connected areas. The region detection algorithm may use the region detection algorithm of opencv.
In step S33, a connected region that is a preset polygon in the connected regions is determined as a preset polygon region.
Wherein, the preset polygon is the area shape of the general dialog box. The dialog box to be detected and the general dialog box have some small differences, and the preset shape and/or the preset color generally exist around the preset polygon, so that the position of the general dialog box needs to be found firstly, namely the preset polygon area needs to be found firstly.
And processing each connected region respectively to determine whether the connected region is a preset polygonal region. And if the polygon formed by the boundaries of the connected region is the preset polygon, determining that the connected region is the preset polygon region, and if the polygon formed by the boundaries of the connected region is not the preset polygon, excluding the connected region. Through the above processing, a preset polygonal area in the communication area can be obtained.
In one exemplary embodiment, the preset polygon is a rectangle;
determining a region which is a preset polygon in the communication region as a preset polygon region, including:
performing polygon approximation processing on the connected region, determining a polygon boundary of the connected region, and determining a polygon region in the connected region based on the polygon boundary of the connected region;
when the polygonal area has four polygonal boundaries, determining the included angle degree between every two intersected edges of the polygonal area;
and if the included angle degree between every two intersected edges is within a preset degree range, determining that the polygonal area is a rectangular area.
When the shape of a dialog box used by an application program is a rectangle, presetting a polygon as the rectangle, determining whether a connected region is the rectangular region, firstly, performing polygon approximation processing on the connected region, determining the polygon boundary of each connected region, determining two intersection points in a plurality of polygon boundaries of one connected region, wherein a region defined by the polygon boundaries is the polygon region in the connected region, screening out the polygon region with four polygon boundaries from the polygon region, filtering out the polygon region with less than or more than four polygon boundaries, calculating the included angle degree between two intersected sides of the polygon region with four polygon boundaries, and if the included angle degree between two intersected sides is within the preset degree range, determining the polygon region as the rectangular region. The preset degree range is 90 degrees plus or minus a preset degree, and the preset degree can be 10 degrees or 15 degrees, for example.
Based on the polygon approximation, the boundary number judgment and the included angle judgment, a roughly rectangular region can be screened out so as to screen out a dialog box of the application program and a dialog box similar to the dialog box used by the application program, and therefore the dialog box which is approximately rectangular can be detected accurately.
In step S34, when a preset shape and/or a preset color exists at a preset position of the preset polygonal area, the preset polygonal area is determined to be a dialog box of a preset dialog box type.
Since the dialog box of the preset dialog box type to be detected generally has the preset shape and/or the preset color at the preset position of the preset polygonal area, a corresponding detection mode can be set for the dialog box of the preset dialog box type.
After the preset polygonal area is determined, the preset polygonal area can be detected by using at least one detection algorithm of the preset dialog box type, whether the preset polygonal area is a dialog box of the preset dialog box type or not is determined, and a specific preset dialog box type is determined. The method comprises the steps of firstly determining a preset position of a preset polygonal area, determining an area near the preset position, detecting whether a preset shape and/or a preset color exist in the area, if the preset shape and/or the preset color exist in the area, determining the preset polygonal area as a dialog box of a preset dialog box type, and obtaining a specific preset dialog box type based on the preset shape and/or the preset color existing in the area. The preset dialog box type can be a preset shape, a preset color, or a preset shape and a preset color.
In an exemplary embodiment, when a preset shape and/or a preset color exists at a preset position of the preset polygonal area, determining that the preset polygonal area is a dialog box of a preset dialog box type includes:
determining a preset vertex in the preset polygonal area, and extracting a rectangular area or a square area with a preset size based on the preset vertex to be used as an area to be detected;
dividing the area to be detected into a first area and a second area based on a diagonal line of the area to be detected, wherein the first area is an area located above the diagonal line, and the second area is an area located below the diagonal line;
determining the pixel mean value of the first area and the pixel mean value of the second area in the binary image;
determining a ratio value of the pixel mean value of the first area to the pixel mean value of the second area as a first ratio value;
and when the first proportion value is larger than a judgment threshold value, determining the preset polygonal area as a dialog box comprising triangular sharp corners.
If the preset dialog box type is that a triangular sharp corner exists near a preset vertex, as shown in fig. 1a and 1b, the position of the preset vertex in the preset polygonal region is first determined based on the position of the preset polygonal region (if the preset polygonal region is a rectangular region or a hexagonal region, the preset vertex may be one of two vertices of the next side), and in fig. 1a and 1b, the preset polygonal region is a rectangular region and the preset vertex is a vertex at the lower left corner of the rectangular region. After the preset vertexes in the preset polygonal area are determined, a rectangular area or a square area with preset size including the preset vertexes is extracted, and the extracted rectangular area or the square area is used as an area to be detected so as to determine whether triangular sharp corners exist in the area to be detected. The preset size may be set according to the size of the triangular cusp to be detected, and may be, for example, 10 × 10 pixels, 20 × 20 pixels, 30 × 30 pixels, or the like. When a rectangular region or a square region with a preset size including a preset vertex is extracted, the preset vertex may be used as one vertex of the rectangular region or the square region, or the preset vertex may also be inside the rectangular region or the square region and preset a pixel position away from one vertex, for example, for the dialog boxes in fig. 1a and 1b, when the extracted region is the square region, the preset vertex (i.e., the lower left vertex) of the rectangular region may be used as the upper left vertex of the square region to be extracted and the square region is extracted based on the preset size, or two upward pixels of the preset vertex (i.e., the lower left vertex) may be used as the upper left vertex of the square region to be extracted and the square region is extracted based on the preset size. When the region to be detected is extracted, specifically, the preset vertex is used as a vertex of the region to be detected or is located at an internal position of the region to be detected, a relative position relationship between the preset vertex and a vertex of the region to be detected (for example, a dialog box shown in fig. 1a and 1b, and a vertex of the region to be detected is an upper left vertex) may be set as a hyper-parameter, and the hyper-parameter is dynamically adjusted in advance according to the dialog box to be detected, so as to determine a suitable value.
After the region to be detected is extracted, the region to be detected is divided into a first region and a second region based on a diagonal line of the region to be detected, the region to be detected is divided into the first region and the second region based on the type of the triangular sharp corner to be detected, the diagonal line corresponding to the type is used, the region located above the diagonal line is the first region, the region located below the diagonal line is the second region, for example, for the dialog box shown in fig. 1a, the region to be detected is divided by using the diagonal line between the top left corner vertex and the bottom right corner vertex of the region to be detected, and for the dialog box shown in fig. 1b, the diagonal line between the top left corner vertex and the top right corner vertex of the region to be detected is used for dividing the region to be detected.
After the area to be detected is divided into a first area and a second area, in the binary image, the pixel mean value of the first area and the pixel mean value of the second area are counted, the proportion value of the pixel mean value of the first area and the pixel mean value of the second area is determined, the proportion value is a first proportion value, and if the first proportion value is larger than a judgment threshold value, the preset polygonal area is determined to be a dialog box comprising triangular sharp corners.
When the preset polygonal region is determined to be a dialog box including a triangular cusp when the first scale value is larger than the determination threshold value, the region of the dialog box is processed to be white in the binarized image.
By extracting the area to be detected near the preset vertex and dividing the area to be detected into the first area and the second area based on the diagonal line, the dialog box comprising the triangular-shaped sharp corner can be accurately judged based on the pixel mean value of the first area and the pixel mean value of the second area, and the judgment accuracy of the dialog box comprising the triangular-shaped sharp corner can be improved.
In an exemplary embodiment, when the first proportion value is greater than a judgment threshold, determining the preset polygonal area as a dialog box including a triangular cusp includes:
when the diagonal line is the diagonal line between the lower left corner and the upper right corner of the region to be detected, and the first proportional value is larger than the judgment threshold value, determining the polygonal region as a dialog box comprising a first triangular sharp corner;
and when the diagonal line is the diagonal line between the upper left corner and the lower right corner of the region to be detected, and the first proportional value is greater than the judgment threshold value, determining the preset polygonal region as a dialog box comprising a second triangular sharp corner.
The first type of trigonal points is shown in FIG. 1a and the second type of trigonal points is shown in FIG. 1 b. When judging whether the first triangular sharp corner is included, dividing the region to be detected by using a diagonal line between the lower left corner and the upper right corner of the region to be detected, as shown in fig. 4a, and if the first proportion value is larger than the judgment threshold value, determining the polygonal region as a dialog box including the first triangular sharp corner. When judging whether the second triangular sharp corner is included, dividing the region to be detected by using a diagonal line between the upper left corner and the lower right corner of the region to be detected, as shown in fig. 4b, and if the first proportion value is larger than the judgment threshold value, determining the polygonal region as a dialog box including the second triangular sharp corner. Different types of dialog boxes with triangular sharp corners can be distinguished through different diagonals, and the dialog boxes with corresponding types can be accurately determined.
In one exemplary embodiment, the determination threshold is determined by the steps including: for a plurality of image samples of the dialog box with triangular sharp corners, respectively determining a pixel mean value of a first area of each image sample as a first pixel mean value, and respectively determining a pixel mean value of a second area of each image sample as a second pixel mean value; determining the second pixel mean value of each image sample and the proportion value of the first pixel mean value as a second proportion value corresponding to each image sample respectively; determining a plurality of candidate threshold values according to the second proportional value corresponding to each image sample; and determining the judgment threshold according to the candidate thresholds.
Acquiring an image sample of a dialog box with a plurality of triangular sharp corners, wherein the position of the dialog box with the triangular sharp corners can be marked in the image sample, determining a region to be detected based on the position, dividing the region to be detected into a first region and a second region based on a diagonal line of the region to be detected, the first region is a region above the diagonal line, the second region is a region below the diagonal line, when the dialog box is binarized into a white region in a binarized image corresponding to the image sample, determining a pixel mean value of the first region of each image sample as a first pixel mean value, respectively determining a pixel mean value of the second region of each image sample as a second pixel mean value, respectively determining a ratio value of the second pixel mean value and the first pixel mean value of each image sample as a second ratio value corresponding to each image sample, determining the difference between 1 and each second proportional value as a candidate threshold, i.e. determining the candidate threshold corresponding to each image sample according to the following formula:
thr=1-Pbottom/Ptop
wherein thr is a candidate threshold corresponding to an image sample, PtopIs the first pixel mean, P, of the image samplebottomIs the second pixel mean, P, of the image samplebottom/PtopThe second ratio value is corresponding to the image sample.
According to the above formula, a plurality of candidate threshold values are obtained based on a plurality of image samples, distribution of the plurality of candidate threshold values is determined, a range where most (e.g., more than 50% or the like) of the candidate threshold values are located is statistically determined, and a determination threshold value is determined based on the range, for example, for 100 image samples, the second ratio values of 50 image samples are all between 0.88 and 0.92, and the second ratio values of other image samples are relatively dispersed, so that the determination threshold value can be determined to be 0.9. By dynamically determining the judgment threshold value based on the first pixel mean value and the second pixel mean value of the plurality of image samples, the accuracy of the dialog box type judgment can be improved.
In another exemplary embodiment, when a preset shape and/or a preset color exists at a preset position of the preset polygonal area, determining that the preset polygonal area is a dialog box of a preset dialog box type includes: when the preset polygonal area is a rectangular area, extracting a lower boundary range area and/or a right boundary range area of the rectangular area from the video frame image to be detected, and taking the lower boundary range area and/or the right boundary range area as the area to be detected; and determining the pixel mean value of the region to be detected, and determining the preset polygonal region as a dialog box comprising preset colors when the pixel mean value is within a preset color range.
The dialog box shown in fig. 1c is a rectangular dialog box, and there are specific colors in the areas near the lower boundary and the right boundary, when detecting such dialog boxes, the lower boundary range area and/or the right boundary range area of the rectangular area are extracted from the video frame image to be detected, and the extracted lower boundary range area and/or the right boundary range area are used as the area to be detected. Counting the pixel mean value of the area to be detected, judging whether the pixel mean value is located in a preset color range, if so, determining that the preset polygonal area is a dialog box comprising preset colors, and if not, determining that the preset polygonal area is not a dialog box comprising the preset colors. By counting the pixel mean values of the lower boundary range area and/or the right boundary range area of the preset polygonal area, the dialog box with the preset color at the preset position can be accurately judged, and the judgment accuracy of the dialog box with the preset color is improved.
The dialog box detection method provided by the present exemplary embodiment obtains a binarized image by performing binarization processing on a video frame image to be detected, performing area detection on the binary image to obtain a connected area in the binary image, determining the connected area as a preset polygon in the connected area as a preset polygon area, if a preset shape and/or a preset color exists at a preset position of the preset polygon area, determining the preset polygon area as a preset dialog box of a dialog box type, because the dialog box with the preset dialog box type is detected by detecting the preset shape and the preset color after the preset polygonal area is determined, the dialog box with local details can be accurately detected, the detection accuracy is improved, and the detection is carried out without a deep learning method, the data is not required to be marked manually, and the labor cost is reduced.
Fig. 5 is a block diagram illustrating a dialog detection device according to an example embodiment. Referring to fig. 5, the apparatus includes a binarization processing module 51, a region detection module 52, a region determination module 53, and a dialog detection module 54.
The binarization processing module 51 is configured to perform binarization processing on a video frame image to be detected to obtain a binarized image;
the region detection module 52 is configured to perform region detection on the binarized image, resulting in a connected region in the binarized image;
the region determining module 53 is configured to perform determining a connected region that is a preset polygon in the connected regions as a preset polygon region;
the dialog box detection module 54 is configured to perform determining that the preset polygonal area is a dialog box of a preset dialog box type when a preset shape and/or a preset color exists at a preset position of the preset polygonal area.
Optionally, the dialog box detection module includes:
a first region-to-be-detected determining unit configured to perform determining a preset vertex in the preset polygonal region, and extract a rectangular region or a square region of a preset size as a region to be detected based on the preset vertex;
the area dividing unit is configured to divide the area to be detected into a first area and a second area based on a diagonal line of the area to be detected, wherein the first area is an area located above the diagonal line, and the second area is an area located below the diagonal line;
a pixel mean value determination unit configured to perform determination of a pixel mean value of the first region and a pixel mean value of a second region in the binarized image;
a determination data determination unit configured to perform determination of a ratio value of a pixel mean value of the first region to a pixel mean value of a second region as a first ratio value;
a first dialog box determination unit configured to perform, when the first scale value is greater than a determination threshold value, determining that the preset polygonal area is a dialog box including a triangular-shaped tip angle.
Optionally, the dialog box determining unit is configured to perform:
when the diagonal line is the diagonal line between the lower left corner and the upper right corner of the region to be detected, and the first proportional value is larger than the judgment threshold value, determining the polygonal region as a dialog box comprising a first triangular sharp corner;
and when the diagonal line is the diagonal line between the upper left corner and the lower right corner of the region to be detected, and the first proportion value is greater than the judgment threshold value, determining the preset polygonal region as a dialog box comprising a second triangular sharp corner.
Optionally, the judgment threshold is determined by the following steps:
for a plurality of image samples of the dialog box with triangular sharp corners, respectively determining a pixel mean value of a first area of each image sample as a first pixel mean value, and respectively determining a pixel mean value of a second area of each image sample as a second pixel mean value;
determining the second pixel mean value of each image sample and the proportion value of the first pixel mean value as a second proportion value corresponding to each image sample respectively;
determining a plurality of candidate threshold values according to the second proportional value corresponding to each image sample;
and determining the judgment threshold according to the candidate thresholds.
Optionally, the dialog box detection module includes:
a second to-be-detected region determining unit configured to extract a lower boundary range region and/or a right boundary range region of the rectangular region from the to-be-detected video frame image when the preset polygonal region is a rectangular region, and take the lower boundary range region and/or the right boundary range region as the to-be-detected region;
and the second dialog box determining unit is configured to determine a pixel mean value of the region to be detected, and when the pixel mean value is within a preset color range, determine the preset polygonal region as a dialog box comprising a preset color.
Optionally, the preset polygon is a rectangle;
the region determination module includes:
a polygon approximation unit configured to perform polygon approximation processing on the connected region, determine a polygon boundary of the connected region, and determine a polygon region in the connected region based on the polygon boundary of the connected region;
an included angle determining unit configured to determine a degree of an included angle between two intersecting edges of the polygonal area when the polygonal area has four polygonal boundaries;
a rectangular area determination unit configured to determine the polygonal area to be a rectangular area when the degree of the included angle between the two intersecting sides is within a preset degree range.
Optionally, the binarization processing module includes:
the gray processing unit is configured to perform gray processing on the video frame image to be detected to obtain a gray image of the video frame image to be detected;
and the binarization processing unit is configured to perform binarization processing on the gray level image to obtain the binarization image.
With regard to the apparatus in the above-described embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment related to the method, and will not be elaborated here.
FIG. 6 is a block diagram illustrating an electronic device in accordance with an example embodiment. For example, the electronic device 600 may be provided as a server. Referring to fig. 6, electronic device 600 includes a processing component 622 that further includes one or more processors and memory resources, represented by memory 632, for storing instructions, such as application programs, that are executable by processing component 622. The application programs stored in memory 632 may include one or more modules that each correspond to a set of instructions. Further, the processing component 622 is configured to execute instructions to perform the dialog box detection method described above.
The electronic device 600 may also include a power component 626 configured to perform power management for the electronic device 600, a wired or wireless network interface 650 configured to connect the electronic device 600 to a network, and an input/output (I/O) interface 658. The electronic device 600 may operate based on an operating system stored in the memory 632, such as Windows Server, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, or the like.
In an exemplary embodiment, a computer-readable storage medium comprising instructions, such as the memory 632 comprising instructions, executable by the processing component 622 of the electronic device 600 to perform the above-described dialog detection method is also provided. Alternatively, the computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
In an exemplary embodiment, a computer program product is also provided, comprising a computer program or computer instructions, which when executed by a processor, implements the dialog box detection method described above.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.

Claims (10)

1. A dialog box detection method, comprising:
carrying out binarization processing on a video frame image to be detected to obtain a binarized image;
carrying out region detection on the binary image to obtain a connected region in the binary image;
determining a communication area which is a preset polygon in the communication area as a preset polygon area;
and when a preset shape and/or a preset color exist at a preset position of the preset polygonal area, determining that the preset polygonal area is a dialog box of a preset dialog box type.
2. The method according to claim 1, wherein when a preset shape and/or a preset color exists at a preset position of the preset polygonal area, determining the preset polygonal area as a dialog box of a preset dialog box type comprises:
determining a preset vertex in the preset polygonal area, and extracting a rectangular area or a square area with a preset size based on the preset vertex to be used as an area to be detected;
dividing the area to be detected into a first area and a second area based on a diagonal line of the area to be detected, wherein the first area is an area located above the diagonal line, and the second area is an area located below the diagonal line;
determining the pixel mean value of the first area and the pixel mean value of the second area in the binary image;
determining a ratio value of the pixel mean value of the first area to the pixel mean value of the second area as a first ratio value;
and when the first proportion value is larger than a judgment threshold value, determining the preset polygonal area as a dialog box comprising triangular sharp corners.
3. The method according to claim 2, wherein when the first proportion value is greater than a judgment threshold, determining the preset polygonal area as a dialog box including a triangular cusp comprises:
when the diagonal line is the diagonal line between the lower left corner and the upper right corner of the region to be detected, and the first proportional value is larger than the judgment threshold value, determining the polygonal region as a dialog box comprising a first triangular sharp corner;
and when the diagonal line is the diagonal line between the upper left corner and the lower right corner of the region to be detected, and the first proportional value is greater than the judgment threshold value, determining the preset polygonal region as a dialog box comprising a second triangular sharp corner.
4. The method of claim 2, wherein the decision threshold is determined by:
for a plurality of image samples of the dialog box with triangular sharp corners, respectively determining a pixel mean value of a first area of each image sample as a first pixel mean value, and respectively determining a pixel mean value of a second area of each image sample as a second pixel mean value;
determining the second pixel mean value of each image sample and the proportion value of the first pixel mean value as a second proportion value corresponding to each image sample respectively;
determining a plurality of candidate threshold values according to the second proportional value corresponding to each image sample;
and determining the judgment threshold according to the candidate thresholds.
5. The method according to claim 1, wherein when a preset shape and/or a preset color exists at a preset position of the preset polygonal area, determining the preset polygonal area as a dialog box of a preset dialog box type comprises:
when the preset polygonal area is a rectangular area, extracting a lower boundary range area and/or a right boundary range area of the rectangular area from the video frame image to be detected, and taking the lower boundary range area and/or the right boundary range area as the area to be detected;
and determining the pixel mean value of the region to be detected, and determining the preset polygonal region as a dialog box comprising preset colors when the pixel mean value is within a preset color range.
6. The method according to any one of claims 1 to 5, wherein the predetermined polygon is a rectangle;
determining a region which is a preset polygon in the communication region as a preset polygon region, including:
performing polygon approximation processing on the connected region, determining a polygon boundary of the connected region, and determining a polygon region in the connected region based on the polygon boundary of the connected region;
when the polygonal area has four polygonal boundaries, determining the included angle degree between every two intersected edges of the polygonal area;
and when the degree of the included angle between every two intersected edges is within a preset degree range, determining that the polygonal area is a rectangular area.
7. A dialog box detection device, comprising:
the binarization processing module is configured to perform binarization processing on a video frame image to be detected to obtain a binarization image;
the region detection module is configured to perform region detection on the binary image to obtain a connected region in the binary image;
a region determination module configured to perform determination of a connected region that is a preset polygon in the connected region as a preset polygon region;
the dialog box detection module is configured to execute dialog box determination that the preset polygonal area is a preset dialog box type when a preset shape and/or a preset color exists at a preset position of the preset polygonal area.
8. An electronic device, comprising:
a processor;
a memory for storing the processor-executable instructions;
wherein the processor is configured to execute the instructions to implement the dialog detection method of any of claims 1 to 6.
9. A computer-readable storage medium whose instructions, when executed by a processor of an electronic device, enable the electronic device to perform the dialog detection method of any of claims 1-6.
10. A computer program product comprising a computer program, characterized in that the computer program, when being executed by a processor, implements the dialog detection method of any of claims 1 to 6.
CN202111137112.8A 2021-09-27 2021-09-27 Dialog box detection method and device, electronic equipment and storage medium Pending CN113869310A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111137112.8A CN113869310A (en) 2021-09-27 2021-09-27 Dialog box detection method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111137112.8A CN113869310A (en) 2021-09-27 2021-09-27 Dialog box detection method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN113869310A true CN113869310A (en) 2021-12-31

Family

ID=78991298

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111137112.8A Pending CN113869310A (en) 2021-09-27 2021-09-27 Dialog box detection method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113869310A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5315667A (en) * 1991-10-31 1994-05-24 International Business Machines Corporation On-line handwriting recognition using a prototype confusability dialog
CN106254933A (en) * 2016-08-08 2016-12-21 腾讯科技(深圳)有限公司 Subtitle extraction method and device
EP3327617A2 (en) * 2016-11-29 2018-05-30 Sap Se Object detection in image data using depth segmentation
CN109272016A (en) * 2018-08-08 2019-01-25 广州视源电子科技股份有限公司 Object detection method, device, terminal device and computer readable storage medium
CN109976637A (en) * 2019-03-27 2019-07-05 网易(杭州)网络有限公司 Dialog box method of adjustment, dialog box adjustment device, electronic equipment and storage medium
CN111681284A (en) * 2020-06-09 2020-09-18 商汤集团有限公司 Corner point detection method and device, electronic equipment and storage medium
CN111950424A (en) * 2020-08-06 2020-11-17 腾讯科技(深圳)有限公司 Video data processing method and device, computer and readable storage medium
CN115578483A (en) * 2021-06-21 2023-01-06 哈皮尼思(北京)文化科技有限公司 Method, device and equipment for generating strip-diffuse image and computer storage medium

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5315667A (en) * 1991-10-31 1994-05-24 International Business Machines Corporation On-line handwriting recognition using a prototype confusability dialog
CN106254933A (en) * 2016-08-08 2016-12-21 腾讯科技(深圳)有限公司 Subtitle extraction method and device
EP3327617A2 (en) * 2016-11-29 2018-05-30 Sap Se Object detection in image data using depth segmentation
CN109272016A (en) * 2018-08-08 2019-01-25 广州视源电子科技股份有限公司 Object detection method, device, terminal device and computer readable storage medium
CN109976637A (en) * 2019-03-27 2019-07-05 网易(杭州)网络有限公司 Dialog box method of adjustment, dialog box adjustment device, electronic equipment and storage medium
CN111681284A (en) * 2020-06-09 2020-09-18 商汤集团有限公司 Corner point detection method and device, electronic equipment and storage medium
CN111950424A (en) * 2020-08-06 2020-11-17 腾讯科技(深圳)有限公司 Video data processing method and device, computer and readable storage medium
CN115578483A (en) * 2021-06-21 2023-01-06 哈皮尼思(北京)文化科技有限公司 Method, device and equipment for generating strip-diffuse image and computer storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
汪亭亭;吴彦文;艾学轶;: "基于面部表情识别的学习疲劳识别和干预方法", 计算机工程与设计, no. 08, 28 April 2010 (2010-04-28) *

Similar Documents

Publication Publication Date Title
TWI655586B (en) Method and device for detecting specific identification image in predetermined area
CN109615611B (en) Inspection image-based insulator self-explosion defect detection method
EP3620981B1 (en) Object detection method, device, apparatus and computer-readable storage medium
CN111179243A (en) Small-size chip crack detection method and system based on computer vision
JP6719457B2 (en) Method and system for extracting main subject of image
CN111814575B (en) Household pattern recognition method based on deep learning and image processing
US20210019878A1 (en) Image processing device, image processing method, and image processing program
CN111915704A (en) Apple hierarchical identification method based on deep learning
CN104318225B (en) Detection method of license plate and device
CN107784669A (en) A kind of method that hot spot extraction and its barycenter determine
US6704456B1 (en) Automatic image segmentation in the presence of severe background bleeding
CN104616275A (en) Defect detecting method and defect detecting device
CN110596120A (en) Glass boundary defect detection method, device, terminal and storage medium
US20120320433A1 (en) Image processing method, image processing device and scanner
CN111738252B (en) Text line detection method, device and computer system in image
CN109767431A (en) Accessory appearance defect inspection method, device, equipment and readable storage medium storing program for executing
CN111126393A (en) Vehicle appearance refitting judgment method and device, computer equipment and storage medium
CN113537037A (en) Pavement disease identification method, system, electronic device and storage medium
CN115908774A (en) Quality detection method and device of deformed material based on machine vision
CN109753981B (en) Image recognition method and device
CN113283439B (en) Intelligent counting method, device and system based on image recognition
WO2024016686A1 (en) Corner detection method and apparatus
CN110188693B (en) Improved complex environment vehicle feature extraction and parking discrimination method
CN111598033A (en) Cargo positioning method, device and system and computer readable storage medium
CN113869310A (en) Dialog box detection method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination